Ticket #1611 (closed enhancement: fixed)
--enable-charset by default
Reported by: | egmont | Owned by: | iNode |
---|---|---|---|
Priority: | major | Milestone: | 4.7.0-pre3 |
Component: | mc-core | Version: | master |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Branch state: | Votes for changeset: | committed-master |
Description (last modified by iNode) (diff)
viewer/editor not doing utf-8
mc-4.7.0-pre2 with UTF-8 everywhere. Locale is set to UTF-8, mc's Display bits is UTF-8 too. The main screen is fine.
mc's builtin viewer and editor, though, still use some 8-bit character set, so accents don't appear correctly.
This is a serious regression from 4.6.x+utf8 patches where the viewer and editor had reasonably good UTF-8 support.
Change History
comment:1 in reply to: ↑ description Changed 15 years ago by andrew_b
comment:2 Changed 15 years ago by egmont
Sorry, I wasn't clear. They do assume UTF-8 when communicating with the terminal, but they assume ISO-8859-whatever encoding for the content of the file. So it appears in "double utf8" encoding, every accented letter replaced by two symbols.
E.g. The file contains (in UTF-8 encoding): áéõûőű
What I see in mcview/mcedit: áéõûÅ<ű
Is it supposed to work correctly? I'd be more than happy to hear that it's already implemented, it's just something unusual in my environment. Probably I forgot an option to ./configure or to change something in setting? Any idea? I'm eager to figure it out.
Thanks!
comment:3 Changed 15 years ago by egmont
Hah, --enable-charset lets you choose the charset of the file, including "No translation" and UTF-8. Then it works fine.
Without charset support, the default is "no translation" for filenames, but "latin1" for file content. This doesn't sound logical to me. I think the behavior should be "no translation" for file contents too.
Or, alternatively, charset support should be turned on by default.
Nowadays more and more distributions use UTF-8 by default and it's the recommended encoding for everything: filenames, file content etc. Imagine thousands of users downloading and installing mc-4.7 just as I did and figuring out that file contents are not displayed correctly. Imagine tons of stupid bug reports just as this one :) You don't want that, users don't want that either. The default behavior (simplest way of compiling and running mc) should provide proper support for fully UTF-8 systems.
I've got a some other similar philosophical corners, I'll file separate report for them.
Overall, however, mc's forthcoming official UTF-8 support looks super great, HUGE THANKS to everyone involved!!!
comment:4 Changed 15 years ago by iNode
- Owner set to iNode
- Status changed from new to accepted
- Description modified (diff)
- Summary changed from viewer/editor not doing utf-8 to --enable-charset by default
Yes, egmont, you are right.
I'm also propose --enable-charset by default.
comment:5 Changed 15 years ago by iNode
- Version changed from 4.7.0-pre2 to master
- Type changed from defect to enhancement
- severity changed from no branch to on review
- Milestone changed from 4.7 to 4.7.0-pre3
branch: 1611_autoconf_enable_charset (parent: master)
changeset: 95201d8574747ae459f0f64b2fcdcaa095839ef1
comment:7 Changed 15 years ago by slavazanko
- Votes for changeset changed from angel_il to angel_il slavazanko
- severity changed from on review to approved
Replying to egmont:
What do you mean?