Ticket #2386 (closed defect: fixed)
Interpretation of LANG variable needs to be case insensitive.
Reported by: | urkle | Owned by: | andrew_b |
---|---|---|---|
Priority: | major | Milestone: | 4.8.3 |
Component: | mc-core | Version: | 4.7.4 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Branch state: | merged | Votes for changeset: | committed-master committed-stable |
Description
Related bug in iTerm 2
http://code.google.com/p/iterm2/issues/detail?id=204
When the LANG variable is set to en_US.utf-8 mcedit specifically does not correctly accept input (every character press is interpreted as a '.'). However when LANG is set to en_US.UTF-8 mcedit works correctly.
From the work on the bug against iTerm 2 it was discovered that in reality midnight commander is not handling the LANG and LC_* environment variable correctly.
From the IANA document on character sets.
The character set names may be up to 40 characters taken from the
printable characters of US-ASCII. However, no distinction is made
between use of upper and lower case letters.
Attachments
Change History
comment:2 Changed 14 years ago by urkle
MC version has been 4.7.+ (First noticed it with 4.7.0.3 currently using 4.7.4)
glib2 version is 2.22.4
MC is currently built with slang (issue occurred when built with ncurses as well)
This is on Mac OS X 10.6.4.
And the issue is NOT specific to iTerm either.. the Standard Mac OSX terminal also exhibits the same behavior if the LANG is set to a lowercase utf-8. (the default there is upper case though)
BTW, I can't recreate on my linux box either, only the Mac system.
comment:3 Changed 14 years ago by urkle
I attached a test C++ program that I used for actually a different purpose but it does show some "oddities" between how Mac OS X and Linux return back information about the character set.
Specifically, the nl_langinfo(CODESET); call.
On linux it ALWAYS returns upper case UTF-8 whether the LANG is set to utf-8 or UTF-8.
On Mac OS X, it returns the same case as the LANG input.
comment:4 Changed 13 years ago by andrew_b
- Branch state set to no branch
- Milestone changed from 4.7 to Future Releases
comment:5 Changed 13 years ago by andrew_b
- Owner set to andrew_b
- Status changed from new to accepted
- Component changed from mcedit to mc-core
- Branch state changed from no branch to on review
- Milestone changed from Future Releases to 4.8.3
Branch: 2386_LANG_case_insensitive (parent: master).
changeset:c45e5a67123f6c483a4032a7130042295a273254
urkle, plese test this fix.
comment:7 Changed 13 years ago by angel_il
- Votes for changeset changed from slavazanko to slavazanko angel_il
- Branch state changed from on review to approved
comment:8 Changed 13 years ago by andrew_b
- Keywords stable-candidate added
- Status changed from accepted to testing
- Votes for changeset changed from slavazanko angel_il to committed-master
- Resolution set to fixed
- Branch state changed from approved to merged
Merged to master: [91ff90f87b3ea7dd17974097a7b6e67190db11cc].
comment:9 Changed 13 years ago by andrew_b
- Status changed from testing to closed
- Keywords stable-candidate removed
- Votes for changeset changed from committed-master to committed-master committed-stable
Merged to 4.8.1-stable: [d474cad4e35680b2976ba714c0ea344ef23b6746].
MC doesn't directly interpret the LC_* and LANG variables. It detects the encoding using nl_langinfo (CODESET).
I cannot reproduce this bug on Linux. Both ru_RU.UTF-8 and ru_RU.utf-8 values of LANG are interpreted as utf-8 locale and MC works fine for me with that both values.
I can't find MC details at http://code.google.com/p/iterm2/issues/detail?id=204: MC version, GLib version, wich screen library MC is built with (S-Lang or NCurses).