Ticket #2386 (closed defect: fixed)

Opened 7 years ago

Last modified 6 years ago

Interpretation of LANG variable needs to be case insensitive.

Reported by: urkle Owned by: andrew_b
Priority: major Milestone: 4.8.3
Component: mc-core Version: 4.7.4
Keywords: Cc:
Blocked By: Blocking:
Branch state: merged Votes for changeset: committed-master committed-stable


Related bug in iTerm 2


When the LANG variable is set to en_US.utf-8 mcedit specifically does not correctly accept input (every character press is interpreted as a '.'). However when LANG is set to en_US.UTF-8 mcedit works correctly.

From the work on the bug against iTerm 2 it was discovered that in reality midnight commander is not handling the LANG and LC_* environment variable correctly.

From the IANA document on character sets.

The character set names may be up to 40 characters taken from the
printable characters of US-ASCII. However, no distinction is made
between use of upper and lower case letters.



utf8.cc (2.1 KB) - added by urkle 7 years ago.
UTF test script

Change History

comment:1 Changed 7 years ago by andrew_b

MC doesn't directly interpret the LC_* and LANG variables. It detects the encoding using nl_langinfo (CODESET).

I cannot reproduce this bug on Linux. Both ru_RU.UTF-8 and ru_RU.utf-8 values of LANG are interpreted as utf-8 locale and MC works fine for me with that both values.

I can't find MC details at http://code.google.com/p/iterm2/issues/detail?id=204: MC version, GLib version, wich screen library MC is built with (S-Lang or NCurses).

comment:2 Changed 7 years ago by urkle

MC version has been 4.7.+ (First noticed it with currently using 4.7.4)

glib2 version is 2.22.4
MC is currently built with slang (issue occurred when built with ncurses as well)

This is on Mac OS X 10.6.4.

And the issue is NOT specific to iTerm either.. the Standard Mac OSX terminal also exhibits the same behavior if the LANG is set to a lowercase utf-8. (the default there is upper case though)

BTW, I can't recreate on my linux box either, only the Mac system.

Changed 7 years ago by urkle

UTF test script

comment:3 Changed 7 years ago by urkle

I attached a test C++ program that I used for actually a different purpose but it does show some "oddities" between how Mac OS X and Linux return back information about the character set.

Specifically, the nl_langinfo(CODESET); call.

On linux it ALWAYS returns upper case UTF-8 whether the LANG is set to utf-8 or UTF-8.
On Mac OS X, it returns the same case as the LANG input.

comment:4 Changed 6 years ago by andrew_b

  • Branch state set to no branch
  • Milestone changed from 4.7 to Future Releases

comment:5 Changed 6 years ago by andrew_b

  • Owner set to andrew_b
  • Status changed from new to accepted
  • Component changed from mcedit to mc-core
  • Branch state changed from no branch to on review
  • Milestone changed from Future Releases to 4.8.3

Branch: 2386_LANG_case_insensitive (parent: master).

urkle, plese test this fix.

comment:6 Changed 6 years ago by slavazanko

  • Votes for changeset set to slavazanko

comment:7 Changed 6 years ago by angel_il

  • Votes for changeset changed from slavazanko to slavazanko angel_il
  • Branch state changed from on review to approved

comment:8 Changed 6 years ago by andrew_b

  • Status changed from accepted to testing
  • Keywords stable-candidate added
  • Votes for changeset changed from slavazanko angel_il to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged

comment:9 Changed 6 years ago by andrew_b

  • Keywords stable-candidate removed
  • Status changed from testing to closed
  • Votes for changeset changed from committed-master to committed-master committed-stable
Note: See TracTickets for help on using tickets.