Ticket #1668 (closed defect: fixed)

Opened 10 years ago

Last modified 8 years ago

[patch] Screen and input corruption under xterm [non-UTF]

Reported by: gotar Owned by: andrew_b
Priority: critical Milestone: 4.8.0-pre1
Component: mcview Version: master
Keywords: Cc: arekm@…, paul.hartman+gentoo@…, zaytsev, andrixnet@…
Blocked By: Blocking:
Branch state: Votes for changeset: committed-master committed-stable

Description

LANG=
LC_CTYPE=pl_PL
LC_COLLATE=pl_PL
LC_[*]="POSIX"
LC_ALL=

Input / display codepage (i.e. display_codepage): ISO-8859-2
source_codepage (set via 'Fix it'): ISO-8859-2

However the same happens with any display_codepage different from 7-bit ASCII or UTF-8 and any (non-UTF at least) locale setting.

The problem:
viewing binary files leads to massive screen corruption and Search dialog pops up with 1;2c search string (multiple times depending on actual screen contents). So it looks like the file 'presses' F7 or / and shift-right_arrow for every specified character combination occurrence. In case of bigger files it's impossible to exit from such viewer, as search dialog keeps popping up after closing.
No such behaviour on linux console with any circumstances.

I've minimized the problem to 2-byte file: 0x9A 0x14 (attached).

GNU Midnight Commander 4.7.0-pre3
Virtual File System: tarfs, extfs, cpiofs, ftpfs, fish, mcfs
With builtin Editor
Using system-installed S-Lang library with terminfo database
With subshell support as default
With support for background operations
With mouse support on xterm and Linux console
With support for X11 events
With internationalization support
With multiple codepages support
Data types: char 8 int 32 long 32 void * 32 off_t 64 ecs_char 8

Attachments

mcview (2 bytes) - added by gotar 10 years ago.
viewer-printable_only.patch (478 bytes) - added by gotar 10 years ago.
slang-8bit_xterm.patch (468 bytes) - added by gotar 10 years ago.

Change History

Changed 10 years ago by gotar

comment:1 Changed 10 years ago by gotar

The problem exist when current panel translation (c-x t) is set to no translation (my default) or any ISO. I don't see any connection of translating file names and displaying file contents, but apparently something here depends on something that it shouldn't...

comment:2 Changed 10 years ago by arekm

  • Cc arekm@… added

comment:3 Changed 10 years ago by gotar

  • Version changed from 4.7.0-pre3 to 4.7.0-pre4

This error manifest themself only when using slang library.

comment:4 Changed 10 years ago by gotar

..and only when compiled with --enable-charset.

comment:5 Changed 10 years ago by gotar

  • Summary changed from Screen and input corruption under xterm [non-UTF] to [patch] Screen and input corruption under xterm [non-UTF]

OK, as apparently noone even took a look at this ticket so here's what I've found on my own:

  1. the problem is 0x9A (154) character itself (Ü - U Dieresis in CP850, U Trema in CP1252 or љ in CP1251). As it is vt220 control code on xterm it causes the terminal to dump it's state. Being KOI8-R printable character it does no charm if such display codepage was selected (and so Latin1).

2a. util.c/strip_ctrl_codes mentions http://www.xfree86.org/current/ctlseqs.html for skipping xterm's OSC (Operating System Command), where

ESC Z Return Terminal ID (DECID is 0x9a). Obsolete form of CSI c (DA).

but this stripping doesn't occour in viewer.

2b. there are is*_printable functions in util.c which use xterm_flag to exclude characters 128-154,156-159 as '"Full 8 bits output" doesn't work on xterm'. None of these functions are used either in viewer now, while one was used in view.c/display before 4.7.0pre.

  1. main.c states:
  • If true, also allow characters in the range 128-159.
  • This is reported to break on many terminals (xterm, qansi-m).

and defines constant full_eight_bits (1) if charset support is enabled (it can't be disabled via full_eight_bits=0 in ini file).

  1. editor checks for g_unichar_isprint or is_printable according to UTF8, hex viewer checks g_ascii_isprint and they replace nonprintable characters with dot mark, while nroff and plain viewer just call tty_print_anychar which in turn calls (when compiled with slang not ncurses) SLsmg_write_char - as full_eight_bits is hardcoded on and so SLsmg_Display_Eight_Bit is set to 128 instead of 160 this leads to problem from subject.

There are two solutions:
a) bring back missing is_printable checks (someone please check UTF8 interaction),
a) make tty/tty-slang.c/tty_display_8bit use xterm_flag (this seems the most logical).

The difference is that a) display dots where b) displays mostly carets - the latter is some kind of mistake, because in real string like <8D> is printed, but cursor position moves just 1 character forward, so any next overwrites second one ('8' here) and so on.

Changed 10 years ago by gotar

Changed 10 years ago by gotar

comment:6 Changed 10 years ago by gotar

Oops, tty_display_8bit is called from setup_pre and it's too early to use xterm_flag (always equals 0 then). Updated patch to recall when needed in init_xterm_support.
Therefore SLsmg_Display_Eight_Bit is always raised to 160 on xterm. Doesn't this way break KOI8-R? Maybe is_printable() approach is better.

comment:7 Changed 9 years ago by slavazanko

  • Owner set to slavazanko
  • Status changed from new to accepted

comment:8 Changed 9 years ago by slavazanko

  • severity changed from no branch to on rework

Created branch 1668_io_corruption_xterm_nonutf

Initial changeset: efe0b37d6c164fd962ae55f34847b1b0a96cc973
Status of branch: on rework.

I tested on 'mcview' attachment and I don't see massive screen corruption.
May be, my tests was incorrect.. :(

This branch for testing only. As I tested under koi8-r - all seems ok, but need to ask 'koi8'-native users :)

comment:9 follow-up: ↓ 16 Changed 9 years ago by gotar

At first just do:

cat mcview

you should end up with:

$ cat mcview
[[?1;2c
$ 1;2c
with this in command line. The same happens when mcview displays character with 194 code. My test file won't cause _massive_ corruption, as it has only single occurence (in contrary to random binary file which has many of them).

As for my patches, only slang-8bit_xterm.patch could influence somehow koi8-r users, viewer-printable_only.patch is fully safe (but I'm not sure how 'proper' in terms of mc code). And both of them address the same issue, so there's no need to use them together (unless both would be considered as proper).

BTW there is -k8/allowC1Printable xterm option since: http://cseweb.ucsd.edu/~mbtaylor/xterm.log.html#xterm_175 which addresses the problem for KOI8-R users and so 'fixes' mc behaviour - make sure you don't have it enabled for testing.

comment:10 Changed 9 years ago by paul.hartman+gentoo@…

  • Cc paul.hartman+gentoo@… added

For me this problem is temporarily fixed if I choose Options -> Display Bits and then select "OK". I don't actually have to change anything, just open that dialog and then "OK" is enough. From there it does not have the problem for the remainder of the session. So I don't know if this "OK" initializes something differently than what mc does on startup?

I use ISO 8859-1 and do not use "Full 8 bits support".

GNU Midnight Commander 4.7.0.2
Virtual File System: tarfs, extfs, cpiofs, ftpfs, fish, smbfs, undelfs
With builtin Editor
Using system-installed S-Lang library with terminfo database
With subshell support as default
With support for background operations
With mouse support on xterm and Linux console
With support for X11 events
With internationalization support
With multiple codepages support
Data types: char 8 int 32 long 64 void * 64 off_t 64 ecs_char 8

comment:11 Changed 9 years ago by zaytsev

  • Cc zaytsev added

Can't confirm on Xterm, but Xterm + screen exhibits same issue. Previous comment fixes it for me. Might be related to #2140 ...

comment:12 Changed 9 years ago by andrew_b

Created 1668_screen_corruption branch.
Parent branch is master.
changest:97967701ff7f7a776b47d571c7db42fc646991f0

In this changeset, the call of 'Display bits' dialog is emulated.

comment:13 follow-up: ↓ 14 Changed 9 years ago by zaytsev

Guys, it seems to me that you are quite crazy :) To me changeset:97967701ff7f7a776b47d571c7db42fc646991f0 looks like a horrible kludge. Is there any particular reason why you can't just follow gotar's approach with is_printable() checks, he did a great job at figuring out the true reason for this breakage...

comment:14 in reply to: ↑ 13 Changed 9 years ago by andrew_b

Replying to zaytsev:

Guys, it seems to me that you are quite crazy :) To me changeset:97967701ff7f7a776b47d571c7db42fc646991f0 looks like a horrible kludge.

Ok, forget this. :) In any case, this bug is not reproducable for me at all.

comment:15 Changed 9 years ago by zaytsev

Of course you cannot reproduce this bug! You said that you are using KOI8-R as a system locale:

Replying to gotar:

  1. the problem is 0x9A (154) character itself (Ü - U Dieresis in CP850, U Trema in CP1252 or љ in CP1251). As it is vt220 control code on xterm it causes the terminal to dump it's state. Being KOI8-R printable character it does no charm if such display codepage was selected (and so Latin1).

Before or after applying the patch you will not notice anything at all if the patch is correct. To reproduce this we need someone using single-byte locale where these symbols are unprintable.

I think we need to investigate branch 1668_io_corruption_xterm_nonutf (changeset: efe0b37d6c164fd962ae55f34847b1b0a96cc973) which seems to contain a correct fix. However what has to be tested is whether it breaks anything in UTF-8 mode. That's it.

comment:16 in reply to: ↑ 9 ; follow-up: ↓ 18 Changed 9 years ago by andrew_b

Gotar, your patches are S-Lang-related only. Does this bug not happen in NCurses-based MC. Or you didn't check that?

comment:17 Changed 9 years ago by zaytsev

If I build with ncurses it works fine for me and I can't reproduce this issue.

comment:18 in reply to: ↑ 16 Changed 9 years ago by gotar

Replying to andrew_b:

Gotar, your patches are S-Lang-related only. Does this bug not happen in NCurses-based MC. Or you didn't check that?

The bug doesn't occour when using ncurses library - see 4. for explanation. This is purely either wrong Slang setting (SLsmg_Display_Eight_Bit) OR missing simple pre-filtering out 154-char. Slang approach filters every 128-160 which is almost identical to 128-154,156-159 filtered out by is*_printable (see 2b), so unless there is some legitimate character in this range there's no difference which solution is chosen.

In short: ncurses handles 154 properly, while Slang requires SLsmg_Display_Eight_Bit>=154 (OR manual filtering if xterm display detected - but checking TERM would fail under screen, OR running xterm with allowC1Printable).

comment:19 Changed 9 years ago by zaytsev

OK, since Paul's workaround actually works, we probably should investigate the SLsmg_Display_Eight_Bit route? For me, actually, it only fails under screen so filtering does work for xterm, it's just that the terminals that have different TERM from xterm are failing.

comment:20 Changed 9 years ago by zaytsev

  • Cc birdie added
  • Version changed from 4.7.0-pre4 to master

comment:21 Changed 9 years ago by zaytsev

See duplicate (#2140) for the screenshot.

comment:22 Changed 9 years ago by angel_il

  • Milestone changed from 4.7 to 4.7.3

comment:23 Changed 9 years ago by angel_il

  • Milestone changed from 4.7.3 to 4.7

comment:24 follow-up: ↓ 26 Changed 8 years ago by andrixnet

  • Cc andrixnet@… added

I'm not sure if it is directly related to this bug, or it's something completely different.

Midnight Commander, Slackware build (all versions built in distro slack-11.0 at least).
Terminal is set to non-utf8.
Anytime I try to view a binary file I run the risc of entering an endless loop similar to what is described here :

  • does not matter if local console, xterm, remote via putty, screen
  • a regex searchbox pops up wanting to seach for "6c" or a series of "6c?"

Sometimes repeated presses of ESC key gets around the problem.
Other, seems to go into endless loop.

comment:25 follow-up: ↓ 27 Changed 8 years ago by angel_il

1) mc -V
2) please, step-by-step how to reproduce this.
3.1) ./configure <common-options>
3.2) make CFLAGS=-ggdb3
3.3) make install
3.4) run mc
4) when mc hangup run gdb -q -pid pid_of_mc_process
5) (gdb) bt
<usefull-output-1>
(gdb) bt full
<usefull-output-2>

comment:26 in reply to: ↑ 24 Changed 8 years ago by gotar

Replying to andrixnet:

I'm not sure if it is directly related to this bug, or it's something completely different.

Just check test file (the 2-bytes one) or workarounds or proposed patches. It looks you have hit the same problem, except for:

  • does not matter if local console, xterm, remote via putty, screen

it can't happen on local console.

comment:27 in reply to: ↑ 25 Changed 8 years ago by gotar

Replying to angel_il:

4) when mc hangup run gdb -q -pid pid_of_mc_process

There is no real hangup, one just can't exit from search dialog as it pops back over and over again every time he presses esc or enter.
As for the rest it's already above. This is my last comment in this ticket, because it's ridiculous that 1 line of code can't be commited in 16 months, so:

please
filter
out
0x9A
character
from
output

By any means. Thank you.

comment:29 Changed 8 years ago by angel_il

andrixnet:
please show output of 'mc -V'

comment:30 follow-up: ↓ 31 Changed 8 years ago by angel_il

gotar: can you reproduce issue with 0x9A on current master or 4.7.5 ?

comment:31 in reply to: ↑ 30 Changed 8 years ago by gotar

Replying to angel_il:

gotar: can you reproduce issue with 0x9A on current master or 4.7.5 ?

Yes, I've just compiled 4.7.5.1 tarball (./configure with no options except paths) and the problem still exists. The only difference I've noticed is that it happens after F9 (ViewToggleNroffMode) is switched to the state where keybar shows '9 Unform' (i.e. to the place where my patch shows carets).
And I cannot locate the changeset that should have fixed this on master branch, which one is it?

comment:32 Changed 8 years ago by andrixnet

Oh, my. I am ashamed I was unaware of this : the mc build I use most often is from aug 2008. No version number though.
I don't have access to a system having a more recent build installed, but I can obtain from our admin an upgrade here, with the latest stable.
Be back with more tests. Thank you for understanding.

comment:33 Changed 8 years ago by angel_il

  • Owner changed from slavazanko to angel_il

comment:34 Changed 8 years ago by angel_il

branch: 1668_io_corruption_xterm_nonutf (parent)

comment:35 Changed 8 years ago by birdie

  • Cc birdie removed

comment:36 Changed 8 years ago by angel_il

  • Votes for changeset set to angel_il

ok, i don't know how to reproduce this, my vote here...

comment:37 Changed 8 years ago by andrew_b

Well, seems we initialize S-Lang incorrectly. As John E. Davis wrote in http://mailman.jedsoft.org/pipermail/slang-users-l/2011/000724.html, SLutf8_enable has to be called first before any of the other functions.

Gotar, would you try the following patch:

--- a/lib/tty/tty-slang.c
+++ b/lib/tty/tty-slang.c
@@ -247,8 +247,8 @@ tty_init (gboolean slow, gboolean ugly_lines)
     ugly_line_drawing = ugly_lines;
 
     SLtt_Ignore_Beep = 1;
-    SLtt_get_terminfo ();
     SLutf8_enable (-1);
+    SLtt_get_terminfo ();
     /*
      * If the terminal in not in terminfo but begins with a well-known
      * string such as "linux" or "xterm" S-Lang will go on, but the

comment:38 follow-up: ↓ 39 Changed 8 years ago by gotar

I did (over 4.5.7.1 tarball) and it doesn't fix the problem (and doesn't break anything neither as far as I can see). I doubt this one could be solved via anything at terminfo level, so unless SLang gets native xterm (and screen?) output fiter it needs to be done like fe32d549dfd711e37eeeb4402c2a7cb6ae4ec681 (please verify if second chunk doesn't break KOI8-R output).

comment:39 in reply to: ↑ 38 Changed 8 years ago by andrew_b

Replying to gotar:

please verify if second chunk doesn't break KOI8-R output.

It doesn't.

comment:40 Changed 8 years ago by andrew_b

  • Owner changed from angel_il to andrew_b
  • severity changed from on rework to on review

comment:41 Changed 8 years ago by andrew_b

  • Keywords stable-candidate added
  • Milestone changed from 4.7 to 4.8.0-pre1

comment:42 Changed 8 years ago by slavazanko

  • Votes for changeset changed from angel_il to angel_il slavazanko
  • severity changed from on review to approved

comment:43 Changed 8 years ago by andrew_b

  • Status changed from accepted to testing
  • Votes for changeset changed from angel_il slavazanko to committed-master
  • Resolution set to fixed
  • severity changed from approved to merged

Merged to master.
changeset:2a97cbf7e71e9fe8053a5c06efc0d85cbf013a19

git log --pretty=oneline 8f92ba1..2a97cbf

comment:44 Changed 8 years ago by zaytsev

Hi guys!

Please confirm that everything works for you on latest master and nothing is broken as a consequence of this patch. I'd love to cherry-pick it to stable, but I'm the only from the dev team who is able to reproduce it, so it's holding us back...

comment:45 Changed 8 years ago by slavazanko

  • Keywords stable-candidate removed
  • Status changed from testing to closed
  • Votes for changeset changed from committed-master to committed-master committed-stable

cherry-picked in stable:

Note: See TracTickets for help on using tickets.