Ticket #2283 (closed defect: fixed)
mcview scrolling issues with heavy utf-8 files
Reported by: | egmont | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 4.8.14 |
Component: | mcview | Version: | 4.7.3 |
Keywords: | Cc: | ||
Blocked By: | #2132 | Blocking: | |
Branch state: | no branch | Votes for changeset: |
Description (last modified by andrew_b) (diff)
wget http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
mcview UTF-8-demo.txt
Scroll up and down with the arrow (or pgup/pgdn) keys. Notice that very often a partial line appears on the top row, you have to press the arrow key twice or even more times to actually scroll by one line.
This happens when the topmost line that you're scrolling in or out contains lots of non-ascii characters. More precisely, I believe this occurs exactly when the number of bytes forming the topmost row is bigger than the terminal's width.
Buggy both in 4.7.3 and 4.7.0.7, fully UTF-8 environment.
Attachments
Change History
Changed 14 years ago by egmont
- Attachment mcview-utf8-scroll.png added
comment:1 Changed 14 years ago by egmont
Note: the bug only happens when word wrapping is enabled (that is, you see 2UnWrap in the button bar), and happens even despite the terminal being wider than the file.
comment:2 follow-up: ↓ 3 Changed 14 years ago by egmont
I'm looking at mc-4.7.0.7. Here the bug is in src/viewer/move.c, mcview_move_up() and mcview_move_down() functions, the view->text_wrap_mode branches. The logic that modifies col (e.g. col += width, col -= width etc.) assume that width and bytelenght are the same notions (because col actually means offset in the file), hence does not handle UTF-8 or CJK (double width) characters correctly.
I don't see what the best solution would be, probably someone more familiar with the utf8/width functions of mc could fix it much faster than me.
comment:3 in reply to: ↑ 2 Changed 14 years ago by andrew_b
- Blocked By 2132 added
Replying to egmont:
The logic that modifies col (e.g. col += width, col -= width etc.) assume that width and bytelenght are the same notions (because col actually means offset in the file), hence does not handle UTF-8 or CJK (double width) characters correctly.
Yes, this is the known issue. At least the #2132 ticket requires such fix.
comment:4 Changed 14 years ago by andrew_b
- Component changed from mc-core to mcview
- Description modified (diff)
comment:5 Changed 13 years ago by andrew_b
- Branch state set to no branch
- Milestone changed from 4.7 to Future Releases
comment:6 Changed 10 years ago by egmont
Similar scrolling issues are also reproducible with:
- no UTF-8 but nroff formatting (e.g. an English manual page) [same underlying cause]
- no UTF-8 and no nroff either (just plain ASCII text) and a line that's exactly as wide as the widow [the top line becomes empty but doesn't scroll out; probably an off-by-one somewhere]
Screenshot - though experiencing the behavior is much more useful