Ticket #256 (closed defect: fixed)
Alt+Backspace off-by-one and segfault (UTF-8)
Reported by: | egmont | Owned by: | angel_il |
---|---|---|---|
Priority: | critical | Milestone: | 4.7.0 |
Component: | mc-core | Version: | 4.7.0-pre4 |
Keywords: | Cc: | dborca@…, egmont@… | |
Blocked By: | Blocking: | ||
Branch state: | Votes for changeset: | committed-master |
Description
The UTF-8 patch introduces this regression: Ctrl+Left and Alt+Backspace proceed one more character than they should.
(Consistently with unpatched mc, bash, and probably lots of other tools, Alt+Backspace should remove a sequence of non-alphanumeric characters followed by a sequence of alphanumeric ones, but instead it removes one more non-alphanumeric character at the end. Similar for Ctrl+Left.)
Trivial patch is attached. I've created this patch years ago and posted to mc-devel; probably it's already included in some major Linux distributions as well.
Please apply this fix in your UTF-8 patch. Thanks!
Attachments
Change History
Changed 16 years ago by egmont
- Attachment 00-79-utf8-backward-word-off-by-one.patch added
comment:1 Changed 16 years ago by winnie
- Status changed from new to accepted
- Owner set to winnie
- Milestone changed from 4.6.2.1 to UTF8 Support
The same applies here. This is no bug for the 4.6.2.1 release...
comment:2 Changed 16 years ago by winnie
- Status changed from accepted to testing
- Resolution set to fixed
Patch is applied to both HACK_utf8 branches and will be included in the next release
comment:4 Changed 16 years ago by angel_il
- Status changed from closed to reopened
- Resolution fixed deleted
- Milestone UTF8 Support deleted
need check it
comment:6 Changed 16 years ago by angel_il
- Status changed from reopened to closed
- Resolution set to fixed
now all correct
comment:8 Changed 15 years ago by egmont
- Cc egmont@… added
- Status changed from closed to reopened
- Version changed from 4.6.2 to 4.7.0-pre2
- Resolution fixed deleted
Reopening: same bug is present in 4.7-pre2. Probably it was fixed in the 4.6 branch, but not in the complete UTF-8 rewrite for 4.7.
comment:9 Changed 15 years ago by egmont
Patch attached, please test it thoroughly. (It's slightly different than the one used for 4.6. As a matter of fact, I don't fully understand how that one worked correctly :))
The trick is: p points to the location of the cursor, that is, the first byte of the character it stands under. When moving forward, you have to examine this character and see if it's alphanumerical or not - easy story. When moving backwards, however, you need to check the preceding character, not this one.
Changed 15 years ago by egmont
- Attachment mc-4.7-pre2-backward-word-off-by-one.patch added
Fix for off-by-one error in mc 4.7-pre2.
comment:10 Changed 15 years ago by angel_il
- Owner changed from winnie to angel_il
- Status changed from reopened to accepted
- Milestone changed from 4.7 to 4.7.0-pre3
comment:11 Changed 15 years ago by iNode
- Blocking 1481 added
#1481 seems to be related or the same bug.
comment:12 Changed 15 years ago by egmont
Yep, exactly the same bug.
comment:13 Changed 15 years ago by iNode
- Status changed from accepted to testing
- Version 4.7.0-pre2 deleted
- Resolution set to duplicate
- Blocking 1481 removed
- Milestone 4.7.0-pre3 deleted
Ok, I close this as duplicate to cleanup roadmap.
comment:14 Changed 15 years ago by slavazanko
- Status changed from testing to reopened
- Cc dborca@… added
- Resolution duplicate deleted
comment:17 Changed 15 years ago by angel_il
- Votes for changeset set to angel_il
- severity changed from no branch to on review
branch: 256_widget_backward_kill_word_fix (parent: master)
changeset: a9c0acc89560548a8df3ca9d0fb4d24b47022434
comment:18 Changed 15 years ago by slavazanko
- Votes for changeset angel_il deleted
comment:19 Changed 15 years ago by slavazanko
Ops, I forgot to remove some code: changeset:a12abcf77a94f1bd4bf71b2ef3990d5fa6d4be06
Need rebase
comment:21 Changed 15 years ago by andrew_b
- Votes for changeset changed from angel_il to angel_il andrew_b
- severity changed from on review to approved
comment:22 Changed 15 years ago by angel_il
- Status changed from assigned to testing
- Votes for changeset changed from angel_il andrew_b to commited-master
- Resolution set to fixed
- severity changed from approved to merged
comment:24 Changed 15 years ago by egmont
- Priority changed from minor to critical
- Status changed from closed to reopened
- Version changed from 4.6.1 to 4.7.0-pre4
- Resolution fixed deleted
- Milestone changed from 4.7.0-pre4 to 4.7.0
Reopening. Unfortunately still not fixed, and the new behavior might even cause segfault, hence is much more harmful than the old one.
If you type English letters only (e.g. "patch.gz") then Alt+Backspace works as expected, it removes "gz" first, then removes "patch.", then does nothing.
The bug happens when you start using accented letters. Type "áá.éé" for example, and then press Alt+Backspace. It removes ".éé" - already buggy. Press it once again, seems to do nothing. Press once or twice again, still seems to do nothing.
Then start typing normal letters. The first few ones don't appear, but then suddenly they start appearing. If you keep on doing various things in mc, you'll pretty soon get a segmentation fault.
Definitely looks like a buffer underrun.
I can't tell at this moment if my originally propsed patch was already buggy or not. I will take a closer look and try to come up with a fix later, I don't have time right now.
comment:25 Changed 15 years ago by egmont
My previously proposed w/ 4.7-pre2 is already broken. Sorry for that. Working on the fix...
Changed 15 years ago by egmont
- Attachment mc-4.7.0-pre4-backward-word-segfault.patch added
fix for segfault and other weird errors
comment:26 Changed 15 years ago by egmont
Fix attached.
The for loop, whose purpose is to remove exactly 1 character (hence I don't get why it's a loop, but nevermind) was not UTF8-ready. So if the character preceding the cursor was an accented one, it jumped to the middle of the UTF-8 sequence, causing the rest of the stuff go unpredictable.
Although it *should* never happen (which, as we all know, does not equal to "never happens"), in this case "p" simply jumped over "in->buffer". The function has a "p != in->buffer" check three times, it might it more robust if you replaced that with "p >= in->buffer". This should prevent the segfault, and just stay with a slightly buggy but otherwise harmless alt-backspace behavior, should there be any UTF-8 or similar bugs left. This change is not included in my patch.
comment:27 Changed 15 years ago by andrew_b
- Votes for changeset commited-master deleted
- severity changed from merged to no branch
comment:28 Changed 15 years ago by zaytsev
Use assert() to catch this?
comment:29 Changed 15 years ago by egmont
- Summary changed from Alt+Backspace off-by-one (UTF-8) to Alt+Backspace off-by-one and segfault (UTF-8)
=, assert, whatever... up to you :)
comment:30 Changed 15 years ago by angel_il
- Votes for changeset set to angel_il
- severity changed from no branch to on review
branch: 256_backward_word_fix
changeset: ecee0cd4b9cd8b44cff1fd14d897037626c97c54
comment:31 Changed 15 years ago by andrew_b
- Votes for changeset changed from angel_il to angel_il andrew_b
- severity changed from on review to approved
comment:32 Changed 15 years ago by angel_il
- Status changed from reopened to closed
- Votes for changeset angel_il andrew_b deleted
- Resolution set to fixed
- severity changed from approved to merged
Fix for the bug.