Ticket #256 (closed defect: fixed)

Opened 16 years ago

Last modified 15 years ago

Alt+Backspace off-by-one and segfault (UTF-8)

Reported by: egmont Owned by: angel_il
Priority: critical Milestone: 4.7.0
Component: mc-core Version: 4.7.0-pre4
Keywords: Cc: dborca@…, egmont@…
Blocked By: Blocking:
Branch state: Votes for changeset: committed-master

Description

The UTF-8 patch introduces this regression: Ctrl+Left and Alt+Backspace proceed one more character than they should.

(Consistently with unpatched mc, bash, and probably lots of other tools, Alt+Backspace should remove a sequence of non-alphanumeric characters followed by a sequence of alphanumeric ones, but instead it removes one more non-alphanumeric character at the end. Similar for Ctrl+Left.)

Trivial patch is attached. I've created this patch years ago and posted to mc-devel; probably it's already included in some major Linux distributions as well.

Please apply this fix in your UTF-8 patch. Thanks!

Attachments

00-79-utf8-backward-word-off-by-one.patch (950 bytes) - added by egmont 16 years ago.
Fix for the bug.
mc-4.7-pre2-backward-word-off-by-one.patch (858 bytes) - added by egmont 15 years ago.
Fix for off-by-one error in mc 4.7-pre2.
mc-4.7.0-pre4-backward-word-segfault.patch (494 bytes) - added by egmont 15 years ago.
fix for segfault and other weird errors

Change History

Changed 16 years ago by egmont

Fix for the bug.

comment:1 Changed 16 years ago by winnie

  • Status changed from new to accepted
  • Owner set to winnie
  • Milestone changed from 4.6.2.1 to UTF8 Support

The same applies here. This is no bug for the 4.6.2.1 release...

comment:2 Changed 16 years ago by winnie

  • Status changed from accepted to testing
  • Resolution set to fixed

Patch is applied to both HACK_utf8 branches and will be included in the next release

comment:3 Changed 16 years ago by winnie

  • Status changed from testing to closed

comment:4 Changed 15 years ago by angel_il

  • Status changed from closed to reopened
  • Resolution fixed deleted
  • Milestone UTF8 Support deleted

need check it

comment:5 Changed 15 years ago by styx

  • severity set to no branch
  • Milestone set to future releases

comment:6 Changed 15 years ago by angel_il

  • Status changed from reopened to closed
  • Resolution set to fixed

now all correct

comment:7 Changed 15 years ago by angel_il

  • Milestone changed from future releases to 4.7

comment:8 Changed 15 years ago by egmont

  • Cc egmont@… added
  • Status changed from closed to reopened
  • Version changed from 4.6.2 to 4.7.0-pre2
  • Resolution fixed deleted

Reopening: same bug is present in 4.7-pre2. Probably it was fixed in the 4.6 branch, but not in the complete UTF-8 rewrite for 4.7.

comment:9 Changed 15 years ago by egmont

Patch attached, please test it thoroughly. (It's slightly different than the one used for 4.6. As a matter of fact, I don't fully understand how that one worked correctly :))

The trick is: p points to the location of the cursor, that is, the first byte of the character it stands under. When moving forward, you have to examine this character and see if it's alphanumerical or not - easy story. When moving backwards, however, you need to check the preceding character, not this one.

Changed 15 years ago by egmont

Fix for off-by-one error in mc 4.7-pre2.

comment:10 Changed 15 years ago by angel_il

  • Owner changed from winnie to angel_il
  • Status changed from reopened to accepted
  • Milestone changed from 4.7 to 4.7.0-pre3

comment:11 Changed 15 years ago by iNode

  • Blocking 1481 added

#1481 seems to be related or the same bug.

comment:12 Changed 15 years ago by egmont

Yep, exactly the same bug.

comment:13 Changed 15 years ago by iNode

  • Status changed from accepted to testing
  • Version 4.7.0-pre2 deleted
  • Resolution set to duplicate
  • Blocking 1481 removed
  • Milestone 4.7.0-pre3 deleted

Ok, I close this as duplicate to cleanup roadmap.

comment:14 Changed 15 years ago by slavazanko

  • Status changed from testing to reopened
  • Cc dborca@… added
  • Resolution duplicate deleted

#1481 is closed as duplicate.

WARNING: #1481 contain some patch.

comment:15 Changed 15 years ago by slavazanko

  • Status changed from reopened to assigned

comment:16 Changed 15 years ago by slavazanko

  • Version set to 4.6.1
  • Milestone set to 4.7.0-pre4

comment:17 Changed 15 years ago by angel_il

  • Votes for changeset set to angel_il
  • severity changed from no branch to on review

branch: 256_widget_backward_kill_word_fix (parent: master)
changeset: a9c0acc89560548a8df3ca9d0fb4d24b47022434

comment:18 Changed 15 years ago by slavazanko

  • Votes for changeset angel_il deleted

comment:19 Changed 15 years ago by slavazanko

Ops, I forgot to remove some code: changeset:a12abcf77a94f1bd4bf71b2ef3990d5fa6d4be06
Need rebase

comment:20 Changed 15 years ago by angel_il

  • Votes for changeset set to angel_il

comment:21 Changed 15 years ago by andrew_b

  • Votes for changeset changed from angel_il to angel_il andrew_b
  • severity changed from on review to approved

comment:22 Changed 15 years ago by angel_il

  • Status changed from assigned to testing
  • Votes for changeset changed from angel_il andrew_b to commited-master
  • Resolution set to fixed
  • severity changed from approved to merged

comment:23 Changed 15 years ago by angel_il

  • Status changed from testing to closed

comment:24 Changed 15 years ago by egmont

  • Priority changed from minor to critical
  • Status changed from closed to reopened
  • Version changed from 4.6.1 to 4.7.0-pre4
  • Resolution fixed deleted
  • Milestone changed from 4.7.0-pre4 to 4.7.0

Reopening. Unfortunately still not fixed, and the new behavior might even cause segfault, hence is much more harmful than the old one.

If you type English letters only (e.g. "patch.gz") then Alt+Backspace works as expected, it removes "gz" first, then removes "patch.", then does nothing.

The bug happens when you start using accented letters. Type "áá.éé" for example, and then press Alt+Backspace. It removes ".éé" - already buggy. Press it once again, seems to do nothing. Press once or twice again, still seems to do nothing.

Then start typing normal letters. The first few ones don't appear, but then suddenly they start appearing. If you keep on doing various things in mc, you'll pretty soon get a segmentation fault.

Definitely looks like a buffer underrun.

I can't tell at this moment if my originally propsed patch was already buggy or not. I will take a closer look and try to come up with a fix later, I don't have time right now.

comment:25 Changed 15 years ago by egmont

My previously proposed w/ 4.7-pre2 is already broken. Sorry for that. Working on the fix...

Changed 15 years ago by egmont

fix for segfault and other weird errors

comment:26 Changed 15 years ago by egmont

Fix attached.

The for loop, whose purpose is to remove exactly 1 character (hence I don't get why it's a loop, but nevermind) was not UTF8-ready. So if the character preceding the cursor was an accented one, it jumped to the middle of the UTF-8 sequence, causing the rest of the stuff go unpredictable.

Although it *should* never happen (which, as we all know, does not equal to "never happens"), in this case "p" simply jumped over "in->buffer". The function has a "p != in->buffer" check three times, it might it more robust if you replaced that with "p >= in->buffer". This should prevent the segfault, and just stay with a slightly buggy but otherwise harmless alt-backspace behavior, should there be any UTF-8 or similar bugs left. This change is not included in my patch.

comment:27 Changed 15 years ago by andrew_b

  • Votes for changeset commited-master deleted
  • severity changed from merged to no branch

comment:28 Changed 15 years ago by zaytsev

Use assert() to catch this?

comment:29 Changed 15 years ago by egmont

  • Summary changed from Alt+Backspace off-by-one (UTF-8) to Alt+Backspace off-by-one and segfault (UTF-8)

=, assert, whatever... up to you :)

comment:30 Changed 15 years ago by angel_il

  • Votes for changeset set to angel_il
  • severity changed from no branch to on review

branch: 256_backward_word_fix
changeset: ecee0cd4b9cd8b44cff1fd14d897037626c97c54

comment:31 Changed 15 years ago by andrew_b

  • Votes for changeset changed from angel_il to angel_il andrew_b
  • severity changed from on review to approved

comment:32 Changed 15 years ago by angel_il

  • Status changed from reopened to closed
  • Votes for changeset angel_il andrew_b deleted
  • Resolution set to fixed
  • severity changed from approved to merged

comment:33 Changed 15 years ago by angel_il

  • Votes for changeset set to committed-master
Note: See TracTickets for help on using tickets.