Ticket #3280 (new defect)
[patch] mcview: search off-by-one between plain and hex modes
Reported by: | egmont | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | Future Releases |
Component: | mcview | Version: | master |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Branch state: | no branch | Votes for changeset: |
Description
echo abcdefghijklmnopqrstuvwxyz > abc
mcview abc
Search for efg, it is highlighted.
Switch to Hex mode, now fgh is highlighted as the match.
The same (with opposite direction) the other way around: search for klm in hex mode, it is highlighted, switch back to plain mode, jkl is highlighted.
Attachments
Change History
Changed 10 years ago by and
- Attachment mc-4.8.13-search_selected-offbyone.patch added
comment:1 Changed 10 years ago by and
- Summary changed from mcview: search off-by-one between plain and hex modes to [patch] mcview: search off-by-one between plain and hex modes
comment:2 Changed 10 years ago by egmont
Hi Andreas,
I'm glad we have a new contributor, please keep up sending patches/fixes :)
You should work against git master instead of the latest tarball - especially the viewer has received modifications that make your patch no longer apply.
Unfortunately, I don't like what you're doing in the patch. The off-by-one presumably comes from sometimes using the right semantics for the start and end offsets and sometimes a wrong one. While your patch might actually fix the bug, it does it by switching to the wrong semantics everywhere, making it prone to breaking the next time someone touches it.
The standard and right way to handle intervals is the start being inclusive and the end being exclusive. Another, probably better way to think about it, which is also clearly demonstrated in Python's string slices (e.g. http://stackoverflow.com/a/509297/4457671) is that whenever possible you should think about pointers as pointing between objects rather than to one of them. With this way of thinking, both start and end should have the value 0 for the beginning of the file, 1 for the boundary between the first letter and the rest, and so on, and exactly the file's size (without ±1) if pointing to the end of the file. This is pretty logical: the start and end of the match are indeed boundaries between letters rather than concrete letters. If something needs to point to an actual object (e.g. a character), you should probably still imagine the pointer pointing to in between two characters just as start and end do, and take the character on its right.
Using this standard approach, the first two hunks of your patch modify the code from the right semantics to the wrong one, and should be reverted. The third hunk is obviously correct, off-by-one corrections should never be necessary.
Could you please locate where start and end are being used incorrectly according to these semantics, and fix those instead?
comment:3 Changed 10 years ago by egmont
Please see my fix.
The start and end values were correct. Removal of an off-by-one adjustment is indeed required as and's patch does.
The problem was that state->offset has already advanced by the character we're about to print, although we'd need its previous value instead.
While the first patch (or a port thereof to git head) might be technically perfect (although I'm not totally certain with combining accents etc. in the game), I think my proposed version has cleaner semantics.
Thanks for the original work!
comment:4 Changed 10 years ago by egmont
Hang on... both of the patches break the "search next" feature :(
comment:5 Changed 10 years ago by egmont
First we seem to hit actions_cmd.c:mcview_search() that does some adjustment for the hex mode (but not for the plain text mode).
Then it calls mcview_do_search which starts with some adjustments, along the lines of decrementing by 2 if searching backward but not altering if searching forwarad, which totally relies on finding a match incrementing the start value by 1 (potentially pointing in the middle of a UTF-8 or nroff sequence).
I guess the adjustment should be carefully modified here to be +1 for forward, -1 for backward search.
We'll need to double check the initial value, probably we'll have to begin with -1 instead of 0, so that a first search attempt that starts by incrementing it actually starts at the beginning of the file.
patch for search off-by-one between plain and hex mode