Ticket #3589 (closed defect: fixed)
Hexdecimal search fails
Reported by: | phelum | Owned by: | andrew_b |
---|---|---|---|
Priority: | major | Milestone: | 4.8.19 |
Component: | mc-search | Version: | 4.8.15 |
Keywords: | Cc: | galtgendo@… | |
Blocked By: | #3694 | Blocking: | |
Branch state: | merged | Votes for changeset: | committed-master |
Description
Searching for characters > 0x7F fails. A search for 0x80 causes a search for 0xC0 0x00. A search for 0x41 - 0x5A results in false finds of 0x61 - 0x7A.
I've disabled the case-insensitivity logic and enabled raw mode in hex.c and regex.c and this seems to fix the problem. I can supply the new files if this will help. But perhaps these changes adversely affect UTF-8 searches.
Cheers,
Steven
Attachments
Change History
comment:1 Changed 9 years ago by zaytsev-work
Could you please make a patch with diff -Naur mc-x.y.z.orig mc-x.y.z ? When you attach the modified files, it's not clear what the reference to compare them with is. Thanks!
comment:3 Changed 9 years ago by mnk
While #3454 isn't fully related, could you have a look at it too ?
comment:6 Changed 8 years ago by andrew_b
Ticket #3695 has been marked as a duplicate of this ticket.
comment:8 Changed 8 years ago by mooffie
Here's my 1.5 lines patch to solve this. (I should get a prize for "the biggest comment for the smallest code"!)
(Although we're blocked by #3694, I don't want to wait any longer, just in case a bus aliens kidnap me.)
Here are 4 tests with which to try out the patch. Rules:
- Be in UTF-8 locale / display (the bugs exist only there).
- Set search type (the radio button) to "Hexadecimal".
- Make sure the checkboxes "Case sensitive" and "All charsets" are off (because if they're on they'll mask the bugs, as explained in #3695).
Here we go:
(1) Open /usr/bin/mc in the viewer or editor, and search for 00 4e ed 00. You should find it.
Open a text file containing "I saw Aaron". Then:
(2) Search for 41 (ASCII of letter 'A'). You should not find the lowercase 'a's, only the upper 'A'.
(3) Search for 41 "A". You should find the "Aa" of "Aaron".
Turn "Case sensitive" checkbox on and:
(4) Search for 41 "A", as before. This time you should not find the "Aa" of "Aaron".
(BTW, if we decide that we want quoted strings (tests (3) and (4)) to always be case-sensitive, it's very easy to do this by making mc_search__hex_translate_to_regex() write them out as \xXX symbols instead of verbatim.)
comment:9 Changed 8 years ago by andrew_b
- Owner set to andrew_b
- Status changed from new to accepted
- Votes for changeset set to andrew_b
- Branch state changed from no branch to on review
- Milestone changed from Future Releases to 4.8.19
Branch: 3589_hex_search_binary_data
changeset:e4a70ff1d2e4773438920bafbdedb8035301e9c7
comment:11 Changed 8 years ago by andrew_b
- Status changed from accepted to testing
- Votes for changeset changed from andrew_b to committed-master
- Resolution set to fixed
- Branch state changed from approved to merged
Merged to master: [f2051b2e99447fe08c8c67d81b42c646c7de6363].