Ticket #3589 (closed defect: fixed)

Opened 14 months ago

Last modified 3 months ago

Hexdecimal search fails

Reported by: phelum Owned by: andrew_b
Priority: major Milestone: 4.8.19
Component: mc-search Version: 4.8.15
Keywords: Cc: galtgendo@…
Blocked By: #3694 Blocking:
Branch state: merged Votes for changeset: committed-master

Description

Searching for characters > 0x7F fails. A search for 0x80 causes a search for 0xC0 0x00. A search for 0x41 - 0x5A results in false finds of 0x61 - 0x7A.

I've disabled the case-insensitivity logic and enabled raw mode in hex.c and regex.c and this seems to fix the problem. I can supply the new files if this will help. But perhaps these changes adversely affect UTF-8 searches.

Cheers,
Steven

Attachments

hex.c (4.5 KB) - added by phelum 14 months ago.
regex.c (35.2 KB) - added by phelum 14 months ago.
hex.patch (1.2 KB) - added by phelum 14 months ago.
regex.patch (3.7 KB) - added by phelum 14 months ago.
3589-Make-hex-search-work-for-binary-data.patch (2.4 KB) - added by mooffie 4 months ago.

Change History

Changed 14 months ago by phelum

Changed 14 months ago by phelum

comment:1 Changed 14 months ago by zaytsev-work

Could you please make a patch with diff -Naur mc-x.y.z.orig mc-x.y.z ? When you attach the modified files, it's not clear what the reference to compare them with is. Thanks!

comment:2 Changed 14 months ago by phelum

Hi,

Patches for both files attached.

Cheers,
Steven

Changed 14 months ago by phelum

Changed 14 months ago by phelum

comment:3 Changed 14 months ago by mnk

While #3454 isn't fully related, could you have a look at it too ?

comment:4 Changed 14 months ago by mnk

  • Cc galtgendo@… added

comment:5 Changed 14 months ago by andrew_b

  • Component changed from mc-core to mc-search

comment:6 Changed 6 months ago by andrew_b

Ticket #3695 has been marked as a duplicate of this ticket.

comment:7 Changed 4 months ago by andrew_b

  • Blocked By 3694 added

Changed 4 months ago by mooffie

comment:8 Changed 4 months ago by mooffie

Here's my 1.5 lines patch to solve this. (I should get a prize for "the biggest comment for the smallest code"!)

(Although we're blocked by #3694, I don't want to wait any longer, just in case a bus aliens kidnap me.)

Here are 4 tests with which to try out the patch. Rules:

  • Be in UTF-8 locale / display (the bugs exist only there).
  • Set search type (the radio button) to "Hexadecimal".
  • Make sure the checkboxes "Case sensitive" and "All charsets" are off (because if they're on they'll mask the bugs, as explained in #3695).

Here we go:

(1) Open /usr/bin/mc in the viewer or editor, and search for 00 4e ed 00. You should find it.

Open a text file containing "I saw Aaron". Then:

(2) Search for 41 (ASCII of letter 'A'). You should not find the lowercase 'a's, only the upper 'A'.
(3) Search for 41 "A". You should find the "Aa" of "Aaron".

Turn "Case sensitive" checkbox on and:

(4) Search for 41 "A", as before. This time you should not find the "Aa" of "Aaron".

(BTW, if we decide that we want quoted strings (tests (3) and (4)) to always be case-sensitive, it's very easy to do this by making mc_search__hex_translate_to_regex() write them out as \xXX symbols instead of verbatim.)

comment:9 Changed 4 months ago by andrew_b

  • Status changed from new to accepted
  • Owner set to andrew_b
  • Votes for changeset set to andrew_b
  • Branch state changed from no branch to on review
  • Milestone changed from Future Releases to 4.8.19

Branch: 3589_hex_search_binary_data
changeset:e4a70ff1d2e4773438920bafbdedb8035301e9c7

comment:10 Changed 3 months ago by andrew_b

  • Branch state changed from on review to approved

comment:11 Changed 3 months ago by andrew_b

  • Status changed from accepted to testing
  • Votes for changeset changed from andrew_b to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged

comment:12 Changed 3 months ago by andrew_b

  • Status changed from testing to closed
Note: See TracTickets for help on using tickets.