Ticket #4450 (closed enhancement: fixed)

Opened 20 months ago

Last modified 15 months ago

preliminary pcre2 support

Reported by: broly Owned by: andrew_b
Priority: major Milestone: 4.8.30
Component: mc-search Version: master
Keywords: Cc:
Blocked By: Blocking:
Branch state: merged Votes for changeset: committed-master

Description

hi,

since my DD-WRT build is now moving to pcre2, i thought i'd give a stab at providing pcre2 support.

since brainslayer is using an ass-old version of MC, i have provided a patch set of the changes necessary to use pcre2.

it compiles fine and from what i can see, it runs fine.

but i haven't tested the search properly because i don't want to restart my router to accommodate the newer glib/pcre (i used LD_LIBRARY_PATH with the newer libraries to test the new executable).

Attachments

shitty_pcre2.patch (8.2 KB) - added by broly 20 months ago.
fix patch corruption issue.

Change History

comment:1 Changed 20 months ago by broly

if anyone wants to fix any potential bugs that may arise from my changes, use these:

https://github.com/apache/httpd/commit/c602ba14811ede722017c4e59e4e30d9990227b4#diff-4d479e1bf17e6539c339b182d3c0861a1bd0b0ccbad5226c437d0e30f8cae0c7

https://github.com/i3/i3/issues/4682#issuecomment-973076704

i will note that, as of the second patch, i have tested the search and it works exactly as you expect!

very happy about this, considering how annoying privoxy is for similar changes.

ENJOY BOYS!

Last edited 20 months ago by broly (previous) (diff)

comment:2 Changed 20 months ago by andrew_b

  • Component changed from mc-core to mc-search

comment:3 follow-up: ↓ 4 Changed 20 months ago by andrew_b

The patch is corrupted:

$ git apply shitty_pcre2.patch
shitty_pcre2.patch:27: trailing whitespace.
    lc_mc_search->num_results = pcre2_match (regex, 
shitty_pcre2.patch:53: trailing whitespace.
        int pcre_options = 
shitty_pcre2.patch:56: trailing whitespace.
#else           
shitty_pcre2.patch:62: trailing whitespace.
            pcre_options |= 
shitty_pcre2.patch:70: trailing whitespace.
                pcre_options |= 
error: corrupt patch at line 113

comment:4 in reply to: ↑ 3 Changed 20 months ago by broly

Replying to andrew_b:

The patch is corrupted:

$ git apply shitty_pcre2.patch
shitty_pcre2.patch:27: trailing whitespace.
    lc_mc_search->num_results = pcre2_match (regex, 
shitty_pcre2.patch:53: trailing whitespace.
        int pcre_options = 
shitty_pcre2.patch:56: trailing whitespace.
#else           
shitty_pcre2.patch:62: trailing whitespace.
            pcre_options |= 
shitty_pcre2.patch:70: trailing whitespace.
                pcre_options |= 
error: corrupt patch at line 113

sorry for the late reply. turns out trying to jam in a forgotten macro corrupted the patch, because i didn't wanna make a new diff and then delete all of the differences for the */Makefile.in files xD

the updated patch has been tested and applied on my end this time. check it out.

Changed 20 months ago by broly

fix patch corruption issue.

comment:5 Changed 20 months ago by andrew_b

  • Status changed from new to accepted
  • Owner set to andrew_b
  • Branch state changed from no branch to on review
  • Milestone changed from Future Releases to 4.8.30

Branch: 4450_pcre2
changeset:49624e473bef82ab56e5e89271f6c4b4cd0f9e5c

@broly it would be nice if you revealed your real name for AUTHORS file.

comment:6 Changed 19 months ago by andrew_b

  • Votes for changeset set to andrew_b
  • Branch state changed from on review to approved

comment:7 Changed 19 months ago by andrew_b

  • Status changed from accepted to testing
  • Votes for changeset changed from andrew_b to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged

comment:8 Changed 19 months ago by andrew_b

  • Status changed from testing to closed

comment:9 Changed 15 months ago by eugenesan

I've been doing some testing with PCRE2 enabled MC and found that it is ~2x faster but provides different search results, which is concerning.

I performed "Find File with Content (Regular expression)" ^.*pcre.*$" on a tree of mixed text, binaries and archives (total ~90k / 3GB).
PCRE2 compiled MC provided ~0.005% (4476 vs 4453) less results when compared to default search engine.
Most of the misses where in binary files (.jar archives to be precise).

Last edited 15 months ago by eugenesan (previous) (diff)

comment:10 Changed 15 months ago by zaytsev

This is certainly concerning. Did you figure out whether these missed are "real" misses, or the old search is actually wrong?

Note: See TracTickets for help on using tickets.