Ticket #4587 (new defect)

Opened 3 months ago

Last modified 2 weeks ago

regexp search issues in mcviewer

Reported by: dextarr Owned by:
Priority: major Milestone: Future Releases
Component: mcview Version: master
Keywords: mcviewer regexp Cc:
Blocked By: Blocking:
Branch state: no branch Votes for changeset:

Description

Searching using regexp patterns sometimes result in weird matches.

How to reproduce:

Create a test file:

printf "0\n1\n2\n3\n4\n5\n101\n11\n12\n13\n14\n15\n" >input.txt

open the file in mcviewer (F3)

search in the file with F7 using regexp mode for "^1" (and then hitting 'n' for the next match) finds as expected:

the first character in 1 101 12 13 14 15

but also _both_ 1s in 11


reverse search the same pattern (? or F7 and setting 'Backwards' or just pressing "N") in regexp mode finds as expected:

the first character in 15 14 13 12 1

but also _both_ 1s in 11 and 101


the same tests work in mcedit as expected.

thank you for looking into the issue!

Change History

comment:1 Changed 2 weeks ago by zaytsev

Another example:

https://lists.midnight-commander.org/pipermail/mc/2024-November/005786.html

Let's say we have the following file:

---
    A   A

    B   A

A
---

Now, let's open it in the internal viewer and start a regular expression
search for the following pattern: ^\s*A. 

At first, it will find the very first occurrence, 'beginning of the
line, 4 spaces, A'. So far so good. But if we then press 'S-F7' or 'n'
to find the next match, the cursor will move to the next space on the
first match. I.e., it looks like at first it found '4 spaces and A',
then '3 spaces and A', and so on. And then it will move to the second
A on the very same first row. Then skips the third A (rightly so). And
then jumps to the last A on the fifth row (again, rightly so).

The problem is

  - there's really no such match as ^\s\s\sA. Or any other matches on
    the first row (as if it just ignores ^ further on if a match has
    been found)
  - And even if we drop ^, it still would be rather strange to find,
    at first, the greediest match, and then continue searching inside
    that greediest match for the less greedy ones.

comment:2 Changed 2 weeks ago by andrew_b

The simplest file for reproducing is following:

11

The ^1 regexp matches both 1.

Note: See TracTickets for help on using tickets.