Ticket #2707 (closed defect: fixed)

Opened 6 years ago

Last modified 11 months ago

u7z fails to parse a particular archive

Reported by: birdie Owned by: zaytsev-work
Priority: major Milestone: 4.8.19
Component: mc-vfs Version: master
Keywords: Cc:
Blocked By: Blocking: #3744
Branch state: merged Votes for changeset: committed-master

Description

Subject.

Attachments

list.txt (3.1 KB) - added by birdie 6 years ago.
7z l of this archive
mc-2707-Extfs-helper-u7z-fix-parsing-7z-archive-no-datetime.patch (5.0 KB) - added by and 21 months ago.
2707-0001-extfs-u7z-make-the-code-more-readable.patch (2.2 KB) - added by mooffie 11 months ago.
2707-0004-extfs-u7z-print-cleanup.patch (1.1 KB) - added by mooffie 11 months ago.
listing2.txt (14.4 KB) - added by birdie 11 months ago.
2707-0002-extfs-u7z-handle-missing-size.patch (2.1 KB) - added by mooffie 11 months ago.
2707-0003-extfs-u7z-sed-portability.patch (1.5 KB) - added by mooffie 11 months ago.

Change History

Changed 6 years ago by birdie

7z l of this archive

comment:1 Changed 6 years ago by andrew_b

  • Component changed from mc-core to mc-vfs

comment:2 Changed 21 months ago by and

Confirmed,
7z archive can hold entries without datetime info
which hide/skip u7z helper.

Patch attached.

Last edited 21 months ago by and (previous) (diff)

comment:3 Changed 21 months ago by zaytsev

@and, in the future, please submit whitespace only patches as separate commits. If you change the whitespace in the same commit, then it's not immediately clear from the diff whether the additional changes also concern the logic, or it's only whitespace, unless special options are used (ignore whitespace or word diff). This makes reviewing the history much more tedious than otherwise.

comment:4 Changed 21 months ago by zaytsev

  • Milestone changed from Future Releases to 4.8.17

I have just created a 4.8.17 milestone. I suggest to use this for all tickets with fresh patches to be reviewed.

We need to make a 4.8.16 at some point this month, so just assigning all newly processed tickets to 4.8.16 is not helpful, as we need to stop at some point. But if they are left for future releases, then one would need to go through them again and see which ones already have good patches on review. Assigning them to 4.8.17 sounds like a good solution to me.

comment:5 Changed 19 months ago by zaytsev

  • Milestone changed from 4.8.17 to 4.8.18

comment:6 Changed 14 months ago by zaytsev

  • Milestone changed from 4.8.18 to 4.8.19

comment:7 Changed 13 months ago by zaytsev-work

  • Owner set to zaytsev-work
  • Status changed from new to accepted

comment:8 Changed 11 months ago by zaytsev

  • Branch state changed from no branch to on review

Branch: 2707_extfs_u7z
Initial changeset: [7f6a700d9e2feea02cde4097d64a7ba522eea7b9]

comment:9 Changed 11 months ago by zaytsev

See also #3730 regarding making a possible test out of this.

comment:10 Changed 11 months ago by mooffie

Please hold on the merge for one day, ok? I'll have a quick look into the testability of this today.

comment:11 Changed 11 months ago by zaytsev

No worries, it has been waiting for 5 years already, so I have absolutely no problems with waiting a couple of days more.

comment:12 Changed 11 months ago by mooffie

There's another problem in the archive @birdie posted: the size field is missing for some files.

  • I'm attaching a patch to fix this.
  • The first patch, though, will be to make the code more readable (otherwise you won't understand my fix).
  • Then two other small patches will follow (portability, cleanup).

Don't worry: you won't have to wait 5 more years to commit this. I've already prepared tests (I'll start a new ticket shortly) so you can commit this outright.

Changed 11 months ago by mooffie

comment:13 Changed 11 months ago by mooffie

  • Blocking 3744 added

comment:14 follow-up: ↓ 16 Changed 11 months ago by zaytsev-work

Could you please give me an example of sed that has problems with \s ? Did you mean macOS?

zaytsevy:~ zaytsev$ echo "abc   def" | sed -e 's/\s*//g'
abc   def
zaytsevy:~ zaytsev$ echo "abc   def" | sed -e 's/[[:space:]]*//g' 
abcdef

I'm not sure that replacing \s with is the best approach, because \s at very least also includes tab, but maybe in this case it doesn't matter and if the input has some other whitespace we have a problem anyways.

comment:15 follow-up: ↓ 17 Changed 11 months ago by birdie

I can imagine 7z archives which lack both compressed _and_ uncompressed sizes simultaneously. Please consider this.

comment:16 in reply to: ↑ 14 Changed 11 months ago by mooffie

Replying to zaytsev-work:

I'm not sure that replacing \s with ' ' is the best approach, because \s at very least also includes tab [...] and if the input has some other whitespace we [...]


I don't see a problem with replacing \s with just ' '. On the contrary. If the input has whitespace zoo, our whole sed recipe would fail: we're very precise in the rest of the pattern(s).

Could you please give me an example of sed that has problems with \s ?


I'm not familiar with such sed personally. \s is just not standard.

Now, speaking of portability:

There was something that bothered me and I now verified it: the | (bar) I used in \($date_re\|$empty_date_re\) isn't portable either. sed is supposed to support "basic regular expressions" and | is of "extended regular expressions". So it's my mistake.

I'll soon re-upload patch #2 (after I get more info from @birdie).

comment:17 in reply to: ↑ 15 Changed 11 months ago by mooffie

Replying to birdie:

I can imagine 7z archives which lack both compressed _and_ uncompressed sizes simultaneously. Please consider this.


Could you please provide more info? I.e., why would they be missing?

(I'm asking this because sed is a bit limited, as you see, and I don't want to bloat the recipe needlessly.)

Last edited 11 months ago by mooffie (previous) (diff)

Changed 11 months ago by birdie

comment:18 Changed 11 months ago by birdie

Replying to mooffie:

Could you please provide more info? I.e., why would they be missing?

See listing2.txt - its last file has no sizes at all.

Changed 11 months ago by mooffie

Changed 11 months ago by mooffie

comment:19 Changed 11 months ago by mooffie

Ok, I updated patch 0002 (so I had to update 0003 too). Remember to commit them in order: 0001, 0002, 0003, 0004.

What's new in patch 0002?

  • It no longer uses | in the regex.
  • It handles the new case @birdie noted where both the uncompressed and compressed sizes are missing (see 'listing2.txt').

Don't worry about tricky bugs: the tests at #3744 were updated. 'u7z.missing-size-and-date.input' tests all the possibilities that patch 0002 needs to handle.

(As for counting 19 spaces in patch 0003: I felt ' *' would be too loose. Whatever. Let's not be fussy about this.)

comment:20 follow-up: ↓ 21 Changed 11 months ago by zaytsev

See listing2.txt - its last file has no sizes at all.

So, the answer is, because it's not a valid 7z archive, but rather 7z is used to explore other ~archive files that it supports, like NSIS installer packages in this case. But okay, fair enough.

I don't see a problem with replacing \s with just ' '. On the contrary. If the input has whitespace zoo, our whole sed recipe would fail: we're very precise in the rest of the pattern(s).

Alright, I was just saying that it's not an equivalent replacement, but you have a valid point.

Don't worry about tricky bugs: the tests at #3744 were updated.

Oh joys of test driven development ;-) I've just pushed out the new patches and Travis build was successful. Unless Andrew has anything to say against it, I'll try to merge it tomorrow.

Thank you mooffie for your help!

comment:21 in reply to: ↑ 20 Changed 11 months ago by andrew_b

Replying to zaytsev:

Unless Andrew has anything to say against it

No, I haven't. Everything is fine.

comment:22 Changed 11 months ago by zaytsev

  • Votes for changeset set to zaytsev andrew_b
  • Branch state changed from on review to approved

comment:23 Changed 11 months ago by zaytsev

  • Status changed from accepted to testing
  • Votes for changeset changed from zaytsev andrew_b to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged

comment:24 Changed 11 months ago by zaytsev

  • Status changed from testing to closed
Note: See TracTickets for help on using tickets.