Ticket #2352 (new enhancement)

Opened 8 years ago

Last modified 4 years ago

[feature request] display most unique part of filename, if panel width is too short

Reported by: pille Owned by:
Priority: minor Milestone: Future Releases
Component: mc-core Version: master
Keywords: Cc: me@…, gotar@…, mooffie@…
Blocked By: Blocking:
Branch state: no branch Votes for changeset:

Description

if a filename displayed in the panel is too long to fit, only the beginning and end is displayed. the non-fitting middle part is reduced to '~'.

i like this feature, but it could be improved.
if you have a lot of files with similar beginnings and endings you'd end up with a non-distinguishable file.

e.g. take some files of a music collection named using the following scheme: artist - track - title (album).mp3

artist - track01 - title of track 01 (album).mp3
artist - track02 - title of track 02 (album).mp3

this may be shortened to:

artist - trac~ (album).mp3
artist - trac~ (album).mp3

it would be better, if the filename is shortened using a smarter algorithm that uses the context (surrounding filenames). e.g. it could display some chars of the beginning and end as usual, but also the most unique fitting chars in the middle part, seperated by '~', like:

artist - tra~1~(album).mp3
artist - tra~2~(album).mp3

Change History

comment:1 Changed 8 years ago by gotar

  • Cc gotar@… added

comment:2 Changed 8 years ago by s01ja

I started to think about a algorithm that does not need the whole directory in a map to find 'the most unique' middle part.
The problems start with the first change in the most unique pattern:

staticfile97.backup.tar.bz2
staticfile97.backup-live.tar.bz2
staticfile98.backup.tar.bz2

As you can see in the output the trim is not perfect for the first file starting at max==20
and reoccurring at max==13

     10        20        30        40
          0         0         0      
max:30
staticfile97.backup.tar.bz2
stat*ile97.backup-live.tar.bz2
staticfile98.backup.tar.bz2
max:29
staticfile97.backup.tar.bz2
stat*le97.backup-live.tar.bz2
staticfile98.backup.tar.bz2
max:28
staticfile97.backup.tar.bz2
stat*e97.backup-live.tar.bz2
staticfile98.backup.tar.bz2
max:27
stat*cfile97.backup.tar.bz2
stat*97.backup-live.tar.bz2
stat*cfile98.backup.tar.bz2
max:26
stat*file97.backup.tar.bz2
stat*7.backup-live.tar.bz2
stat*file98.backup.tar.bz2
max:25
stat*ile97.backup.tar.bz2
sta*7.backup-live.tar.*z2
stat*ile98.backup.tar.bz2
max:24
stat*le97.backup.tar.bz2
sta*7.backup-live.ta.*z2
stat*le98.backup.tar.bz2
max:23
stat*e97.backup.tar.bz2
sta*7.backup-live.t.*z2
stat*e98.backup.tar.bz2
max:22
stat*97.backup.tar.bz2
sta*7.backup-live..*z2
stat*98.backup.tar.bz2
max:21
stat*7.backup.tar.bz2
sta*7.backup-live.*z2
stat*8.backup.tar.bz2
max:20
stat*.backup.tar.bz2
sta*7.backup-liv.*z2
sta*8.backup.tar.*z2
max:19
stat*backup.tar.bz2
sta*7.backup-li.*z2
sta*8.backup.ta.*z2
max:18
stat*ackup.tar.bz2
sta*7.backup-l.*z2
sta*8.backup.t.*z2
max:17
stat*ckup.tar.bz2
sta*7.backup-.*z2
sta*8.backup..*z2
max:16
stat*kup.tar.bz2
sta*7.backup.*z2
sta*8.backup.*z2
max:15
stat*up.tar.bz2
sta*7.backu.*z2
sta*8.backu.*z2
max:14
stat*p.tar.bz2
sta*7.back.*z2
sta*8.back.*z2
max:13
stat*.tar.bz2
sta*7.bac.*z2
sta*8.bac.*z2
max:12
sta*.tar.*z2
sta*7.ba.*z2
sta*8.ba.*z2
max:11
sta*.ta.*z2
sta*7.b.*z2
sta*8.b.*z2
max:10
sta*.t.*z2
sta*7..*z2
sta*8..*z2
max:9
stat*.bz2
stat*.bz2
stat*.bz2
max:8
stat*bz2
stat*bz2
stat*bz2
max:7
sta*bz2
sta*bz2
sta*bz2

---
I'm still working on a better solution and will document and post this function as patch to the correct file. This is what currently produces the output above:

'r': result
'l': length
char*
strim (const int max, const char* prev, const char* cur, const char* next){
    int i, il, rl, rt, maxkeep;
    char *r=g_strdup(cur);
    char *ref = next;
    rl=strlen(r);
    if(rl<max) return r;
    r[max]=0;
    maxkeep=max/2;
    for(i=rl;i>rl-(maxkeep+1);i--) r[max-(rl-i)]=cur[i];
    if(maxkeep<=4){
      r[maxkeep]='*';
    }else{
      // max>12
      il = max-8;
      if(strlen(next)<4)
        if(strlen(prev)>4)ref=prev;
      if(strlen(ref)>4)
      for(i=4;i<=rl;i++){
        if(ref[i]==0) break;
        if(cur[i]!=ref[i]) {
          if(i+il+4>rl){
            for(i=max;i>4;i--) r[i]=cur[rl-(max-i)];
            r[4]='*';
          }else{
            for(rt=0;rt<il&&ref[i+rt];rt++) r[rt+4]=cur[i+rt];
            r[max-(3)]='*';
            r[(3)]='*';
          }
          break;
        }
      }
    }
    //Dr[max-1]='|';
    return r;
}
int
main (int argc, char *argv[])
{
    struct stat s;
    char *mc_dir;
    char *s1="staticfile97.backup.tar.bz2";
    char *s2="staticfile97.backup-live.tar.bz2"; 
    char *s3="staticfile98.backup.tar.bz2";
    char *r1,*r2,*r3;
    printf("     10        20        30        40\n");
    printf("          0         0         0      \n");
    for(max=30;max>6;max--){
    r1=strim(max,"",s1,s2);
    r2=strim(max,s1,s2,s3);
    r3=strim(max,s2,s3,"");
    printf("max:%i\n%s\n%s\n%s\n",max,r1,r2,r3);
    g_free(r1);
    g_free(r2);
    g_free(r3);
    }
    printf("%s\n%s\n%s\n",s1,s2,s3);
    return 1;
   ....

This is just hacked in the mc::main.c
When you come up with better ideas how to find middlestring please post them.

comment:3 Changed 8 years ago by gotar

And how about performance penalty for huge directories? Could you compare e.g. 10000 times this against regular listing of 10000 files?

Currently I've got single 'name' in left "User mini status" and CmdTogglePanelsSplit (alt-comma) for the worst file names to read.

comment:4 follow-up: ↓ 5 Changed 8 years ago by s01ja

The problem is that 'str_fit_to_term' is implemented three (3) times:

269:strutil8bit.c
212:strutilascii.c
587:strutilutf8.c

and called about 20 times. So each occurrence has to be able to provide a previous and next string - seems too major for a simple patch.

I'd prefer to change the three str_fit_to_term implementations for this. Is there a better idea than a 'global variable' that can be used to guess the 'most unique part'?

comment:5 in reply to: ↑ 4 Changed 8 years ago by andrew_b

  • Version changed from version not selected to master
  • Milestone set to Future Releases

Replying to s01ja:

The problem is that 'str_fit_to_term' is implemented three (3) times:

Sure.

269:strutil8bit.c

This is for 8-bit locales.

212:strutilascii.c

This is for 7-bit ASCII locale.

587:strutilutf8.c

This is for UTF-8 locales.

comment:6 Changed 4 years ago by mooffie

  • Cc mooffie@… added
  • Branch state set to no branch
Note: See TracTickets for help on using tickets.