Ticket #55 (closed defect: fixed)

Opened 16 years ago

Last modified 11 years ago

savannah: tab completion vs. spaces and escaping

Reported by: egmont Owned by: slavazanko
Priority: major Milestone: 4.8.8
Component: mc-core Version: master
Keywords: Cc: ossi, egmont@…
Blocked By: Blocking:
Branch state: merged Votes for changeset: committed-master

Description (last modified by ossi) (diff)

Original: http://savannah.gnu.org/bugs/?18695

Submitted by:Egmont Koblinger <egmont>Submitted on:Fri 05 Jan 2007 01:07:06 PM UTC
Category:NoneSeverity:3 - Normal
Status:NonePrivacy:Public
Assigned to:NoneOpen/Closed:Open
Release:current (CVS or snapshot)Operating System:GNU/Linux

Original submission:

Tab completion (Alt+Tab or Escape+Tab, I'll refer to these simply 
as Tab) works exactly the same way in the command line and in 
dialog boxes (such as copy/move file). The function complete() in 
src/complete.c only takes a widget as its input and performs the 
completion.

This approach is wrong, since the dialog boxes and the command line 
have different word splitting and escaping rules, they cannot be 
treated the same way. Currently they both have bugs. (I found these 
while trying to tidy up my mp3s, where lots of file and directory 
names contain spaces.)

In the file copy and similar dialogs, filenames are typed as pure 
strings without any kind of escaping, and a text entry only takes 
one filename. However, splitting at spaces is still performed. 
Suppose I have a directory named "/tmp/a bcde". I press F5/F6 on a 
file, and type "/tmp/a" and then Tab, it completes to "/tmp/a 
bcde". This is okay. But if I type "/tmp/a bc" and press Tab now, 
it completes to the files in the current directory whose name 
begins with "bc", due to the splitting at the space. Similarly, if 
I already have "/tmp/a bcde/" in the edit field, it won't complete 
to subdirectories or files of this directory. In this case the 
whole content of the field should be treated as one, not being 
split at spaces.

In the command line, splitting at spaces is okay, however, we would 
need some escaping and unescaping too, which doesn't happen. (Note: 
Esc followed by Enter correctly escapes the filename when it puts 
it into the cmdline.) Example: type "ls /tmp/a" and then Tab, 
you'll have "ls /tmp/a bcdefgh/" in the command line, which clearly 
won't list the contents of that directory due to the unescaped 
space character. Another example: type "ls /tmp/a\ b" and hit Tab, 
it won't complete to "ls /tmp/a\ bcde/", but it should.

I understand that it is beyond the scope of mc to perform a full 
shell syntax parsing (single and double quotes, nesting them etc.). 
This doesn't even happen when Esc Enter is pressed: backslashes are 
inserted even between quotes, but I'm fine with that. However, mc 
should produce proper escaping when inserting spaces or other 
special chars as a result of completion, and should be able to 
parse the backslash escaping it created (or the user typed) when 
doing further completion.

Obviously the current design of the complete() function to take 
only a widget is not okay (unless there's some room in the widget 
to store splitting and escaping rules). There are two main possible 
solutions I can see, but probably they'd eventually lead to 
basically the same code.

1) modify complete() so that it receives information on word 
splitting and escaping, so that it becomes more complex and 
probably more buggy and harder to maintain, but solves everything 
we need.

2) remove the word splitting code from complete(), so that it 
becomes simpler and perfect for the file copy/move widgets. 
Slightly modify its interface to take and return string and cursor 
positions instead of widget. Then make the command line editor 
split at unescaped spaces, unescape the resulted word, call 
complete() on this temporary string, escape the result of 
completion and put back in the command line.

Finally, we should check how the internal "cd" command handles its 
arguments. Currently it seems to do a lot of heuristics. For 
example, if I type "cd a\\\\b" (four backslashes) then it tries to 
change to the directory called "a\\\\b" (four backslashes) but if 
there's no such directory, it changes to "a\\b" (two backslashes). 
This seems to be terrible, and leads to bugs, e.g. if the cursor is 
on "a\\b" and I type "cd " followed by a Tab, I get the command "cd 
a\\\\b" and hence it'll change to "a\\\\b" which is not the desired 
directory. I'd remove all the heuristics and perform the usual 
command line unescaping, just as if it was an external command. 

Comment 1 by Pavel Tsekov <ptsekov> at Fri 05 Jan 2007 01:53:04 PM UTC:

Sounds familiar...

http://savannah.gnu.org/bugs/?16176

Change History

comment:1 Changed 16 years ago by slavazanko

Related ticket: #41

comment:2 Changed 16 years ago by winnie

  • Milestone set to VFS Standardisation

comment:3 Changed 16 years ago by metux

  • Blocked By 157 added

comment:4 Changed 15 years ago by andrew_b

  • Version set to master
  • severity set to no branch
  • Blocked By 157 removed

comment:5 Changed 15 years ago by ossi

  • Cc ossi@… added

comment:6 Changed 14 years ago by andrew_b

  • Blocking 2450 added

comment:7 Changed 13 years ago by slavazanko

  • Owner set to slavazanko
  • Status changed from new to accepted
  • Branch state set to no branch

comment:8 Changed 13 years ago by slavazanko

  • Cc egmont added
  • Branch state changed from no branch to on review

Created branch 55_filename_complete (parent: master)
Initial 55_filename_complete

Review, please.

Last edited 13 years ago by angel_il (previous) (diff)

comment:9 Changed 13 years ago by andrew_b

  • Branch state changed from on review to on rework

comment:10 Changed 13 years ago by andrew_b

  • Blocking 2626 added

comment:11 Changed 13 years ago by slavazanko

  • Branch state changed from on rework to on review
  • Milestone changed from 4.8 to 4.8.3

comment:12 Changed 13 years ago by ossi

in lib/widget/input_complete.c, the tab char should also be treated like that. you can insert a tab with ctrl-q tab easily, even if this is not really useful in the normal case.

INPUT_COMPLETE_FILES_ESC sounds like a misnomer to me. at the level of the input api it doesn't relate to files at all (or at least it shouldn't - i can't see from the patch alone whether that's actually the case). INPUT_COMPLETE_SPACE_ESC sounds like a better name.

comment:13 Changed 13 years ago by slavazanko

Ossi, your wishes have been realized in temporary changesets dac8b78fb and b530609cb. Thanks.

comment:14 Changed 13 years ago by ossi

actually, the logic in lib/widget/input_complete.c is broken. i think it should be more or less like this:

// no space, no tab here.
// XXX this list seems pretty arbitrary to me. if shell, missing at least: ()&
if (strchr (";|<>", *s) != NULL)
    break;
if ((*s == 32 || *s == 9) &&
    (!(in->completion_flags & INPUT_COMPLETE_SPACE_ESC) ||
     !strutils_is_char_escaped (in->buffer, s)))
    break; 

comment:15 Changed 13 years ago by slavazanko

  • Branch state changed from on review to on rework

comment:16 Changed 13 years ago by slavazanko

  • Branch state changed from on rework to on review
  • Milestone changed from 4.8.3 to 4.8.4

rebased to latest master and have respected ossi's thoughts.

Review, please.

comment:17 Changed 13 years ago by ossi

a recommendation for the commit message: use the past tense for the broken things that your commit fixes. it makes no sense to describe *some* state in the present tense, because it is not clear whether that's the new or the old state.

your re-definition of INPUT_COMPLETE_DEFAULT makes it a misnomer - it's more like INPUT_COMPLETE_NONE now.

anyway, the more i look into this code, the less i understand. what exactly is the above fragment supposed to do? the "stop chars" in the strchr are applied irrespective of mode, which makes no sense to me at all. what is INPUT_COMPLETE_SPACE_ESC meant to actually do? as far as i can see, there are only two modes: plain (regular line inputs) and shell-like (command prompt. fwiw, the built-in cd should just pretend to be a regular cd command (i.e., follow the normal shell splitting and quoting rules), then there would be no ambiguity). i can't find the code which is supposed to quote completions for the command prompt - am i or the code missing something?

try_complete() contains a lot of fuzzy logic, in particular with backslash escaping being handled rather arbitrarily or not at all. strutils_is_char_escaped() should be consistently applied, with regard to the mode.

INPUT_COMPLETE_DIRNAMES needs to be split out from INPUT_COMPLETE_CD, otherwise the "cd" detection magic is activated outside command prompt context as well.

generally, i'd suggest that you *precisely* describe the intended semantics of the INPUT_COMPLETE_* enum values next to their declarations, and then match these descriptions against both the implementation and the usages.

wow. two hours gone.

comment:18 Changed 13 years ago by slavazanko

  • Branch state changed from on review to on rework

a recommendation for the commit message: use the past tense for the broken things that your commit fixes. it makes no sense to describe *some* state in the present tense, because it is not clear whether that's the new or the old state.

Thank for explanation, my English isn't good, and I accept all lessons and criticism.

your re-definition of INPUT_COMPLETE_DEFAULT makes it a misnomer - it's more like INPUT_COMPLETE_NONE now.

Yep, you right, INPUT_COMPLETE_NONE have more sense now. I changed the name of constant.

All your other notes I have to take a closer look in the evening with a relaxing cup of tea. So, ticket going to 'rework' stage.

wow. two hours gone.

But I hope, you got a fun in this two hours, because a fun it an engine of OpenSource (in most cases) :)

comment:19 Changed 13 years ago by andrew_b

  • Milestone changed from 4.8.4 to 4.8.5

comment:20 Changed 12 years ago by andrew_b

  • Milestone changed from 4.8.5 to Future Releases

comment:21 Changed 12 years ago by slavazanko

  • Branch state changed from on rework to on review
  • Milestone changed from Future Releases to 4.8.8

Branch was recreated. See 55_filename_complete for details. Review & vote, please.

comment:22 Changed 12 years ago by andrew_b

  • Votes for changeset set to andrew_b

comment:23 Changed 12 years ago by angel_il

  • Votes for changeset changed from andrew_b to andrew_b angel_il
  • Branch state changed from on review to approved

comment:24 Changed 12 years ago by slavazanko

  • Status changed from accepted to testing
  • Votes for changeset changed from andrew_b angel_il to committed-master
  • Resolution set to fixed
  • Blocking 2450, 2626 removed
  • Branch state changed from approved to merged

merged to master:

git log --pretty=oneline 74d71e7..0608af2

comment:25 Changed 12 years ago by slavazanko

  • Status changed from testing to closed

comment:26 Changed 11 years ago by ossi

  • Cc ossi added; ossi@…, egmont removed
  • Description modified (diff)
  • Reporter changed from slavazanko to egmont

comment:27 Changed 11 years ago by egmont

  • Cc egmont@… added
Note: See TracTickets for help on using tickets.