Ticket #2290 (closed defect: fixed)
"Find File" is very slow and consumes 100% of CPU
Reported by: | birdie | Owned by: | andrew_b |
---|---|---|---|
Priority: | major | Milestone: | 4.8.12 |
Component: | mc-core | Version: | master |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Branch state: | merged | Votes for changeset: | committed-master |
Description
For a mounted filesystem which contains roughly ten thousand files and a thousand directories, MC's "Find File" search with default parameters takes 5 (five) seconds with 100% CPU usage.
There's something terribly wrong, because:
time find . > /dev/null real 0m0.022s user 0m0.010s sys 0m0.010s
Other console file managers find all files and directories instantly.
Attachments
Change History
Changed 14 years ago by birdie
- Attachment mc_findfile2.png added
comment:1 Changed 14 years ago by birdie
- Version changed from 4.7.3 to master
- Milestone changed from 4.7.4 to 4.7.5
comment:4 Changed 11 years ago by sknaumov
- Branch state set to no branch
I've benchmarked mc find with linux-3.10.3 source code and USB_SPEED_LOW search pattern. On mc-4.8.9 it takes 30 seconds to complete the search.
First of all, couple of seconds could be cut down by commenting out mc_refresh() call in search_content function in find.c: then it takes 23,5 seconds to complete.
Commenting out also gsprintf( ... "Grepping in %s" ... ) and status_update calls in this function cuts time to 20,5 seconds.
But commenting all status line manipulations (internals of status_update and find_rotate_dash functions) get us only to about 19 seconds, so there is another bottleneck.
comment:5 Changed 11 years ago by slyfox
Do you have chance to run mc under 'perf' tool on that box? (from linux/tools/perf)
Should be quite straightforward:
perf record mc # perfrom action, exit from mc perf report # post 10-15 top lines from the output
You will need debugging symbols present in mc (CFLAGS+=-g)
comment:6 Changed 11 years ago by sknaumov
Samples: 154K of event 'cpu-clock', Event count (approx.): 36529 26,57% [kernel] [k] read_tsc 19,77% [kernel] [k] _raw_spin_unlock_irqrestore 8,63% mc [.] g_string_append_c_inline 6,25% mc [.] get_line_at 5,14% mc [.] mc_search__get_char 5,02% [kernel] [k] finish_task_switch 4,75% libc-2.15.so [.] 0x00077704 4,27% mc [.] mc_search__run_regex 2,58% libpcre.so.3.13.1 [.] pcre_exec 1,59% libglib-2.0.so.0.3400.1 [.] 0x0004c4b0 0,80% libpixman-1.so.0.26.0 [.] 0x00080b70 0,79% libc-2.15.so [.] __libc_calloc 0,51% vboxvideo_drv_113.so [.] VBoxHGSMIBufferSubmit 0,49% libslang.so.2.2.4 [.] 0x0006a1c1 0,47% libc-2.15.so [.] malloc 0,46% Xorg [.] 0x00125273 0,44% xterm [.] 0x000372ad 0,43% libglib-2.0.so.0.3400.1 [.] g_slice_free1 0,40% mc [.] mc_search__regex_found_cond
These figures are from perf top just after search completion, not from perf report, because perf report don't show any significant mc activity.
comment:7 Changed 11 years ago by slyfox
Yeah, looks like mc triggers kernel's subsystem churn quite a lot.
Can you describe your hardware setup a bit more?
Looks like it's a virtualbox thing. I guess it's framebuffer
is quite inefficient (memory copy kills all caches)
Namely:
- what guest $CHOST is that? (the 'gcc -v' output for example)
- do you use it on local machine or networking kicks-in? (like SSH's session or something)
- what video driver on host machine is used? (i guess some xorg's one)
- what video driver on guest machine is used? (linux dumb term, or some vbox fbdev)
comment:8 Changed 11 years ago by sknaumov
It was Xubuntu 12.10 (with VirtualBox? guest additions installed) on top of Windows 7. I will run the same test tomorrow at work with Xubuntu 12.10 installed on AMD Athlon X2. I thought that it could be only terminal redraw issue, but as I said before 2/3 of the search time is spent somewhere else.
comment:9 follow-up: ↓ 10 Changed 11 years ago by sknaumov
Yes, on native system difference is not so big: 13,5 seconds with status updates and dash rotation commented out (except notification that search is finished) and 15 seconds otherwise. But I can't see any reason why status bar should be updated so often - information about file being grepped is unreadable anyway. Also it is interesting to measure overhead over ssh. As to mc's search code - it seems to be too generalized - it scans input buffer char by char to get string, then scans resulted string again to get length, then it calls some generic functions and then it seems that in mc_search_run__regex some functions are called per char.
comment:10 in reply to: ↑ 9 ; follow-up: ↓ 11 Changed 11 years ago by andrew_b
Replying to sknaumov:
But I can't see any reason why status bar should be updated so often - information about file being grepped is unreadable anyway.
Unreadable until you find in small files. Search is fast for small file but long for large one. If you decimate (2, 3 or more times) output of file name and meet a large file after a lot of small ones during seach, you can see some previous file name in status instead of current large file.
As to mc's search code - it seems to be too generalized - it scans input buffer char by char to get string, then scans resulted string again to get length
It can be fixed easy.
then it calls some generic functions and then it seems that in mc_search_run__regex some functions are called per char.
This is issue about general speed up of mc search engine.
So, some minor optimization is performed in
2290_find_file_speedup branch.
Initial changeset:1b2e6883436d3a9e57d37c59abf44a1ab9ea8e2f
I guess this doesn't give us a large profit, but...
comment:11 in reply to: ↑ 10 Changed 11 years ago by sknaumov
Replying to andrew_b:
So, some minor optimization is performed in
2290_find_file_speedup branch.
Initial changeset:f66b1bac8e56a0cb8843ba9ac110d75124a35f0e
I guess this doesn't give us a large profit, but...
Actually this patch slows down things a lot. On my new Virtualbox setup (Xubuntu 13.04, VDI image is on SSD) this patch increases search time from 17,5 to 20,65 seconds. Maybe using glib functions like g_string_append_c is very inefficient approach, but...
Passing pointer to variable on stack to get_line_at function and setting resulting string length to it surprisingly increases search time to 19.3 seconds. Usage of global variable to pass length from get_line_at to mc_search_run decreases search time to 17.2 seconds. Sounds a bit unreal, but I launched tests three times in a row and results were repeatable.
Usage of SIGALARM and checking per line whether 1 second passed since previous status update reduces search time to 15.7 seconds. Now I should check whether this change improves or decreases performance in native case.
UPDATE: on Virtualbox results seem to change a bit from one "make install" to another, but not so much. Have to check everything tomorrow on native system at work.
comment:12 Changed 11 years ago by sknaumov
These patches decreases search time by 33% on Virtualbox and by 16% on real hardware. Not so big improvement, but not so bad given that changes are rather small. Please review and benchmark.
comment:13 Changed 11 years ago by sknaumov
As to your last changeset:df952a, Andrew, please benchmark it. When I passed length by reference in get_line_at function search time increased approximately by 5-6%. When I used global variable it decreased by 3%.
comment:14 follow-up: ↓ 15 Changed 11 years ago by sknaumov
Find without content pattern is VERY VERY slow partly due to usage of g_list_append function, because it uses g_list_last internally to traverse from the beginning to the end of a list, so forming a list of results has O(n2) complexity instead of O(n). I can't find g_list_insert_after in GLib Reference Manual, but is seems that this function exists. We should track the end of list to improve append time. Find time with find_add_match function commented is about 4 seconds, but in normal case it takes about 20-22 seconds to complete. Here is perf report output:
20,29% mc libglib-2.0.so.0.3600.0 [.] g_list_last 17,04% mc libglib-2.0.so.0.3600.0 [.] 0x00000000000405fd 13,17% mc libc-2.17.so [.] bsearch 12,90% mc libslang.so.2.2.4 [.] SLsmg_write_chars 12,26% mc mc [.] str_utf8_make_make_term_form 7,12% mc libglib-2.0.so.0.3600.0 [.] g_unichar_isprint 2,63% mc libglib-2.0.so.0.3600.0 [.] g_unichar_type 2,38% mc libglib-2.0.so.0.3600.0 [.] g_unichar_to_utf8 2,18% mc libglib-2.0.so.0.3600.0 [.] g_utf8_get_char_validated 1,89% mc mc [.] str_unichar_iscombiningmark 1,61% mc libglib-2.0.so.0.3600.0 [.] g_unichar_iswide 0,95% mc libglib-2.0.so.0.3600.0 [.] g_strlcpy 0,65% mc mc [.] utf8_tool_copy_chars_to_end 0,59% mc [kernel.kallsyms] [k] 0xffffffff81043e6a 0,56% mc mc [.] g_unichar_to_utf8@plt 0,53% mc libglib-2.0.so.0.3600.0 [.] bsearch@plt 0,53% mc mc [.] g_utf8_get_char_validated@plt
I've tried to use g_list_prepend and it resulted in inverse of list elements in output and constant redrawing, but it took only 16 seconds to complete.
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 11 years ago by andrew_b
Replying to sknaumov:
I can't find g_list_insert_after in GLib Reference Manual, but is seems that this function exists.
There is no such function in the GLib source tree.
We should track the end of list to improve append time.
There is the GQueue class with such functionality, but glib >= 2.14.0 is required. Another way is reimplementation of WListbox using GPtrArray.
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 11 years ago by sknaumov
Replying to andrew_b:
There is the GQueue class with such functionality, but glib >= 2.14.0 is required. Another way is reimplementation of WListbox using GPtrArray.
May be we can use g_list_prepend and traverse this list backward on show? Is it traversed only one/a few times?
comment:17 in reply to: ↑ 16 ; follow-up: ↓ 18 Changed 11 years ago by andrew_b
Replying to sknaumov:
Replying to andrew_b:
There is the GQueue class with such functionality, but glib >= 2.14.0 is required. Another way is reimplementation of WListbox using GPtrArray.
May be we can use g_list_prepend and traverse this list backward on show? Is it traversed only one/a few times?
Store WListbox content in reverse order makes support of listbox_append_t flags more complicate.
Length of list is required in any case, so GArray is better than GList. From other hand, GAarray doesn't provide the sorted insertion, but this is cam be workarounded using some implementation of lock/unlock fuctionality:
listbox_lock_sort (); ... /* add a lot of items in some order */ listbox_unlock_sort (); /* and sort here */ ... /* draw listbox */
Also we can insert dummy sentinel element, use g_list_insert_before function and do not traverse this element at show time.
There is no such function in GLib.
comment:18 in reply to: ↑ 17 Changed 11 years ago by andrew_b
Changed 11 years ago by sknaumov
- Attachment mc-4.8.10-no-get-line-at.patch added
This patch gets rid of get_line_at function that was called only at one place and was tightly coupled with caller function. It allows to perform some minor optimizations, and as a result improve search time by 18-20%.
comment:19 Changed 11 years ago by andrew_b
- Milestone changed from 4.8 to Future Releases
mc-4.8.9-optimize-regex.patch with some modifications is applied as [bb2f5d35f3a005effb72ab9dc030a5495e9b0a4a].
Changed 11 years ago by sknaumov
- Attachment mc-4.8.10-refresh-size+time.patch added
New patch to optimize updates in status bar during search (based on examining file size and time of the last status update). This version is synchronous and is based on assumption that relatively small files are quickly processed even on slow hardware so if refresh timeout expires during their processing, the delay couldn't be noticed by human. Therefore only rather big files have to be reported unconditionally, and others - only if there is a match in this file or if timeout expires.
Changed 11 years ago by sknaumov
- Attachment mc-4.8.10-refresh-size+time-gtod.patch added
The same patch as mc-4.8.10-refresh-size+time.patch, but uses gettimeofday() instead of clock_gettime().
comment:20 Changed 11 years ago by andrew_b
- Status changed from new to accepted
- Owner set to andrew_b
- Branch state changed from no branch to on review
- Milestone changed from Future Releases to 4.8.12
Rebased 2290_find_file_speedup branch.
Initial changeset:6c784ed7a35926230d377d856f447ed458da81bd
comment:23 Changed 11 years ago by andrew_b
- Status changed from accepted to testing
- Votes for changeset changed from slavazanko to committed-master
- Resolution set to fixed
- Branch state changed from approved to merged
Merged to master: [f949a29422e9961cb429553a643bf298a8e4ead9].
git log --pretty=oneline 7abd8a1..f949a29
Search parameters