Ticket #2244 (assigned defect)
MC consumes 100% cpu after wake up from suspend
Reported by: | Spinal | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | Future Releases |
Component: | mc-tty | Version: | 4.7.4 |
Keywords: | high cpu | Cc: | zaytsev, slyfox, torohov_s_a@…, petre.rodan@…, graham@… |
Blocked By: | Blocking: | ||
Branch state: | no branch | Votes for changeset: |
Description
I often use hardware suspend to ram or suspend to disk between my sessions on PC. It sometimes occurs that something consumes 100% of my CPU (2.66 GHz is the CPU frequency, by the way) after wake up. When doing top I see that it's an MC instance running. The interesting thing is that I've closed all MC's but it's still running somewhere in background consuming 100% CPU. "killall mc" helps a lot. I don't know exactly how to trigger the bug, but it only occurs after wake up from suspend. Please let me know if I can assist you in fixing this.
P.S. The bug was introduced by Slavaz version of MC. I didn't mention such behaviour before.
P.P.S. I'm Gentoo user.
My MC version is 4.7.2
USE flags:
X edit gpm nls -samba -slang
Attachments
Change History
comment:3 Changed 14 years ago by Spinal
Okay.
~ $ top top - 21:21:55 up 1 day, 1:57, 4 users, load average: 1.18, 0.97, 0.62 Tasks: 145 total, 2 running, 143 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 74.5%sy, 24.5%ni, 0.0%id, 0.0%wa, 0.3%hi, 0.7%si, 0.0%st Mem: 1035200k total, 850560k used, 184640k free, 117364k buffers Swap: 3903784k total, 8896k used, 3894888k free, 371240k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10460 spinal 30 10 8360 3776 2548 R 95.3 0.4 3:05.41 mc 30336 root 30 10 70316 39m 9304 S 2.3 4.0 2:27.24 X 10032 spinal 30 10 176m 86m 20m S 1.0 8.6 1:13.24 opera 10784 spinal 30 10 47632 20m 9548 S 0.7 2.0 0:00.34 Terminal 30426 spinal 30 10 26388 14m 6956 S 0.3 1.4 0:01.64 xfce4-netload-p 1 root 20 0 1624 520 496 S 0.0 0.1 0:00.32 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.49 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.38 events/0 5 root 20 0 0 0 0 S 0.0 0.0 0:00.01 khelper 6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 7 root 20 0 0 0 0 S 0.0 0.0 0:01.78 sync_supers 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 bdi-default 9 root 20 0 0 0 0 S 0.0 0.0 0:00.35 kblockd/0 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpid 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_notify 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_hotplug ~ $ ps ax PID TTY STAT TIME COMMAND 1 ? Ss 0:00 init [3] 2 ? S 0:00 [kthreadd] 3 ? S 0:00 [ksoftirqd/0] 4 ? S 0:00 [events/0] 5 ? S 0:00 [khelper] 6 ? S 0:00 [async/mgr] 7 ? S 0:01 [sync_supers] 8 ? S 0:00 [bdi-default] 9 ? S 0:00 [kblockd/0] 10 ? S 0:00 [kacpid] 11 ? S 0:00 [kacpi_notify] 12 ? S 0:00 [kacpi_hotplug] 13 ? S 0:04 [ata/0] 14 ? S 0:00 [ata_aux] 15 ? S 0:00 [ksuspend_usbd] 16 ? S 0:00 [khubd] 17 ? S 0:00 [kseriod] 18 ? S 0:02 [kswapd0] 19 ? S 0:00 [aio/0] 20 ? S 0:00 [crypto/0] 24 ? S 0:08 [scsi_eh_0] 25 ? S 0:00 [scsi_eh_1] 26 ? S 0:00 [scsi_eh_2] 29 ? S 0:00 [scsi_eh_3] 32 ? S 0:00 [edac-poller] 33 ? S 0:00 [usbhid_resumer] 36 ? S 0:00 [jbd2/sda5-8] 37 ? S 0:00 [ext4-dio-unwrit] 129 ? S<s 0:00 /sbin/udevd --daemon 267 ? S 0:00 [kpsmoused] 500 ? S 0:04 [flush-8:0] 529 ? S 0:00 [reiserfs/0] 532 ? S 0:00 [xfs_mru_cache] 533 ? S 0:02 [xfslogd/0] 534 ? S 0:00 [xfsdatad/0] 535 ? S 0:00 [xfsconvertd/0] 536 ? S 0:00 [xfsbufd] 537 ? S 0:00 [xfsaild] 538 ? S 0:00 [xfssyncd] 539 ? S 0:00 [jbd2/sda8-8] 540 ? S 0:00 [ext4-dio-unwrit] 541 ? S 0:00 [jbd2/sda9-8] 542 ? S 0:00 [ext4-dio-unwrit] 543 ? S 0:00 [xfsbufd] 544 ? S 0:00 [xfsaild] 545 ? S 0:00 [xfssyncd] 546 ? S 0:00 [xfsbufd] 547 ? S 0:00 [xfsaild] 548 ? S 0:00 [xfssyncd] 3496 ? S 0:00 supervising syslog-ng 3497 ? Ss 0:00 /usr/sbin/syslog-ng 3560 ? Ss 0:00 /usr/sbin/acpid 3623 ? Ss 0:00 /bin/bash /opt/scripts/sbin/acpid-helper 3691 ? Ss 0:00 /usr/bin/dbus-daemon --system 3754 ? Ssl 0:00 /usr/sbin/console-kit-daemon 4525 ? SNs 0:00 /usr/bin/distccd --daemon --pid-file /var/run/distccd/distccd.pid --user distcc --port 3632 --log-level critical --all 4529 ? SN 0:00 /usr/bin/distccd --daemon --pid-file /var/run/distccd/distccd.pid --user distcc --port 3632 --log-level critical --all 4589 ? Ss 0:00 /usr/sbin/gpm -m /dev/input/mice -t ps2 4652 ? Ss 0:00 /usr/sbin/hald --use-syslog --verbose=no 4653 ? S 0:00 hald-runner 4654 ? SN 0:00 /usr/bin/distccd --daemon --pid-file /var/run/distccd/distccd.pid --user distcc --port 3632 --log-level critical --all 4676 ? S 0:00 hald-addon-input: Listening on /dev/input/event4 /dev/input/event3 /dev/input/event0 /dev/input/event1 4683 ? S 0:09 hald-addon-storage: polling /dev/sr0 (every 2 sec) 4700 ? S 0:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket 4720 ? SN 0:00 /usr/bin/distccd --daemon --pid-file /var/run/distccd/distccd.pid --user distcc --port 3632 --log-level critical --all 4819 ? Ss 0:00 /sbin/portmap 4884 ? Ss 0:00 /sbin/rpc.statd --no-notify 4946 ? S 0:00 [rpciod/0] 4954 ? Ss 0:00 /usr/sbin/rpc.mountd 4956 ? S 0:00 [lockd] 4957 ? S 0:00 [nfsd] 4958 ? S 0:00 [nfsd] 4959 ? S 0:00 [nfsd] 4960 ? S 0:00 [nfsd] 4961 ? S 0:00 [nfsd] 4962 ? S 0:00 [nfsd] 4963 ? S 0:00 [nfsd] 4964 ? S 0:00 [nfsd] 5077 ? Ssl 0:03 /usr/bin/mpd /etc/mpd.conf 5138 ? Ss 0:00 /usr/bin/mpdscribble --pidfile /var/run/mpdscribble.pid 5201 ? Ss 0:00 /usr/sbin/smbd -D 5210 ? S 0:00 /usr/sbin/smbd -D 5211 ? Ss 0:00 /usr/sbin/nmbd -D 5279 ? Ss 0:00 /usr/sbin/sshd 5409 ? Ss 0:00 /usr/sbin/cron 5475 ? Ss 0:00 /usr/sbin/vsftpd /etc/vsftpd/vsftpd.conf 5644 tty1 Ss 0:00 /bin/login -- 5645 tty2 Ss+ 0:00 /sbin/agetty 38400 tty2 linux 5646 tty3 Ss+ 0:00 /sbin/agetty 38400 tty3 linux 5647 tty4 Ss+ 0:00 /sbin/agetty 38400 tty4 linux 5648 tty5 Ss+ 0:00 /sbin/agetty 38400 tty5 linux 5649 tty6 Ss+ 0:00 /sbin/agetty 38400 tty6 linux 5677 ? SNsl 0:01 /usr/bin/smbnetfs /home/spinal/net 5984 ? S< 0:00 /sbin/udevd --daemon 5985 ? S< 0:00 /sbin/udevd --daemon 10032 ? SNl 1:20 /opt/opera/lib/opera/10.10/opera -notrayicon 10042 ? SN 0:00 /usr/libexec/gconfd-2 10161 ? S 0:00 /usr/bin/inotifywait -qq -e close_write -e move -e delete_self /dev/shm/acpid.status 10169 ? SN 0:01 /usr/bin/smbnetfs /home/spinal/net 10460 ? RN 6:24 /usr/bin/mc -P /tmp/mc-spinal/mc.pwd.10447 10462 pts/1 SNs+ 0:00 bash -rcfile .bashrc 10784 ? SN 0:01 /usr/bin/Terminal 10785 ? SN 0:00 gnome-pty-helper 10786 pts/2 SNs+ 0:00 bash 10799 pts/3 SNs+ 0:00 bash 10833 pts/4 SNs 0:00 bash 10843 pts/4 SN+ 0:00 screen -r 10849 ? SN 0:00 /usr/bin/smbnetfs /home/spinal/net 10850 ? SN 0:00 /usr/bin/smbnetfs /home/spinal/net 10851 ? SN 0:00 /usr/bin/smbnetfs /home/spinal/net 10852 ? SN 0:00 /usr/bin/smbnetfs /home/spinal/net 10853 ? SN 0:00 /usr/bin/smbnetfs /home/spinal/net 10887 pts/6 SNs 0:00 bash 10897 pts/6 RN+ 0:00 ps ax 30056 ? SNs 0:00 SCREEN 30057 pts/5 SNs+ 0:00 -/bin/bash 30200 tty1 S+ 0:00 -bash 30327 ? SNs 0:00 /usr/bin/gdm 30333 ? SN 0:00 /usr/bin/gdm 30336 tty7 RNs+ 2:40 /usr/bin/X :0 -audit 0 -auth /var/gdm/:0.Xauth vt7 30355 ? SNs 0:00 /bin/sh /etc/xdg/xfce4/xinitrc -- /etc/X11/xinit/xserverrc 30371 ? SN 0:00 /usr/bin/dbus-launch --sh-syntax --exit-with-session 30372 ? SNs 0:00 /usr/bin/dbus-daemon --fork --print-pid 6 --print-address 9 --session 30377 ? SNs 0:00 /usr/bin/ssh-agent -- startxfce4 30384 ? SN 0:00 xscreensaver -no-splash 30389 ? SN 0:00 /usr/bin/xfce4-session 30391 ? SN 0:00 /usr/libexec/xfconfd 30395 ? SN 0:00 xfsettingsd 30397 ? SN 0:02 xfwm4 30399 ? SN 0:06 xfce4-panel 30401 ? SN 0:00 Thunar --daemon 30403 ? SN 0:02 xfdesktop 30405 ? SN 0:00 /usr/libexec/gam_server 30406 ? SN 0:00 /usr/libexec/xfce4/panel-plugins/xfce4-menu-plugin socket_id 16777244 name xfce4-menu id 5 display_name Меню Xfce size 30414 ? SN 0:00 /bin/bash /home/spinal/.config/autorun/autorun.sh 30418 ? SN 0:01 stardict 30419 ? SN 0:00 xfce4-settings-helper 30424 ? SN 0:00 /usr/libexec/xfce4/panel-plugins/xfce4-mpc-plugin socket_id 16777252 name xfce4-mpc-plugin id 12679081321 display_name 30425 ? SNl 0:00 /usr/libexec/xfce4/panel-plugins/xfce4-mixer-plugin socket_id 16777253 name xfce4-mixer-plugin id 125438493015 display 30426 ? SN 0:01 /usr/libexec/xfce4/panel-plugins/xfce4-netload-plugin socket_id 16777254 name netload id 125438500817 display_name Net 30427 ? SN 0:00 /usr/libexec/xfce4/panel-plugins/xfce4-genmon-plugin socket_id 16777256 name genmon id 125438385110 display_name Gener 30428 ? SN 0:00 /usr/libexec/xfce4/panel-plugins/xfce4-notes-plugin socket_id 16777257 name xfce4-notes-plugin id 12595153000 display_ 30429 ? SN 0:05 /usr/libexec/xfce4/panel-plugins/xfce4-cpugraph-plugin socket_id 16777258 name cpugraph id 12582873440 display_name CP 30430 ? SN 0:01 /usr/libexec/xfce4/panel-plugins/xfce4-time-out-plugin socket_id 16777259 name xfce4-time-out-plugin id 12744763684 di 30431 ? SN 0:01 /usr/libexec/xfce4/panel-plugins/orageclock socket_id 16777260 name orageclock id 12543837287 display_name Часы с дато 30439 ? SN 0:00 /bin/bash /home/spinal/.config/autorun/autorun.sh 30442 ? SN 0:00 trix
comment:4 Changed 14 years ago by ossi
it would be probably much more helpful to attach first strace (to see whether it is looping around some syscalls) and then gdb to the process.
comment:5 Changed 14 years ago by Spinal
Hello, Ossi.
Could you please tell me what should I do with gdb?
I'm completely noob with it...
comment:6 Changed 14 years ago by angel_il
1 run mc
2 waiting for mc hangup
3 in another terminal run 'top' or 'ps ax', copy PID of mc
4 start gdb -p PID_of_mc
5 bt (copy/paste output)
comment:7 Changed 14 years ago by Spinal
Okay. Here's the strace's output:
...
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99992})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99992})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
...
And here's gdb's:
~ $ gdb -p 28087
warning: Can not parse XML syscalls information; XML support was disabled at compile time.
GNU gdb (Gentoo 7.0.1 p1) 7.0.1
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>.
Attaching to process 28087
Reading symbols from /usr/bin/mc...(no debugging symbols found)...done.
Reading symbols from /lib/libext2fs.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libext2fs.so.2
Reading symbols from /lib/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /lib/libgpm.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgpm.so.1
Reading symbols from /lib/libncursesw.so.5...(no debugging symbols found)...done.
Loaded symbols for /lib/libncursesw.so.5
Reading symbols from /usr/lib/libgmodule-2.0.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgmodule-2.0.so.0
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libglib-2.0.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libglib-2.0.so.0
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libncurses.so.5...(no debugging symbols found)...done.
Loaded symbols for /lib/libncurses.so.5
Reading symbols from /lib/libnss_compat.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_compat.so.2
Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libnss_nis.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_nis.so.2
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /usr/lib/libX11.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libX11.so
Reading symbols from /usr/lib/libxcb.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libxcb.so.1
Reading symbols from /usr/lib/libXau.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libXau.so.6
Reading symbols from /usr/lib/libXdmcp.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libXdmcp.so.6
0xb77f9424 in kernel_vsyscall ()
(gdb) bt
#0 0xb77f9424 in kernel_vsyscall ()
#1 0xb75f98fd in select () from /lib/libc.so.6
#2 0x080ad5fd in try_channels ()
#3 0x080ae4da in tty_get_event ()
#4 0x0806116d in run_dlg ()
#5 0x08097130 in main ()
(gdb)
~ $ file /usr/bin/mc
/usr/bin/mc: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped
The mc binary is not stripped but gdb says
Reading symbols from /usr/bin/mc...(no debugging symbols found)...done.
I'm going now to recompile mc with -ggdb gcc option enabled to see if there will be any difference.
Please let me know if there's anything else I should consider doing.
comment:10 Changed 14 years ago by slyfox
It looks like mc bug for file descriptor exceptions (mis)handling. I wonder what forces input fd to close/'error out' exactly on suspend/resume.
Do you use any sort of terminal multiplexers? (screen or something like that)
comment:11 Changed 14 years ago by slyfox
At least I seem found an easy way to reproduce the bug (many thanks to ossi)
$ mc # attach to it in another session and close fd=4 $ gdb -p $mc_pid call close(4) quit
poor mc process starts to eat CPU in dead loop.
comment:12 follow-up: ↓ 13 Changed 14 years ago by Spinal
I only use screen to run rtorrent. This is not connected to the bug.
What is that file descriptor #4 points to?
Could you tell if I can check if that's actually the reason of the bug in my case?
comment:13 in reply to: ↑ 12 Changed 14 years ago by slyfox
Replying to Spinal:
I only use screen to run rtorrent. This is not connected to the bug.
What is that file descriptor #4 points to?
Could you tell if I can check if that's actually the reason of the bug in my case?
According to whole strace log it's a descriptor of spawned by mc subshell (the one, available on Ctrl+O). Can you run mc under strace before suspend and then reproduce the hangup?
I should look like:
$ strace -omc-log.strace mc # reproduce the bug # attach mc-log.strace here
I'd like to see the whole log to get what caused tty close/break.
And I'd like to see your exact linux kernel version:
$ uname -a
comment:14 Changed 14 years ago by Spinal
It's a pity, I don't know how to reproduce the bug. It occurs once a week - i.e. pretty rarely. I use mc extensively. And I'm not really sure if that's connected to suspend. It may be just a conjunction.
~ $ uname -a
Linux supervisor 2.6.32-gentoo-r7 #2 PREEMPT Wed May 26 22:50:15 EEST 2010 i686 Intel(R) Celeron(R) CPU 2.66GHz GenuineIntel? GNU/Linux
So, how could I do needed checks, considering the rareness of this bug?
comment:15 Changed 14 years ago by slyfox
Ah, i thought it's not so rare. I think we have enough available info at least to fix dead loop.
So we can give the root cause to live in the tree for a while :]
comment:16 Changed 14 years ago by Zenith88
- Keywords high cpu added
- Version changed from 4.7.2 to 4.7.0
Similar problem, but not related to suspend/resume. In my system mc sometimes 'disconnects' from terminal and consumes 100% CPU. Exiting from mc via F10 does not help - the mc process remains running and only killing it helps. I remember noticing that for almost 10 years since Slackware 2.
That happens under X term and in console. Don't know what steps to take to reproduce - it feels absolutely random.
comment:17 follow-ups: ↓ 18 ↓ 19 Changed 14 years ago by andrew_b
Try to compile mc without gpm support. Or simple whitch off the gpm service.
comment:18 in reply to: ↑ 17 Changed 14 years ago by Spinal
Replying to andrew_b:
Try to compile mc without gpm support. Or simple whitch off the gpm service.
Ok. I will notice you if the problem occurs again (with disabled gpm).
If there's no answer in two months, please consider the solution to be helpfull.
Thanks for advice.
comment:19 in reply to: ↑ 17 Changed 14 years ago by Spinal
- Version changed from 4.7.0 to 4.7.4
- Milestone changed from 4.7 to 4.7.5
Try to compile mc without gpm support. Or simple whitch off the gpm service.
That didn't help.
Installed version: mc-4.7.4-r1(13:04:56 25.09.2010)(X edit nls -gpm -samba -slang)
Same behaviour occured. Here's the strace copy (this is output in infinite loop):
...
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99993})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
...
What did I do? I just launch smplayer using Enter on avi file. Then when the movie finished I saw 100% cpu load. It's again midnight commander. But I closed all mc.
So now it wasn't connected to gpm (mc is compiled with -gpm USE flag)
Only "killall mc" helped.
comment:20 follow-up: ↓ 21 Changed 14 years ago by andrew_b
- Milestone changed from 4.7.5 to 4.7
Please build mc with full debug info. When mc will consume 100% cpu, connect to the mc process using gdb -p and walk step-by-step in code where inifinte loop occured. Then post here your results. Thanks!
comment:21 in reply to: ↑ 20 ; follow-up: ↓ 22 Changed 14 years ago by Spinal
Please build mc with full debug info. When mc will consume 100% cpu, connect to the mc process using gdb -p and walk step-by-step in code where inifinte loop occured.
Thanks for the answer, Andrew.
Could you please consult me how to do these steps properly:
1) build mc with full debug info
2) walk step-by-step in code with gdb
I'm Gentoo user if that matters.
Thanks again!
comment:22 in reply to: ↑ 21 Changed 14 years ago by Spinal
FORGET MY LAST POST :)
I've found this article: http://www.unknownroad.com/rtfm/gdbtut/gdbinfloop.html
Probably it should help.
Sorry for my naive lame. Googling is not my best side ))
Replying to Spinal:
Please build mc with full debug info. When mc will consume 100% cpu, connect to the mc process using gdb -p and walk step-by-step in code where inifinte loop occured.
Thanks for the answer, Andrew.
Could you please consult me how to do these steps properly:
1) build mc with full debug info
2) walk step-by-step in code with gdb
I'm Gentoo user if that matters.
Thanks again!
comment:23 Changed 14 years ago by Spinal
(gdb) bt
#0 0xb76ea424 in kernel_vsyscall ()
#1 0xb74d184d in select () from /lib/libc.so.6
#2 0x080ac8ab in try_channels (set_timeout=<value optimized out>) at key.c:589
#3 0x080ad445 in getch_with_delay (event=0xbf92fc50, redo_event=0, block=1) at key.c:661
#4 tty_get_event (event=0xbf92fc50, redo_event=0, block=1) at key.c:1684
#5 0x08061718 in frontend_run_dlg (h=0x9a70358) at dialog.c:1043
#6 run_dlg (h=0x9a70358) at dialog.c:1075
#7 0x0809619f in create_panels_and_run_mc (argc=3, argv=0xbf92fe64) at main.c:1716
#8 do_nc (argc=3, argv=0xbf92fe64) at main.c:1798
#9 main (argc=3, argv=0xbf92fe64) at main.c:2048
(gdb) next
Single stepping until exit from function kernel_vsyscall,
which has no line number information.
0xb74d184d in select () from /lib/libc.so.6
(gdb) next
Single stepping until exit from function select,
which has no line number information.
try_channels (set_timeout=<value optimized out>) at key.c:590
590 if (v > 0) {
(gdb) next
591 check_selects (&select_set);
(gdb) next
592 if (FD_ISSET (input_fd, &select_set))
(gdb) next
596 }
(gdb) next
tty_get_event (event=0xbf92fc50, redo_event=0, block=1) at key.c:1684
1684 c = block ? getch_with_delay () : get_key_code (1);
(gdb) next
And then gdb hangs here doing nothing and consuming 100% cpu.
Will this be helpful?
Should I do something else to debug this?
Thank you.
comment:24 Changed 14 years ago by zaytsev
- Cc zaytsev, slyfox added
Hi! Sorry, we are busy with release preps but thank you for helping with debugging this issue. Maybe Sly has something to say about it. We'll get back to it later.
comment:25 Changed 14 years ago by andrew_b
- Component changed from mc-core to mc-tty
Some additional info can be found in #2416.
comment:26 follow-up: ↓ 40 Changed 14 years ago by Spinal
Please disregard my explanation of the bug! It's not connected to suspend to ram.
It's more common than I thought. I experienced this bug on my n900 about 2 times last month.
I just suddenly explored that my battery is drained more actively than usually.
"top" showed me that it was ... yes, it was midnight commander. Killall helped to fix the things.
But, it's interesting that I don't see much comments here from other mc users...
Am I a magnet for that bug or something? :-)
P.S. The packaged mc version is 4.7.4-maemo3 on the phone.
comment:27 follow-up: ↓ 28 Changed 14 years ago by angel_il
try build MC without subshel suport
comment:28 in reply to: ↑ 27 Changed 14 years ago by Spinal
Replying to angel_il:
try build MC without subshel suport
And what's the reason?
1) I don't know how to build software for the phone.
I use binary (.deb) packages prepared by maemo community.
2) (More important) I use subshell actively. What's the point of removing it?
3) I don't know how to reproduce bug. It's reproduced randomly.
comment:29 Changed 14 years ago by angel_il
just do it, don't ask me 'why' :) i want to know the bug still reproduced if subshell is disabled.
1) I don't know how to build software for the phone.
99% - bug in mc.
comment:30 Changed 14 years ago by Spinal
As I stated above "I don't know how to reproduce bug. It's reproduced randomly"
I cannot work without a subshell for a week or two waiting if the bug will appear.
Midnight commander without a subshell is useless thing, IMHO.
comment:31 Changed 14 years ago by slyfox
- Status changed from new to assigned
- Owner set to slyfox
- severity changed from no branch to on review
Created branch:2244_busy_loop
aka changeset:8de43bfa2c776a6142665cd78cb94b39617e5038
I don't guarantee the patch fixes this exact problem, but it will not hurt in any way.
I think (not sure) Spinal's case is the following:
- He closes terminal window in mc
- it leads subshell (and it's descriptor) to die
- signal delivery is:
- too late or
- absent or
- SIGCHLD is blocked or
- something else (it's the major thing to find out) and mc is fast enough to call select() on that invalid descriptor. Full strace log (from the very mc start) would certainly help. So we get busy loop.
Please review.
comment:32 follow-up: ↓ 37 Changed 14 years ago by ossi
that should be an else-if, to make it clear that no fall-through from the v > 0 is possible.
anyway, while i agree that the patch won't hurt, it makes plain no sense in this context. the strace indicates clearly that the select does not fail (and it never will, unless an FD is actually actively closed somewhere or the system is in real trouble). as i said five months ago, the problem is that the EOF (on stdin) is not handled.
comment:36 Changed 14 years ago by slavazanko
- Votes for changeset changed from andrew_b to andrew_b slavazanko
- severity changed from on review to approved
Branch rebased to new master, changeset:0880dbcea223f3c2357606ccd2ae2ee2d1b7d08e
comment:37 in reply to: ↑ 32 Changed 14 years ago by slyfox
- Votes for changeset andrew_b slavazanko deleted
- severity changed from approved to on rework
Replying to ossi:
that should be an else-if, to make it clear that no fall-through from the v > 0 is possible.
Agreed, will amend.
anyway, while i agree that the patch won't hurt, it makes plain no sense in this context. the strace indicates clearly that the select does not fail (and it never will, unless an FD is actually actively closed somewhere or the system is in real trouble). as i said five months ago, the problem is that the EOF (on stdin) is not handled.
Oh my, right. I'm not sure how smplayer managed to close mc's stdin though.
I tried to address some nasty issue (which i can't reproduce with hangup) happening when one closes terminal window.
andrew_b slavazanko
Guys, i am removing your votes as I'll have to fix the EOF issue as well.
Might last for a while so block on #2409 can be dropped at some time.
Sorry.
comment:38 Changed 14 years ago by slyfox
Another theory: Spinal's terminal sometimes does not send SIGHUP to mc.
Steps to reproduce another hangup:
- apply the following patch (mc ignores SIGHUP signal)
diff --git a/lib/utilunix.c b/lib/utilunix.c index 5d0a207..3f7b8b2 100644 --- a/lib/utilunix.c +++ b/lib/utilunix.c @@ -215,6 +215,7 @@ my_system (int flags, const char *shell, const char *command) signal (SIGQUIT, SIG_DFL); signal (SIGTSTP, SIG_DFL); signal (SIGCHLD, SIG_DFL); + signal (SIGHUP, SIG_DFL); if (flags & EXECUTE_AS_SHELL) execl (shell, shell, "-c", command, (char *) NULL); diff --git a/src/main.c b/src/main.c index 7b54789..6fb9e7c 100644 --- a/src/main.c +++ b/src/main.c @@ -522,6 +522,15 @@ main (int argc, char *argv[]) #endif /* HAVE_SUBSHELL_SUPPORT */ mc_prompt = (geteuid () == 0) ? "# " : "$ "; + { + struct sigaction ignore; + ignore.sa_handler = SIG_IGN; + sigemptyset (&ignore.sa_mask); + ignore.sa_flags = 0; + + sigaction (SIGHUP, &ignore, NULL); + } + /* Program main loop */ if (!midnight_shutdown) do_nc ();
- build mc with ncurses (slang seems to handle it properly)
F="$F -ggdb3" ../mc/configure --prefix=$(pwd)/_mc-bin \ --with-samba \ --with-mcserver \ --enable-charset \ --enable-extcharset \ --enable-maintainer-mode \ --with-screen=ncurses \ && make CFLAGS="$F" && make install -j 3
- run ./mc_bin/bin/mc in xterm and close xterm's window
- contemplate 100% CPU load
strace log:
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99993}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99994}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99993}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99993}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99994}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99993})
gdb log:
#0 0x00007f5c76c348f3 in __select_nocancel () from /lib/libc.so.6 #1 0x0000000000442764 in try_channels (set_timeout=1) at ../../../mc/lib/tty/key.c:609 #2 0x00000000004429e9 in getch_with_delay () at ../../../mc/lib/tty/key.c:693 #3 0x00000000004447ba in tty_get_event (event=0x7fff8d16a500, redo_event=0, block=1) at ../../../mc/lib/tty/key.c:1877 #4 0x000000000044839e in frontend_run_dlg (h=0x1d65fd0) at ../../../mc/lib/widget/dialog.c:527 #5 0x000000000044952d in run_dlg (h=0x1d65fd0) at ../../../mc/lib/widget/dialog.c:1145 #6 0x000000000048af11 in create_panels_and_run_mc () at ../../../mc/src/filemanager/midnight.c:883 #7 0x000000000048c538 in do_nc () at ../../../mc/src/filemanager/midnight.c:1606 #8 0x0000000000422e75 in main (argc=<value optimized out>, argv=<value optimized out>) at ../../mc/src/main.c:536
Looks similar, eh?
comment:40 in reply to: ↑ 26 Changed 14 years ago by zap
But, it's interesting that I don't see much comments here from other mc users...
Am I a magnet for that bug or something? :-)
You're not a magnet, I have experienced the same bug when porting mc to N900, and reported it here:
http://www.midnight-commander.org/ticket/2416
(the bug above also contains debug info, maybe it can shed additional light on the bug). The bug is triggered when you close the terminal without quitting mc first. This will quickly drain your battery, so it's a very serious bug for a smartphone.
but it was marked as a duplicate of this bug. Please use version 4.6.2-pre1-1maemo10 (the latest stable mc port), it's old but doesn't have this (and many other) bugs.
comment:41 Changed 14 years ago by zaytsev
Hi! Is there a N900 emulator or something? We don't have N900, so you know, it's not easy to fix something if you only have strange backtraces and you can't reproduce the bug yourself and also can't check if it is fixed or not.
comment:42 follow-up: ↓ 44 Changed 14 years ago by ossi
oh, c'mon, don't be silly. the bug is rather obviously that get_key_code() returns -1 on EOF, which is interpreted as "try again" by getch_with_delay(). or something very similar. i found that after three minutes of just looking over the code, so it can't be that hard to find the problem when you do an actual review.
comment:43 Changed 14 years ago by zaytsev
Is this "don't be silly" comment for me? WTF? Go talk like this to someone who appreciates it.
comment:44 in reply to: ↑ 42 Changed 14 years ago by slyfox
Replying to ossi:
oh, c'mon, don't be silly. the bug is rather obviously that get_key_code() returns -1 on EOF, which is interpreted as "try again" by getch_with_delay(). or something very similar. i found that after three minutes of just looking over the code, so it can't be that hard to find the problem when you do an actual review.
That's cool and I agree mc needs a workaround, but I'd also like to know who stole SIGHUP.
It could easily be a flaw in n900 terminal.
comment:45 Changed 14 years ago by ossi
zaytsev: you don't have to appreciate it, as it was an expression of *my* disappreciation for your somewhat unconvincing approach to this problem.
slyfox: good point. otoh, some sources i found indicate that only the session leader (which would be the shell mc was started from) receive the hangup, and propagating it to the children is part of the shell's job control - which can be intentionally suppressed (nohup or disown -h) or could fail for example if the shell simply died (which may even be the reason for the terminal exiting in the first place).
so i think we have an explanation and a tentative solution.
ps: merry xmas! :)
comment:46 Changed 14 years ago by angel_il
Orthodox Christmas on Jan. 7:)
comment:47 Changed 14 years ago by zaytsev
I am not considering the technical merits of your input, however, I find that you are repeatedly being from boorish to plain rude in expressing your opinions. It might be that it's a norm of life to be harsh to each other to pass for an 1337 h4x0r in the other open source communities that you are involved in, or a subtle trait of your personality which gives it such an unique touch, but unfortunately I couldn't care less.
Consequently, by saying that your comments are not appreciated I was trying to politely indicate that maybe you should for once drop your mentor tone and force yourself to try to be a bit more convivial if not friendly. Your attitude is considered to be intimidating not only by me, but also by the other members of the group, therefore if you absolutely want to stick to it, it would be better if you would keep your comments for yourself. Am I making myself clear enough now?
comment:48 Changed 14 years ago by ossi
ilya: accumulate the wishes for later then. :D
zyv: and being "part of the group" allows you to be exactly that? sorry, but the irony of *you* trying to teach me good manners borders on grotesque. have *i* made myself clear enough now?
comment:49 Changed 14 years ago by zaytsev
Using your own words, I don't expect you to appreciate it: if you see nothing wrong with your behavior there's frankly not much that I can do. The allusion to other group members was to keep you from being deluded that I am to only one who finds your tone irritating and hence it is my personal problem.
You are more than welcome to create your own community and express yourself in every imaginable way that you think you should. However, if you want keep commenting on this specific trac you are expected to behave at least neutrally in the eyes of those whom you are addressing to. I think this is a reasonable requirement.
I hope you have enough self esteem to not to engage in a follow-up discussion on how I am expected to enforce it.
comment:50 Changed 14 years ago by ossi
I think this is a reasonable requirement.
i just wonder why you think it doesn't apply to you. or do you really not notice how aggressive, sarcastic and plain rude you often are to the bug reporters and sometimes your team mates?
and don't get me wrong: it's my philosophy to be just that towards those who are taxing my patience. but if you do that, you better make damn sure that it's not *you* who is wasting others' time.
comment:51 follow-up: ↓ 52 Changed 14 years ago by zaytsev
I don't think that the policies that I'm advocating for do not equally apply to me, however, to my mind, my own behavior is within the realms of acceptable. You have a track record of tapping on my nerves for more than one year on different occasions. Now you present this as your consistent responses to my aggression, but I can't see how I could have triggered your rudeness to me and others in the first place, when you were initially commenting on messages that were not destined to you directly in any way.
Your usual communication tactics are:
(1) Fish out something from the commit list and reply back something along the lines of "wtf, how this could have even been committed? of course anyone with a slightest clue would have done X and Y instead" (which obviously implies that you have the clue, but the day-to-day routine has to be performed by lower-grade programmers and you will be sending them your directions when they really irritate you with their lack of competence beyond of what you can handle)
(2) Go through the bugs backlog and wonder how come this was not noticed, or that has not been done yet, while for a person with your qualification this would have only taken a few moments. Sometimes, you have brought up points years ago on the mailing list with former developers, and yet nobody fixed it! Of course, anyone should be devoted to fixing things that annoy you in the first place. If one wants to work on mc, one has to be doing what you think is needed.
(3) Chime into a discussion on the trac and declare that the only true way to implement X is by doing Y and Z. Obviously, anyone that dares to disagree with you is an idiot. Moreover, this idiot has to implement it for you the way you want it to, because you have already pointed him the right way.
(4) Whenever once in awhile you are in the mood of writing some divine code, you attach it to the ticket saying something like "Fix it!". Roger, sir! Of course, this was a sparkle of humor, which stupid slaves did not get.
So am I really the one that provokes you doing this all the time? Maybe I need to see a therapist.
The explanation I came up with is that the emphasis on your superiority comes naturally without any ulterior motives and you are sincerely surprised by subsequent reactions, but it doesn't make it any more pleasant.
Maybe your close friends that know all your virtues IRL will appreciate you calling them silly, but not me, sorry. And I don't think that my provocations are the reason for your lack of positive communication.
comment:52 in reply to: ↑ 51 Changed 14 years ago by Spinal
Sorry for my 50 cents but...
That really seems ridiculous, reading comments about manners on this tracker.
Is there any update on the bug? Is it going to be fixed?
comment:53 Changed 14 years ago by zap
- I can provide a remote SSH session and assistance to any mc developer that wishes to debug the problem on my N900. Contact me via jabber zap#jabber.ozerki.net if you want it.
- I will try to debug the problem myself, I'm just not familiar with inner workings of mc. Now that I know someone must catch SIGHUP, I can set some breakpoints and see what happen.
comment:54 follow-up: ↓ 55 Changed 14 years ago by Shareth
100% way to reproduce it:
- open Konsole
- sudo bash
- mc
- close Konsole (Quit from menu or just by pressing X on window).
Works only after sudo but I guess the problem is not superuser itself.
The exact same thing happens if you run 'top' instead of 'mc' - top eats 100% cpu after closing Konsole. On the other hand htop in the same circumstances closes gracefully.
mc --version
GNU Midnight Commander 4.7.0.3
emerge -pv mc
app-misc/mc-4.7.0.3 USE="X edit gpm nls samba -slang"
uname -a
Linux ruf-gentoo 2.6.36-gentoo-r5-1 #1 SMP PREEMPT Thu Dec 23 12:55:11 MSK 2010 x86_64 Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz GenuineIntel? GNU/Linux
comment:55 in reply to: ↑ 54 Changed 14 years ago by angel_il
Replying to Shareth:
100% way to reproduce it:
- open Konsole
- sudo bash
- mc
- close Konsole (Quit from menu or just by pressing X on window).
Works only after sudo but I guess the problem is not superuser itself.
The exact same thing happens if you run 'top' instead of 'mc' - top eats 100% cpu after closing Konsole. On the other hand htop in the same circumstances closes gracefully.
mc --version
GNU Midnight Commander 4.7.0.3
emerge -pv mc
app-misc/mc-4.7.0.3 USE="X edit gpm nls samba -slang"
uname -a
Linux ruf-gentoo 2.6.36-gentoo-r5-1 #1 SMP PREEMPT Thu Dec 23 12:55:11 MSK 2010 x86_64 Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz GenuineIntel? GNU/Linux
Shareth, thanx, i try install Konsole for reproduce.
comment:56 Changed 13 years ago by sorath
- Cc torohov_s_a@… added
- Branch state set to no branch
The bug is still reproduced in Gentoo Linux with mc-4.7.5.2 (current stable), mc-4.7.5.5 and mc-4.8.0 (current masked) releases.
comment:57 Changed 13 years ago by sorath
Sorry, it seems that while adding myself to "cc list" I leaved default "Branch state: no branch" options and it change branch status from probably (mentioned above)"severity changed from approved to on rework" to "no branch".
comment:59 Changed 13 years ago by petertux
- Cc petre.rodan@… added
hi
I'm another gentoo user confronted with this bug on both x86 and amd64. mc went into 100%cpu multiple times per day and today I got fed up with it and decided to investigate.
it is extremely easy to replicate: start xterm and mc inside it. close xterm. voila. nothing else is needed.
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999})
select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
read(0, "", 1) = 0
select(5, [0 4], NUL
mc was built with ncurses and without slang:
[ebuild R ] app-misc/mc-4.7.5.2 USE="X ncurses -edit -gpm -nls -samba -slang" 0 kB
[ebuild R ] sys-libs/ncurses-5.7-r7 USE="unicode -ada -cxx -debug -doc -gpm -minimal -profile -static-libs -trace" 2,388 kB
if I compile it with slang and without ncurses:
[ebuild R ] app-misc/mc-4.7.5.2 USE="X slang -edit -gpm -ncurses -nls -samba" 0 kB
the bug disappears.
comment:60 follow-up: ↓ 61 Changed 11 years ago by ginggs
I am able to reproduce this with MC 4.8.10 on Ubuntu Raring.
Download and unpack the Debian source package for mc/3:4.8.10-2.
Change line 31 of debian/rules from:
--with-screen=slang \
to:
--with-screen=ncurses \
and line 14 of debian/control from:
,libslang2-dev
to:
,libncurses-dev
Build and install the package.
Start gnome-terminal or xterminal.
Run mc as root (sudo mc), bug does not occur as normal user.
Close the terminal window.
Mc process continues to run, consuming 100% CPU.
comment:61 in reply to: ↑ 60 ; follow-up: ↓ 62 Changed 11 years ago by slyfox
Replying to ginggs:
Start gnome-terminal or xterminal.
Run mc as root (sudo mc), bug does not occur as normal user.
Close the terminal window.
Mc process continues to run, consuming 100% CPU.
May I ask you to get a strace log bit of a process when it's in such state?
strace -p $pid -o log
Some lines should be enough to see where we don't handle errors on tty in/out.
And, may I ask you to attack to it with gdb and get a backtrace?
Needs debuggigng symbols
gdb -p $pid bt full
Thanks!
comment:62 in reply to: ↑ 61 Changed 11 years ago by Spinal
Replying to slyfox:
Replying to ginggs:
Start gnome-terminal or xterminal.
Run mc as root (sudo mc), bug does not occur as normal user.
Close the terminal window.
Mc process continues to run, consuming 100% CPU.
May I ask you to get a strace log bit of a process when it's in such state?
strace -p $pid -o logSome lines should be enough to see where we don't handle errors on tty in/out.
And, may I ask you to attack to it with gdb and get a backtrace?
Needs debuggigng symbols
gdb -p $pid bt fullThanks!
Hi, Slyfox.
I got this from mc-4.8.10.
This is gdb output:
GNU gdb (Gentoo 7.5.1 p2) 7.5.1 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu". For bug reporting instructions, please see: <http://bugs.gentoo.org/>. Attaching to process 2373 Reading symbols from /usr/bin/mc...done. warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? Reading symbols from /lib/libncursesw.so.5...(no debugging symbols found)...done. Loaded symbols for /lib/libncursesw.so.5 Reading symbols from /lib/libext2fs.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libext2fs.so.2 Reading symbols from /usr/lib/libgmodule-2.0.so.0...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libgmodule-2.0.so.0 Reading symbols from /usr/lib/libglib-2.0.so.0...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libglib-2.0.so.0 Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/libthread_db.so.1". Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/libcom_err.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libcom_err.so.2 Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/librt.so.1 Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/libnss_compat.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /usr/lib/libX11.so...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libX11.so Reading symbols from /usr/lib/libxcb.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libxcb.so.1 Reading symbols from /usr/lib/libXau.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libXau.so.6 Reading symbols from /usr/lib/libXdmcp.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libXdmcp.so.6 0xb770c424 in __kernel_vsyscall () (gdb) bt full #0 0xb770c424 in __kernel_vsyscall () No symbol table info available. #1 0xb7480b9d in select () from /lib/libc.so.6 No symbol table info available. #2 0x08081219 in try_channels (set_timeout=0) at key.c:621 time_out = {tv_sec = 0, tv_usec = 99999} select_set = {fds_bits = {1, 0 <repeats 31 times>}} timeptr = <optimized out> v = <optimized out> maxfdp = <optimized out> #3 0x080829f7 in getch_with_delay () at key.c:698 c = <optimized out> #4 tty_get_event (event=0xbfa582b0, redo_event=0, block=1) at key.c:2133 c = <optimized out> flag = 0 time_out = {tv_sec = 10, tv_usec = 0} time_addr = <optimized out> dirty = 1 #5 0x0806663a in frontend_dlg_run (h=0x8333280) at dialog.c:565 d_key = <optimized out> event = {buttons = 0, x = -1, y = 135627276, type = (GPM_MOVE | GPM_DRAG | GPM_DOWN | GPM_TRIPLE | GPM_HARD | unknown: 134632960)} #6 dlg_run (h=0x8333280) at dialog.c:1252 No locals. #7 0x0808ad21 in create_panels_and_run_mc () at midnight.c:959 No locals. #8 do_nc () at midnight.c:1774 ret = <optimized out> midnight_colors = {9, 9, 9, 9, 9} #9 0x08054890 in main (argc=1, argv=0xbfa584d4) at main.c:397 error = 0x0 config_migrated = 0 config_migrate_msg = 0xbfa58438 "\250\204\245\277\227E;\267\001" exit_code = 1
Strace:
read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99996})
comment:64 Changed 11 years ago by ginggs
My gdb backtrace and strace log look much the same as Spinal's.
comment:65 Changed 9 years ago by ginggs
This issue is still present in mc 4.8.15 (compiled with ncurses).
Tested on Ubuntu 15.10 amd64 with mc 4.8.15-2 from Debian unstable.
comment:66 follow-up: ↓ 68 Changed 9 years ago by and
Can we have a fresh strace or is comment:61 strace log uptodate?
Looks like we looping in getch_with_delay() all day long,
when mc thinks to retrieve next key but slang/ncurses never return a key after resume?
comment:67 Changed 9 years ago by ginggs
backtrace:
(gdb) bt full #0 0x00007f9b50f53723 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81 No locals. #1 0x0000560bab8c7d7e in try_channels (set_timeout=set_timeout@entry=1) at key.c:626 timeptr = 0x7ffc3a3002b0 maxfdp = 1 v = <optimized out> time_out = {tv_sec = 0, tv_usec = 99999} select_set = {fds_bits = {1, 0 <repeats 15 times>}} #2 0x0000560bab8c971a in getch_with_delay () at key.c:722 c = <optimized out> #3 tty_get_event (event=event@entry=0x7ffc3a300400, redo_event=0, block=block@entry=1) at key.c:2138 c = <optimized out> flag = 0 ev = {buttons = 0 '\000', modifiers = 0 '\000', vc = 0, dx = 0, dy = 0, x = 0, y = 0, type = (unknown: 0), clicks = 0, margin = (unknown: 0), wdx = 0, wdy = 0} time_out = {tv_sec = 94608148554896, tv_usec = 94608148537344} time_addr = <optimized out> dirty = 1 #4 0x0000560bab8b892b in frontend_dlg_run (h=0x560bad162000) at dialog.c:568 d_key = <optimized out> event = {buttons = 0 '\000', modifiers = 0 '\000', vc = 0, dx = 0, dy = 0, x = -1, y = -21621, type = (GPM_MOVE | GPM_DRAG | GPM_UP | GPM_ENTER | GPM_LEAVE | unknown: 20480), clicks = -1391003136, margin = (GPM_TOP | GPM_BOT | GPM_RGT | unknown: 22016), wdx = -28784, wdy = -21576} #5 dlg_run (h=0x560bad162000) at dialog.c:1267 No locals. #6 0x0000560bab8d1126 in create_panels_and_run_mc () at midnight.c:954 No locals. #7 do_nc () at midnight.c:1757 ret = <optimized out> #8 0x0000560bab8aafc9 in main (argc=1, argv=0x7ffc3a300668) at main.c:418 mcerror = 0x0 config_migrated = <optimized out> config_migrate_msg = 0x0 exit_code = 1
strace:
read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99998}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99997}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) read(0, "", 1) = 0 select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0]) select(5, [0 4], NULL, NULL, {0, 100000}) = 1 (in [0], left {0, 99999}) select(5, [0 4], NULL, NULL, NULL) = 1 (in [0])
comment:68 in reply to: ↑ 66 Changed 9 years ago by ginggs
Replying to and:
Looks like we looping in getch_with_delay() all day long,
when mc thinks to retrieve next key but slang/ncurses never return a key after resume?
I don't think suspending and resuming are relevant.
Steps to reproduce:
compile mc --with-screen=ncurses (does not occur with slang)
subshell must be enabled and a suitable shell available (does not occur with busybox)
open a GUI terminal (gnome-terminal, xterminal, konsole is also mentioned)
start mc as root (sudo mc)
close GUI terminal
1 CPU will continue running at 100%
comment:69 Changed 9 years ago by and
thanks ginggs for more information.
with slang mc will exiting with
SLang_getkey returned SLANG_GETKEY_ERROR Assuming EOF on stdin and exiting
but under ncurses getch() error condition check is complicated.
ncurses getch() can return ERR which is an error on delay mode, but not strictly on no-delay mode.
So stdin EOF may never signaled by ncurses getch() in no-delay mode (I have no test case for checking ncurses getch() in no-delay mode if returning ERR _and_ an errno state)
patch will handled ncurces getch() error in delay mode which solve looping on exit.
comment:70 Changed 9 years ago by ginggs
Thanks, andreas!
That patch mc-2244-infinite-loop-when-stdin-fd-got-deleted.patch works for me.
comment:71 Changed 9 years ago by and
#3108 is a duplicate this
comment:72 Changed 9 years ago by zaytsev
Ticket #3108 has been marked as a duplicate of this ticket.
show please output of
'top' and 'ps ax'