Ticket #2240 (new enhancement)

Opened 9 years ago

Last modified 17 months ago

Background jobs queue (aka: file copy queue)

Reported by: danilo.schembri Owned by:
Priority: minor Milestone: Future Releases
Component: mc-core Version: master
Keywords: Cc: sledgeas@…, mooffie@…
Blocked By: Blocking:
Branch state: no branch Votes for changeset:

Description

When slow devices are used (slow networks, old USB, ...) a functionality which executes background jobs one at a time would be appreciated.

I wrote a patch which does this (only and simply this), but a fully functionality could include:

  • a new state ("Queued") for a background job
  • a new button in background jobs list which allows to set "Queued" state ("Resume" is already present and fine working)
  • a new parameter which allows to specify max number of concurrent running jobs (by default; with manual "Enqueue"/"Resume"/"Suspend", parameter can be workarounded).

Attached patch only sets "Stopped" state if another job is already running, and starts automatically (switch form "Stopped" to "Running" state) a job which is "Stopped".

Attachments

mc-4.7.0_pre3-copy-queue.patch (1.5 KB) - added by danilo.schembri 9 years ago.
mc-4.7.0_pre3.ebuild (1.8 KB) - added by danilo.schembri 9 years ago.
mc-4.7.0.9-copy-queue.patch (1.5 KB) - added by danilo.schembri 9 years ago.
mc-4.7.4-copy-queue.patch (1.6 KB) - added by danilo.schembri 9 years ago.

Change History

Changed 9 years ago by danilo.schembri

comment:1 Changed 9 years ago by danilo.schembri

For Gentoo GNU/Linux user, I add ebuil for portage.

Changed 9 years ago by danilo.schembri

comment:2 Changed 9 years ago by zaytsev

Can you update your patch to the latest master?

comment:3 Changed 9 years ago by danilo.schembri

Of course I can.
I'll do it ASAP.

comment:4 Changed 9 years ago by andrew_b

Pleas use diff -u to create patches.
Please don't use "" comments.

comment:5 Changed 9 years ago by zaytsev

Recommended settings: diff -Naur a b > c.patch .

Changed 9 years ago by danilo.schembri

comment:6 Changed 9 years ago by danilo.schembri

Hi,
I made file copy queue patch for stable version (4.7.0.9).

Please, let me know your suggestions.

Changed 9 years ago by danilo.schembri

comment:7 Changed 9 years ago by danilo.schembri

  • Version changed from 4.7.0-pre3 to master

...and for 4.7.4 version, which seems to be equal to latest master.

comment:8 Changed 6 years ago by balta2ar

  • Branch state set to no branch

Any progress on this? This is probably the single feature I miss very much in MC.

comment:9 Changed 6 years ago by sledge

Bump.. :/

comment:10 Changed 6 years ago by sledge

  • Cc sledgeas@… added

comment:11 Changed 4 years ago by mooffie

  • Cc mooffie@… added

comment:12 Changed 4 years ago by andrew_b

  • Milestone changed from 4.8 to Future Releases

comment:13 Changed 4 years ago by zaytsev

zatricky commented on 14 Nov 2014

@slavaz, the whole point of open source is that new people come in, learn, and contribute. Try to be more encouraging.

That said, @Gilwyad, a mature project is usually the wrong type of project to take on as a "junior". Maybe things have changed for you since. I'd suggest following the development of an active project to get an idea of how things work, and to learn from that.

@slavaz mentioned (a long long time ago):
"other developers disagree with the only one active background process because background operations should be completed fast as possible"
The above makes sense with SSDs or with memory for performance. With spinning rust, parallel operations result in MUCH WORSE performance. With SSDs, it is also arguable that parallel operations result in slightly more wear (overall, probably less than 5% extra).

On the spinning-rust parallel performance issue, the simple fact is that any SATA drive (spindle or SSD) only performs operations sequentially. Additionally they are limited in the number of operations they perform per second. Spindles are typically up to 150 IOps (I/O Operations Per Second) while SSDs are typically between 40k and 1M IOps (the real reason they're subjectively faster than spindles).

From the above, if you attempt to read as many far-apart 1KB files from a spindle in one second, you will typically read between 70 and 200 (depending on if you have a low-end consumer drive or a high-end enterprise drive). If you expand that to be that you want to read 150 files simultaneously, you are likely to read only a single chunk of each file - and it will take you an entire second to read one chunk. One chunk is typically 4KB, thus you are now reading at an abysmal 600KB/s.

That 600KB/s is SUPER slow. If you tried to copy 1GB files, 150 of them, simultaneously, it will take you two days, 21 hours, and 27 minutes. If you instead copy them sequentially (one at a time), you will be doing 1 IOps (sequentially) but at the maximum speed of the spindle. Let's say it can do 80MB/s. You will average slightly less than that, but still very close to maximum, and it will probably take ~ 31 minutes.

You can easily test this theory within Linux or Windows:
Set up 10 operations to run in parallel and sequentially. These copy operations must be from a spindle drive and must be copied to /dev/null, a ramdisk, or an SSD. The files must preferably be large enough that each takes more than a second to copy (200MB or larger is a good starting point).

Wipe the caches (or reboot), and execute the copies in sequence. You will find the average copy speed of each one is relatively stable and close to the drive's maximum speed.

Wipe the caches again (or reboot), and execute the copies in parallel. You will find that the average copy speed of each one is less than a tenth of the copy speed seen previously. Thus the total speed will be less, and the copy operation will take longer to complete.

comment:14 Changed 4 years ago by zaytsev

igitur commented on 1 Feb

If you need a more concrete use case, I compared 2 scenarios:

1) Copy 100GB from internal HDD to external HDD (USB3) by queueing the data.
2) Copy the same data from same source to same destination, but batch it in 5 20GB chunks.

The first option is definitely faster. I suspect the overhead is exaggerated on an external hard drive. My gut feel is that for internal copying (internal HDD to another internal HDD) the r/w overhead of simultaneous background tasks wouldn't be that much.

comment:16 Changed 3 years ago by darkdragon-001

Any progress on this?

Commenting here, since Github issues seem to be disabled...

Another use case:
I want to have one queue for multiple FTP servers each.
Within a queue, copy operations should be sequentially, while all queues should run simultaneously.

comment:17 Changed 17 months ago by wingerman

Please, implement this feature. It's more than desirable.

Note: See TracTickets for help on using tickets.