Tools that know they will take a long time often come with a built-in progress indicator, but there are other utilities on Linux that often leave the user frustratedly tapping their fingers, wondering how much longer they will have to wait.
Luckily, there is a nifty little tool called pv that will donate a progress bar to any program that can read from standard input or a pipe. pv probably stands for pipe viewer.
1. Simple example: figure out how long an md5sum will take:
pv eternal.avi |md5sum
will display something like
96.5MB 0:00:05 [25.3MB/s] [=======> ] 9% ETA 0:00:48
Notes:
pvreads from file and prints to stdout.md5sumreads from stdin.pvoutputs the progress bar to stderr so as not to interfere with the piped data. See the man page for ways to customizepv’s output.- since the bottleneck of such an operation is the media you’re reading from, not the CPU, there will be no noticeable overhead.
2. Complex example: add a progress bar to tar/bzip2 compression/decompression:
tar cf – mydir | pv -n -s $(du -sb mydir | awk ‘{print $1}’) | bzip2 >mydir.tar.bz2
Notes:
- this example is adapted from the
pvman page. - the
-nswitch makespvoutput only percentage values. - no file is passed to
pv, so it reads from stdin (piped to the output oftar). - on a system with good cache and enough memory, doing the extra
du -s mydirshouldn’t hurt much, since tar will go through the entire directory anyway.
Now let’s decompress it:
pv mydir.tar.bz2 |tar xjf -
By now you realize how awesome this is.
3. Fun example: measure /dev/null throughput:
pv /dev/zero >/dev/null
is close to 3.3GB/s on my 3-year-old system.
Notes:
- this is not a benchmark ™.
pvcan’t know the size of its input in this case (infinity), so it obviously can’t display an ETA.
pv is a brilliant example of the UNIX philosophy: simple puzzle pieces combining to create useful results. A couple of last-word remarks:

March 9, 2009 at 02:55 |
Interesting tool, I ask myself how it can tell how much work there is left to do.
I can understand how it can determine the performance (the MB/s part); + you answered my question partially here: “the size of its input in this case (infinity), so it obviously can’t display an ETA.”
But even if the input is known, is there a special convention that says that “my program’s progress will be the last number printed on the screen” (so that another program can parse that and use it? How does it know the ETA?
What if my program is doing something else? An actual example: a program is compressing a file, after it is compressed – it encrypts it. Encryption and compression happen at a different speed. If the file is large it will take it a while until the data are compressed (in the meantime pv’s estimation will be based only on samples taken during compression), and then, when encryption kicks in – not only that the size of the input will be smaller (because the file is compressed now), but it will also be different.
So, how does pv deal with the cases in which it is not that easy to estimate progress?
March 9, 2009 at 20:44 |
It can only do two things:
1) get the size of the data that is being piped through (e.g. if it’s a file, it knows its size);
2) divide the amount of data piped through by the argument to -s.
(it’s only a pipe throughput measurement tool, no magic here.)
This might seem limiting at first sight, and if you have a program that compresses + encrypts, pv will probably make inaccurate predictions. But if you follow the Unix philosophy, you would have separate tools for compression and encryption, and pv supports multiple progress bars: http://ivarch.com/programs/pv.shtml
A side-note about inaccurate progress bars: when installing linux packages, download speed and ETA can easily be determined, but actually installing a 200MB package accounts for the same amount of pixels on the progress bar as a 200KB package, even if the times vary widely. I guess a smart program could assume that the time necessary to install a package is approximately proportional to its size, but there are a lot of things for which we will never get accurate progress bars…