Hole: A program for creating sparse files

What are sparse files?

On many Unix filesystems, files containing long strings of nulls can be stored much more efficiently than other files. To be specific, if a string of nulls spans an entire allocation block, that whole block is not stored on disk at all. Files where one or more blocks are omitted in this way are called sparse files. The missing blocks are also known as holes.

Note that sparse files are not the same as compressed files. Sparse files are exactly the same as their non-sparse equivalents when they are read. The Unix kernel simply fills in nulls for the missing blocks.

Sparse files are created by seeking beyond the end of a file and then writing data. Because of the nature of these applications, sparse files are often created by random-access database programs.

Under Linux, one of the more common uses of sparse files is in the creation of rescue disks. Since space on a floppy is limited, the C shared library must be stored as a sparse file. Typically, one would start with a light version of the shared library, such as libc-lite.so.4.6.27, and then shrink it a further 260K or so by making it sparse. Unfortunately, such a technique is no longer possible with the new ELF format binaries since the new format does not have many long strings of nulls. So, rescue disks are sometimes still created in the old a.out format.

A demonstration

[12/512] Tue Oct  8 00:54:55 (narf:cheah):~/hole
$ hole outfile < /lib/libc.so.4.7.2
[13/513] Tue Oct  8 00:55:12 (narf:cheah):~/hole
$ du /lib/libc.so.4.7.2
624	/lib/libc.so.4.7.2
[14/514] Tue Oct  8 00:55:20 (narf:cheah):~/hole
$ du outfile
412	outfile
[15/515] Tue Oct  8 00:55:23 (narf:cheah):~/hole
$ ls -l /lib/libc.so.4.7.2 outfile
-rwxr-xr-x   1 root     root       634880 Apr 29  1995 /lib/libc.so.4.7.2*
-rw-r--r--   1 cheah    users      634880 Oct  8 00:55 outfile
[16/516] Tue Oct  8 00:55:40 (narf:cheah):~/hole
$

Download the Hole program. (3.4K)

Back


Po Shan Cheah cheah@nic.com
Last modified December 4, 1996.