On many Unix filesystems, files containing long strings of nulls can be stored much more efficiently than other files. To be specific, if a string of nulls spans an entire allocation block, that whole block is not stored on disk at all. Files where one or more blocks are omitted in this way are called sparse files. The missing blocks are also known as holes.
Note that sparse files are not the same as compressed files. Sparse files are exactly the same as their non-sparse equivalents when they are read. The Unix kernel simply fills in nulls for the missing blocks.
Sparse files are created by seeking beyond the end of a file and then writing data. Because of the nature of these applications, sparse files are often created by random-access database programs.
Under Linux, one of the more common uses of sparse files is in the
creation of rescue disks. Since space on a floppy is
limited, the C shared library must be stored as a sparse
file. Typically, one would start with a light version of the shared
library, such as libc-lite.so.4.6.27, and then shrink it
a further 260K or so by making it sparse. Unfortunately, such a
technique is no longer possible with the new
ELF
format binaries since the new format does not have many long strings
of nulls. So, rescue disks are sometimes still created in the old
a.out format.
[12/512] Tue Oct 8 00:54:55 (narf:cheah):~/hole $ hole outfile < /lib/libc.so.4.7.2 [13/513] Tue Oct 8 00:55:12 (narf:cheah):~/hole $ du /lib/libc.so.4.7.2 624 /lib/libc.so.4.7.2 [14/514] Tue Oct 8 00:55:20 (narf:cheah):~/hole $ du outfile 412 outfile [15/515] Tue Oct 8 00:55:23 (narf:cheah):~/hole $ ls -l /lib/libc.so.4.7.2 outfile -rwxr-xr-x 1 root root 634880 Apr 29 1995 /lib/libc.so.4.7.2* -rw-r--r-- 1 cheah users 634880 Oct 8 00:55 outfile [16/516] Tue Oct 8 00:55:40 (narf:cheah):~/hole $