Secure deletion
From Gentoo Linux Wiki
Contents |
[edit] Introduction
When you need to delete data, you normally just use mkfs or rm, depending on whether you want to delete the contents of a hard disk partition, or just a bunch of files. However sometimes you need to make sure that the data you deleted can not be restored anymore. For example you may be planning to encrypt your hard disk using DM-Crypt and want to make sure that no unencrypted traces of old data are left on that disk. Or you want to give the hard disk to someone else or make it serve some other purpose, without giving your old data away with it.
By reading this page you should gain some basic understanding of how data is stored on a hard disk or other media and under which conditions data can be restored. Following that you will learn some good and practical approaches to secure deletion, so your data can not be restored by anyone. Last but not least this guide tries to do away with some common mistakes, misconceptions and paranoia that is, unfortunately, often found and spread on the internet.
[edit] Data storage
In order to learn how to delete data correctly, you first have to gain an understanding of how data is stored in the first place. Since this is a lot of dry theory, let's illustrate it with an example.
Your data, that is to say, your files and folders, are managed by filesystems. In the image on the right you can see two example filesystems, mounted in /mnt/A/ and /mnt/B/, both of which contain an arbitrary number of files and folders. Files can be of any size, as long as there is enough storage space available. Storage space is provided by partitions, in this example there is the partition A in red and partition B in blue. Each partition has a specific size, which is counted in sectors. Unlike files, all sectors have the same size. For example, partition A has 35 sectors, A0-A34, and partition B has 65 sectors, B0-B64. Since the storage space itself is provided by a hard disk, the partition sectors translate to hard disk sectors. Our example hard disk has 100 sectors, S0-S99, so partition A is stored in hard disk sectors S0-S34 and partition B is stored in hard disk sectors S35-S99.
In simpler terms, your files are stored by the filesystem somehow (how exactly depends on the filesystem), and since the storage space of the filesystem is made up of partition sectors, and therefore hard disk sectors, your files will be physically stored in one or more sectors of your hard disk storage space. The files in /mnt/A/ will end up on one of the sectors of partition A, which means they will be stored somewhere on sectors S0-S34, and files on /mnt/B/ will be stored somewhere on sectors S35-S99. Since filesystems store files whereever there just happens to be free space, it is hard to predict on which sectors exactly your files will end up. A filesystem is even allowed to move files physically as long as the name of the file remains the same, for example when defragmenting a filesystem.
[edit] Common misconceptions
When it comes to data storage, one of the most common misconceptions is that a hard disk can be empty or deleted. Strictly speaking, this is not true. A hard disk is never empty, and it does not understand a delete command. All a hard disk can do, is read and write. For example, a filesystem can ask the hard disk to read sector X or write sector X with DATA. It can not ask the hard disk to delete. Even if you never write anything on a brand new disk, it is not empty in the sense that a read would fail because no data has been written yet. A 1TB large disk stores 1TB of data, always. That data may be completely uninteresting and without any meaning whatsoever, such as a disk full of (binary) zeroes. Even so, it's still data. It can not be deleted or emptied.
Overwriting data sounds easy, and it is, however you have to make sure that you overwrite data correctly. Don't forget, your files are physically stored in hard disk sectors. In order to delete this data, you have to overwrite the sectors that store the data. Filesystem operations like rm or even mkfs do NOT do this. The filesystem will not overwrite all sectors of its files, because that would make the filesystem horribly slow. Instead, the filesystem just forgets the files, it will no longer know which sectors the files were stored in, therefore it can't access the files anymore. However that does not change the fact that the data is still physically stored in the same sectors. It may be overwritten at a later point when the filesystem writes a file and searches for unused sectors (free space) to store it in, but you don't know when that will happen.
Thus, in order to overwrite your data, you have to overwrite the sectors themselves, and because you don't know which file was stored where, you have to overwrite all of them. A simple and powerful tool that can do read and write operations on a hard disk directly is dd. The following sections will show you several approaches to secure deletion.
[edit] Secure deletion
This section contains several approaches to secure deletion. In the following examples, the device /dev/sdx represents the device or partition that should be deleted. For the examples that use dd, you can make it print out the current progress on stderr by sending it the USR1 signal with killall -SIGUSR1 dd. The bs=1M parameter is optional; dd uses a very small blocksize by default, which causes the copy process to be extremely slow. A bigger blocksize causes less overhead, so the copy operation can run at full speed.
[edit] The fastest way
The fastest way to delete a partition or an entire hard disk is to fill it with zeroes. It will utilize the full speed of your disk.
In this approach, dd reads an unlimited number of (binary) zeroes from /dev/zero, and writes them to /dev/sdx, starting at the first sector. Eventually /dev/sdx will run out of space, which means that all of its sectors got overwritten with zeroes, and therefore deleted.
[edit] The secure way
If you're planning to encrypt your hard disk, you may not want the disk to be full of zeroes, because those zeroes would give away the location of used and free space in your new encrypted partition. Instead you may want to fill the disk with random data. Unfortunately, the kernel does not provide a fast source for random data. However, you can convert zeroes to what looks like random numbers, by using encryption with a random key. The following example uses DM-Crypt, but you can apply the same principle to your encryption method of choice. Have a look at your CPU load while you do this so you can get an idea of how CPU demanding encryption will be on your system.
Encrypt the hard disk using a random key:
The fastest way with encryption (writes zeroes to the encrypted container, with a good cipher this results in random data on the physical disk):
Don't forget to clean up afterwards:
[edit] The useful way
You can also run a destructive badblocks test on your hard disk. In terms of security, it's no better than the fastest way, because it does not write really random data on the disk. It's also very slow. The only advantage of this method is that badblocks will test your hard disk for errors. The hard disk getting overwritten in the process is just a side effect.
Run a destructive badblocks test with random patterns:
[edit] Other ways
In other guides, you will find mention of many other ways to overwrite an entire hard disk. They usually involve /dev/urandom, shred, bcwipe and others. The downside of these methods is that they are slow. The kernels random source is intended for applications that require just a few bytes of high quality random data, such as encryption keys. Thus, /dev/urandom tries really hard to produce really good random numbers, at a high cost to CPU. It is not suitable for writing hundreds of gigabytes of data to a disk, since you do not get more than a few MB/s of data even on modern machines. If you use /dev/urandom directly or indirectly (many other apps also use it as their random source), you will have to wait for a long time for the deletion process to finish. Other apps even overwrite the disk more than just once, which just multiplies the time it takes to finish.
[edit] Deleting files
If all you want to delete is a single file, overwriting an entire hard disk seems like a waste of time. You may ask yourself whether that's really necessary. The answer to that, unfortunately, is that in most cases, it is simply not possible to delete a single file and make sure that no traces of it are left on the partition, short of overwriting the partition as a whole. This is due to the way a filesystem works, or more specifically, how you work with files in your filesystem.
When a filesystem deletes a file, it does not overwrite the sectors the file's data is stored in. It does not do this because it would make the filesystem horribly slow. Instead, the filesystem just forgets the file and add the now unused sectors to the available free space. When a filesystem overwrites a file, basically the same thing happens. For the filesystem, overwriting a file basically means deleting and therefore forgetting the old file, and creating and therefore writing a new file with the same name. However, since a filesystem writes new files whereever there just happens to be free space, there is no guarantee that the new file will be written in the same place as the old file. Therefore it is possible that by overwriting a file in a filesystem, what physically happens on the disk is that the data of the old file remains untouched, and the data of the new file gets written in a new location. In other words, it's a copy.
While there are tools like shred or bcwipe that claim to be able to overwrite files in place, they either do not work at all with modern journaling filesystems, or they only overwrite the file that the filesystem currently knows about. If the file was at one or several points overwritten (for example because you frequently 'Save' your OpenOffice document, which overwrites the document file every time), it is very likely that somewhere on the disk there resides an older copy of your file. And since the filesystem just forgets where the overwritten files were originally stored, those old copies of the file you want to delete can basically be anywhere in free space. Thus the aforementioned tools that claim to have deleted your files securely actually lulled you in a false sense of security. Since those tools do not serve any other purpose, that basically makes those tools obsolete.
[edit] Overwriting free space
If you are desparate and absolutely want to delete a file no matter what, without deleting any of your other data, you can delete the file using rm. At this point, the data of your file will be part of the free space of your filesystem. Older copies of the file will probably also be in the free space. You can overwrite the data of all deleted files by overwriting all available space of your partition with zeroes. However, doing that can be harmful to your system. Since you are filling up free space, the disk will be full temporarily. If other applications try to store data during this time, they will fail with an error, and probably lose data due to no space left on device errors. Furthermore, since many filesystems can not access all their free space at all times for all files, or may even reserve free space for special uses, there is only a 95% probability that this process will actually overwrite all the data of the files you deleted.
Delete the files you want to get rid of:
Overwrite all free space by filling a file with (binary) zeroes, taking up all free space on the disk:
Wait until the above command fails with 'no space left on device'.
Then, make sure that the file actually gets written on disk completely and not only in a memory filesystem cache. Depending on the size of the cache, syncing the filesystem files to the physical disk storage may take some time.
Now you can delete the zero file in order to get the free space back:
[edit] Paranoia
Unfortunately, when it comes to deleting data, there is a lot of voodoo knowledge going around. Many people believe that you have to overwrite a hard disk with random data dozens of times in order to prevent data rescue companies or even intelligence service to get at your data. This is untrue. It is not possible to restore data of an overwritten sector. Data rescue companies restore data that is no longer accessible, due to hardware failure, or corrupt filesystem tables (when a filesystem forgot about all its data, but the data itself is still physically stored on disk). Intelligence service may get data by simply stealing it, or by reading data the user did not manage to overwrite correctly. It is also possible that a sector does not get overwritten in the first place due to hardware errors (reallocated sectors). However in such a case, doing more than one pass does not help either, as a sector that the hard disk marked as defective won't get overwritten no matter how many times you try. If your hard disk is defective you should get a replacement instead.