Tag: sparse
Distributed Copy Util
by jfargen on Oct.29, 2012, under Work and stuff
Say you have 1 file that is 10TBytes, like a VM image, that you need to copy between say two nfs file stores (that you don’t control). Your only problem is a single host is only capable of 50MB/sec over a 1Gb/sec network connection. The good news is you have hundreds of hosts all mounting these two nfs filesystems. Maybe you can lash 10-12 hosts together to greatly the reduce the time it takes to copy the file? Then check out scp-tsunami – http://code.google.com/p/scp-tsunami.
How to copy sparse files faster?
by jfargen on Oct.11, 2012, under Work and stuff
A lot of people do backups with rsync and 99% of the time it works pretty well. There is one file type that rsync handles rather poorly and that is sparse files, even though rsyncs documentation indicates that sparse files are handled efficiently it is simply not true. These sparse files are becoming more prominent because they are used to store VM images, like KVM, Xen, even VMWare. So what is a good way to copy them? It turns out that tar is much better at handling sparse files than rsync, it has had support for sparse files for over two decades, and can represent the holes in sparse files more efficiently.
In my testing using Tar instead of Rsync has resulted in cutting the time to backup a sparse file by 8-10 times.
Just pipe tar over ssh and compare your results to rsync:
tar cvzSpf – sparse.file | ssh user@hostname ‘(cd /tmp; tar xzSpf -)’