I've always been a block level kinda guy but I'm interested in hearing some real world experiences with file level cloning. What are some of the advantages and disadvantages as well as what tools work the best.
-
Well the most obvious advantage of file-level cloning is that you don't waste time cloning unused blocks. Eg a clone of a 40G partition with 10G of data will require 40G of reads and 40G of writes on the block level, but close to 10G of reads and 10G of writes on the file level.
One minor benefit of file-level cloning, is that it effectively perfectly de-fragments your filesystem at the same time, whereas block-level cloning clones fragmentation as well.
Block-level cloning is simpler, and you don't have to worry about any kind of permissions or other issues, you know for 100% certain the clone will be identical to the original, but it's possible for file-level cloning to go wrong if you mess up your settings.
Mark : If you have a tool to do block level cloning that is aware of the filesystem structure, it can skip unused blocks, and so it may not need the full 40GB of reads/writes. I'm thinking zfs send/receive here, but I'm sure there are other filesystems or tools that do something similar.womble : If it's aware of the filesystem structure, it's a file-level cloning tool.From davr -
My worst experience with file level cloning was a 20Gig NT4 partition with about 1.6m tiny files. Transfer rate would have been ~8Meg/sec with block level cloning (over a 100Meg network) and should have taken somewhere between an hour and an hour and a half, it ended up at <150K/sec because of all of the file system\permissions overhead and took almost two days.
From Helvick -
As people have said use block level when the hit on the file meta data is too great. Use file based when there are not many files.
I am used to a block replication system that only replicates blocks that have changed and are allocated to files. This can work very well.
File based replication is cheap and easy to do on an open system however rsync/unison scripts need more maintenance than replication on a NAS or a SAN.
If there are millions of files then block level is the only way to go, we have a number of filesystems that have 40 million files in 600GB and file based replication is not going to work there.
From James