De-duplication for Speed
OpenSolaris has recently incorporated block level de-duplication into ZFS. This allows for space savings when many duplicates of the same file or similar files exist in the same pool of storage. Even if they are in different filesystems because ZFS uses a hierarchal file system, allowing multiple filesystems to share the same storage pool.
A big advantage to this: If you try to write a block that already exists, the pool only needs to reference the block in the filesystem. This basically eliminates the disk writes necessary to store data that matches data already stored. For those who don't know, disk writes are often a huge performance bottleneck. De-duplication also helps read speeds by eliminating redundant data blocks in the read cache, allowing you to maximize the use of your cache.
Along with the ability to use SSD as read and write caches, you can achieve huge performance gains. Don't think you can afford cache drives for hundreds of systems? Well, consider consolidating storage across several systems using NFS, iSCSI, fibre channel or even InfiniBand. ZFS has direct support for all of them and more.With the overlapping blocks from multiple systems being in cache much more frequently, you may find even better performance than you get from local disks.


