Follow

Cloud Container De-duplication

Subject:

The s3ql filesystem that the Cloud Container feature is based on uses de-duplication and compression to reduce the footprint of the data used for the cloud objects.

 

Details:

For data that contains large numbers of duplicate data structures and is not already heavily compressed, this will result in a small usage of the Cloud object store.

Please also note that the de-duplication occurs at the cloud container level, so if you create multiple cloud backup snapshots of a ZFS Storage Volume, the individual data usage for the Storage Volume snapshots will differ only by unique data that cannot be de-duplicated. 

This also means that if there are Storage Volumes that are discrete data structures on the QuantaStor(Storage Volume Dev and Sorage Volume Test for instance) but they share duplicates of common files, if they are both backed up to the same cloud container, they will share pointers for their common duplicate blocks, reducing their physical data footprint in the cloud container.

The tradeoff for this type of de-duplication is reduced data access speeds as it must perform additional lookups before locating the physical datablocks. However given that the bottleneck for the data transfers to the cloud are typically bound to the limitations of available network throughput, it works very well as a cloud tier for long term archive/tertiary backup.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk