CAS Cleanup

If using an Indexed CAS along with an S3 storage backend, you can run a separate daemon to handle LRU cleanup/expiry of blobs from S3, to stay within usage quotas for example. Non-S3 backends can broadly use other mechanisms to handle expiry of old blobs.

This continuously monitors the current size of the CAS contents, and triggers cleanup if the contents reach a specified “high water mark” size. The cleanup deletes blobs (in configurably-sized chunks) in least recently used order until the size of the CAS contents reaches a specified “low water mark”, when it stops deleting and goes back to monitoring the size.

Usage

To run the cleanup daemon,

bgd cleanup --high-watermark 10G --low-watermark 7.5G --batch-size 100M \
    --sleep-interval 10 deployment.yml

The batch size and high/low water mark parameters take numbers in bytes. Shorthands for kB, MB, GB, and TB are available as K, M, G, and T respectively, as seen in the example.

The batch size is the minimum amount of space cleared in one go. The cleanup tool will try to remain as close as possible to the configured batch size, but depending on the size of blobs in the CAS will sometimes delete more than the specified batch at a time.

A smaller batch size adds more load to the database and the storage backend, but space will start to be actually cleared faster than with large batch sizes.

If the batch size is larger than the difference between the current CAS size and the low water mark, then the whole set of deletions required will be done in one batch.

The sleep interval is the time in seconds to sleep after checking whether the CAS size has reached the configured high water mark. A lower sleep interval means a more reactive cleanup, at the cost of more database load.

The configuration file used should contain the index and backend storage definitions. The easiest way to achieve this is to just use the same config file that was used to deploy the indexed CAS in the first place.

It should be noted that if monitoring is configured in the provided config file (see Automatic job pruning) then any metrics produced by the cleanup tool will be published in the configured place. If that shouldn’t be the same place as the indexed CAS metrics for whatever reason then the config will need to be changed.

Note If using a WithCache storage type and a non-distributed storage type, such as InMemoryLRU, the caches will not be cleaned up along with the backing storage. In rare cases this can cause issues. To minimize this issue, the configured cache size across all BuildGrids should be smaller than the configured low watermark.