buildgrid.server.cas.storage.index.sql module

SQLIndex

A SQL index implementation. This must be pointed to a remote SQL server.

buildgrid.server.cas.storage.index.sql.read_and_rewind(read_head: IO) AnyStr | None

Reads from an IO object and returns the data found there after rewinding the object to the beginning.

Parameters:

read_head (IO) – readable IO head

Returns:

readable content from read_head.

Return type:

AnyStr

class buildgrid.server.cas.storage.index.sql.SQLIndex(sql_provider: SqlProvider, storage: StorageABC, *, sql_ro: SqlProvider | None = None, window_size: int = 1000, inclause_limit: int = -1, max_inline_blob_size: int = 0, refresh_accesstime_older_than: int = 0, **kwargs: Any)

Bases: IndexABC

TYPE: str = 'SQLIndex'
start() None
stop() None
has_blob(digest: Digest) bool

Return True if the blob with the given instance/digest exists.

get_blob(digest: Digest) IO[bytes] | None

Get a blob from the index or the backing storage. Optionally fallback and repair index

delete_blob(digest: Digest) None

Delete the blob from storage if it’s present.

commit_write(digest: Digest, write_session: IO[bytes]) None

Store the contents for a digest.

The storage object is not responsible for verifying that the data written to the write_session actually matches the digest. The caller must do that.

missing_blobs(digests: list[Digest]) list[Digest]

Return a container containing the blobs not present in CAS.

bulk_update_blobs(digest_blob_pairs: list[tuple[Digest, bytes]]) list[Status]

Implement the StorageABC’s bulk_update_blobs method.

The StorageABC interface takes in a list of digest/blob pairs and returns a list of results. The list of results MUST be ordered to correspond with the order of the input list.

bulk_read_blobs(digests: list[Digest]) dict[str, bytes]

Given an iterable container of digests, return a {hash: file-like object} dictionary corresponding to the blobs represented by the input digests.

Each file-like object must be readable and seekable.

least_recent_digests() Iterator[Digest]

Generator to iterate through the digests in LRU order

get_total_size() int

Return the sum of the size of all blobs within the index.

get_blob_count() int

Return the number of blobs within the index.

delete_n_bytes(n_bytes: int, dry_run: bool = False, protect_blobs_after: datetime | None = None, large_blob_threshold: int | None = None, large_blob_lifetime: datetime | None = None) int

When using a SQL Index, entries are deleted from the index, and then the blobs they refer to are deleted from the backing storage. In the event that the deletion from the underlying storage fails, the blob will be left behind unindexed. To avoid this situation, the cleanup janitor can be used, which deletes unindexed blobs from backing storage.

This cleanup operates in LRU order, over the course of two separate queries. We first select a batch of blobs to be deleted, tracking which ones are inlined and which are stored in the underlying storage. In a separate query (or set of queries, depending on the in clause size limit) we then delete these index entries. Whilst being considered for deletion, index entries are selected with a FOR UPDATE lock. Already locked blobs are not considered.

We then delete the relevant blobs from the underlying storage before returning.

The workflow is roughly as follows: - Start a SQL transaction. - Find a set of digests to delete, tracking the size of the set and split between inlined/storage. - Actually delete the index entries. - Close the SQL transaction. - Perform the storage deletes.

bulk_delete(digests: list[Digest]) list[str]

Delete a list of blobs from storage.