buildgrid.server.cas.storage.index.sql module
SQLIndex
A SQL index implementation. This must be pointed to a remote SQL server.
- buildgrid.server.cas.storage.index.sql.read_and_rewind(read_head: IO) AnyStr | None
Reads from an IO object and returns the data found there after rewinding the object to the beginning.
- Parameters:
read_head (IO) – readable IO head
- Returns:
readable content from read_head.
- Return type:
AnyStr
- class buildgrid.server.cas.storage.index.sql.SQLIndex(sql_provider: SqlProvider, storage: StorageABC, *, sql_ro: SqlProvider | None = None, window_size: int = 1000, inclause_limit: int = -1, max_inline_blob_size: int = 0, refresh_accesstime_older_than: int = 0, **kwargs: Any)
Bases:
IndexABC- TYPE: str = 'SQLIndex'
- start() None
- stop() None
- has_blob(digest: Digest) bool
Return True if the blob with the given instance/digest exists.
- get_blob(digest: Digest) IO[bytes] | None
Get a blob from the index or the backing storage. Optionally fallback and repair index
- delete_blob(digest: Digest) None
Delete the blob from storage if it’s present.
- commit_write(digest: Digest, write_session: IO[bytes]) None
Store the contents for a digest.
The storage object is not responsible for verifying that the data written to the write_session actually matches the digest. The caller must do that.
- missing_blobs(digests: list[Digest]) list[Digest]
Return a container containing the blobs not present in CAS.
- bulk_update_blobs(digest_blob_pairs: list[tuple[Digest, bytes]]) list[Status]
Implement the StorageABC’s bulk_update_blobs method.
The StorageABC interface takes in a list of digest/blob pairs and returns a list of results. The list of results MUST be ordered to correspond with the order of the input list.
- bulk_read_blobs(digests: list[Digest]) dict[str, bytes]
Given an iterable container of digests, return a {hash: file-like object} dictionary corresponding to the blobs represented by the input digests.
Each file-like object must be readable and seekable.
- least_recent_digests() Iterator[Digest]
Generator to iterate through the digests in LRU order
- get_total_size() int
Return the sum of the size of all blobs within the index.
- get_blob_count() int
Return the number of blobs within the index.
- delete_n_bytes(n_bytes: int, dry_run: bool = False, protect_blobs_after: datetime | None = None, large_blob_threshold: int | None = None, large_blob_lifetime: datetime | None = None) int
When using a SQL Index, entries are deleted from the index, and then the blobs they refer to are deleted from the backing storage. In the event that the deletion from the underlying storage fails, the blob will be left behind unindexed. To avoid this situation, the cleanup janitor can be used, which deletes unindexed blobs from backing storage.
This cleanup operates in LRU order, over the course of two separate queries. We first select a batch of blobs to be deleted, tracking which ones are inlined and which are stored in the underlying storage. In a separate query (or set of queries, depending on the in clause size limit) we then delete these index entries. Whilst being considered for deletion, index entries are selected with a
FOR UPDATElock. Already locked blobs are not considered.We then delete the relevant blobs from the underlying storage before returning.
The workflow is roughly as follows: - Start a SQL transaction. - Find a set of digests to delete, tracking the size of the set and split between inlined/storage. - Actually delete the index entries. - Close the SQL transaction. - Perform the storage deletes.
- bulk_delete(digests: list[Digest]) list[str]
Delete a list of blobs from storage.