buildgrid.server.cas.service module
CAS services
Implements the Content Addressable Storage API and ByteStream API.
- class buildgrid.server.cas.service.ContentAddressableStorageService
Bases:
ContentAddressableStorageServicer,InstancedServicer[ContentAddressableStorageInstance]- SERVICE_NAME = 'ContentAddressableStorage'
- REGISTER_METHOD(server)
- FULL_NAME: ClassVar[str] = 'build.bazel.remote.execution.v2.ContentAddressableStorage'
The full name of the servicer, used to match instances to the servicer and configure reflection. This value should be declared on the class of any Servicer implementations.
- FindMissingBlobs(request: FindMissingBlobsRequest, context: ServicerContext) FindMissingBlobsResponse
Determine if blobs are present in the CAS.
Clients can use this API before uploading blobs to determine which ones are already present in the CAS and do not need to be uploaded again.
Servers SHOULD increase the lifetimes of the referenced blobs if necessary and applicable.
There are no method-specific errors.
- BatchUpdateBlobs(request: BatchUpdateBlobsRequest, context: ServicerContext) BatchUpdateBlobsResponse
Upload many blobs at once.
The server may enforce a limit of the combined total size of blobs to be uploaded using this API. This limit may be obtained using the [Capabilities][build.bazel.remote.execution.v2.Capabilities] API. Requests exceeding the limit should either be split into smaller chunks or uploaded using the [ByteStream API][google.bytestream.ByteStream], as appropriate.
This request is equivalent to calling a Bytestream Write request on each individual blob, in parallel. The requests may succeed or fail independently.
Errors:
INVALID_ARGUMENT: The client attempted to upload more than the
server supported limit.
Individual requests may return the following errors, additionally:
RESOURCE_EXHAUSTED: There is insufficient disk quota to store the blob.
INVALID_ARGUMENT: The
[Digest][build.bazel.remote.execution.v2.Digest] does not match the provided data.
- BatchReadBlobs(request: BatchReadBlobsRequest, context: ServicerContext) BatchReadBlobsResponse
Download many blobs at once.
The server may enforce a limit of the combined total size of blobs to be downloaded using this API. This limit may be obtained using the [Capabilities][build.bazel.remote.execution.v2.Capabilities] API. Requests exceeding the limit should either be split into smaller chunks or downloaded using the [ByteStream API][google.bytestream.ByteStream], as appropriate.
This request is equivalent to calling a Bytestream Read request on each individual blob, in parallel. The requests may succeed or fail independently.
Errors:
INVALID_ARGUMENT: The client attempted to read more than the
server supported limit.
Every error on individual read will be returned in the corresponding digest status.
- GetTree(request: GetTreeRequest, context: ServicerContext) Iterator[GetTreeResponse]
Fetch the entire directory tree rooted at a node.
This request must be targeted at a [Directory][build.bazel.remote.execution.v2.Directory] stored in the [ContentAddressableStorage][build.bazel.remote.execution.v2.ContentAddressableStorage] (CAS). The server will enumerate the Directory tree recursively and return every node descended from the root.
The GetTreeRequest.page_token parameter can be used to skip ahead in the stream (e.g. when retrying a partially completed and aborted request), by setting it to a value taken from GetTreeResponse.next_page_token of the last successfully processed GetTreeResponse).
The exact traversal order is unspecified and, unless retrieving subsequent pages from an earlier request, is not guaranteed to be stable across multiple invocations of GetTree.
If part of the tree is missing from the CAS, the server will return the portion present and omit the rest.
Errors:
NOT_FOUND: The requested tree root is not present in the CAS.
- SpliceBlob(request: SpliceBlobRequest, context: ServicerContext) SpliceBlobResponse
SpliceBlob tells the CAS how chunks can compose a blob.
This is the complementary operation to the [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob] function to handle the chunked upload of large blobs to save upload traffic.
When uploading a large blob using chunked upload, clients MUST first upload all chunks to the CAS, then call this RPC to tell the server how those chunks compose the original blob. The chunks referenced in the SpliceBlob call SHOULD be available in the CAS before calling this RPC.
If a client needs to upload a large blob and is able to split a blob into chunks in such a way that reusable chunks are obtained, e.g., by means of content-defined chunking, it can first determine which parts of the blob are already available in the remote CAS and upload the missing chunks, and then use this API to store information on how the chunks compose the original blob.
Servers which implement this functionality MUST declare that they support it by setting the [CacheCapabilities.splice_blob_support][build.bazel.remote.execution.v2.CacheCapabilities.splice_blob_support] field accordingly.
Clients MUST check that the server supports this capability, before using it.
In order to ensure data consistency of the CAS, the server MUST only add blobs to the CAS after verifying their digests. In particular, servers MUST NOT trust digests provided by the client. The server MAY accept a request as no-op if the client-specified blob is already in CAS or if information on how to construct the blob from chunks is available. If the client-specified blob is not already in the CAS, the server MUST verify that the digest of the newly created blob assembled from chunks matches the digest specified by the client, and reject the request if they differ. Servers MAY choose to allow overwriting existing chunk mappings or to store multiple chunk mappings for the same blob.
When blob splitting and splicing is used at the same time, the clients and the server SHOULD agree out-of-band upon a chunking algorithm used by both parties to benefit from each other’s chunk data and avoid unnecessary data duplication.
Errors:
NOT_FOUND: At least one of the blob chunks is not present in the CAS.
RESOURCE_EXHAUSTED: There is insufficient disk quota to store the
spliced blob. * INVALID_ARGUMENT: The digest of the spliced blob is different from the provided expected digest. * ALREADY_EXISTS: The blob already exists in CAS and the server did not extend the lifetime of the chunks specified in the request, e.g. because it prefers a different chunking and extended those instead. Clients can call [SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob] to check what chunk mapping the server is using.
- SplitBlob(request: SplitBlobRequest, context: ServicerContext) SplitBlobResponse
SplitBlob retrieves information about how a blob is split into chunks.
This call returns information about how a blob is split into chunks, and returns a list of the chunk digests. Using the returned list of chunk digests, a client can check which chunks are locally available and only fetch the missing ones. The desired blob can be assembled by concatenating the fetched chunks in the order of the digests in the list. The chunks SHOULD all be available in the CAS.
This API can be used to reduce the required data to download a large blob from CAS if some chunks from similar blobs are locally available. For this procedure to work properly, blobs SHOULD be split in a content-defined way, rather than with fixed-sized chunking.
If a split request is answered successfully, a client can expect the following guarantees from the server: 1. The blob chunks are stored in CAS. 2. Concatenating the blob chunks in the order of the digest list returned by the server results in the original blob.
Servers which implement this functionality MUST declare that they support it by setting the [CacheCapabilities.split_blob_support][build.bazel.remote.execution.v2.CacheCapabilities.split_blob_support] field accordingly.
Clients MUST check that the server supports this capability, before using it.
Clients SHOULD verify that the digest of the blob assembled by the fetched chunks is equal to the requested blob digest.
The lifetimes of the generated chunk blobs MAY be independent of the lifetime of the original blob. In particular: * A blob and any chunk derived from it MAY be evicted from the CAS at different times. * A call to [SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob] extends the lifetime of the original blob, and sets the lifetimes of the resulting chunks (or extends the lifetimes of already-existing chunks). * Touching a chunk extends its lifetime, but the server MAY choose not to extend the lifetime of the original blob. * Touching the original blob extends its lifetime, but the server MAY choose not to extend the lifetimes of chunks derived from it.
When blob splitting and splicing is used at the same time, the clients and the server SHOULD agree out-of-band upon a chunking algorithm used by both parties to benefit from each other’s chunk data and avoid unnecessary data duplication.
Errors:
NOT_FOUND: The requested blob is not present in the CAS, OR there is no
split information available for the blob, OR at least one chunk needed to reconstruct the blob is missing from the CAS. * RESOURCE_EXHAUSTED: There is insufficient disk quota to store the blob chunks.
- class buildgrid.server.cas.service.ResourceNameRegex
Bases:
object- READ = '^(.*?)/?(blobs/.*/[0-9]*)$'
- WRITE = '^(.*?)/?(uploads/.*/blobs/.*/[0-9]*)'
- class buildgrid.server.cas.service.ByteStreamService
Bases:
ByteStreamServicer,InstancedServicer[ByteStreamInstance]- SERVICE_NAME = 'ByteStream'
- REGISTER_METHOD(server)
- FULL_NAME: ClassVar[str] = 'google.bytestream.ByteStream'
The full name of the servicer, used to match instances to the servicer and configure reflection. This value should be declared on the class of any Servicer implementations.
- Read(request: ReadRequest, context: ServicerContext) Iterator[ReadResponse]
Read() is used to retrieve the contents of a resource as a sequence of bytes. The bytes are returned in a sequence of responses, and the responses are delivered as the results of a server-side streaming RPC.
- Write(request_iterator: Iterator[WriteRequest], context: ServicerContext) WriteResponse
Write() is used to send the contents of a resource as a sequence of bytes. The bytes are sent in a sequence of request protos of a client-side streaming RPC.
A Write() action is resumable. If there is an error or the connection is broken during the Write(), the client should check the status of the Write() by calling QueryWriteStatus() and continue writing from the returned committed_size. This may be less than the amount of data the client previously sent.
Calling Write() on a resource name that was previously written and finalized could cause an error, depending on whether the underlying service allows over-writing of previously written resources.
When the client closes the request channel, the service will respond with a WriteResponse. The service will not view the resource as complete until the client has sent a WriteRequest with finish_write set to true. Sending any requests on a stream after sending a request with finish_write set to true will cause an error. The client should check the WriteResponse it receives to determine how much data the service was able to commit and whether the service views the resource as complete or not.
- QueryWriteStatus(request: QueryWriteStatusRequest, context: ServicerContext) QueryWriteStatusResponse
QueryWriteStatus() is used to find the committed_size for a resource that is being written, which can then be used as the write_offset for the next Write() call.
If the resource does not exist (i.e., the resource has been deleted, or the first Write() has not yet reached the service), this method returns the error NOT_FOUND.
The client may call QueryWriteStatus() at any time to determine how much data has been processed for this resource. This is useful if the client is buffering data and needs to know which data can be safely evicted. For any sequence of QueryWriteStatus() calls for a given resource name, the sequence of returned committed_size values will be non-decreasing.