Notes for Developers
Warning
Invocation of the dependency generation script must happen within the “requirements” directory.
Adding dependencies to a requirements file via the manage-deps script
Simply add the dependencies to the relevant requirements.*.in file, and run the manage-deps script from within the requirements directory:
cd requirements ./manage-deps lock requirements.*.in # Where * is the name of a specific file
Warning
DO NOT add dependencies to requirements.txt, general dependencies should go in requirements.base.in.
Updating dependencies in the main requirements.in file, and creating the frozen requirements.txt file.
requirements.in contains all the dependencies specified in the requirements files, while requirements.txt contains the all the frozen dependencies.
To create this file, simply run:
cd requirements ./manage-deps lock
This should be run after a new dependency has been added to one of the requirements files.
Upgrading frozen requirements.
To bump up the versions of all the packages in a frozen requirements.*.txt file, run:
cd requirements ./manage-deps upgrade-all
Upgrading the gRPC protobuf files
Buildgrid’s gRPC stubs are not built as part of installation. Instead, they are precompiled and shipped with the source code. The protobufs are prone to compatibility-breaking changes, so we update them manually.
First bring the updated proto file into the protos source tree. For example, if updating
the remote execution proto, replace the old
protos/src/build/bazel/remote/execution/v2/remote_execution.proto
with the newer one. Then, compile the protobufs.
tox -e protos
The regenerated protobufs will be available in buildgrid/server/_protos
. If adding a new protobuf
file, make sure to list the file in the protos/protos.yaml
configuration!
Modifying the database schema
The database models are stored in buildgrid/server/persistence/sql/models.py
.
This is the source of truth for the database schema, and this file is what needs to
be updated in order to modify the schema.
To update the schema, make any needed changes to the models.py
file. Then, you
need to generate a new revision and test the revision against a database. The easiest
way to do this is probably the postgres docker image (https://hub.docker.com/_/postgres).
docker pull postgres docker run --rm --name buildgrid-postgres -e POSTGRES_PASSWORD=pass -p 5432:5432 -d postgres
Now, install alembic (with pip install alembic
) and modify alembic.ini
(the file in the root
of this repository) to point at our dockerized postgres database by editing the sqlalchemy.url
field.
sqlalchemy.url = postgres://postgres:pass@0.0.0.0:5432/postgres
Then, upgrade the database to the latest pre-revision state. Run this from the repository root.
alembic init alembic alembic upgrade head
Now, we can finally generate a new revision. Run this from the repository root.
alembic revision --autogenerate -m "a meaningful commit message"
This will generate a new revision file in buildgrid/server/persistence/sql/alembic/versions/
that contains the difference between your old database and the new, updated model.
Some particulars worth noting:
If you are adding a new index, please be sure to add it CONCURRENTLY in Postgres. This ensures that a migration can be performed on a database that is being locked by other processes (perhaps a running BuildGrid). This is accomplished with the postgresql_concurrently flag.
Implementing a data store for the Scheduler
The buildgrid.server.scheduler.Scheduler
can be configured to use one of
multiple backends. Currently in-memory and SQL-based backends are available and
supported.
It is possible to implement a new backend, by implementing the data store interface
(buildgrid.server.persistence.interface.DataStoreInterface
) and adding
a class to buildgrid/_app/settings/parser.py
to allow the interface to be
configured.
The implementation is free to decide how to persist the data, as long as all of
the abstract methods of DataStoreInterface
are implemented. The implementations
are only required to exist as specified; the details are left up to the author.
This allows implementations to make unneccessary methods do nothing for example,
as long as the expected data type is returned. There may be nothing to do for
enabling/disabling monitoring for some implementations for example.
Backend implementations are also required to implement some way of triggering
the events that are used when streaming updates to clients. These are instances
of buildgrid.utils.TypedEvent
, and notify_change
should be used to
indicate an update message should be sent, and notify_stop
should be used to
indicate that the thread handling the stream should check whether the client
is connected, and stop if not. Implementations should only really need to use
notify_change
, as the disconnect logic is in the DataStoreInterface
already.
Both existing implementations start a thread which periodically checks the
state of the data for jobs that are being watched and compares it with the
previous state. If it detects a change, then it calls notify_change
on
the relevant event, which is stored in the
buildgrid.utils.JobWatchSpec
for the affected job, which is in
the self.watched_jobs
dictionary. Adding/removing entries from
this dictionary is handled by the DataStoreInterface
class, so doesn’t
need to be a concern for implementations.
The class to parse a YAML tag to allow the data store implementation to be
configured should inherit from YamlFactory
and have a __new__
method
which returns an instance of the implementation. There are no other limitations
on what it should and shouldn’t do. In order to allow the parser to understand
the tag, the get_parser
function at the bottom of parser.py
should also
be modified to add a constructor for the new tag.
Working with timestamps and timezones in Buildgrid
Currently, if the Index is enabled, Buildgrid will store timestamps in the Index Database. These timestamps are stored as timezone-unaware objects in the database. This means that the timestamps do not have any accompaning timezone information. As convention, Buildgrid treats all timestamps as UTC time.
This results in some important considerations one must make when contributing to Buildgrid. You should always default to using UTC time when dealing with timestamps. Not doing so can break behavior in Buildgrid which requires proper ordering of timestamps. This also means that all timestamp objects should also be timezone-unaware. If you use timezone-aware objects, some libraries like SQLAlchemy will convert them to local time before comparing them with timezone-unaware objects. This can break systems which rely on accurate timestamps in Buildgrid.
Consequently, when contributing, please verify that if datetime objects are being used, they are UTC time and timezone-unaware. For example, if you wish to get the current time, you should always be using datetime.utcnow(). Using variants which include timezone information can create subtle bugs!
Additionally, when updating timestamp-sensitive code, it is always best practice to write thorough unit tests. Even if the change may seem trivial, unit tests can reveal hidden assumptions you are making.