Monitoring and Metrics

Overview

BuildGrid provides a set of tools to monitor itself and the services it provides. The monitoring in BuildGrid can be enabled in the configuration yaml file. Prefixes, serialization format, endpoint type and location can also be configued in the yaml file. Please refer to the reference.yml for more details.

Serialization Formats Provided

BuildGrid allows the monitoring to be serialized using either a Binary Protobuf format, a JSON format, or a StatsD format.

  • Binary format: BuildGrid will serialize the message to a string using the Protobuf Buffer API. This data can then be unserialized using ParseFromString.

  • JSON format: BuildGrid will serialize the message to JSON using Protobuf.

  • StatsD format: BuildGrid will publish the method in the StatsD format. It will exclude any log messages. Currently, only the Gauge, Timer, and Counter record types are supported for StatsD.

Regardless of the format chosen, BuildGrid will prepend the instance name to the metrics as metadata. This is the only key value pair you can prepend at this point.

End Points Supported

BuildGrid supports publishing metrics and logs to one of four locations.

  • stdout

  • file (path to file)

  • unix domain socket (socket address)

  • udp address (address:port)

How Metrics Publishing Works

The BuildGrid server class has a monitoring_bus which methods can use to publish records. When the monitoring bus is started, the monitoring bus class will then spin up an async event loop, which will pull records off of a internal queue. It will publish them to the endpoint specified using the format specified.

Publishing API

Once a log or metric record is collected, one can send it to the monitoring_bus using send_record_nowait or send_record. This will place the record in the monitoring_bus’s queue, where it will then later be consumed for publishing.

Supporting New Records

New record types for StatsD can be added to BuildGrid, by modifying the monitoring proto, monitoring bus class, and adding a new utility to metrics utilities. Adding a decorator for this record type will make it easier for others to utilize it.

Example Usage

Here is how one can create and publish a timing metric:

timer_record = create_timer_record(
        MetricRecordDomain.BUILD, 'inputs-fetching-time', input_fetch_time, metadata=context)
monitoring_bus = get_monitoring_bus()  # Get the singleton monitoring bus
monitoring_bus.send_record_nowait(timer_record)

Visualizations of metrics in BuildGrid, using StatsD and Grafana/Graphite, can be seen using this docker compose file. After composing up, one can go to localhost:3000 to see metrics being published to a Grafana dashboard.