Loggregator Release
Loggregator is a BOSH release deployed as a part of cf-deployment. Loggregator provides a highly-available (HA) and secure stream of logs and metrics for all applications and components on Cloud Foundry. It does so while not disrupting the behavior of the the applications and components on the platform (i.e. "backpressure").
The Loggregator Design Notes presents an overview of Loggregator components and architecture.
If you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack
Table of Contents
- Streaming Application Logs
- Consuming the Firehose
- Emitting Logs and Metrics into Loggregator
- Tools for Testing and Monitoring Loggregator
- Troubleshooting Reliability
Streaming Application Logs
Any user of Cloud Foundry can experience Loggregator by using two simple interfaces for streaming application specific logs. These do not require any special User Account and Authentication(UAA) Scope.
Using the CF CLI
The fastest way to see your logs is by running the cf logs
command using the
CF CLI. Check the Cloud Foundry CLI docs for more details.
Forwarding to a Log Drain
If you’d like to save all logs for an application in a third party or custom tool that expects the syslog format, a log drain allows you to do so. Check the Cloud Foundry docs for more details.
Log Ordering
Loggregator does not provide any guarantees around the order of delivery of logs in streams. That said, there is enough precision in the timestamp provided by diego that streaming clients can batch and order streams as they receive them. This is done by the cf cli and most other streaming clients.
Consuming the Firehose
The firehose is an aggregated stream of all application logs and component metrics on the platform. This allows operators to ensure they capture all logs within a microservice architecture as well as monitor the health of their platform. See the Firehose README.
User Account and Authentication Scope
In order to consume the firehose you’ll need the doppler.firehose
scope from
UAA. For more details see the Firehose README.
Nozzle Development
Once you have configured appropriate authentication scope you are ready to start developing a nozzle for the firehose. See our Nozzle community page for more details about existing nozzles and how to get started.
Metrics
Loggregator and other Cloud Foundry components emit regular messages through the Firehose that monitor the health, throughput, and details of a component's operations. For more detials about Loggregator’s metrics see our Loggregator Metrics README.
Emitting Logs and Metrics into Loggregator
For components of Cloud Foundry or standalone BOSH deployments, Loggregator provides a set of tools for emitting Logs and Metrics.
Reverse Log Proxy (RLP)
The RLP is the v2 implementation of the Loggregator API. This component is intended to be a replacement for traffic controller.
RLP Gateway
By default, the RLP communicates with clients via gRPC over mutual TLS. To enable HTTP access to the Reverse Log Proxy, deploy the RLP Gateway.
Loggregator API
The Loggregator API is a replacement of the Dropsonde Protocol. Loggregator API defines an envelope structure which packages logs and metrics in a common format for distribution throughout Loggregator. See the Loggregator API README for more details.
Loggregator Agents
Loggregator Agents receive logs and metrics on VMs, and forward them onto the Firehose. For more info see the loggregator-agent release.
Statsd-injector
The statsd-injector receives metrics from components in the statsd metric aggregator format. For more info see the statsd-injector README.
Syslog Release
For some components (such as UAA) it makes sense to route logs separate from the Firehose. The syslog release uses rsyslog to accomplish this. For more information see the syslog-release README.
Tools for Testing and Monitoring Loggregator
Loggregator provides a set of tools for testing the performance and reliability of your loggregator installation. See the loggregator tools repo for more details.
Troubleshooting Reliability
Scaling
In addition to the scaling recommendations above, it is important that the resources for Loggregator are dedicate VM’s with similar footprints to those used in our capacity tests. Even if you are within the bounds of the scaling recommendations it may be useful to scale Loggregator and Nozzle components aggressively to rule out scaling as a major cause log loss.
Noise
Another common reason for log loss is due to an application producing a large amount of logs that drown out the logs from other application on the cell it is running on. To identify and monitor for this behavior the Loggregator team has created a Noisy Neighbor Nozzle and CLI Tool. This tool will help operators quickly identify and take action on noise producing applications. Instruction for deploying and using this nozzle are in the repo.