Table of contents

Monitoring apps

Metrics

Cloud Foundry provides time series data, or metrics, for each instance of your PaaS app. You can receive, store and view this data in a monitoring system of your choice by either:

  • using the Prometheus [external link] endpoint provided by the GOV.UK PaaS team
  • deploying the paas-metric-exporter app to push metrics data in StatsD [external link] format

You can also view all metrics in a one-off snapshot by installing the Cloud Foundry CLI log cache plug-in [external link].

Use the Prometheus endpoint

Prometheus uses the https://metrics.cloud.service.gov.uk/metrics API endpoint to request metrics from Cloud Foundry. PaaS maintains this endpoint for free, so you can access all available metrics for free. You can configure Prometheus manually to filter out any metrics you do not need.

You must set up Prometheus to request metrics from the https://metrics.cloud.service.gov.uk/metrics API endpoint.

  1. Install Prometheus [external link].

  2. You must set up a bearer token so the API endpoint can authenticate your Prometheus request. We recommend you use a bearer_token_file. Set up an automated cron job to run the following command every 5 minutes:

    cf oauth-token > /path/to/bearer_token_file.txt
    

    where:

    • cf oauth-token is the command that generates a bearer token
    • /path/to/bearer_token_file.txt is the location and name of the bearer_token_file used by the Prometheus configuration
  3. Configure Prometheus to read the bearer token from the bearer_token_file.txt. Refer to the Prometheus configuration documentation [external link] for more information.

Use Docker to run Prometheus locally

You can set up Prometheus to request metrics from the API endpoint by using Docker [external link] to run a local instance of Prometheus.

  1. Save the following script as test-metrics.sh:

    #!/usr/bin/env bash
    set -ue
    
    echo "
    global:
      scrape_interval: 1m
      evaluation_interval: 1m
      scrape_timeout: 1m
    
    scrape_configs:
      - job_name: PaaS
        bearer_token: $(cf oauth-token | sed 's/bearer //')
        scheme: https
        static_configs:
          - targets:
            - metrics.cloud.service.gov.uk:443
    " > prometheus.yml
    
    docker run --publish 9090:9090 \
               --volume "$PWD/prometheus.yml:/etc/prometheus/prometheus.yml" \
               prom/prometheus
    
  2. Make the script executable:

    chmod +x test-metrics.sh
    
  3. Execute the script:

    ./test-metrics.sh
    

    If the script executes successfully, you will see the message:

    msg="Server is ready to receive web requests."
    
  4. Open your web browser and go to http://localhost:9090/targets. You should see the local Prometheus instance running in a Docker container and receiving metrics.

If your local Prometheus instance is not receiving any metrics, check that the PaaS State is UP.

If the PaaS State is UP and you are still not receiving any metrics, contact us by emailing gov-uk-paas-support@digital.cabinet-office.gov.uk.

Metrics exporter app with StatsD

To use the metrics exporter, you deploy it as an app on PaaS. The current metrics supported by this app are:

  • CPU
  • RAM
  • disk usage data
  • app crashes
  • app requests
  • app response times

Before you set up the metrics exporter app, you will need:

  • a monitoring system to store the metrics with an accompanying StatsD [external link] endpoint set up
  • a live Cloud Foundry account assigned to the spaces you want to receive metrics on

We recommend that this Cloud Foundry account:

  • uses the SpaceAuditor role as this role has the minimum permissions needed to meet the requirements of the metrics exporter app
  • is separate to your primary Cloud Foundry account

To set up the metrics exporter app:

  1. Clone the https://github.com/alphagov/paas-metric-exporter repository.
  2. Push the metrics exporter app to Cloud Foundry without starting the app by running cf push --no-start metric-exporter.
  3. Set the following mandatory environment variables in the metrics exporter app by running cf set-env metric-exporter NAME VALUE:

    Name Value
    API_ENDPOINT Use https://api.cloud.service.gov.uk
    STATSD_ENDPOINT StatsD endpoint
    USERNAME Cloud Foundry User
    PASSWORD Cloud Foundry Password

    You should use the cf set-env command for these mandatory variables as they contain secret information, and this method will keep them secure.

    You can also set environment variables by amending the manifest file. We recommend that you use this method for optional environment variables that do not contain secret information. Refer to the https://github.com/alphagov/paas-metric-exporter repository for more information.

  4. Start the metrics exporter app by running cf start metric-exporter.

You can now check your monitoring system to see if you are receiving metrics.

If you are not receiving any metrics, check the logs for the metrics exporter app. If you still need help, please contact us by emailing gov-uk-paas-support@digital.cabinet-office.gov.uk.

More about monitoring

For more information about monitoring apps, see Monitoring the status of your service on the Service Manual.

Logs

Cloud Foundry and apps running on Cloud Foundry generate logs using Loggregator [external link] and stream them to your terminal. You should consult your logs if your app is failing to deploy or crashing, and it’s not clear why.

Your app must write to stdout or stderr instead of a log file for its logs to be included in the Loggregator stream.

Run cf logs in the command line to stream all logs from each Cloud Foundry service involved in your app deployment:

cf logs APP_NAME

Run cf logs with the --recent flag to stream only the most recent logs:

cf logs APP_NAME --recent

You can also run cf events to see all recent app events, such as when an app starts, stops, restarts, or crashes (including error codes):

cf events APP_NAME

Set up the Logit log management service

By default, Cloud Foundry streams a limited amount of logs to your terminal for a defined time. You can use a commercial log management service to keep more logging information for longer. This section describes how to set up the Logit log management service [external link].

Prerequisites

Before you set up Logit, you must:

Configure logstash filters

You must set up logstash [external link] to process the Cloud Foundry logs into separate Gorouter [external link] and app log types.

  1. Go to your Logit dashboard. For the Logit ELK stack you want to use, select Settings.
  2. On the Stack options menu, select Logstash Filters.
  3. Go to the Logstash Filters page, and replace the code there with the following logstash filter code:

    filter {
        grok {
            # attempt to parse syslog lines
            match => { "message" => "%{SYSLOG5424PRI}%{NONNEGINT:syslog_ver} +(?:%{TIMESTAMP_ISO8601:syslog_timestamp}|-) +(?:%{HOSTNAME:syslog_host}|-) +(?:%{NOTSPACE:syslog_app}|-) +(?:%{NOTSPACE:syslog_proc}|-) +(?:%{WORD:syslog_msgid}|-) +(?:%{SYSLOG5424SD:syslog_sd}|-|) +%{GREEDYDATA:syslog_msg}" }
            # if successful, save original `@timestamp` and `host` fields created by logstash
            add_field => [ "received_at", "%{@timestamp}" ]
            add_field => [ "received_from", "%{host}" ]
            tag_on_failure => ["_syslogparsefailure"]
        }
    
        # parse the syslog pri field into severity/facility
        syslog_pri { syslog_pri_field_name => 'syslog5424_pri' }
    
        # replace @timestamp field with the one from syslog
        date { match => [ "syslog_timestamp", "ISO8601" ] }
    
        # Cloud Foundry passes the app name, space and organisation in the syslog_host
        # Filtering them into separate fields makes it easier to query multiple apps in a single Kibana instance
        dissect {
            mapping => { "syslog_host" => "%{[cf][org]}.%{[cf][space]}.%{[cf][app]}" }
            tag_on_failure => ["_sysloghostdissectfailure"]
        }
    
        # Cloud Foundry gorouter logs
        if [syslog_proc] =~ "RTR" {
            mutate { replace => { "type" => "gorouter" } }
            grok {
                match => { "syslog_msg" => "%{HOSTNAME:[access][host]} - \[%{TIMESTAMP_ISO8601:router_timestamp}\] \"%{WORD:[access][method]} %{NOTSPACE:[access][url]} HTTP/%{NUMBER:[access][http_version]}\" %{NONNEGINT:[access][response_code]:int} %{NONNEGINT:[access][body_received][bytes]:int} %{NONNEGINT:[access][body_sent][bytes]:int} %{QUOTEDSTRING:[access][referrer]} %{QUOTEDSTRING:[access][agent]} \"%{HOSTPORT:[access][remote_ip_and_port]}\" \"%{HOSTPORT:[access][upstream_ip_and_port]}\" %{GREEDYDATA:router_keys}" }
                tag_on_failure => ["_routerparsefailure"]
                add_tag => ["gorouter"]
            }
            # replace @timestamp field with the one from router access log
            date {
                match => [ "router_timestamp", "ISO8601" ]
            }
            kv {
                source => "router_keys"
                target => "router"
                value_split => ":"
                remove_field => "router_keys"
            }
        }
    
        # Application logs
        if [syslog_proc] =~ "APP" {
            json {
                source => "syslog_msg"
                add_tag => ["app"]
            }
        }
    
        # User agent parsing
        if [access][agent] {
            useragent {
                source => "[access][agent]"
                target => "[access][user_agent]"
            }
        }
    
        if !("_syslogparsefailure" in [tags]) {
            # if we successfully parsed syslog, replace the message and source_host fields
            mutate {
                rename => [ "syslog_host", "source_host" ]
                rename => [ "syslog_msg", "message" ]
            }
        }
    }
    
  4. Select Validate.

  5. Select Apply once the code is valid. If this is not possible, check you have copied the code correctly or contact us at gov-uk-paas-support@digital.cabinet-office.gov.uk .

  6. Go back to the Logit dashboard once the following message appears: “Filters have been applied to logstash, logstash will be restarted, this may take up to 2 minutes”.

Configure app

  1. Select Settings for the stack you want to use.
  2. On the Stack options menu, select Logstash Inputs.
  3. Note your Stack Logstash endpoint and TCP-SSL port.
  4. Run the following in the command line to create the log drain service in Cloud Foundry:

    $ cf create-user-provided-service logit-ssl-drain -l syslog-tls://ENDPOINT:PORT
    
  5. Bind the service to your app by running:

    $ cf bind-service APP_NAME logit-ssl-drain
    
  6. Restage your app by running:

    $ cf restage APP_NAME
    
  7. Select Access Kibana on the Stack options menu and check that you can see the logs in Kibana.

Once you confirm that the logs are draining correctly, you have successfully set up Logit.

Contact us by emailing gov-uk-paas-support@digital.cabinet-office.gov.uk if the logs are not draining correctly or if you have any questions.

Enable security for your ELK stack

By default, Logit allows anyone on the internet to send logs to your ELK stack. You can set up Logit to make sure that your ELK stack only receives logs from GOV.UK PaaS.

  1. Contact GOV.UK PaaS support at gov-uk-paas-support@digital.cabinet-office.gov.uk for a list of syslog drain egress IP addresses.
  2. Send these IP addresses to Logit support at https://logit.io/contact-us [external link] and ask that your ELK stack only receives log messages from these addresses.

Further information

Refer to the Cloud Foundry documentation for more information on: