`pyanodot.collectors.cloudwatch` - Amazon CloudWatch Collector¶

Overview¶

The pyanodot.collectors.cloudwatch collector uses two types of API calls provided by AWS CloudWatch:

ListMetrics - Provides a list of the available metrics ( names of time-series ).
GetMetricStatistics - Provides the time-series data for a given time-series and time range.

The collector provides a thin wrapper around these API calls and works follows:

Phase 1.: The collector calls the ListMetrics API and caches a list of metrics into a JSON file inside the work directory. ( this Phase is slow and so it only runs when there is no cached metric list, or when forced by the -f/--force-refresh flag ).
Phase 2.: The collector reads from the cached file the list of metrics to be collected, and for each one calls the GetMetricStatistics API call. This data is sent to Anodot and/or saved locally.

Basic Setup¶

Create or choose an existing AWS user with an existing key-pair.
Grant read-only permissions to the user as described Here .
On the machine which runs the collector, log-in as the user which will run the collector. Create the directory ~/.aws and the file and create an entry for the key-pair in ~/.aws/credentials where ~ is the

For example, the following section ( in ~/.aws/credentials ) declares a profile named anodot-collector with the key-pair details:

[anodot-collector]
aws_access_key_id = AKIAXXXXXXXXXXXXXXXX
aws_secret_access_key = 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

Please refer to Boto 3 Docs for complete explanation.

Configuration¶

Account-level properties¶

profile_name(string, required) - the name of the profile to use to connect as described above.
region_name(string, required) - the name of the AWS region for which to run this query.

Since AWS CloudWatch API queries work per-region, a separate query section is required for each region that is to be collected.

Query-level properties¶

The concept of a metric in AWS CloudWatch is a combination of the three properties Namespace, MetricName, Dimensions.

ListMetrics(ListMetrics, required) - this object specifies which metrics to fetch during step 1

The valid properties of a ListMetrics object are Namespace, MetricName, Dimensions. These properties are passed as-is to the query for ListMetrics and have the same meaning as in the AWS ListMetrics API Call.
MetricNameFilter - Filters the metrics based on a regular expression. e.g.

FlexibleDimensionFilter - In some use-cases the full list of metrics is not needed, rather one would like to take only certain metrics at the highest level of granularity.

For example: one may be interested in the CPUUtilization metric aggregated for the whole region, and also in a granular level for a specific instance. In order to achieve this, one may use the following definition:

ListMetrics:
  Namespace: AWS/EC2
  MetricName: CPUUtilization
FlexibleDimensionFilter:
  InstanceId:
    - i-1234567

In general, FlexibleDimensionFilter will “allow” a metric to be used in a call to GetMetricStatistics if for every dimension name specified ( in the example - “InstanceId”), the metric’s Dimensions property either

does not contain a dimension with this name, or
contains a dimension with this name whose value appears in the list ( in the example - ["i-1234567"] ).

GetMetricStatistics(GetMetricsStatistics, required) - this object may contain the properties:

Period - the granularity in seconds of the time series ( default: 3600 ). Statistics - which statistics of the time-series during the interval to get ( default: [Average].

Sample Configuration¶

anodot_api_endpoint: production
anodot_api_token: XXXXXXXXXXXXXXX
collectors:
  pyanodot.collectors.cloudwatch:
    accounts:
      AWSAnodot:
        profile_name: "profile name from ~/.aws/credentials"
        queries:
          instance_example:   # this key is arbitrary
            FlexibleDimensionFilter: {}
            GetMetricStatistics: { Period: 3600 }
            ListMetrics:
              Dimensions:
              - {Name: InstanceId, Value: i-008ff9d2739f78622}
              MetricName: CPUUtilization
              Namespace: AWS/EC2
            extra_params: {test: true}
            ver: 19
        region_name: us-east-1

Command-line¶

$ anodot-collect.py pyanodot.collectors.cloudwatch -h

usage: anodot-collect.py pyanodot.collectors.cloudwatch [-h] [-s START_TIME] [-e END_TIME] [-f]
                                    [-w WORK_DIR] [-E ENDPOINT] [-V VER] [-a]
                                    [-J] [-d] [-p] [-D API_DELAY]
                                    [-C API_CHUNK]
                                    [--producer-concurrency PRODUCER_CONCURRENCY]
                                    [--anodot-api-concurrency ANODOT_API_CONCURRENCY]

optional arguments:
  -h, --help            show this help message and exit
  -s START_TIME, --start-time START_TIME
                        (default: yesterday 00:00)
  -e END_TIME, --end-time END_TIME
                        End time for the query. Format is "YYYY-MM-DD
                        hh:mm:ss" (default: today 00:00)
  -f, --force-refresh   force refreshing the cached metric lists
  -w WORK_DIR, --work-dir WORK_DIR
                        working directory ( logs saved there )
  -E ENDPOINT, --endpoint ENDPOINT
                        Anodot API endpoint: `poc` or `production`
  -V VER, --ver VER     data version number to send ( unless specified here or
                        in the config, ver=1 )
  -a, --save-anodot-csv
                        write a CSV file with the Anodot format
  -J, --save-anodot-json
                        write a JSON file with the same format being sent to
                        the Anodot API
  -d, --debug           Print verbose debug output
  -p, --production
  -D API_DELAY, --api-delay API_DELAY
                        Anodot API: delay between requests
  -C API_CHUNK, --api-chunk API_CHUNK
                        Anodot API: max. number of metrics to send per request
  --producer-concurrency PRODUCER_CONCURRENCY
                        Number of concurrent processes to use for producers
  --anodot-api-concurrency ANODOT_API_CONCURRENCY
                        Number of concurrent processes to use for sending to
                        anodot

Examples:¶

To test and look into the metrics that were fetched ( using multiprocessing ):

$ anodot-collect.py pyanodot.collectors.cloudwatch -w . --producer-concurrency 20 -a -s '2017-05-01 00:00:00' -e '2017-05-02 00:00:00'

To actually send metrics to Anodot using multiprocessing, with 20 processes on the AWS API side and 5 processes on the Anodot API side:

$ anodot-collect.py pyanodot.collectors.cloudwatch -w . --producer-concurrency 20 --anodot-api-concurrency 5 -p -s '2017-05-01 00:00:00' -e '2017-05-02 00:00:00'

Notes and recommendations¶

Though it varies according to the amount of managed resources, the cloudwatch collector requires a substantial number of API calls to list, collect and send metrics. It is recommended to make use of the --producer-concurrency and --anodot-concurrency parameters in order to reduce the run time of the collection process.
AWS CloudWatch has its own data retention policy and the highest-level granularity may not be available for all historical data. One implication is that currently for a query with "Period": 60 it is not possible to request a starting date of more than two weeks ago.

pyanodot.collectors.cloudwatch - Amazon CloudWatch Collector¶