pyanodot.collectors.cloudwatch - Amazon CloudWatch Collector¶
Overview¶
The pyanodot.collectors.cloudwatch collector uses two types of API calls provided by AWS CloudWatch:
ListMetrics- Provides a list of the available metrics ( names of time-series ).GetMetricStatistics- Provides the time-series data for a given time-series and time range.
The collector provides a thin wrapper around these API calls and works follows:
- Phase 1.
- The collector calls the
ListMetricsAPI and caches a list of metrics into a JSON file inside the work directory. ( this Phase is slow and so it only runs when there is no cached metric list, or when forced by the-f/--force-refreshflag ). - Phase 2.
- The collector reads from the cached file the list of metrics to be collected, and for each one calls the
GetMetricStatisticsAPI call. This data is sent to Anodot and/or saved locally.
Basic Setup¶
Create or choose an existing AWS user with an existing key-pair.
Grant read-only permissions to the user as described Here .
On the machine which runs the collector, log-in as the user which will run the collector. Create the directory
~/.awsand the file and create an entry for the key-pair in~/.aws/credentialswhere~is theFor example, the following section ( in
~/.aws/credentials) declares a profile namedanodot-collectorwith the key-pair details:
[anodot-collector]
aws_access_key_id = AKIAXXXXXXXXXXXXXXXX
aws_secret_access_key = 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
Please refer to Boto 3 Docs for complete explanation.
Configuration¶
Account-level properties¶
profile_name(string, required)- the name of the profile to use to connect as described above.region_name(string, required)- the name of the AWS region for which to run this query.
Since AWS CloudWatch API queries work per-region, a separate query section is required for each region that is to be collected.
Query-level properties¶
The concept of a metric in AWS CloudWatch is a combination of the three properties Namespace, MetricName, Dimensions.
ListMetrics(ListMetrics, required)- this object specifies which metrics to fetch during step 1The valid properties of a
ListMetricsobject areNamespace,MetricName,Dimensions. These properties are passed as-is to the query for ListMetrics and have the same meaning as in the AWSListMetricsAPI Call.MetricNameFilter- Filters the metrics based on a regular expression. e.g.
FlexibleDimensionFilter- In some use-cases the full list of metrics is not needed, rather one would like to take only certain metrics at the highest level of granularity.- For example: one may be interested in the
CPUUtilizationmetric aggregated for the whole region, and also in a granular level for a specific instance. In order to achieve this, one may use the following definition:
ListMetrics:
Namespace: AWS/EC2
MetricName: CPUUtilization
FlexibleDimensionFilter:
InstanceId:
- i-1234567
In general, FlexibleDimensionFilter will “allow” a metric to be used in a call to GetMetricStatistics if for every dimension name specified ( in the example - “InstanceId”), the metric’s Dimensions property either
- does not contain a dimension with this name, or
- contains a dimension with this name whose value appears in the list ( in the example -
["i-1234567"]).
GetMetricStatistics(GetMetricsStatistics, required)- this object may contain the properties:Period- the granularity in seconds of the time series ( default:3600).Statistics- which statistics of the time-series during the interval to get ( default:[Average].
See also
Sample Configuration¶
anodot_api_endpoint: production
anodot_api_token: XXXXXXXXXXXXXXX
collectors:
pyanodot.collectors.cloudwatch:
accounts:
AWSAnodot:
profile_name: "profile name from ~/.aws/credentials"
queries:
instance_example: # this key is arbitrary
FlexibleDimensionFilter: {}
GetMetricStatistics: { Period: 3600 }
ListMetrics:
Dimensions:
- {Name: InstanceId, Value: i-008ff9d2739f78622}
MetricName: CPUUtilization
Namespace: AWS/EC2
extra_params: {test: true}
ver: 19
region_name: us-east-1
Command-line¶
$ anodot-collect.py pyanodot.collectors.cloudwatch -h
usage: anodot-collect.py pyanodot.collectors.cloudwatch [-h] [-s START_TIME] [-e END_TIME] [-f]
[-w WORK_DIR] [-E ENDPOINT] [-V VER] [-a]
[-J] [-d] [-p] [-D API_DELAY]
[-C API_CHUNK]
[--producer-concurrency PRODUCER_CONCURRENCY]
[--anodot-api-concurrency ANODOT_API_CONCURRENCY]
optional arguments:
-h, --help show this help message and exit
-s START_TIME, --start-time START_TIME
(default: yesterday 00:00)
-e END_TIME, --end-time END_TIME
End time for the query. Format is "YYYY-MM-DD
hh:mm:ss" (default: today 00:00)
-f, --force-refresh force refreshing the cached metric lists
-w WORK_DIR, --work-dir WORK_DIR
working directory ( logs saved there )
-E ENDPOINT, --endpoint ENDPOINT
Anodot API endpoint: `poc` or `production`
-V VER, --ver VER data version number to send ( unless specified here or
in the config, ver=1 )
-a, --save-anodot-csv
write a CSV file with the Anodot format
-J, --save-anodot-json
write a JSON file with the same format being sent to
the Anodot API
-d, --debug Print verbose debug output
-p, --production
-D API_DELAY, --api-delay API_DELAY
Anodot API: delay between requests
-C API_CHUNK, --api-chunk API_CHUNK
Anodot API: max. number of metrics to send per request
--producer-concurrency PRODUCER_CONCURRENCY
Number of concurrent processes to use for producers
--anodot-api-concurrency ANODOT_API_CONCURRENCY
Number of concurrent processes to use for sending to
anodot
Examples:¶
- To test and look into the metrics that were fetched ( using multiprocessing ):
$ anodot-collect.py pyanodot.collectors.cloudwatch -w . --producer-concurrency 20 -a -s '2017-05-01 00:00:00' -e '2017-05-02 00:00:00'
- To actually send metrics to Anodot using multiprocessing, with 20 processes on the AWS API side and 5 processes on the Anodot API side:
$ anodot-collect.py pyanodot.collectors.cloudwatch -w . --producer-concurrency 20 --anodot-api-concurrency 5 -p -s '2017-05-01 00:00:00' -e '2017-05-02 00:00:00'
Notes and recommendations¶
- Though it varies according to the amount of managed resources,
the
cloudwatchcollector requires a substantial number of API calls to list, collect and send metrics. It is recommended to make use of the--producer-concurrencyand--anodot-concurrencyparameters in order to reduce the run time of the collection process. - AWS CloudWatch has its own data retention policy and the highest-level granularity may not be available
for all historical data. One implication is that currently for a query
with
"Period": 60it is not possible to request a starting date of more than two weeks ago.