google-webmasters - Google Search Console Collector¶
The google-webmasters collector queries two endpoints of the Google Search Console API and translates them to metrics in a format suitable for Anodot.
- Search Console Statistics - See documentation of this API here.
This API provides the following metrics for each site and grouping in the query result: clicks, impressions, ctr(Click through rate), position (Avg. Position in the SERP). The grouping is determined by the
dimensionsparameter inside thequeryblock. See in the document and below.URL Crawl Error Counts - See documentation of this API here.
This API provides the number of errors for each site and grouping. Currently works by default and does not require configuration.
Account-level properties¶
credentials_uri (string)- a path to a json file containing credentials for a service account.See also: Setting up a Google Service Account.
credentials_type (string)- should always be"ServiceAccountCredentials"
Query-level properties¶
Each Query in this collector describes data to be fetched from the Search Console Statistics and URL Crawl Error Counts APIs, possibly from multiple sites under the Google Search Console account at once.
site_url_patterns ( string[] )- a list of regular expressions or substrings to match site urls. The collection of the query will be applied only to sites whose URLs match at least one of the patterns.query (object)- an object compliant to the request format in Google Search Console API. - The propertiesstartDate,endDatewill be overwritten by the collector based on CLI arguments.
- The properties
rowLimitandstartRoware overwritten and set to 5000 and 0 respectively.- Notice that this API does not allow paging, and has a limit of 5000 rows which can be fetched. In some cases it may be necessary to split a single query into multiple queries.
Command-line params¶
(venv) $ anodot-collect.py google-webmasters -h
usage: anodot-collect.py google-webmasters [-h] [-s START_DATE] [-e END_DATE]
[-w WORK_DIR] [-E ENDPOINT]
[-T ANODOT_API_TOKEN] [-V VER] [-a]
[-J] [-d] [-p] [-D API_DELAY]
[-C API_CHUNK]
[--producer-concurrency PRODUCER_CONCURRENCY]
[--anodot-api-concurrency ANODOT_API_CONCURRENCY]
optional arguments:
-h, --help show this help message and exit
-s START_DATE, --start-date START_DATE
first date to collect in format YYYY-MM-DD (default:
yesterday). Time zone is UTC.
-e END_DATE, --end-date END_DATE
last date to collect in format YYYY-MM-DD (default:
yesterday). Time zone is UTC.
-w WORK_DIR, --work-dir WORK_DIR
working directory ( logs saved there )
-E ENDPOINT, --endpoint ENDPOINT
Anodot API endpoint: `poc` or `production`
-T ANODOT_API_TOKEN, --anodot-api-token ANODOT_API_TOKEN
Anodot API endpoint: `poc` or `production`
-V VER, --ver VER data version number to send ( unless specified here or
in the config, ver=1 )
-a, --save-anodot-csv
write a CSV file with the Anodot format
-J, --save-anodot-json
write a JSON file with the same format being sent to
the Anodot API
-d, --debug Print verbose debug output
-p, --production
-D API_DELAY, --api-delay API_DELAY
Anodot API: delay between requests
-C API_CHUNK, --api-chunk API_CHUNK
Anodot API: max. number of metrics to send per request
--producer-concurrency PRODUCER_CONCURRENCY
Number of concurrent processes to use for producers
--anodot-api-concurrency ANODOT_API_CONCURRENCY
Number of concurrent processes to use for sending to
anodot
Sample CLI Invocation¶
(venv)$ anodot-collect.py google-webmasters -w my_work_dir -s 2017-06-01 -e 2017-06-10 -a
Sample Config File¶
{
"anodot_api_token": "XXX_ANODOT_TOKEN_HERE_XXX",
"collectors": {
"google-webmasters" : {
"accounts": {
"my_account": {
"credentials_uri": "/path/to/service-account-key.json",
"credentials_type": "ServiceAccountCredentials",
"queries": {
"all_sites": {
"description": "All Sites",
"ver": "1",
"extra_properties": {
"tag": "example"
},
"site_url_patterns": [".*"],
"query": {
"dimensions": [ "country", "device" ]
}
}
}
}
}
}
}
}
Download this sample here.
In order to use the config file:
- replace the
credentials_urifield with a path to the json you downloaded according to Setting up a Google Service Account.