google-webmasters - Google Search Console Collector

The google-webmasters collector queries two endpoints of the Google Search Console API and translates them to metrics in a format suitable for Anodot.

  • Search Console Statistics - See documentation of this API here.

    This API provides the following metrics for each site and grouping in the query result: clicks, impressions, ctr(Click through rate), position (Avg. Position in the SERP). The grouping is determined by the dimensions parameter inside the query block. See in the document and below.

  • URL Crawl Error Counts - See documentation of this API here.

    This API provides the number of errors for each site and grouping. Currently works by default and does not require configuration.

Account-level properties

  • credentials_uri (string) - a path to a json file containing credentials for a service account.

  • credentials_type (string) - should always be "ServiceAccountCredentials"

Query-level properties

Each Query in this collector describes data to be fetched from the Search Console Statistics and URL Crawl Error Counts APIs, possibly from multiple sites under the Google Search Console account at once.

  • site_url_patterns ( string[] ) - a list of regular expressions or substrings to match site urls. The collection of the query will be applied only to sites whose URLs match at least one of the patterns.
  • query (object) - an object compliant to the request format in Google Search Console API. - The properties startDate, endDate will be overwritten by the collector based on CLI arguments.
    • The properties rowLimit and startRow are overwritten and set to 5000 and 0 respectively.
    • Notice that this API does not allow paging, and has a limit of 5000 rows which can be fetched. In some cases it may be necessary to split a single query into multiple queries.

Command-line params

(venv) $ anodot-collect.py google-webmasters -h
usage: anodot-collect.py google-webmasters [-h] [-s START_DATE] [-e END_DATE]
                                           [-w WORK_DIR] [-E ENDPOINT]
                                           [-T ANODOT_API_TOKEN] [-V VER] [-a]
                                           [-J] [-d] [-p] [-D API_DELAY]
                                           [-C API_CHUNK]
                                           [--producer-concurrency PRODUCER_CONCURRENCY]
                                           [--anodot-api-concurrency ANODOT_API_CONCURRENCY]

optional arguments:
  -h, --help            show this help message and exit
  -s START_DATE, --start-date START_DATE
                        first date to collect in format YYYY-MM-DD (default:
                        yesterday). Time zone is UTC.
  -e END_DATE, --end-date END_DATE
                        last date to collect in format YYYY-MM-DD (default:
                        yesterday). Time zone is UTC.
  -w WORK_DIR, --work-dir WORK_DIR
                        working directory ( logs saved there )
  -E ENDPOINT, --endpoint ENDPOINT
                        Anodot API endpoint: `poc` or `production`
  -T ANODOT_API_TOKEN, --anodot-api-token ANODOT_API_TOKEN
                        Anodot API endpoint: `poc` or `production`
  -V VER, --ver VER     data version number to send ( unless specified here or
                        in the config, ver=1 )
  -a, --save-anodot-csv
                        write a CSV file with the Anodot format
  -J, --save-anodot-json
                        write a JSON file with the same format being sent to
                        the Anodot API
  -d, --debug           Print verbose debug output
  -p, --production
  -D API_DELAY, --api-delay API_DELAY
                        Anodot API: delay between requests
  -C API_CHUNK, --api-chunk API_CHUNK
                        Anodot API: max. number of metrics to send per request
  --producer-concurrency PRODUCER_CONCURRENCY
                        Number of concurrent processes to use for producers
  --anodot-api-concurrency ANODOT_API_CONCURRENCY
                        Number of concurrent processes to use for sending to
                        anodot

Sample CLI Invocation

Basic execution for generating CSV files for Anodot
(venv)$ anodot-collect.py google-webmasters -w my_work_dir -s 2017-06-01 -e 2017-06-10 -a

Sample Config File

{
    "anodot_api_token": "XXX_ANODOT_TOKEN_HERE_XXX",
    "collectors": {
        "google-webmasters" : {
            "accounts": {
                "my_account": {
                    "credentials_uri": "/path/to/service-account-key.json",
                    "credentials_type": "ServiceAccountCredentials",
                    "queries": {
                        "all_sites": {
                            "description": "All Sites",
                            "ver": "1",
                            "extra_properties": {
                                "tag": "example"
                            },
                            "site_url_patterns": [".*"],
                            "query": {
                                "dimensions":   [ "country", "device" ]
                            }
                        }
                    }
                }
            }
        }
    }
}

Download this sample here.

In order to use the config file: