Historical measurements - Request report

POST {baseURl}/v2/report-requests
Request a report of measurements over a time range. Returns the new report record.

Three reports are available:

Datasource measurements - by datasource id or for all datasources in the org
Node measurements - by Node id
Reference site measurements - by reference site id

Note: This is an expensive operation. There is a limit of 30 historical measurement reports per organization per day.

Example
POST https://clarity-data-api.clarity.io/v2/report-requests

Headers

x-api-key: the API key string.

Request

Common Parameters

json body parameter		description
`org`	string required	The organization for this request.
`report`	string required	One of: `"datasource-measurements"`, `"node-measurements"`, or `"reference-site-measurements"`.
`outputFrequency`	string required	One of `"day"`, `"hour"`, or `"minute"`.
`startTime`	string required	A date in ISO 8601 format, e.g. `"2023-01-02T03:45:67.899Z"`. Measurments returned are on or after this time.
`endTime`	string required	A date in ISO 8601 format, e.g. `"2023-01-02T03:45:67.899Z"`. Measurments returned are before this time.
`metricLabelStyle`	string	One of `"canonical"` (default), `"english"`, `"legacy"`. See below.
`fileFormat`	string	One of • `"csv-wide"`(or the shorthand: `"csv"`) default • `"parquet-wide"` (or the shorthand: `"parquet"`) See Apache Parquet Format.
`metricSelect`	string	Limits the metrics returned in the measurement. See Metrics selection
`qcAssessment`	boolean	When `true`, adds the field `qcAssessment` to each metric in the response. Default `false`. Values are `valid` and `invalid` but may be expanded in the future.
`qcFlags`	boolean	When `true`, adds the field `qcFlags` to each metric in the response. Default `false`. See QC Flags

Metric Label Style

Metrics can be identified in one of these ways:

metricLabelStyle	example / when to use
canonical	`pm2_5ConcMass1HourMean.raw` This is the recommended label style to use for your scripts and automation. See metrics dictionary.
english	`PM2.5 mass concentration, 1-hour mean raw` Human readable. Labels can have embedded spaces and commas which makes them less suitable for automation, but fine for a person opening a downloaded CSV file. We reserve the possibility of improving the english representation over time, so please don't use these labels as idenitifers in your code.
legacy	`pm2_5ConcMass.raw` These can be ambiguous -- the same label will mean different things depending on the requested output frequency. Please only use these if you are trying to maintain compatibility with legacy v1 endpoints.

Datasource Parameters

json body parameter		description
`allDatasources`	bool	Queries all the datasources for the `org` when `allDatasources` is `true`. Default: `false`.
`datasourceIds`	array	Required when `allDatasources` is missing or `false`. A json array of strings representing datasource ids. e.g. `["DZJSI2879", "DDRUK4341"]`

Node Parameters

json body parameter		description
`nodeIds`	array required	A json array of strings representing device ids. e.g. `["AFQNN3RH", "ATVS3KVL"]`

Reference Site Parameters

json body parameter		description
`refSiteIds`	array required	A json array of strings representing reference site ids. e.g. `["RWKQH7W4", "RR4BJSJW"]`

Example Requests

Datasource Measurements

By Id

{
    "org": "MyOrg1234",
    "report": "datasource-measurements",
    "datasourceIds": [
        "DZJSI2879", "DDRUK4341"
    ],
    "outputFrequency": "day",
    "startTime": "2022-06-01T00:00:00.000Z",
    "endTime": "2023-06-02T00:00:00.000Z",
    "qcAssessment": true,
    "qcFlags": true
}

All in Org

{
    "org": "MyOrg1234",
    "report": "datasource-measurements",
    "allDatasources": true,
    "outputFrequency": "day",
    "startTime": "2022-06-01",
    "endTime": "2023-06-02",
    "fileFormat": "parquet",
}

Node Measurements

{
    "org": "MyOrg1234",
    "report": "node-measurements",
    "nodeIds": [
        "AZ123456", "AA654321"
    ],
    "outputFrequency": "day",
    "startTime": "2022-06-01T00:00:00.000Z",
    "endTime": "2023-06-02T00:00:00.000Z",
    "metricLabelStyle": "english"
}

Reference Site Measurement

{
    "org": "MyOrg1234",
    "report": "reference-site-measurements",
    "refSiteIds": [
        "R47BCX7N", "RHGH8V3N"
    ],
    "outputFrequency": "day",
    "startTime": "2022-06-01T00:00:00.000Z",
    "endTime": "2023-06-02T00:00:00.000Z"
}

Response

If successful, returns the report request, including its status and the parameters submitted. See GET /v2/report-requests/:reportId for details.

If the request limit has been exceeded, an HTTP 429 Too Many Requests is returned. The throttle is reset at UTC midnight.

Example 200 Response

{
    "reportId": "JB78VV46E8",
    "reportStatus": "in-progress",
    "message": "in-progress",
    "report": "datasource-measurements",
    "urls": [],
    "query": {
        "datasourceIds": [
            "DAQSF4092"
        ],
        "endTime": "2023-06-02T00:00:00.000Z",
        "outputFrequency": "hour",
        "startTime": "2022-06-01T00:00:00.000Z",
        "columnLabelStyle": "canonical",
        "fileFormat": "csv"
    }
}

How to request and retrieve reports

Running reports is a multi-step process that may take minutes to complete. The process involves:

POST the report request to /v2/report-requests and store the reportId
GET the report status until it is not in-progress from /v2/report-requests/:reportId
If the report status is succeeded:
- GET each of the files listed in the urls of the report.

The following sample Python code requests a report, polls for its completion, and fetches the output files.

from time import sleep
import requests

clarity_api_key = os.environ.get('MY_CLARITY_API_KEY')  # put your key in the environment or directly here
clarity_api_base_url = 'https://clarity-data-api.clarity.io'

def request_and_fetch_a_report():
    headers = {"x-api-key": clarity_api_key}

    # request the report
    result = requests.post(url=clarity_api_base_url + "/v2/report-requests",
                           headers=headers,
                           json={
                               "org": "MyOrg1234",
                               "report": "datasource-measurements",
                               "datasourceIds": ["DAQSF4092", "DQCXA3527"],
                               "outputFrequency": "hour",
                               "startTime": "2023-07-01T01:00:00.000Z",
                               "endTime": "2023-07-02T01:00:00.000Z",
                           })
    result.raise_for_status()
    result_json = result.json()
    reportId = result_json['reportId']

    # poll for its completion
    for i in range(12):
        print("sleeping 1 minute")
        sleep(60)
        print("fetching report status ... ", end="")
        statusUrl = clarity_api_base_url + f"/v2/report-requests/{reportId}"
        result = requests.get(url=statusUrl, headers=headers)
        result.raise_for_status()
        result_json = result.json()
        print(result_json.get("reportStatus"))
        if result_json.get("reportStatus") != 'in-progress':
            break

    print(result_json)

    if result_json.get("reportStatus") == 'succeeded':
        # if it succeeded, fetch the resulting files
        for i, url in enumerate(result_json['urls']):
            with requests.get(url=url, stream=True) as result:
                result.raise_for_status()
                filename = f"extract_{i}.csv"
                # stream to disk
                with open(f"{filename}", "w") as f:
                    for chunk in result.iter_content(1024 * 1024, decode_unicode=True):
                        f.write(chunk)


if __name__ == "__main__":
    request_and_fetch_a_report()