Conversation Detail job
Conversation detail record jobs are an alternate means of accessing the same data behind a Conversation Detail Record query but in bulk. The data behind both is the same and the vast majority of the request/response payload is also the same.
Which endpoint do I want--'query' or 'jobs'?
Depending on your needs, one endpoint might standout as the obvious choice for you. It is also possible that you might need to employ both and to stitch results together. Keep this in mind as you read more.
Query
The Conversation Detail Record query endpoint is intended for users who need the most up-to-date data and they need a response to their query right now. The need for speed comes with limitations though. For example, you can query for data over shorter intervals and get fewer results per page. This endpoint is not intended for bulk export workloads.
Differences to conversation detail queries
There are small differences between conversation detail queries and jobs:
- Interval: Conversation detail queries will only include conversations that started on a day (UTC) touched by the interval. Conversation detail jobs will by default include all conversations whose lifetime overlaps with the query interval. If you want the same behavior as for queries, you should set the startOfDayIntervalMatching flag in the query to true.
- Email dimensions: Jobs can match on partial email addresses, e.g.
name@domain.c
will match a record forname@domain.com
. Matching for queries is more restrictive: The dimension value has to match either the full email address or the full name or domain part of the email address, e.g.domain.com
will match the record forname@domain.com
(jobs also support this type of matching). - Job results contain participant attributes for each conversation. The keys and values of the attributes are truncated to 1024 characters but there is no limit in analytics on the number of attributes.
Jobs
The conversation detail record job endpoint is intended for users who are less latency sensitive. This would commonly apply to users who are looking for data export style integrations (e.g. you're trying to scroll through all detail record data to add incremental copies into your own database/warehouse).
If you need more bulk-friendly data access to conversation detail record data, this endpoint allows you to query across more data and to get more data back at a time (e.g. query for wider time intervals, retrieve larger page sizes, etc). As a bonus, you also have the same query structure so swapping workloads from one endpoint to the other should be reasonably straightforward.
An important consideration is that the jobs-backed data is not continuously updated in real-time and so depending on when you query, you may not see up-to-the-moment data. While your job might execute and have results for you ready in seconds, the data it can search/return may not be up-to-date for on the order of hours to a full day behind real-time.
Job lifecycle
- Submit a query to create a job. A
jobId
will be returned. Hang onto thisjobId
(HTTP POST /api/v2/analytics/conversations/details/jobs). - Armed with your
jobId
, you should now periodically poll for the status of your job (HTTP GET /api/v2/analytics/conversations/details/jobs/{jobId}). Is your job still running? Did it fail? Has it successfully completed gathering all of your data? Depending on load and the volume of data being queried, it might be on the order of seconds to minutes before you see your job complete. - If and only if your job has successfully completed, is it time for you to retrieve the results. At this point, you can ask for the first page of data back (HTTP GET /api/v2/analytics/conversations/details/jobs/{jobId}/results). Alongside the results of your query, you will find a
cursor
. This is what you will use to advance to the next page of data (that is, this is an iterator and not random-access/numbered-page-style access). Use thatcursor
as a query parameter on the URL to advance to the next page of results. Each page will have a uniquecursor
to advance you forward. If there is nocursor
in the page response, there is no data beyond this page.
Example queries
First, submit a job:
{
"interval": "2019-01-01T00:00:00.000Z/2019-07-01T00:00:00.000Z",
"segmentFilters": [
{
"type": "or",
"predicates": [
{
"dimension": "purpose",
"value": "agent"
},
{
"metric": "tSegmentDuration",
"range": {
"gt": 2000,
"lte": 90000
}
}
]
}
]
}
{
"jobId": "1dfb52ab-f02a-42f8-8ad7-9821edd2ee7f"
}
Next, use the jobId
from above to check on the status of your job execution:
GET https://api.mypurecloud.com/api/v2/analytics/conversations/details/jobs/1dfb52ab-f02a-42f8-8ad7-9821edd2ee7f
{
"state": "FULFILLED",
"expirationDate": "2019-08-01T00:00:00.000Z",
"submissionDate": "2019-07-01T15:25:33.344Z",
"completionDate": "2019-07-01T15:26:08.209Z"
}
There are several possible values for the state
of the job:
State | Description |
QUEUED | The job is in the queue, waiting to run |
PENDING | The job is running |
FAILED | The job completed with an error |
CANCELLED | The job was manually cancelled |
FULFILLED | The job completed successfully |
EXPIRED | The job previously completed, but results have since expired |
Since the job successfully completed, now scroll through the results:
GET https://api.mypurecloud.com/api/v2/analytics/conversations/details/jobs/1dfb52ab-f02a-42f8-8ad7-9821edd2ee7f/results
{
"conversations": [
{
"conversationId": "8449d2d3-3ab1-45c2-acae-f24795e79148",
"conversationStart": "2019-06-28T16:51:22.694Z",
"conversationEnd": "2019-06-28T16:52:32.262Z",
"mediaStatsMinConversationMos": 1,
"originatingDirection": "outbound",
"mediaStatsMinConversationRFactor": 0,
"participants": [
{
"userId": "093e832f-aa05-46af-93b5-6d8afa648d67",
"purpose": "user",
"participantId": "34810f1c-d4a8-450d-95e9-dc22ba02e93b",
"sessions": [
{
"sessionId": "5f36d996-9821-4f4c-a30b-efec73d3aaeb",
"mediaType": "voice",
"protocolCallId": "f41d1e15-e9b9-4a47-b70d-5a6bff3f4b99",
"dnis": "tel:+1317....",
"direction": "outbound",
"segments": [
{
"segmentStart": "2019-06-28T16:51:22.694Z",
"segmentEnd": "2019-06-28T16:51:23.340Z",
"segmentType": "contacting",
"conference": false
},
{
"segmentStart": "2019-06-28T16:51:23.340Z",
"segmentEnd": "2019-06-28T16:51:27.130Z",
"segmentType": "dialing",
"conference": false
},
...
],
"metrics": [
{
"name": "tContacting",
"value": 646,
"emitDate": "2019-06-28T16:51:23.340Z"
},
{
"name": "tDialing",
"value": 3790,
"emitDate": "2019-06-28T16:51:27.130Z"
},
{
"name": "tTalk",
"value": 46903,
"emitDate": "2019-06-28T16:52:14.033Z"
},
...
]
}
]
}
]
}
],
"dataAvailabilityDate": "2020-04-28T21:13:39.000Z",
"cursor": "ASDZmsplafejJmPh4h53wlChCzcQnloaP2xuECPZLMFhhNY1YfNx/L/reje81J/IMcBHdQKbWOYTHSnxH/cSrT4YAiGUprY7Tg=="
}
The presence of a cursor indicates that there's more data available. Use the cursor
field from the above response as a url-encoded query parameter for the next page of data:
GET https://api.mypurecloud.com/api/v2/analytics/conversations/details/jobs/1dfb52ab-f02a-42f8-8ad7-9821edd2ee7f/results?cursor=ASDZmsplafejJmPh4h53wlChCzcQnloaP2xuECPZLMFhhNY1YfNx%2FL%2Freje81J%2FIMcBHdQKbWOYTHSnxH%2FcSrT4YAiGUprY7Tg%3D%3D
{
"conversations": [
{
"conversationId": "0131ec63-15b3-44a6-8985-aecf446cb2dd",
"conversationStart": "2019-06-26T19:43:10.542Z",
"conversationEnd": "2019-06-26T19:43:54.610Z",
"mediaStatsMinConversationMos": 1,
"originatingDirection": "inbound",
"mediaStatsMinConversationRFactor": 0,
"participants": [
{
"purpose": "customer",
"participantName": "customer",
"participantId": "9b0b336e-db49-4bb5-9d39-5b5e17d97641",
"sessions": [
{
"sessionId": "8dcabd63-9581-4c02-8f77-48a9e3a4e47f",
"mediaType": "voice",
...
}
A pageSize
may be specified as a query parameter if you need more or fewer results than the default, for example:
GET https://api.mypurecloud.com/api/v2/analytics/conversations/details/jobs/1dfb52ab-f02a-42f8-8ad7-9821edd2ee7f/results?pageSize=5000
Or with a cursor:
GET https://api.mypurecloud.com/api/v2/analytics/conversations/details/jobs/1dfb52ab-f02a-42f8-8ad7-9821edd2ee7f/results?pageSize=5000&cursor=ASDZmsplafejJmPh4h53wlChCzcQnloaP2xuECPZLMFhhNY1YfNx%2FL%2Freje81J%2FIMcBHdQKbWOYTHSnxH%2FcSrT4YAiGUprY7Tg%3D%3D
Elements of the query
The query is identical to the Conversation Detail Record query with a few exceptions:
- there is no random-access paging element like there is on the 'query' endpoint. Instead, cursors are used (see the above example)
- there is no support for aggregations
- the results of your job execution will be cached and available to you for a number of days after its execution. Once this cache is purged, your job status will be 'expired'
Limitations
- the interval restrictions you can query over are lifted and are instead a function of the volume of data you are querying
- page size can be no larger than 10,000 results per page
- jobs-backed data is not continuously updated in real-time
- no specific guarantees are made on the speed of job execution
- rate limits on usage of this endpoint will vary from other analytics endpoints. Both frequency of and concurrency of queries will have an impact as well as data volume queried.
Paging
The nature of this endpoint is likely to mean you will encounter high-volume result sets. You can apply the following properties:
- Page size-The number of records per page.
- Paging through the result set follows an iterator model. Retrieve the cursor from your results and use that to make your subsequent request for the next page of data (until you reach the end).
- The default page size is 1000 records with a maximum page size of 10000.
Data Availability
While the data returned by detail record jobs is the same as that behind the detail query endpoint, the data source is different. Jobs are served by a data lake that is updated nightly by a batch process, rather than being updated in real time. The exact time this update occurs varies day to day, but it will generally not occur during business hours.
This makes the jobs endpoint more useful for querying historical data, as the data will not be complete up to the the day of the query. The dataAvailabilityDate
field in the body of the job result indicates that all data archived up to that date and time is available. Typically, a query run during business hours on January 2nd will generally have data available up to around 6:00 pm on January 1st in the time zone of the AWS region your organization is based in.
There are several important caveats to this:
- There may be data available after
dataAvailabilityDate
, so there is no guarantee you will not get data from after that time depending on your query interval. - In the rare case that the batch update process fails, the data may not be updated again until the following night. In this case, there would be a temporary data gap in which yesterday's data is not available. The value of
dataAvailabilityDate
will reflect this fact, showing no change from the previous day. In this case, use appropriate requests to the detail record query endpoint to retrieve the data for the most recent day.
Data Availability Endpoint
The dataAvailabilityDate
is returned as part of query results but you can also check it via a separate endpoint. This is useful if you want to make sure that the interval you want to query is present in the datalake before triggering a job.