New datalake availability endpoints

Category: API
Summary: We are adding new endpoints for checking analytics datalake availability dates.
Impacted API's:
/api/v2/analytics/users/details/jobs/availability
/api/v2/analytics/conversations/details/jobs/availability
Impact: We are adding two new endpoints that allow retrieving the datalake availability date for user/conversation details without having to trigger a job.
Date of Change: 11/18/2020

1 Like

This seems useful, we use the jobs analytics endpoints to pull down data into our data lake every night.

Can I ask that you give an example how to use these endpoints? E.g. a use case?

The use case is that you can poll the availability endpoint periodically (like hourly, or 30 mins) to identify when new data has become available. Then once new data is ready, you can invoke a job to export it. This prevents your app from having to perform an export just to find out when the latest data availability is.

Okay, so if I get this back, I can with confidence request a job analytics data up to (not including) 2020-11-18T17:19:00.000Z ?

And if I don't see the value I "want", I could just hold off on starting the export, and poll a bit later, and request the export when I get a dataAvailabilityDate value fits with my use case?

{
  "dataAvailabilityDate": "2020-11-18T17:19:00.000Z"
}

By the way, the release notes said (my highlight):

Developers can now access the partial availability time stamp

Which is another reason why I asked this question, I'm not sure what the partial referred to, that you can get data back before that point, but data some data could be missing?

Hi Tim, can you review my last comment about the "partial" availability?

I don't know why the release note uses the word "partial"; partial availability isn't a concept here. The data availability timestamp indicates that data prior to that timestamp is available.

Ok, thanks for the clarification :slight_smile:

Tried to poll data with both endpoints and dataAvailabilityDate I get back in response, is almost 24hrs backwards from current timestamp. At the same time, if I query the data for conversation/user details, I do retrieve fresh data data for today just ok.

Is this expected behavior to see gap this long?

Using developer tools to poke the new endpoints.

Yes, per https://developer.mypurecloud.com/api/rest/v2/analytics/conversation_details_job.html#data_availability

While the data returned by detail record jobs is the same as that behind the detail query endpoint, the data source is different. Jobs are served by a data lake that is updated nightly by a batch process, rather than being updated in real time. The exact time this update occurs varies day to day, but it will generally not occur during business hours.

This makes the jobs endpoint more useful for querying historical data, as the data will not be complete up to the the day of the query. The dataAvailabilityDate field in the body of the job result indicates that all data archived up to that date and time is available. Typically, a query run during business hours on January 2nd will generally have data available up to around 6:00 pm on January 1st in the time zone of the AWS region your organization is based in.

Thanks for prompt response. This clarifies.

This topic was automatically closed 62 days after the last reply. New replies are no longer allowed.