Can we increase our rate limit, potentially using multiple oauth clients?

One of our services interacts with the Platform API to keep a local fresh copy of all of the conversation objects, as well as synchronize our changes up to the Platform API. There are multiple scenarios where, combined, this comes to more than 300 requests per minute.

For example, at particularly busy times when there are lots of conversation updates, and each change requires another request to get the full conversation details, meanwhile other requests are happening to reflect local changes to the Platform API.

We were hoping to mitigate this rate limit issue, without needing to introduce delays / waits throughout our services. We were looking in to using multiple OAuth clients and cycling between them for our requests, but notice there's a few caveats, such as user accounts being flagged and/or limited in their OAuth usage.

Is it possible to just increase our rate limit for the Platform API? Will there be any issues if we round-robin multiple OAuth clients? Thanks - Daniel.

I work with Daniel, and just wanted to add an additional detail.

We currently have about 15% of our agents on the platform and that is coming close to maxing out the rate limit on one key. Once we get 100% of the agents over it will not be possible to live with the throttling as we will likely need 600 - 900 request/minute to process the number of ongoing conversations.

Thanks

Chris

Hi Daniel,

Rate limits are not configurable at this time but will be in the future.

You can use multiple tokens with the same client as an alternative to creating multiple clients. There are max limits on clients but the limit is much higher.

Thank you,
Chuck

Chris/Daniel, can you provide some background about your use case? I may be able to make some recommendations to reduce your API usage. I'm specifically interested in:

  • What you're using the conversation data for
  • How you're retrieving the conversation data (analytics APIs, conversation APIs, notifications, etc.)
  • Are you doing this in real time or in a batch process?
  • I assume this is a server-side service that's using client credentials. If not, please clarify.

Hi Tim

I will give you the info you asked for, but we will have need of more throughput. We already have 5 different services that talk to GPC, and each has their own key. However some of them are starting to hit the point where 1 key is not enough.

The service we are talking about is processing changes in conversations so as to update the data about phone calls, email and sms (have they started, whos involved, did they end, was there call backs, whats the wrap up code). We need uptodate information with as low lag as possible so our agents can see the results of the calls and the client files can be kept uptodate.

We were scanning once a minute. And during that scan using the analytics api to scan all convos changed in previous minute. Then, because the analytics api does NOT return the full convo object, we have to make an api call per convo, and then emit an event into our own system indicating the conversation has changed.

The per convo lookup is where we are really using the RPM.

We have discovered that there is actually some lag in analytics api returning accurate results. EG if you call it for the time of now() to now() - 1 minute, you will actually miss conversation changes that haven't been updated due to async nature of your back end system.

We tried adding lag to our system where we would call the period of now() - 1 minute to now() - 2 minutes. But that introduced painful delay for our agents, and still missed convos.

So we made the system more robust. We would call every minute and scan from now() to now() - 5 minutes. And we would check the returned convos to see if they had actually changed since we last emited an event. That way we would get the best of all worlds. Fast response, and better protection from missing convos due to async nature of processing.

Also when there is an analytics outage, which happens often enough that we have commands to do this, we can run a scan after the incident and be assured of getting the data back into the system with no duplicate events emitted on our side.

Once we got that working, the agents still weren't happy with the lag, so we increased the scan rate to once every 30 seconds.

Now things have been working pretty well. But needing to plan for future capacity.

Thanks

Chris

Hi Chuck

These are actually server side Oauth client credentials. Does creating a token with those credentials not invalidate the previous token? What is the max number allowed, and is this a supported use?

Thanks

Chris

You really should be using notifications for this. You won't have any rate limit issues and you'll also have data up to date in real time. There isn't a topic for all conversations in the system, but you can subscribe to each queue using v2.routing.queues.{id}.conversations to catch all ACD conversations. You can also subscribe to users conversations if that's a better fit.

Each client credentials OAuth client can have up to 50 tokens active at once. The 51st will invalidate the 1st. Keep in mind that each token individually still has the 300 requests/minute rate limit, but all tokens for the OAuth client are additionally limited to 900 requests/minute total.

1 Like

Hi Tim, I'm still digesting all of what's been said here, but looking at the notifications API - We would still have to handle the case of downtime (either PureCloud, or our own socket connection), and catching missed changes during that downtime, correct? I'm just trying to confirm my understand of the notifications API as being fire-and-forget, and that ensuring we have a complete view means auxillary scanning anyway. Thanks!

Hi Tim

I'm afraid we can't use notifications as we need to get information on conversations that never make it to a queue.

eg, inbound calls that hangup in the IVR.

Same reason that listening to user conversation notifications wouldn't catch the ones that don't get to a user.

We had abandoned the notification method early on for those reasons. Do you know how those apis get affected by issues that affect the analytics query api?

Thanks

Chris

You are correct that the delivery of notifications is not guaranteed and messages are not retried. If you can't use notifications, you must work within the rate limits of the API.

I'm not sure what issues you're referring to.

Hi Tim

What do you mean by notification is not guaranteed?

I was referring to times where there are backend issues with the analytics query apis and data is not available through them. I think they are usually listed as issues with "Reporting" since that is the part of the UI that uses those apis. There were a few in january. Anyway when those are happening, using the analytics api to keep track of conversation updates doesn't work, was wondering if Notifications might not be affected by those issues.

Okay, so we will be aware each oauth cred key has limit of 300 per token and 900 per key.

We will have to expand to support more keys as our business grows then.

Thanks

Chris

Notifications are transmitted over the WebSocket protocol. WebSockets, as a protocol, does not provide feedback to the sender regarding if the message delivery was a success or failure; in this respect, it's similar to a UDP packet. If a WebSocket connection is open and there is no network failure between the server and client, there is no reason the message wouldn't be received. PureCloud processes messages and always sends them, but due to the limitation of WebSockets, there's no guarantee that a client will receive it nor will the message be retried if a client fails to receive it (because the sender can't know that it wasn't received).

Please report these issues via a Care case. Missing or latent data, as well as any 5xx errors, cannot be investigated via the forum.

Hi Tim

Okay good to know about the Notifications.

Yes we filed tickets when we detect times of analytics api issues.

Thanks

Chris

Hey Tim

You mentioned the overall rate limit of 900. Is there a way to learn how close we are to exceeding this value?

Thanks

Chris

The per-token rate limit is exposed via the headers, but I don't believe the per-client limiter (or any others) is exposed anywhere prior to receiving the 429 response. https://developer.mypurecloud.com/api/rest/rate_limits.html

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.