TF_Export failures after provider version 1.27.0

Hello,

We started encountering "API Error: 401 - Invalid login credentials." performing genesyscloud_tf_export transactions after provider version 1.27.0

As far as I can see, our OAuth has all the necessary permissions, and there is no indication in the error message that any are missing.

With version 1.27.0, exports complete without issues, but any version after that fails.

We have seen these with multiple object types and multiple provider versions, across all our environments.

Object types we have noticed so far:

  • genesyscloud_architect_datatable_row
  • genesyscloud_telephony_providers_edges_phone
  • genesyscloud_user

Example with genesyscloud_architect_datatable_row object:

  • resource "genesyscloud_tf_export" "export_hcl" {
    + directory = "/var/tmp/tf_export"
    + enable_flow_depends_on = false
    + export_as_hcl = true
    + id = (known after apply)
    + include_filter_resources = [
    + "genesyscloud_architect_datatable_row",
    ]
    + include_state_file = true
    + log_permission_errors = true
    + split_files_by_resource = false
    }

Error:

│ Error: Failed to get state for genesyscloud_architect_datatable_row instance /: Failed to get state for genesyscloud_architect_datatable_row instance /: [{0 Failed to read Datatable Row /: API Error: 401 - Invalid login credentials. () []}]

│ with genesyscloud_tf_export.export_hcl,
│ on export_hcl.tf line 1, in resource "genesyscloud_tf_export" "export_hcl":
│ 1: resource "genesyscloud_tf_export" "export_hcl" {

Thank you.

Hi,

Two questions:

  • How long was the export running for before it failed?
  • What is the Token Duration set to on the oauth client being used?

Regards,
Declan

Hello,

Last message before the error:
genesyscloud_tf_export.export_hcl: Still creating... [15m40s elapsed]

With version 1.27.0, it completes in about 3 minutes:
genesyscloud_tf_export.export_hcl: Creation complete after 3m10s [id=/var/tmp/tf_export]

Token duration in Genesys Cloud UI is 900, which makes sense, but I don't know why it's running for so long with the later versions.

Thank you.

Hi,

It seems your access token expired during the export. Cx as code does not refresh access token. We are currently adding this to the provider but in the meantime you could increase your token duration so it lasts for the whole export.

As for why the exporter is taking longer it is most likely due to new features being added to the exporter that has resulted in it taking a bit longer to run.

Regards,
Declan

Thank you for your response.

I just ran some tests, extending the OAuth client timeout to 45 minutes. It seems that the latest client is about 7-8x slower than 1.27.0

Should we consider this "expected"? I'm asking as we may have to re-write our processes to account for this as we have multiple processes with 30-60 minute timeout assumptions (ADO pipelines, TFE).

1.27.0
genesyscloud_tf_export.export_json: Creation complete after 4m41s [id=/var/tmp/tf_export]

1.32.1
genesyscloud_tf_export.export_json: Creation complete after 37m33s [id=/var/tmp/tf_export]

Thank you.

Hi JMacek,

Thanks for reaching out to us. At the end of December and into this year we have been actively working on the exporter to do better dependency resolution so that you can automatically export objects that you have not explicitly defined in your export file. This is supposed to be only invoked when you have the enable_flow_depends_on attribute set.

I am wondering if some of that code is being invoked even without that attribute set. It does seem weird that we suddenly jumped from 7 x 8 times in duration so it is worth investigating more.

A few things:

  1. The dev who works on the exporter dependency work is on vacation until Thursday. When he gets back I will have him dig into this more and see if we can figure out what is going on.

  2. As Declan pointed out, we are currently working on providing the ability to automatically refresh the OAuth token. It does not solve this issue.

  3. We just finished a proof of concept where we are looking at introducing caching into our overall export process. The POC went well and I am hoping to start rolling that out across the resources in the next couple of months. This does not solve the immediate problem you are facing but when it is done it will significantly increase the exporter's performance.

  4. For now, to keep moving I would increase your overall timeouts. It is not an ideal solution but it should help you get around this issue while we dig into it.

Thanks,
John Carnell
Director, Developer Engagement

Hi @jmakacek

Can you raise a care ticket for this and enclose the export logs for both the versions for us to investigate more on this.

Thanks
Hemanth

HI JMacek,

Please post the Care ticket here because Care will redirect you back to the forum. Once we have the Care ticket # I can escalate in within the Care org that this could be a defect within the product and then we a ticket to track it. CX as Code is an open source project so Care will always direct you back to the forum and ask it to be evaluated there before we make it a care ticket.

Thanks,
John Carnell
Director, Developer Engagement

Hello,

Are there any specific options or parameters you would like me to use to produce the logs?

As-is, the only difference between the logs is how long it runs (and how many times "Still creating...") is printed.

Also, if you would still like me to open a case, which Component should I select to make sure it goes to the right team?

Thank you.

Hi Jmakacek,

Please set the sdk_debug=true and then TF_LOG=debug and then this will produce two logs: A record of all of the SDK calls and then the TF_LOG which will produce the terraform logs.

Thanks,
John Carnell
Director, Developer Engagement

Hi John,

Case #0003458441 has been created and log files uploaded. Exactly the same code of the same pipeline, with the only difference being the Provider version number.

Thank you again for all your help.

Hi JMakacek,

Perfect. I am in communication with Care and the engineer who owns the export code (Hemanth) is assigned this item and is waiting for the logs. I am hoping we can see something because we have been unable to reproduce this against our large org so we are interested in seeing whats going on.

Thanks,
John

Hi @jmakacek

Thanks for providing the logs and raising the care ticket.
I have done an initial diagnosis on the problem and I could see most of the Platform API calls are rate limited in 1.32.1 due to which we are seeing bunch of retries inturn increasing the export process time exponentially.

Can you confirm in all your runs, you used 1.27.0 for export first and then immediately did a run on a later version ? which might be the cause for the rate limit.

Also Can I request , if you can do an export run 1.32.1 individually without running the 1.27.0 first , just to eliminate the above said diagnosis as a probable root cause?

Thanks
Hemanth

Hello,

Thank you for your response.

We just ran 1.32.1 without 1.27.0 first, and the runtime was about the same (39 minutes). The log file has been uploaded to the case.

Would it please be possible for TF to do things more efficiently? Perhaps taking advantage of internal APIs/DBs, or finding a way to accomplish the same thing with fewer API calls?

In this case, for example, taking advantage of the export job API (/api/v2/flows/datatables/{datatableId}/export/jobs)?

Overall, perhaps to take advantage of the Audit APIs to deal with state changes?

Thank you again for all your help.

Hi @jmakacek

Thanks for the run and sharing the logs.

I could still see the rate limit timeouts for dataTableRows which is the major reason for the delayed exporter run.

We are planning to expedite a solution which can significantly improve the exporter performance for the coming cx code release. This could be potentially a week / weekAndHalfs time. Please let us know if this sounds good for you.
We shall keep you posted on this.

Best Regards
Hemanth

HI Jmakek,

We are looking at options to make our API calls more efficient in the export process. Right now it is a very heavy process where we first call all of the "getAll" calls for the resource and then do an individual API call for each record we want to retrieve. Last week I worked on a "proof-of-concept" pattern that would use caching in the export process to eliminate the individual API calls for looking up each resource.

Hemanth and I talked today and he is going to roll this pattern out to the datatables resource and should be in place in the next release (barring any unforeseen problems). From there we need to roll this pattern out to each resource so it will not be an immediate thing, but as you see the caching implementation rollout to other resources the overall number of API calls will drop pretty dramatically.

For example, if you are trying to export a 10,000 row data table you would see the API calls go down from 10,100 API calls to approximately 100 for this one resource.

So the original design was not by mistake. We wrote the original export process we wanted to make sure we were using the same processing for exporting Terraform as we were reading objects. While this guarantees consistency between your CI/CD use of Terraform and your export, it is also extremely heavy. So we are looking at how we can leverage caching on the export to improve things.

A couple of other things:

  1. We don't have private APIs that are more performant. The public APIS we call are leveraging the private APIs so many of the underlying API calls would be the same.

  2. We do have a ticket out there to redesign the data tables resource to use something like a bulk API pattern. It's just been a matter of time and priority. I will bump this ticket up so that we can start looking at in the next couple of sprints, but I can give a firm commitment on time.

Thanks,
John Carnell
Director, Developer Engagement

1 Like

Hi @jmakacek

We have a new release which enhances the export process (Solution @John_Carnell mentioned in his last comment) for DataTable and DataTable Rows v1.34.0

Can you do an export run and let me know if you see improvement in the export time.

Thanks
Hemanth

Thank you very much, the export completed in under 2 minutes using 1.34.0

Would it please be possible to also have a look at genesyscloud_externalcontacts_contact object type?

That's another object which takes a very long time to export.

Thanks again.

Hi Jmakecek,

Thanks for the feedback. We are going to work on the genesyscloud users object next and then I will have also work on the externalcontacts object. We are currently in a Sprint right now (about a week in), but we can prioritize it for the next sprint.

Thanks,
John

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.