CX as Code for existing orgs and DR

Matthew_Nolan · December 31, 2021, 10:37pm

I recently started exploring CX as Code and came up with a handful of questions mostly focused around how to best set it up with existing orgs. Some of these questions also stem from the examples and guides provided in the various knowledge articles and blogs such as How to begin your CX as Code Journey and CX as Code , Terraform articles, self-conducted experiments and Google (of course!).

For an existing orgs what are the best recommendations on how to start using CX as Code to manage the core objects for companies who have multiple orgs (DEV, TEST, PROD)? I have concerns around how exports of existing objects are identified (has a random 10 digit number stamped at the end) and applied, how state info is kept (probably need to use remote state), etc. Instead of using exports, is it even possible to manually define existing objects and their state and use that initially to treat the config as code and reliably change it and apply to the orgs?
Speaking of that weird identifier on exported objects, in my experiments exporting existing objects such as a queue will look like "Matt_Sample_Queue_2437141386" whereas if I create it from scratch using CX as Code it's just "Matt_Sample_Queue". I'm not sure how that number gets determined, but seems to be the same across orgs. I was able to get the same identifier doing an export in both DEV and TEST orgs. Is it expected that exporting these objects across orgs will have the same exact identifier?
Creating source code repos: In the above mentioned "How to begin your CX as Code Journey" blog, it is recommended to create separate repos based on logical groupings such as lines of business (LOBs) and store both the flows and config definitions. I'm all for this as it prevents creating a large monolith and increases risk. However, what is recommended for objects shared across LOBs such as user prompts, data tables, common modules (or any shared flows), data actions, wrap codes, skills, etc.?
Related to the above question, is there a best practice on apply changes? Would it/should it be granular for specific object types and their dependencies as one apply, and then again for other object types, and so forth. For example, running the apply for queues and wrap codes as 1 apply, user prompts as separate apply, data actions as another separate apply, etc. I'm assuming we would want to keep the config and state file sizes at a minimum to reduce risk if possible.
Is there a best practice for getting state per org and/or the logical separation of objects such as per LOB? If I understand state correctly, it is required to be kept somewhere so Terraform knows what to change. Is it possible to get the state at runtime before attempting to apply the change as opposed to storing the state continuously? Would this even be advised? I see we can store state remotely in places like AWS S3 or Kubernetes secrets (but is limited to 1mb, so a whole org's state in 1 secret would not be possible for large orgs) and for automation/pipeline purposes we can declare some env vars that could set the org env when the terraform init command is ran to tell it which remote state to run against.
Is there a procedure for restoring objects that were accidentally destroyed (asking for a friend.... )? I could see a scenario where maybe the wrong state or config definition was used when applying changes (terraform apply) and for objects missing in the definition terraform deletes them from the target org. Would this involve opening a case with customer care? Could all those objects and their associations be restored (such as queue membership and skills)? This also assumes that there isn't an existing definition config that includes all users, queues, skills, etc.
Should data sources only be defined and used as dependencies for resources in a config definition if those objects themselves are not defined as resources (if that makes sense)? For example, if we have wrap codes and queues setup as resources in the config, we would not need to also define the wrap codes as data sources and refer to them in the queue resources. We would instead just refer to the wrap code resources. I'm mainly asking to confirm that we don't need to duplicate our resources as data sources if it isn't necessary.
What's the long term plan in using CX as Code with a DR org? Are there best practices currently documented for this purposed? For example, is it best to export an entire org's config for all objects on a regular basis, store in a version control repo, and use this in an emergency against a DR org? Or perhaps do this once and setup some service or ability to keep the orgs in sync?

Thanks!

John_Carnell · January 3, 2022, 7:59pm

Hi Matt,

Great set of questions. Let me see if I can answer them as best as I can:

For an existing orgs what are the best recommendations on how to start using CX as Code to manage the core objects for companies who have multiple orgs (DEV, TEST, PROD)? I have concerns around how exports of existing objects are identified (has a random 10 digit number stamped at the end) and applied, how state info is kept (probably need to use remote state), etc. Instead of using exports, is it even possible to manually define existing objects and their state and use that initially to treat the config as code and reliably change it and apply to the orgs?

Exports objects and the 10 number put at the end of them. All objects managed within a Genesys Terraform state need to have a unique name associated with them (e.g. Matt_Sample_Queue). Unfortunately, Genesys Cloud has several entities (e.g. queues) where you can create an object with the exact same name. Under the covers, entities have unique GUIIDs but those are not human readable. In order to ensure the names are unique, we append a number to the name ensure uniqueness.
You definitely should use a remote backing store for managing your state. There are plenty of options in the space. While you can manage your own remote state in AWS, Azure, etc... I always recommend people look at using Terraform Cloud and manage their remote state there. Personally, I have found Terraform Cloud a great solution because it is all cloud-based and simple to implement.
There is no direct way to manually define existing objects directly into Terraform. However, you can pull the object down into your Terraform state file by:
terraform import. Once the item is imported into your Terraform state file you can use the terraform show <<id>> command to see the HCL. Note: The import command will only import the command into your terraform state. You need to manually generate the HCL and then cut and paste (or do some shell scripting to build it out) it into the file.
I would be careful about trying to snapshot your environment every time you run and using that as the basis for promotion. Terraform is built as a set of DevOps primitives and writing good Terraform code often involves more than just trying to generate object definitions.

Speaking of that weird identifier on exported objects, in my experiments exporting existing objects such as a queue will look like "Matt_Sample_Queue_2437141386" whereas if I create it from scratch using CX as Code it's just "Matt_Sample_Queue". I'm not sure how that number gets determined, but seems to be the same across orgs. I was able to get the same identifier doing an export in both DEV and TEST orgs. Is it expected that exporting these objects across orgs will have the same exact identifier?

I would need to dig through the code, but I suspect the number is a hash based on the name of the object and some other property (e.g. the GUID). I will need to dig into this further.

Creating source code repos: In the above mentioned "How to begin your CX as Code Journey" blog, it is recommended to create separate repos based on logical groupings such as lines of business (LOBs) and store both the flows and config definitions. I'm all for this as it prevents creating a large monolith and increases risk. However, what is recommended for objects shared across LOBs such as user prompts, data tables, common modules (or any shared flows), data actions, wrap codes, skills, etc.?

I would group those shared resources into a common repo and then deploy them as a pre-dependent job before you deploy your individual LOB objects. This is similar to how you would deploy services. If you have a set of services (or code libraries) that are shared between projects you need to make sure that those items are always deployed in your code pipeline before your downstream dependencies.

Related to the above question, is there a best practice on apply changes? Would it/should it be granular for specific object types and their dependencies as one apply, and then again for other object types, and so forth. For example, running the apply for queues and wrap codes as 1 apply, user prompts as separate apply, data actions as another separate apply, etc. I'm assuming we would want to keep the config and state file sizes at a minimum to reduce risk if possible.

I always try to start with what is my dependencies around a flow. Most of your dependencies should be managed in your Terraform file. For example, if you have a queue and you assign users to a queue in Terraform, Terraform will not create the queue, until the users are available. You can also explicitly manage dependencies. I would avoid trying to do multiple applies either in the same Terraform project or across multiple Terraform projects. The moment you start trying to manually manage the order of dependencies, the harder it is going to be in the long run to maintain your Terraform config.

Is there a best practice for getting state per org and/or the logical separation of objects such as per LOB? If I understand state correctly, it is required to be kept somewhere so Terraform knows what to change. Is it possible to get the state at runtime before attempting to apply the change as opposed to storing the state continuously? Would this even be advised? I see we can store state remotely in places like AWS S3 or Kubernetes secrets (but is limited to 1mb, so a whole org's state in 1 secret would not be possible for large orgs) and for automation/pipeline purposes we can declare some env vars that could set the org env when the terraform init command is ran to tell it which remote state to run against.

I have never worked with Kubernetes secrets so I can't speak to its use, but yes you should keep your Terraform states in a remote backing state. Per my earlier comments, I like to use Terraform cloud. What I typically do is completely segregate each of my environments into their own backing state. So there is one for dev, test, and production. For a really large development organization, I also see companies giving each development team their own workspaces within a backing state. I personally have not worked too much with workspaces.
Personally, even though it has some redundant code, I keep the automation CI/CD for each environment in their own definitions. So for example, in my [GitHubActions blueprint] (https://github.com/GenesysCloudBlueprints/cx-as-code-cicd-gitactions-blueprint/blob/main/.github/workflows/deploy-flow.yaml). I have two separate jobs for dev and test defined in the workflow. I use the same terraform script, but I point it to a different backing state for each environment. (e.g. the TF_STATE variable). Any other environment-specific items I set as environment variables and leverage those in the Terraform scripts themselves.

Is there a procedure for restoring objects that were accidentally destroyed ( asking for a friend .... )? I could see a scenario where maybe the wrong state or config definition was used when applying changes ( terraform apply ) and for objects missing in the definition terraform deletes them from the target org. Would this involve opening a case with customer care? Could all those objects and their associations be restored (such as queue membership and skills)? This also assumes that there isn't an existing definition config that includes all users, queues, skills, etc.

Unfortunately, all actions in terraform apply are final. If you are missing an object definition for an object that is there, Terraform will delete it. Our support group will not have the ability to restore these items. However, all your Terraform definitions should be kept in a source control system. So you should be able to find the previous definition before the one that was destroyed and re-apply.
Also, remember too that you can run a terraform plan before any deployment to see exactly what Terraform says it is going to do. As a safeguard, I have seen terraform users break production deploys into a two-part workflow where the change is run with a terraform plan and must be manually approved by a human being before it is then "released" to run against production using a terraform apply.

Should data sources only be defined and used as dependencies for resources in a config definition if those objects themselves are not defined as resources (if that makes sense)? For example, if we have wrap codes and queues set up as resources in the config, we would not need to also define the wrap codes as data sources and refer to them in the queue resources. We would instead just refer to the wrap code resources. I'm mainly asking to confirm that we don't need to duplicate our resources like data sources if it isn't necessary.

If you are not managing a resource in the terraform file (and the backing state associated with the environment), you need to use a data source to look up the object. If are managing the resource in the terraform file (and the backing state) you can get the id for the item by referencing its fully-qualified name (e.g. genesyscloud_routing_queue.myqueuename.id)

What's the long-term plan in using CX as Code with a DR org? Are there best practices currently documented for this purpose? For example, is it best to export an entire org's config for all objects on a regular basis, store in a version control repo, and use this in an emergency against a DR org? Or perhaps do this once and set up some service or ability to keep the orgs in sync?

Our position for CX as Code is that is part of the DR solution and not an entire DR solution. What I mean by that is that every customer has different needs for DR and the entire Genesys Cloud org is not managed through CX as Code (e.g. for example transactional data like user recordings is not managed by CX as Code). Instead, CX as Code is best used when managing infrastructure-related items like queue definitions, skills, etc... You could do a regular export, but frankly, if you are already managing your infrastructure in CX as Code, your CX as Code definitions should already be defined and stored inside of a source control repository. I have seen some customers actually have a stand by the environment for DR and then use CX as Code (along with custom scripts for a number of non-CX as Code managed resources) to deploy any changes that were made to their production immediately to their DR environment. This way your DR environment is just another environment that changes are pushed to.
I personally push people to apply changes to their DR environment shortly after their production deploys occur and then run a set of automated tests against their DR environment to ensure nothing is broken. The problem with exporting and storing config and then trying to recreate the environment from scratch is that it's too easy for code and config to be stale and you do not find out about it until the world is burning down around you. Near-time deploys of changes to your DR environment after a production deploy, coupled with automated testing tend to surface issues quicker.

Thanks!

system · February 3, 2022, 8:00pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.