Prioritise SkillGroup deletion over creation

Hi Team
Yesterday we had to make a change to the skill expression of few SkillGroups.
This change was such that it needed the SkillGroup to be deleted and then recreated.

Ideally, in this situation, the logical sequence of events should be to first delete the existing SkillGroup resource, and then create it as another resource. However, when the PR was applied, terraform tried to "create" the skillgroup first, and then "delete" the existing one.
Clearly, the create request itself failed with below error:

**API Error: 400 - group named [redacted] already exists**

After this, terraform went on to delete the existing Skillgroup successfully a few secs later. This resulted in the skillgroup being deleted successfully, but creation of it failing. Therefore, for about an hour, we had calls queueing up on the associated queues as these skill groups were suddenly not existing anymore.

This incident would have been avoided had terraform deleted the existing SkillGroup first, and then go for creation.
Can you please take a look at possibility of implementing this logic so that such customer impacting incidents do not occur?

Below are extracts of above observation taken from the TF logs showing the timestamps.

Plan to create:
{"@level":"info","@message":"module.skillgroups.genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]: Plan to create","@module":"terraform.ui","@timestamp":"2024-10-23T23:59:14.398818Z","change":{"resource":{"addr":"module.skillgroups.genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]","module":"module.skillgroups","resource":"genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]","implied_provider":"genesyscloud","resource_type":"genesyscloud_routing_skill_group","resource_name":"agent_groups_with_language_and_one_not_condition","resource_key":"redacted"},"action":"create"},"type":"planned_change"}

Plan to delete
{"@level":"info","@message":"module.skillgroups.genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]: Plan to delete","@module":"terraform.ui","@timestamp":"2024-10-23T23:59:14.399642Z","change":{"resource":{"addr":"module.skillgroups.genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]","module":"module.skillgroups","resource":"genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]","implied_provider":"genesyscloud","resource_type":"genesyscloud_routing_skill_group","resource_name":"skillgroups_common","resource_key":"redacted"},"action":"delete","reason":"delete_because_each_key"},"type":"planned_change"}

Creating
{"@level":"info","@message":"module.skillgroups.genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]: Creating...","@module":"terraform.ui","@timestamp":"2024-10-24T00:08:08.468801Z","hook":{"resource":{"addr":"module.skillgroups.genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]","module":"module.skillgroups","resource":"genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]","implied_provider":"genesyscloud","resource_type":"genesyscloud_routing_skill_group","resource_name":"agent_groups_with_language_and_one_not_condition","resource_key":"redacted"},"action":"create"},"type":"apply_start"}

Creation errored
{"@level":"info","@message":"module.skillgroups.genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]: Creation errored after 1s","@module":"terraform.ui","@timestamp":"2024-10-24T00:08:09.297000Z","hook":{"resource":{"addr":"module.skillgroups.genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]","module":"module.skillgroups","resource":"genesyscloud_routing_skill_group.agent_groups_with_language_and_one_not_condition[\"redacted\"]","implied_provider":"genesyscloud","resource_type":"genesyscloud_routing_skill_group","resource_name":"agent_groups_with_language_and_one_not_condition","resource_key":"redacted"},"action":"create","elapsed_seconds":1},"type":"apply_errored"}

Destroying
{"@level":"info","@message":"module.skillgroups.genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]: Destroying... [id=dd86679f-aac8-4d33-a32b-ecdfe907dc9c]","@module":"terraform.ui","@timestamp":"2024-10-24T00:08:13.885206Z","hook":{"resource":{"addr":"module.skillgroups.genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]","module":"module.skillgroups","resource":"genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]","implied_provider":"genesyscloud","resource_type":"genesyscloud_routing_skill_group","resource_name":"skillgroups_common","resource_key":"redacted"},"action":"delete","id_key":"id","id_value":"dd86679f-aac8-4d33-a32b-ecdfe907dc9c"},"type":"apply_start"}

Destruction complete
{"@level":"info","@message":"module.skillgroups.genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]: Destruction complete after 1s","@module":"terraform.ui","@timestamp":"2024-10-24T00:08:14.755553Z","hook":{"resource":{"addr":"module.skillgroups.genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]","module":"module.skillgroups","resource":"genesyscloud_routing_skill_group.skillgroups_common[\"redacted\"]","implied_provider":"genesyscloud","resource_type":"genesyscloud_routing_skill_group","resource_name":"skillgroups_common","resource_key":"redacted"},"action":"delete","elapsed_seconds":1},"type":"apply_complete"}

Hi Nishant,

Can you send the routing queue definition and the conditional skill group involved to your TAM Premkumar?

Thanks,
John Carnell
Director, Developer Engagement

Hi John
I have passed the requested details onto Premkumar.

Would it also help if I send out the TF log as well through him?

Regards
Nishant Tank

Hi Nishant,

A couple of quick questions.

  1. I am curious as to why you did not update the skill_groups?
  2. I noticed you are using modules to create your skill groups. Did you define two different module invocations
    and the first module call remove the skill_group from the array you were passing in and then in the second
    module call create the skill group. Was this done all in the same terraform run?

For example, let's say I wrote a module to create a list of queues based on an array passed into it and then tried to do this:

module "classifier_queues1" {
  source                   = "./modules/queues"
  classifier_queue_names   = ["HSA"]
  classifier_queue_members = module.classifier_users.user_ids
}

Terraform would happily go and create the HSA queue.

Now later on you decided to delete and re-create the HSA queue by doing this

  source                   = "./modules/queues"
  classifier_queue_names   = []
  classifier_queue_members = module.classifier_users.user_ids
}

module "classifier_queues2" {
  source                   = "./modules/queues"
  classifier_queue_names   = ["HSA"]
  classifier_queue_members = module.classifier_users.user_ids
}

You are now creating a race condition because Terraform has two different module declarations with no dependency mechanism. When Terraform executes it builds its dependency graph and then hands over all items without dependencies to a pool of go routines because there is no established graph.

I suspect this is the case because I see two different objects with the same name being created and deleted in the plan. I looked through the Schema definitions and we don't create and re-create skillgroups via destroy and recreate on any of the attributes in the schema definition.

This is not a problem in the CX as Code resource. From what I can see in the log messages you are spinning up two different "resources" (e.g. modules) that are trying to manage the same group of projects Terraform happily executes them because there is no dependency between the modules.

Modules using a for..loop are a great way of enforcing consistency when you are creating objects, updating, or deleting objects en masse. However, it sounds like you needed to change the skill expression group that was going to be different than all of the other ones and since it was in a module you had no way of easily updating it. In that case, you would need to manually remove the object from the terraform state via a terraform rm command and then import it as a separate resource into your project using a terraform import command and then make your changes.

Thanks,
John Carnell
Director, Developer Engagement

John
You are right in observing that while changing the Skill Group definition, we are effectively creating it as a different resource.
Our skill groups are categorized into different arrays that get passed to a dedicated resource that uses a for...each loop to create and update the skill groups.
In this case, the definition of a few queues changed, such that the skillgroups were removed from 1 array and added into another.

My ask in this scenario was, is it possible for terraform to be setup such that it first deletes objects if there are any to be deleted and then do the creation.

Looking at your detailed analysis, it seems that may not be possible. If so, only alternative is that we recognize whenever such a requirement is received and inform the business that a disruption should be expected in such scenarios.

Regards
Nishant Tank

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.