Featured image of post Nasty surprise with Terraform and AKS

Nasty surprise with Terraform and AKS

This blog article will be very long and technical. It describes an issue encountered during the adaptation of some Terraform manifests used to deploy Azure Kubernetes Services clusters.

Context

We are currently setting up an AKS offering at work. As for all the services we develop, we deploy them preferably on Azure with Terraform code. We also use YAML configuration stored in git repositories as source for the values assigned to Terraform variables.

As we decided to use Azure Application Gateway as Ingress Controller for our clusters, we needed to adapt the current Terraform code to add a new subnet for the Application Gateway in the AKS virtual network, the Application Gateway itself as well as enable the corresponding AKS addon.

Everything being stored in git, I created a new branch and started the modifications in the code. When I was ready, I decided to run terraform plan and … TA DAAAAA! Terraform informed me that the existing clusters must be replaced because of a change in the vnet id associated with the AKS default node pool !

… Hummm … not good … let’s investigate what’s going on there …

Investigations

I started by creating a small test case containing the minimum resources :

  • A resource group
  • A virtual network with one subnet

No need for an AKS cluster because we know that root cause is related to the addition of a subnet in the existing AKS vnet.

The YAML configuration file has the following structure :

k8s:
  seb:
    region: West Europe
    vnet:
      cidr:
        - 10.144.0.0/16
      subnets:
        aks_default_node_subnet: 10.144.0.0/22
        aks_worker_node_subnet: 10.144.4.0/22
        # aks_agw_subnet: 10.144.240.0/27

The Terraform code to deploy the resources is :

locals {
  var_file         = "./config.yml"
  var_file_content = fileexists(local.var_file) ? file(local.var_file) : "NoSettingsFileFound: true"
  var_config       = yamldecode(local.var_file_content)
}

variable "environment" {
  type    = string
  default = "dev"
}

resource "azurerm_resource_group" "aks" {
  for_each = local.var_config.k8s

  name     = "rg-aks-${each.key}-${var.environment}"
  location = each.value.region
  tags     = local.common_tags[each.key]
}

# Create AKS vnet with dynamic block -> problem when you want to add a subnet
resource "azurerm_virtual_network" "aks-net" {
  for_each = local.var_config.k8s

  name                = "vnet-aks-${each.key}-${var.environment}"
  location            = azurerm_resource_group.aks[each.key].location
  resource_group_name = azurerm_resource_group.aks[each.key].name
  address_space       = each.value.vnet.cidr

  dynamic "subnet" {
    for_each = each.value.vnet.subnets
    content {
      name           = "${subnet.key}-${var.environment}"
      address_prefix = subnet.value
    }
  }
}

Running that Terraform manifest produces the following results :

terraform apply

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # azurerm_resource_group.aks["seb"] will be created
  + resource "azurerm_resource_group" "aks" {
      + id       = (known after apply)
      + location = "westeurope"
      + name     = "rg-aks-seb-dev"
    }

  # azurerm_virtual_network.aks-net["seb"] will be created
  + resource "azurerm_virtual_network" "aks-net" {
      + address_space         = [
          + "10.144.0.0/16",
        ]
      + guid                  = (known after apply)
      + id                    = (known after apply)
      + location              = "westeurope"
      + name                  = "vnet-aks-seb-dev"
      + resource_group_name   = "rg-aks-seb-dev"
      + subnet                = [
          + {
              + address_prefix = "10.144.0.0/22"
              + id             = (known after apply)
              + name           = "aks_default_node_subnet-dev"
              + security_group = ""
            },
          + {
              + address_prefix = "10.144.4.0/22"
              + id             = (known after apply)
              + name           = "aks_worker_node_subnet-dev"
              + security_group = ""
            },
        ]
      + vm_protection_enabled = false
    }

Plan: 2 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

azurerm_resource_group.aks["seb"]: Creating...
azurerm_resource_group.aks["seb"]: Creation complete after 1s [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev]
azurerm_virtual_network.aks-net["seb"]: Creating...
azurerm_virtual_network.aks-net["seb"]: Creation complete after 5s [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Ok this is the startup situation. Now we add a new subnet to the existing vnet by uncommenting the line with aks_agw_subnet: 10.144.240.0/27 in the file config.yaml.

k8s:
  seb:
    region: West Europe
    vnet:
      cidr:
        - 10.144.0.0/16
      subnets:
        aks_default_node_subnet: 10.144.0.0/22
        aks_worker_node_subnet: 10.144.4.0/22
        aks_agw_subnet: 10.144.240.0/27

Run terraform plan to assess the impact of our modification.

terraform plan

azurerm_resource_group.aks["seb"]: Refreshing state... [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev]
azurerm_virtual_network.aks-net["seb"]: Refreshing state... [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # azurerm_virtual_network.aks-net["seb"] has been changed
  ~ resource "azurerm_virtual_network" "aks-net" {
      + dns_servers           = []
        id                    = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev"
        name                  = "vnet-aks-seb-dev"
        # (6 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo
or respond to these changes.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # azurerm_virtual_network.aks-net["seb"] will be updated in-place
  ~ resource "azurerm_virtual_network" "aks-net" {
        id                    = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev"
        name                  = "vnet-aks-seb-dev"
      ~ subnet                = [
          - {
              - address_prefix = "10.144.0.0/22"
              - id             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev"
              - name           = "aks_default_node_subnet-dev"
              - security_group = ""
            },
          + {
              + address_prefix = "10.144.0.0/22"
              + id             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev"
              + name           = "aks_default_node_subnet-dev"
              + security_group = null
            },
          - {
              - address_prefix = "10.144.4.0/22"
              - id             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev"
              - name           = "aks_worker_node_subnet-dev"
              - security_group = ""
            },
          + {
              + address_prefix = "10.144.4.0/22"
              + id             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev"
              + name           = "aks_worker_node_subnet-dev"
              + security_group = null
            },
            # (1 unchanged element hidden)
        ]
        # (6 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.

Here is the root of the issue. The azurerm_virtual_network resource is updated in-place but the existing subnets are deleted and recreated identically ! We think that it happens because subnets created inline in a azurerm_virtual_network resources are stored in an array in the Terraform state file.

...
    {
      "mode": "managed",
      "type": "azurerm_virtual_network",
      "name": "aks-net",
      "provider": "provider[\"registry.terraform.io/hashicorp/azurerm\"]",
      "instances": [
        {
          "index_key": "seb",
          "schema_version": 0,
          "attributes": {
            "address_space": [
              "10.144.0.0/16"
            ],
            "bgp_community": "",
            "ddos_protection_plan": [],
            "dns_servers": [],
            ...
            "id": "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev",
            "location": "westeurope",
            "name": "vnet-aks-seb-dev",
            "resource_group_name": "rg-aks-seb-dev",
            "subnet": [
              {
                "address_prefix": "10.144.0.0/22",
                "id": "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev",
                "name": "aks_default_node_subnet-dev",
                "security_group": ""
              },
              {
                "address_prefix": "10.144.240.0/27",
                "id": "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_agw_subnet-dev",
                "name": "aks_agw_subnet-dev",
                "security_group": ""
              },
              {
                "address_prefix": "10.144.4.0/22",
                "id": "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev",
                "name": "aks_worker_node_subnet-dev",
                "security_group": ""
              }
            ],    
...

When you modify an array of resources in Terraform, it implies the recreation of the resources in the array. You can read a nice explanation of that expected behavior here.

The final situation is certainly correct from a networking point of view but if you create an AKS cluster resource that uses one of the existing subnets for his default node pool, this action forces the cluster to be replaced ! You will then find yourself with an empty AKS cluster … not a good idea …

Before going on further, run terraform destroy to clean up all resources.

Solution

To avoid that issue, we need to define each subnet as a separate Terraform resource using for_each meta statement. Simple no?

In fact, not so simple. The config.yaml file uses a nested structure that is human friendly to describe the subnets. As there is no nested for_each loop possible in Terraform, we will need to transform the data structure into something else. This is a use case very similar to what is described in the Terraform documentation about the flatten function.

First we need to create a flattened list of subnets that contains all the needed information to be able to define them as individual resources with a unique identifier. We do this by creating a new local variable var_snet_list.

locals {
  var_file         = "./config.yml"
  var_file_content = fileexists(local.var_file) ? file(local.var_file) : "NoSettingsFileFound: true"
  var_config       = yamldecode(local.var_file_content)
  var_snet_list = flatten([ 
      for k8s_key, k8s in local.var_config.k8s :  [
        for snet_key, snet in k8s.vnet.subnets : {
          k8s_name = k8s_key
          snet_name = snet_key
          snet_prefixes = [snet]
        }
      ]
  ])
}

The result of that transformation on the content of the config.yaml file is :

[
      + {
          + k8s_name      = "seb"
          + snet_name     = "aks_default_node_subnet"
          + snet_prefixes = [
              + "10.144.0.0/22",
            ]
        },
      + {
          + k8s_name      = "seb"
          + snet_name     = "aks_worker_node_subnet"
          + snet_prefixes = [
              + "10.144.4.0/22",
            ]
        },
    ]

It’s a list of objects whose fields are

  • k8s_name : the name of the corresponding AKS cluster.
  • snet_name : the name of the subnet.
  • snet_prefixes : a list of prefixes associated with the subnet.

Then we modify the azurerem_virtual_network resource to not use the dynamic subnet block and add the dedicated azurerem_subnet resources using a for_each loop on a transformed version of var_snet_list.

# Create AKS vnet without subnets
resource "azurerm_virtual_network" "aks-net" {
  for_each = local.var_config.k8s

  name = "vnet-aks-${each.key}-${var.environment}"
  location            = azurerm_resource_group.aks[each.key].location
  resource_group_name = azurerm_resource_group.aks[each.key].name
  address_space       = each.value.vnet.cidr
}

# # Create subnets 
resource "azurerm_subnet" "aks-subnet" {
  for_each = { for subnet in local.var_snet_list : "${subnet.k8s_name}.${subnet.snet_name}" => subnet }

  name                 = "${each.value.snet_name}-${var.environment}"
  virtual_network_name = azurerm_virtual_network.aks-net[each.value.k8s_name].name
  resource_group_name  = azurerm_resource_group.aks[each.value.k8s_name].name
  address_prefixes     = each.value.snet_prefixes
}

The code on line 13 produces the following structure to iterate on :

{
      + seb.aks_default_node_subnet     = {
          + k8s_name      = "seb"
          + snet_name     = "aks_default_node_subnet"
          + snet_prefixes = [
              + "10.144.0.0/22",
            ]
        }
      + seb.aks_worker_node_subnet      = {
          + k8s_name      = "seb"
          + snet_name     = "aks_worker_node_subnet"
          + snet_prefixes = [
              + "10.144.4.0/22",
            ]
        }
    }

It is a map of independant subnets that use a key composed of the AKS cluster name and the subnet name to uniquely identify them. This is crucial to assign the good snet_prefixes per clusters as the subnets do have the same name.

Let’s create the resources with the new code.

terraform apply

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # azurerm_resource_group.aks["seb"] will be created
  + resource "azurerm_resource_group" "aks" {
      + id       = (known after apply)
      + location = "westeurope"
      + name     = "rg-aks-seb-dev"
    }

  # azurerm_subnet.aks-subnet["seb.aks_default_node_subnet"] will be created
  + resource "azurerm_subnet" "aks-subnet" {
      + address_prefix                                 = (known after apply)
      + address_prefixes                               = [
          + "10.144.0.0/22",
        ]
      + enforce_private_link_endpoint_network_policies = false
      + enforce_private_link_service_network_policies  = false
      + id                                             = (known after apply)
      + name                                           = "aks_default_node_subnet-dev"
      + resource_group_name                            = "rg-aks-seb-dev"
      + virtual_network_name                           = "vnet-aks-seb-dev"
    }

  # azurerm_subnet.aks-subnet["seb.aks_worker_node_subnet"] will be created
  + resource "azurerm_subnet" "aks-subnet" {
      + address_prefix                                 = (known after apply)
      + address_prefixes                               = [
          + "10.144.4.0/22",
        ]
      + enforce_private_link_endpoint_network_policies = false
      + enforce_private_link_service_network_policies  = false
      + id                                             = (known after apply)
      + name                                           = "aks_worker_node_subnet-dev"
      + resource_group_name                            = "rg-aks-seb-dev"
      + virtual_network_name                           = "vnet-aks-seb-dev"
    }

  # azurerm_virtual_network.aks-net["seb"] will be created
  + resource "azurerm_virtual_network" "aks-net" {
      + address_space         = [
          + "10.144.0.0/16",
        ]
      + guid                  = (known after apply)
      + id                    = (known after apply)
      + location              = "westeurope"
      + name                  = "vnet-aks-seb-dev"
      + resource_group_name   = "rg-aks-seb-dev"
      + subnet                = (known after apply)
      + vm_protection_enabled = false
    }

Plan: 4 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

azurerm_resource_group.aks["seb"]: Creating...
azurerm_resource_group.aks["seb"]: Creation complete after 1s [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev]
azurerm_virtual_network.aks-net["seb"]: Creating...
azurerm_virtual_network.aks-net["seb"]: Creation complete after 5s [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev]
azurerm_subnet.aks-subnet["seb.aks_worker_node_subnet"]: Creating...
azurerm_subnet.aks-subnet["seb.aks_default_node_subnet"]: Creating...
azurerm_subnet.aks-subnet["seb.aks_worker_node_subnet"]: Creation complete after 5s [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev]
azurerm_subnet.aks-subnet["seb.aks_default_node_subnet"]: Creation complete after 8s [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev]

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

Now let’s uncomment the aks_agw_subnet: 10.144.240.0/27 line in config.yaml and check what terraform plan proposes …

terraform plan
azurerm_resource_group.aks["seb"]: Refreshing state... [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev]
azurerm_virtual_network.aks-net["seb"]: Refreshing state... [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev]
azurerm_subnet.aks-subnet["seb.aks_default_node_subnet"]: Refreshing state... [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev]
azurerm_subnet.aks-subnet["seb.aks_worker_node_subnet"]: Refreshing state... [id=/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # azurerm_subnet.aks-subnet["seb.aks_worker_node_subnet"] has been changed
  ~ resource "azurerm_subnet" "aks-subnet" {
        id                                             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev"
        name                                           = "aks_worker_node_subnet-dev"
      + service_endpoint_policy_ids                    = []
      + service_endpoints                              = []
        # (6 unchanged attributes hidden)
    }
  # azurerm_subnet.aks-subnet["seb.aks_default_node_subnet"] has been changed
  ~ resource "azurerm_subnet" "aks-subnet" {
        id                                             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev"
        name                                           = "aks_default_node_subnet-dev"
      + service_endpoint_policy_ids                    = []
      + service_endpoints                              = []
        # (6 unchanged attributes hidden)
    }
  # azurerm_virtual_network.aks-net["seb"] has been changed
  ~ resource "azurerm_virtual_network" "aks-net" {
      + dns_servers           = []
        id                    = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev"
        name                  = "vnet-aks-seb-dev"
      ~ subnet                = [
          + {
              + address_prefix = "10.144.0.0/22"
              + id             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_default_node_subnet-dev"
              + name           = "aks_default_node_subnet-dev"
              + security_group = ""
            },
          + {
              + address_prefix = "10.144.4.0/22"
              + id             = "/subscriptions/<sanitized>/resourceGroups/rg-aks-seb-dev/providers/Microsoft.Network/virtualNetworks/vnet-aks-seb-dev/subnets/aks_worker_node_subnet-dev"
              + name           = "aks_worker_node_subnet-dev"
              + security_group = ""
            },
        ]
        # (5 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo
or respond to these changes.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # azurerm_subnet.aks-subnet["seb.aks_agw_subnet"] will be created
  + resource "azurerm_subnet" "aks-subnet" {
      + address_prefix                                 = (known after apply)
      + address_prefixes                               = [
          + "10.144.240.0/27",
        ]
      + enforce_private_link_endpoint_network_policies = false
      + enforce_private_link_service_network_policies  = false
      + id                                             = (known after apply)
      + name                                           = "aks_agw_subnet-dev"
      + resource_group_name                            = "rg-aks-seb-dev"
      + virtual_network_name                           = "vnet-aks-seb-dev"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.

This looks much better ! The existing subnets are not deleted then recreated anymore. Only the new one is added to the state.

Note : you can ignore the messages stating that Terraform has detected changes outside of itself, they are due to some default values on optional parameters that are returned by the Azure API when Terraform refreshes the state file. This is a feature that made it in Terraform release 0.15.4.

Thanks to this refactoring of the Terraform code, we can now deploy new subnets in AKS vnet without a replacement of the cluster.

Conclusion

Mission accomplished … well for the future clusters because to solve the current issue, we would need to delete the resources from the state file and re-import them with the new Terraform code … not worth the time as they are temporary test clusters that will soon be deleted.

We are happy we have caught this problem before going to production.

See you next time.

Built with Hugo
Theme Stack designed by Jimmy