Featured image of post Create Azure resource health alert with terraform

Create Azure resource health alert with terraform

Greetings everyone. No, this blog is not (yet) dead. It is true that I did not have the motivation neither nice topic to write about since a long time.

But today is the day and we will start with a short one that will save you some frustration, at least I hope.

Azure Resource Health

I’m currently working on some observability task for various components deployed in Azure. One of the first signal I want to be alerted for is the current health of the deployed resources.

In Azure, resource health is the feature that offers that functionality. Not all resources do have health checks implemented. Please refer to the documentation to get more information about the support status.

Basically, resource health events are triggered on regular interval and when a status change is detected, the event is available in the Azure Activity Log.

Having those events available in the activity log is nice but it is much more interessant to get a notification automatically when it happens. This is possible with the help of activity log alerts and while the creation of such an alert for the resource health event is documented if you wanna use the portal or an ARM template, the terraform option is not and can be a bit frustrating if you do not know the details to fill in when creating the alert. Let’s have a look.

Create activity log alert for resource health using terraform

The terraform resource azurerm_monitor_activity_log_alert creates an activity log alert based on multiple criteria that are not well documented.

There are different types of activity log events and you should filter on the correct one. For resource health events, you need to specify category = "ResourceHealth".

You should also specify at least one scope which can be a subscription id, a resource group id or a resource id.

Back to the filter criteria, you better give additional filters otherwise you’ll generate a lot of alerts and thus get alert fatigue. You can use any of the following criteria to further filter the alerts :

  • operation_name
  • resource_provider
  • resource_type
  • resource_group
  • resource_id
  • caller
  • level
  • status
  • sub_status

As there is a limit on the number of activity log alerts per subscription, a good idea is to define resource health activity log alerts per resource type. The data to be specified in this filter criteria is not clearly documented and being the donkey I am, I first created an alert for virtual machines in the portal and had a look at how was the resource type mentioned there.

fig 1 - Azure resource health alert definition in the portal

So I tried to use the following terraform code

resource "azurerm_monitor_activity_log_alert" "resourcehealth" {
  name                = "resource_health_alert"
  description         = "Resource Health Alerts"
  resource_group_name = var.resource_group_name
  scopes              = [var.scope]
  criteria {
    resource_id    = var.resource_id
    category       = "ResourceHealth"
    resource_type  = "Virtual machines"     # <---- 
    resource_group = ""
    resource_health {
         current = ["Degraded","Unavailable","Unknown"]
         previous = ["Available"]
         reason = ["PlatformInitiated","UserInitiated","Unknown"]
    }
  }
  action {
    action_group_id = var.action_group_id
  }
  tags = var.tags
}

And sure, terraform apply did work and my alert rule was created and visible in the portal … and still, no alerts were triggered when I was stopping and starting my test VM …

I spare you all the details of ignorance I went into until I had the luminous idea to check the structure of one of the health event JSON object in the activity log.

{
    "channels": "Operation",
    "correlationId": "84f30f94-3d1e-439c-9ce2-869e4b363369",
    "description": "",
    "eventDataId": "d905d68b-8e78-4889-a7b8-1f3c516d71c3",
    "eventName": {
        "value": "",
        "localizedValue": ""
    },
    "category": {
        "value": "ResourceHealth",
        "localizedValue": "Resource Health"
    },
    "eventTimestamp": "2023-04-27T14:52:24.762Z",
    "id": "/SUBSCRIPTIONS/[REDACTED]/RESOURCEGROUPS/[REDACTED]/PROVIDERS/MICROSOFT.COMPUTE/VIRTUALMACHINES/[REDACTED]/events/d905d68b-8e78-4889-a7b8-1f3c516d71c3/ticks/638182039447620000",
    "level": "Informational",
    "operationId": "",
    "operationName": {
        "value": "Microsoft.Resourcehealth/healthevent/Updated/action",
        "localizedValue": "Health Event Updated"
    },
    "resourceGroupName": "[REDACTED]",
    "resourceProviderName": {
        "value": "MICROSOFT.COMPUTE",
        "localizedValue": "MICROSOFT.COMPUTE"
    },
    "resourceType": {
        "value": "MICROSOFT.COMPUTE/virtualmachines",
        "localizedValue": "MICROSOFT.COMPUTE/virtualmachines"
    },
    "resourceId": "/SUBSCRIPTIONS/[REDACTED]/RESOURCEGROUPS/[REDACTED]/PROVIDERS/MICROSOFT.COMPUTE/VIRTUALMACHINES/[REDACTED]",
    "status": {
        "value": "Updated",
        "localizedValue": "Updated"
    },
    "subStatus": {
        "value": "",
        "localizedValue": ""
    },
    "submissionTimestamp": "2023-04-27T14:52:24.762Z",
    "subscriptionId": "[REDACTED]",
    "tenantId": "",
    "properties": {
        "title": "Stopping and deallocating",
        "details": "This virtual machine is stopped and deallocated as requested by an authorized user or process.",
        "currentHealthStatus": "Available",
        "previousHealthStatus": "Available",
        "type": "Downtime",
        "cause": "UserInitiated"
    },
    "relatedEvents": []
}

Of course ! Stupid me ! One should use the technical name of the resource type and thus microsoft.compute/virtualmachines instead of Virtual machines that is displayed in the portal.

Once I fixed that, all alerts triggered fine and notifications started to flow in my mailbox.

Conclusion

I was really frustrated by this experience and I hope this small article would help other people to get the correct information quickly about this topic.

Picture from Jair Lázaro on Unsplash

Licensed under CC BY-NC-SA 4.0
Last updated on Apr 30, 2023 12:07 UTC
Built with Hugo
Theme Stack designed by Jimmy