Building Christmas baking factory for crispy custom images

Building Christmas baking factory for crispy custom images

ยท

16 min read

This post was created for the Festive Tech Calendar 2023 event. There is a lot of great content published every day, I would encourage you to check it out.

๐Ÿ’ก

Who doesn't like Christmas cookies. We will, however, be baking something else today...

There are many scenarios, where your organisation can be required to use custom VM images in Azure with apps and settings baked into it, rather than relying on "vanilla" marketplace images and coupling it with last mile configuration scripts.

The mission

In this post, we will be building on top of some great content provided by other community members and Microsoft and create an automated Image Factory (like Santa's workshop) for adding enterprise applications for Azure Virtual Desktop scenario. The factory can be used for other use cases too, of course.

There are several strategies or approaches for application delivery / deployment:

  • Golden image

  • Configuration management tools: anything from traditional Microsoft SCCM, through declarative 'as code' options like Ansible or Chef to MDM/MAM solutions like Microsoft Intune

  • Last-mile configuration - running a post-deployment script once your VM is up-and-running, in Azure typically in form of a custom script extension or 'cloud init'.

  • "Attach on demand" solutions like MSIX App Attach or Azure VM Apps

Each approach has its advantages and disadvantages, of course. I am not claiming that the Golden image options is the best, I simply find it the most proven and universal (like in 'you should be able to put any type of app in the image'). You don't need to repackage your apps, learn some new tools, or self-host infrastructure to support some of the other options.

Baking tools

What would baking be without good tools, right?

  • Azure Image Builder (aka HashiCorp Packer as a service) - an essential service we will use to 'outsource' all the heavy-lifting. TIP: If you search for "AVD + Image Builder" online, you will find a lot of great content from community members like Dean Cefola, Travis Roberts, and John Saville, just to name a few I personally follow.

  • GitHub - this is just my personal preference but I'm sure Azure DevOps or any other CI/CD tool could work well here. I guess I don't need to explain a lot, why we need this tool: we want to automate everything, store our code in a single place, enable collaboration, et cetera, et cetera.

The recipe

Whenever I design a new solution, I always try to look at the problem from the main user's perspective. And I don't mean the user of AVD that leverages the custom image, but the engineers that develop the image and its content.

Their objective would be to use DevOps practices and tools they know and apply them to automated image build and management discipline.

Something like this:

  1. An engineer develops and tests an unattended installation of software packages in a Dev VM (the OS should match the image OS). They produce an application manifest that contains key information about the app and the installation.

  2. They push updated artefacts - mainly the manifest, but also PowerShell scripts or IaC templates - to a GitHub repo and ask for review via Pull Request. Since binary files (installation packages) are not intended for storing in a git repository, the engineer uploads them directly to the 'Assets storage account'. Azure Storage Explorer or AzCopy can serve well for this purpose.

  3. The GitHub Actions 'Image build workflow' can be triggered manually (workflow_dispatch) or by a push event to main branch after the PR was approved.

  4. The workflow will first copy the installation scripts and application manifests to the same 'Assets storage account', so the repo and the blob container are always in sync.

  5. Then it will deploy an Image Template using Az CLI (or Azure PowerShell). This will be used by Azure Image Builder (AIB) as a recipe.

  6. The last step in our workflow is to trigger an image build, so AIB can start baking our new image version.

  7. This new image can be referenced in an AVD deployment as a source for session hosts.

The main success factor to the entire "operation" is knowing how different applications we want in the image can be installed in an unattended (silent) way. Why? Since we have AIB for doing the heavy lifting, there is no way for us to interact with the build.

There are several packaging formats that have different arguments to trigger a silent installation. Searching in vendor's FAQ (like this one), Stack Overflow, and other helpful content is how you get those arguments.

It is still very important to test them in that 'Dev Sandbox', because troubleshooting failed builds is no fun: AIB / Packer produces a very verbose log, and the image build time is quite long, so you want to test your script to shorten the debugging cycle. Trust me!

A shoutout to Travis for explaining the process so well in this video.

Code walkthrough

All the source code I am using can be found in my GitHub repo:

Workflow

We begin by looking at our workflow in .github/workflows/image-build.yml:

name: Build Custom Image

on:
#   push:
#     branches:
#       - main
#     paths:
#       - "artifacts/**"
  workflow_dispatch:

permissions:
  id-token: write
  contents: read

env:
  ARM_TENANT_ID: ${{ vars.AZ_TENANT_ID }}
  LOCATION: 'norwayeast'
  ASSETS_STORAGE_ACCOUNT: 'imgfactoryassets'
  RESOURCE_GROUP_NAME: 'imagefactory-prod-rg'
  AZURE_STORAGE_CONTAINER: 'scripts'
  IMAGE_TEMPLATE_NAME: 'AVDImageTemplate-${{ github.run_number }}'

jobs:
  build-image:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: azure/login@v1
        name: Sign in to Azure
        with:
          client-id: ${{ vars.AZ_CLIENT_ID }}
          tenant-id: ${{ vars.ISP_AZ_TENANT_ID }}
          subscription-id: ${{ vars.AZ_SUBSCRIPTION_ID }}

      # Uploads all PS1 files and manifests to the scripts container in Assets storage account
      - name: Upload script artefacts
        run: |
          az storage blob upload-batch \
            --account-name ${{ env.ASSETS_STORAGE_ACCOUNT }} \
            --destination ${{ env.AZURE_STORAGE_CONTAINER }} \
            --source ./artifacts \
            --overwrite true
        env:
            AZURE_STORAGE_AUTH_MODE: login

      # Deploys Image Template
      - uses: azure/arm-deploy@v1
        id: deploy
        name: Deploy Image Template
        with:
          failOnStdErr: false
          deploymentName: ImageFactory-${{ github.run_number }}
          scope: resourcegroup
          resourceGroupName: ${{ env.RESOURCE_GROUP_NAME }}
          subscriptionId: ${{ vars.AZ_SUBSCRIPTION_ID }}
          template: ./artifacts/deploy.bicep
          parameters: ./artifacts/deploy.parameters.json name=${{ env.IMAGE_TEMPLATE_NAME }}

      # Write template id output
      - name: Get Image Template output
        run: |
          echo "Image Template ID: ${{ steps.deploy.outputs.imageTemplateId }}"

      # Trigger image builder to run, will not wait for process to complete
      - name: Trigger image build with invoke-action
        run: |
         az resource invoke-action \
           --action Run \
           --name ${{ env.IMAGE_TEMPLATE_NAME }} \
           --resource-type Microsoft.VirtualMachineImages/imageTemplates \
           --resource-group ${{ env.RESOURCE_GROUP_NAME }} \
           --no-wait

The main parts of the code are commented or self-explanatory. If you choose to use this repo as a template, please note that you will need to:

  • create several repository-level variables: AZ_TENANT_ID, AZ_SUBSCRIPTION_ID, AZ_CLIENT_ID.

  • I am using 'federated credentials' to authenticate to Azure, so we don't need any passwords or keys. You still need to create a workload identity by, e.g., following this Microsoft Learn module or any other guidance. This identity needs to be "linked" with your repo / branch and it needs to be assigned an RBAC role. You can scope it to that resource group we reference in the RESOURCE_GROUP_NAME variable.

  • update LOCATION, ASSETS_STORAGE_ACCOUNT, RESOURCE_GROUP_NAME, and AZURE_STORAGE_CONTAINER environment variables to match with your environment. Especially the storage account name that must be globally unique.

Bicep template

Every time we update our "assets" (introduce new apps, make changes in existing ones, improve or debug the scripts), we need to deploy a new Image Template to Azure (aka The recipe).

There is a simple Bicep template in the repo. The template is a slightly modified version from the AVD Landing Zone Accelerator. A great tool to spin up a new AVD environment fast, by the way.

Note: this Accelerator contains its own 'Custom image build' component but that one focuses on image optimization for AVD, not so much on 'bring your own apps' scenario.

@description('Optional. Location for all resources.')
param location string = resourceGroup().location

@description('Required. Name of the Image Template to be built by the Azure Image Builder service.')
param name string

@description('Generated. Do not provide a value! This date value is used to generate a unique image template name.')
param baseTime string = utcNow('yyyy-MM-dd-HH-mm-ss')

@description('Optional. Tags of the resource.')
param tags object = {}

@description('Optional. Size of the VM to be used to build the image.')
param vmSize string = 'Standard_B4s_v2'

@description('Optional. Image build timeout in minutes. Allowed values: 0-960. 0 means the default 240 minutes.')
@minValue(0)
@maxValue(960)
param buildTimeoutInMinutes int = 0

@description('Required. Name of the User Assigned Identity to be used to deploy Image Templates in Azure Image Builder.')
param userMsiName string

@description('Optional. Resource group of the user assigned identity.')
param userMsiResourceGroup string = resourceGroup().name

@description('Optional. Specifies the size of OS disk.')
param osDiskSizeGB int = 128

@description('''Optional. List of User-Assigned Identities associated to the Build VM for accessing Azure resources such as Key Vaults from your customizer scripts.
Be aware, the user assigned identity specified in the \'userMsiName\' parameter must have the \'Managed Identity Operator\' role assignment on all the user assigned identities
specified in this parameter for Azure Image Builder to be able to associate them to the build VM.
''')
param userAssignedIdentities array = []

@description('''
Optional. Resource ID of an already existing subnet, e.g.: /subscriptions/<subscriptionId>/resourceGroups/<resourceGroupName>/providers/Microsoft.Network/virtualNetworks/<vnetName>/subnets/<subnetName>.
If no value is provided, a new temporary VNET and subnet will be created in the staging resource group and will be deleted along with the remaining temporary resources.
''')
param subnetId string = ''

@description('Optional. Resource ID of Shared Image Gallery to distribute image to, e.g.: /subscriptions/<subscriptionID>/resourceGroups/<SIG resourcegroup>/providers/Microsoft.Compute/galleries/<SIG name>/images/<image definition>.')
param sigImageDefinitionId string = ''

@description('Optional. Version of the Shared Image Gallery Image. Supports the following Version Syntax: Major.Minor.Build (i.e., \'1.1.1\' or \'10.1.2\').')
param sigImageVersion string = ''

@allowed([
  'Standard_LRS'
  'Standard_ZRS'
])
@description('Optional. Storage account type to be used to store the image in the Azure Compute Gallery.')
param storageAccountType string = 'Standard_LRS'

@description('Optional. Exclude the created Azure Compute Gallery image version from the latest.')
param excludeFromLatest bool = false

@description('Optional. List of the regions the image produced by this solution should be stored in the Shared Image Gallery. When left empty, the deployment\'s location will be taken as a default value.')
param imageReplicationRegions array = []

@description('''Optional. Resource ID of the staging resource group in the same subscription and location as the image template that will be used to build the image.
If this field is empty, a resource group with a random name will be created.
If the resource group specified in this field doesn\'t exist, it will be created with the same name.
If the resource group specified exists, it must be empty and in the same region as the image template.
The resource group created will be deleted during template deletion if this field is empty or the resource group specified doesn\'t exist,
but if the resource group specified exists the resources created in the resource group will be deleted during template deletion and the resource group itself will remain.
''')
param stagingResourceGroup string = ''

var imageSource = {
  type: 'PlatformImage'
  publisher: 'microsoftwindowsdesktop'
  offer: 'windows-11'
  sku: 'win11-23h2-avd'
  version: 'latest'
}

var imageReplicationRegionsVar = empty(imageReplicationRegions) ? array(location) : imageReplicationRegions
var distribute = empty(sigImageDefinitionId) ? [] : array(sharedImage)

var sharedImage = {
  type: 'SharedImage'
  galleryImageId: empty(sigImageVersion) ? sigImageDefinitionId : '${sigImageDefinitionId}/versions/${sigImageVersion}'
  excludeFromLatest: excludeFromLatest
  replicationRegions: imageReplicationRegionsVar
  storageAccountType: storageAccountType
  runOutputName: !empty(sigImageDefinitionId) ? '${last(split(sigImageDefinitionId, '/'))}-SharedImage' : 'SharedImage'
  artifactTags: {
    sourceType: imageSource.type
    sourcePublisher: contains(imageSource, 'publisher') ? imageSource.publisher : null
    sourceOffer: contains(imageSource, 'offer') ? imageSource.offer : null
    sourceSku: contains(imageSource, 'sku') ? imageSource.sku : null
    sourceVersion: contains(imageSource, 'version') ? imageSource.version : null
    creationTime: baseTime
  }
}

var vnetConfig = {
  subnetId: subnetId
}

// Image Template
resource imageTemplate 'Microsoft.VirtualMachineImages/imageTemplates@2022-07-01' = {
  name: name
  location: location
  tags: tags
  identity: {
    type: 'UserAssigned'
    userAssignedIdentities: {
      '${az.resourceId(userMsiResourceGroup, 'Microsoft.ManagedIdentity/userAssignedIdentities', userMsiName)}': {}
    }
  }
  properties: {
    buildTimeoutInMinutes: buildTimeoutInMinutes
    vmProfile: {
      vmSize: vmSize
      osDiskSizeGB: osDiskSizeGB
      userAssignedIdentities: userAssignedIdentities
      vnetConfig: !empty(subnetId) ? vnetConfig : null
    }
    source: imageSource
    customize: loadJsonContent('avdCustomizationSteps.json')
    distribute: distribute
    stagingResourceGroup: stagingResourceGroup
  }
}

output imageTemplateId string = imageTemplate.id

The template is coupled with two important files: deploy.parameters.json with parameter values and avdCustomizationSteps.json.

Customization steps

This file contains the customize part of AIB Image Template.

๐Ÿ’ก
I thought it would be more practical to keep it separate, so the engineer can make edits in this file without touching the Bicep template.

The first part comes again from the AVD Accelerator and it: removes the image from apps that are not so useful in a work environment (like various XBox modern apps), changes the configuration of the OS and other optimization, so it runs faster without any extra clutter.

The second part - all the blocks with imageFactory prefix - is where the magic happens:

  • we first get download (and install) some essential tools (like AzCopy and the main installation script)

  • we download all application manifests - manifest-appname.json - from the Assets storage account

  • Each application we want to have in the image is represented by a separate block, e.g.,:

    {
        "name": "imageFactory_install_Firefox",
        "type": "PowerShell",
        "inline": [
            "C:\\ImageBuilder\\Install-Application.ps1 -Manifest C:\\ImageBuilder\\manifest-firefox.json"
        ],
        "runAsSystem": true,
        "runElevated": true
    },
  • It instructs AIB to run the Install-Application.ps1 script with a specific manifest as an input. My first version of the factory featured separate scripts for each app but I soon realized there was a lot of repetition and it was tedious to make changes in all the script files. Having one universal installation files that can naturally cater for several package types (msi, exe) was for me a better approach.

  • When the Bicep template is transpiled to JSON ARM, a simple (yet super useful) function called loadJsonContent injects the JSON file into it (and fixes escape characters, etc.). You can see it on line 122 in the template.

Installation script

This PowerShell script takes 'application manifest' as an input, parses its content, determines what type the application package uses, authenticates to Azure for AzCopy using User-assigned Identity, downloads the package, runs the installation, and in the end it validate the installation:

<#
.SYNOPSIS
    Installs Application
.DESCRIPTION
    Installs an application for Azure Image Builder
.PARAMETER ManifestFile
    The path to the manifest file for the application
.NOTES
    This script is a customizer for Azure Image Builder service.
.LINK
    https://aka.ms/installazurecliwindowsx64
.EXAMPLE
    Install-Application -ManifestFile 'manifest-app.json' -Verbose
#>

[CmdletBinding()]
param(
    [string]$ManifestFile
)

# Read the manifest file
try {
    $Manifest = Get-Content $ManifestFile | ConvertFrom-Json
}
catch {
    $ErrorMessage = $_.Exception.message
    Write-Error "Error reading the manifest file - $ErrorMessage"
}

# Common variables
[string]$UAMI = $Manifest.uami
[string]$PackagePath = $Manifest.packagePath
[string]$PackageLocalPath = $Manifest.packageLocalPath
[string]$AppName = $Manifest.appName
[string]$InstallationPath = $Manifest.installationPath
[string]$AppType = $Manifest.appType
[string]$Arguments = $Manifest.arguments
[bool]$Zipped = $Manifest.zipped
[string]$ArchiveLocalPath = $Manifest.archiveLocalPath

#region Common functions
$logFile = "C:\ImageBuilder\" + (Get-Date -Format 'yyyyMMdd') + '_image-build.log'
function Write-Log {
    [CmdletBinding()]
    param(
        [string]$message
    )
    process {
        Write-Output "$(Get-Date -Format 'yyyyMMdd HH:mm:ss') $message" | Out-File -Encoding utf8 $logFile -Append
    }
}
#endregion

#region Sign-in to Azure with AzCopy using UAMI
try {
    Start-Process -FilePath 'C:\ImageBuilder\azcopy.exe' -Wait -ErrorAction Stop -ArgumentList 'login', '--identity', '--identity-resource-id', $UAMI
}
catch {
    $ErrorMessage = $_.Exception.message
    Write-Log "Error signing in using UAMI - $ErrorMessage"
}
#endregion

#region Download and Extract app archive
if ($Zipped) {
    try {
        # AZCopy to download the archive file and extract to the ImageBuilder directory
        C:\ImageBuilder\azcopy.exe copy $PackagePath $ArchiveLocalPath
        Write-Log "$AppName downloaded"
        Expand-Archive $ArchiveLocalPath 'c:\ImageBuilder'
        Write-Log "$AppName extracted"
    }
    catch {
        $ErrorMessage = $_.Exception.message
        Write-Log "Error extracting $AppName - $ErrorMessage"
    }
}
else {
    try {
        # AZCopy to download the archive file and save to the ImageBuilder directory
        C:\ImageBuilder\azcopy.exe copy $PackagePath $PackageLocalPath
        Write-Log "$AppName downloaded"
    }
    catch {
        $ErrorMessage = $_.Exception.message
        Write-Log "Error downloading $AppName - $ErrorMessage"
    }
}
#endregion

#region Install app
$ProgressPreference = 'SilentlyContinue'
if ($AppType -eq 'msi') {
    try {
        Start-Process -FilePath msiexec.exe -Wait -ErrorAction Stop -ArgumentList $Arguments
    }
    catch {
        $ErrorMessage = $_.Exception.message
        Write-Log "Error installing $AppName - $ErrorMessage"
    }
}
elseif ($AppType -eq 'appx') {
    try {
        Add-AppxPackage -AppInstallerFile $PackageLocalPath
    }
    catch {
        $ErrorMessage = $_.Exception.message
        Write-Log "Error installing $AppName - $ErrorMessage"
    }
}
elseif ($AppType -eq 'exe') {
    try {
        Start-Process -FilePath $PackageLocalPath -Wait -ErrorAction Stop -ArgumentList $Arguments
    }
    catch {
        $ErrorMessage = $_.Exception.message
        Write-Log "Error installing $AppName - $ErrorMessage"
    }
}
else {
    Write-Log "Error installing $AppName - Unsupported app type"
}
#endregion

#region Validate the installation path
try {
    if (Test-Path $InstallationPath) {
        Write-Log "$AppName installed"
    }
    else {
        Write-Log "Error locating the $AppName executable"
    }
}
catch {
    $ErrorMessage = $_.Exception.message
    Write-Log "Error installing $AppName - $ErrorMessage"
}
$ProgressPreference = 'Continue'
#endregion

Application manifest

You can see from this example, how the manifest is structured:

{
    "uami": "/subscriptions/xxxxx-xxxx-xxxxxx-xxxxx-xxxxx/resourcegroups/imagefactory-prod-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/imagefactory-prod-uai",
    "packagePath": "https://imgfactoryassets.blob.core.windows.net/archives/Firefox120.0.zip",
    "packageLocalPath": "C:\\ImageBuilder\\Firefox Setup 120.0.msi",
    "zipped": true,
    "archiveLocalPath": "C:\\ImageBuilder\\Firefox120.0.zip",
    "appName": "Firefox",
    "installationPath": "C:\\Program Files\\Mozilla Firefox\\firefox.exe",
    "appType": "msi",
    "arguments": "/I \"C:\\ImageBuilder\\Firefox Setup 120.0.msi\" DESKTOP_SHORTCUT=true INSTALL_MAINTENANCE_SERVICE=false /quiet"
}

If you want to get it working, you need to:

  • update the uami value with the User-Assigned Identity you created in imagefactory-prod-rg Resource Group (or any other name you chose based on your naming convention)

  • update the packagePath URL with your storage account name and container name

  • upload the installation package for a given app in that container

Note: some packages from vendors contain spaces in the name. I chose to ZIP them before uploading and I have an extra step in the script to unzip then before the installation.

Before you start baking

There is a few important steps you need to do before you start "baking" new images. This automation expects that several resources are present in your Azure subscription:

  • Azure Compute Gallery (Microsoft.Compute/galleries)

  • User-Assigned Identity (Microsoft.ManagedIdentity/userAssignedIdentities) that was granted needed RBAC permissions

  • VM mage Definition (Microsoft.Compute/galleries/images) that servers as an "envelope" for image versions produced by AIB.

  • Storage Account (Microsoft.Storage/storageAccounts) for your assets

I will try to update the repo with a Bicep template or a script that can provision all these prerequisites later.

A few gotchas in the end

While building and testing this automation, I learned a few important lessons:

  • The image build can take a long time - it depends on several factors like the number of apps, the size of the Build VM, the size of installation packages, etc. For those reasons I chose to:

    • use a solid VM SKU to give AIB enough 'horse power' to speed it up

    • use --no-wait parameter in the last step of the workflow to avoid any timeouts. The GH workflow will complete but the image build will take longer. You can check the status in several ways, e.g., by using the Portal to go to the Image Template (created by the workflow) and watch for 'Build run state'

  • Every time the GH workflow runs, it needs to deploy a new Image Template, there is no way to update the existing one. I use a simple trick by having variable that uses a dynamic run_number - IMAGE_TEMPLATE_NAME: 'AVDImageTemplate-${{ github.run_number }}'

  • There are two networking options in AIB:

    • one where AIB creates a temporary VNet, a Public IP address, and NSG (restricting inbound access from a single AIB address).

    • Bring your own VNet - this one is a bit more complex - it leverages Private Link, and an extra Proxy VM with internal Load Balancer - but it gives you more control in environments, where Public IPs is 'no go'

  • Staging environments for AIB can be either 'static' (you precreate a separate Resource Group and point to it in the Bicep template) or dynamic (AIB creates a new RG with 'IT_' prefix. The only object that stays there after the build is finished is a storage account that contains logs from Pester. Very useful when you need to troubleshoot why are your builds failing. On the other hand, it leaves a bit of "mess" in your subscription, so having a runbook or a function that cleans it up would be a good idea.

Bonus: MSIX App Attach v2

There is a new video from Dean Cefola (published very recently) about the new experience for AVD with MSIX App Attach. I must admit, it sounds very interesting, despite rumours about MSIX being dead, so I can't wait to try it out...

ย