2020-02-24

#docker

Paul Cowan

14515

Feb 24, 2020 ⋅ 9 min read

Real-world Azure resource management with Terraform and Docker

Paul Cowan Contract software developer.

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Check it out

Before I start, I would like to thank Iain Hunter for some valuable tips into real-world Terraform.

If you are using one of the major cloud providers to host your applications and you are logging into a web portal and creating critical infrastructure by clicking buttons, then you are making a very costly mistake. Every single infrastructure item should be created from an executable code file that goes through a pull request process and gets committed into a versioned source control system such as git. Terraform takes a code-first approach to creating infrastructure resources.

Most posts that I have read about Terraform do not cover how I would use it in a real-world scenario. A lot of the posts miss some essential steps, like storing the Terraform state remotely, and do not mention Terraform modules. I would welcome any further recommendations that I am missing in the comments section at the end of the post.

🚀 Sign up for The Replay newsletter

The Replay is a weekly newsletter for dev and engineering leaders.

Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.

Why Terraform?

Why would you use Terraform and not Chef, Puppet, Ansible, SaltStack, or CloudFormation, etc.? Terraform is excellent for managing cloud resources. At the same time, tools like Ansible are more for provisioning software and machines. The reason I feel more at home with Terraform is that you are defining your infrastructure with code and not endless yml files of configuration. You can create reusable parameterized modules like I am used to in other languages.

Do not store Terraform state on the local file system

Terraform must store state about your managed infrastructure and configuration. This state is used by Terraform to map real-world resources to your configuration, keep track of metadata, and to improve performance for large infrastructures. Terraform state includes the settings for all of the resources in the configuration. By default, the Terraform state is stored on the local file system in a file named terraform.tfstate. Nearly every blog post I have read does not mention the correct way to persist the Terraform state. Terraform state should be stored remotely.

Store Terraform state in Azure Blob storage

You can store the state in Terraform cloud which is a paid-for service, or in something like AWS S3.

In this example, I am going to persist the state to Azure Blob storage.

Our first step is to create the Azure resources to facilitate this. I am going to need to create the following resources in Azure:

Azure resource group – A container that holds related resources for an Azure solution
Azure storage account – contains all of your Azure storage data resources
Azure Blob storage container – organizes a set of blobs, similar to a directory in a file system
Azure key vault store – Where we will store all the secrets that we don’t want hardcoded in our scripts and checked into source control
Azure service principal – an identity created for use with applications, hosted services, and automated tools to access Azure resources

We are going to create these initial resources using the Azure CLI tools. I know, I know we should be using Terraform. More on this later.

Terraform workspaces

In a real-world scenario, the artifacts get created in specific environments like dev, staging, production, etc. Terraform has the concept of workspaces to help with this. By default, Terraform starts with a default workspace but we will create all our infrastructure items under a dev workspace.

Terraform stores the state for each workspace in a separate state file in the remote storage:

env:/
    dev/
       state.tfs

Create a storage account

The script below will create a resource group, a storage account, and a storage container.

#!/bin/bash
RESOURCE_GROUP_NAME=tstate
# $1 is the environment or terraform workspace, dev in this example
STORAGE_ACCOUNT_NAME="tstate$RANDOM$1"
CONTAINER_NAME="tstate$1"

# Create resource group
az group create --name $RESOURCE_GROUP_NAME --location eastus

# Create storage account
az storage account create --resource-group $RESOURCE_GROUP_NAME --name $STORAGE_ACCOUNT_NAME --sku Standard_LRS --encryption-services blob

# Get storage account key
ACCOUNT_KEY=$(az storage account keys list --resource-group $RESOURCE_GROUP_NAME --account-name $STORAGE_ACCOUNT_NAME --query [0].value -o tsv)

# Create blob container
az storage container create --name $CONTAINER_NAME --account-name $STORAGE_ACCOUNT_NAME --account-key $ACCOUNT_KEY

echo "storage_account_name: $STORAGE_ACCOUNT_NAME"
echo "container_name: $CONTAINER_NAME"
echo "access_key: $ACCOUNT_KEY"

This will echo something like this to STDOUT

storage_account_name: tstate666
container_name: tstate
access_key: wp9AZRTfXPgZ6aKkP94/hTqj/rh9Tsdj8gjlng9mtRSoKm/cpPDR8vNzZExoE/xCSko3yzhcwq+8hj1hsPhlRg==

An access_key is generated that allows access to the storage. As previously mentioned, we do not want to store sensitive secrets in source control, and instead, we are going to store them in an Azure key vault which can securely store and retrieve application secrets like the access_key.

Create a key vault store

The official advice from Microsoft is to create a key vault store per environment.
The script below creates the key vault store:

if [[ $# -eq 0 ]] ; then
    echo 'you must pass in an environment of dev,staging or production'
    exit 0
fi

vault_name="my-key-vault-$1"

az keyvault create --name $vault_name --resource-group "mystate" --location germanywestcentral

We will now store the access_key, storage account name and storage container name in the key vault store:

az keyvault secret set --vault-name "my-key-vault-dev" --name "terraform-backend-key" --value "wp9AZRTfXPgZ6aKkP94/hTqj/rh9Tsdj8gjlng9mtRSoKm/cpPDR8vNzZExoE/xCSko3yzhcwq+8hj1hsPhlRg=="
az keyvault secret set --vault-name "my-key-vault-dev" --name "state-storage-account-name" --value "tstate6298"
az keyvault secret set --vault-name "my-key-vault-dev" --name "state-storage-container-name" --value "tstate"

I also store the Azure subscription ID in the key vault store for easier access:

az keyvault secret set --vault-name "my-key-vault-dev" --name "my-subscription-id" --value "79c15383-4cfc-49my-a234-d1394814ce95"

Create the service principal

The next step is to create the service principal account that we will give permissions to when accessing the applications’ infrastructure.

SUBSCRIPTIONID=$(az keyvault secret show --name my-subscription-id --vault-name my-key-vault --query value -o tsv)
az ad sp create-for-rbac --role contributor --scopes "/subscriptions/$SUBSCRIPTIONID" --name http://myterraform --sdk-auth

The above script will output something like the following:

{
  "clientId": "fd0e2604-c5a2-46e2-93d1-c0d77a8eca65",
  "clientSecret": "d997c921-5cde-40c8-99db-c71d4a380176",
  "subscriptionId": "79c15383-4cfc-49my-a234-d1394814ce95",
  "tenantId": "a567135e-3479-41fd-8acf-a606c8383061",
  "activeDirectoryEndpointUrl": "https://login.microsoftonline.com",
  "resourceManagerEndpointUrl": "https://management.azure.com/",
  "activeDirectoryGraphResourceId": "https://graph.windows.net/",
  "sqlManagementEndpointUrl": "https://management.core.windows.net:8443/",
  "galleryEndpointUrl": "https://gallery.azure.com/",
  "managementEndpointUrl": "https://management.core.windows.net/"
}

This is the only time you will have visibility of the clientSecret so we need to get this into the az key vault store quick — smart! The only way to get access to the clientSecret again is to regenerate it:

az keyvault secret set --vault-name "my-key-vault-dev" --name "sp-client-id" --value "e900db02-ab6a-4098-a274-5b91d5f510bb"
az keyvault secret set --vault-name "my-key-vault-dev" --name "sp-client-secret" --value "156c4cdf-23e7-44c0-ad2b-64a6f169b253"<

NOTE: An even more secure way of doing this is to use a client certificate.

Run Terraform through Docker

We are going to run Terraform through Docker. The first question that you should be asking is why?

Here are just a few reasons why you should be running Terraform through Docker:

The Terraform scripts should be treated as application code and should have things like a predictable OS
Encapsulate all requirements in a single image
Build once, run everywhere
If we use a container image repository, then we can version the images
Ability to deploy to different environments by parameterizing values with things like environment variables that are contextual at runtime
Consistent deployment experience when more than one developer is working on the same project

Terraform Dockerfile

The following Dockerfile will install both Terraform and the Azure CLI tools:

FROM ubuntu:19.04

ENV TERRAFORM_VERSION 0.12.19
ENV TERRAFORM_URL https://releases.hashicorp.com/terraform/$TERRAFORM_VERSION/terraform_${TERRAFORM_VERSION}_linux_amd64.zip
ENV AZURE_CLI_VERSION 2.0.77

RUN apt-get update && apt-get install -y \
    curl \
    python3-pip \
    zip

RUN echo 'alias python=python3' >> ~/.bashrc
RUN echo 'alias pip=pip3' >> ~/.bashrc
RUN pip3 install --upgrade pip

RUN curl -o /root/terraform.zip $TERRAFORM_URL && \
   unzip /root/terraform.zip -d /usr/local/bin/ && \
   rm /root/terraform.zip

RUN pip3 install azure-cli==${AZURE_CLI_VERSION}


WORKDIR /workspace

RUN chmod -R  +x .

ENTRYPOINT [ "./ops/help.sh", "-h" ]
CMD ["bash"]

The Dockerfile above will install both the Terraform and azure-cli at specific versions. I also like to have an entry point of a help menu for my Docker images that explain what the Docker image does.

The ./ops/help.sh file looks like this:

#!/bin/bash

if [ "$1" == "-h" ] ; then
    cat << EndOfMessage
Usage:
./run.sh [environment] [init|destroy]
e.g.
./run.sh dev init
./run.sh dev destroy
EndOfMessage
    exit 0
fi

Building the Terraform Docker image

The script below will build the image and tag it appropriately for the workspace:

#!/bin/bash

if [[ $# -eq 0 ]] ; then
    echo 'you must pass in an environment of dev,staging or production'
    exit 0
fi

version=$(cat ./terraform/version)
tag="my-azure:${version}-$1"

echo "Building images with default parameters"
docker image build \
  --rm \
  -f ./Dockerfile \
  -t $tag \
  --no-cache \
  .

The appropriate workspace argument is passed in as an argument when running ./build.sh:

./build.sh dev

Running the Terraform Docker image

Part of the reason for using Docker when running Terraform was to allow different environments or workspaces to be created from the same Dockerfile with different environment variables.

The run.sh script below will select the correct key vault store for this workspace. This script takes two arguments, the first one being the workspace and the second a command of init or destroy.

#!/bin/bash

if [[ $# -eq 0 ]] ; then
    echo 'you must pass in an environment of dev,staging or production and a command of init, destroy or -h'
    exit 0
fi

vault_name="c2-key-vault-$1"

version=$(cat ./terraform/version)
tag="${version}-$1"

working_directory="${PWD}/terraform"


vault_name="c2-key-vault-$1"
container_name="tf-azure-cli-$1"

case "$2" in
    ("init") command="./ops/init.sh" ;;
    ("destroy") command="./ops/teardown.sh" ;;
    (*) docker run \
          --rm \
          -v $working_directory:/workspace:z \
          --name $container_name \
          -it c2-azure:${tag}
        exit 0;;
esac

echo "about to run $command"

echo "setting environment variables for the $1 environment"

export subscription_id=$(az keyvault secret show --name c2-subscription-id --vault-name $vault_name --query value -o tsv)
export state_storage_account_name=$(az keyvault secret show --name state-storage-account-name --vault-name $vault_name --query value -o tsv)
export state_storage_container_name=$(az keyvault secret show --name state-storage-container-name --vault-name $vault_name --query value -o tsv)
export access_key=$(az keyvault secret show --name terraform-backend-key --vault-name $vault_name --query value -o tsv)
export client_id=$(az keyvault secret show --name sp-client-id --vault-name $vault_name --query value -o tsv)
export client_secret=$(az keyvault secret show --name sp-client-secret --vault-name $vault_name --query value -o tsv)
export tenant_id=$(az account show --query tenantId -o tsv)

docker run \
  --rm \
  -v $working_directory:/workspace:z \
  -e resource_group="c2state" \
  -e subscription_id="${subscription_id}"  \
  -e state_storage_account_name="${state_storage_account_name}" \
  -e state_storage_container_name="${state_storage_container_name}" \
  -e access_key="${access_key}" \
  -e client_id="${client_id}" \
  -e client_secret="${client_secret}" \
  -e tenant_id=${tenant_id} \
  -e workspace=$1 \
  --name $container_name \
  --entrypoint $command \
  -it c2-azure:${tag}

Environment variables are assigned from values in the Azure key vault store and subsequently made available in the Docker container through the -e switch when calling docker run.

Over 200k developers use LogRocket to create better digital experiences

Learn more →

A host volume is also mapped to our local Terraform files and scripts so the container can pick up changes instantly which negates the need to rebuild the image after every change.

The run.sh script is executed per workspace and the second argument of init or destroy will delegate eventually to terraform init or terraform destroy.

# run.sh takes a workspace argument and a command
./run.sh dev init

The result is a call todocker run. The –entrypoint switch is used to either delegate to a init.sh script or a teardown.sh script. Below is the init.sh script that will create the Azure infrastructure:

!/bin/bash

az login --service-principal -u $client_id -p $client_secret --tenant $tenant_id

export TF_VAR_client_id=$client_id
export TF_VAR_client_secret=$client_secret
export ARM_CLIENT_ID=$client_id
export ARM_CLIENT_SECRET=$client_secret
export ARM_ACCESS_KEY=$access_key
export ARM_SUBSCRIPTION_ID=$subscription_id
export ARM_TENANT_ID=$tenant_id
export TF_VAR_subscription_id=$subscription_id


terraform init \
    -backend-config="storage_account_name=${state_storage_account_name}" \
    -backend-config="container_name=${state_storage_container_name}" \
    -backend-config="access_key=${access_key}" \
    -backend-config="key=my.tfstate.$workspace"

terraform workspace select $workspace || terraform workspace new $workspace

terraform apply --auto-approve

In this script, the environment variables that are needed for the Terraform scripts are assigned.

terraform init is called with the -backend-config switches instructing Terraform to store the state in the Azure Blob storage container that was created at the start of this post.

The current Terraform workspace is set before applying the configuration.

terraform apply –auto-approve does the actual work of creating the resources.

Terraform will then execute the main.tf file and behave as normal.

Destroy

The run.sh script can be called with a destroy command:

./run.sh dev destroy

The container will execute this teardown.sh script:

#!/bin/bash

echo "tearing the whole $workspace down"

az login --service-principal -u $client_id -p $client_secret --tenant $tenant_id

export TF_VAR_client_id=$client_id
export TF_VAR_client_secret=$client_secret
export ARM_CLIENT_ID=$client_id
export ARM_CLIENT_SECRET=$client_secret
export ARM_ACCESS_KEY=$access_key
export ARM_SUBSCRIPTION_ID=$subscription_id
export ARM_TENANT_ID=$tenant_id
export TF_VAR_subscription_id=$subscription_id  

terraform workspace select $workspace

terraform destroy --auto-approve

What goes up, can go down.

Terraform modules

I don’t see enough mention of Terraform modules in most of the posts that I have read.

Terraform modules can both accept parameters in the form of input variables and return values that can be used by other Terraform modules called output variables.

The Terraform module below accepts two input variables resource_group_name and resource_group_location that are used to create the Azure resource group:

variable "resource_group_name" {
  type                      = string
}

variable "resource_group_location" {
  type                      = string
}

resource "azurerm_resource_group" "main" {
  name      = var.resource_group_name
  location  = var.resource_group_location
}

output "eu_resource_group_name" {
 value      = azurerm_resource_group.main.name
}

output "eu_resource_group_location" {
 value      = azurerm_resource_group.main.location
}

The module also returns two output variables eu_resource_group_name and eu_resource_group_location that can be used in other Terraform scripts.

The above module is called like this:

module "eu_resource_group" {
  source                        = "./modules/resource_groups"

  resource_group_name           = "${var.resource_group_name}-${terraform.workspace}"
  resource_group_location       = var.location
}

The two input variables are assigned in the module block. String interpolation is used to add the current Terraform workspace name to the resource group name. All Azure resources will be created under this resource group.

The two output variables eu_resource_group_name and eu_resource_group_location can be used from other modules:

module "vault" {
  source                        = "./modules/vault"

  resource_group_name           = module.eu_resource_group.eu_resource_group_name
  resource_group_location       = module.eu_resource_group.eu_resource_group_location
}

Epilogue

I became frustrated when reading a lot of the Terraform posts that were just too basic to be used in a real, production-ready environment.

Even the Terraform docs don’t go into great detail about storing keys and secrets in other ways than the script files themselves which is a big security mistake. Please do not use local Terraform state if you are using Terraform in a real-world scenario.

Terraform modules with input and output variables are infinitely better than one large script.

Executing Terraform in a Docker container is the right thing to do for exactly the same reasons as we put other application code in containers.

Get set up with LogRocket's modern error tracking in minutes:

Visit https://logrocket.com/signup/ to get an app ID

Install LogRocket via npm or script tag. LogRocket.init() must be called client-side, not server-side

npm
Script tag

$ npm i --save logrocket 

// Code:

import LogRocket from 'logrocket'; 
LogRocket.init('app/id');

// Add to your HTML:

<script src="https://cdn.lr-ingest.com/LogRocket.min.js"></script>
<script>window.LogRocket && window.LogRocket.init('app/id');</script>