Introduction to AWS CloudFormation

This is the first article in what I hope to be a series that documents my experience getting to know how to use AWS CloudFormation. In this article we’ll get to know the basics about CloudFormation and set up a simple application stack. In future articles I hope to cover more advanced topics.

What is AWS CloudFormation?

CloudFormation allows you to specify all of the AWS resources your application uses in a single JSON file called a CloudFormation template.

Since the template is just a text file, it can be versioned—this means it’s easy to track changes over time, and it’s simple to rollback to a known working state. Furthermore, it’s easy enough to replicate an entire application setup (known as a “stack”) by using the same template to create another stack.

Unfortunately there’s a bit of a learning curve to understanding how it all fits together. It gets complicated really fast. Fortunately, AWS has great documentation when you want to get into the details of a particular element of a template, and there are lots of examples for particular use cases to be found via some basic Internet research. In teaching myself about CloudFormation, however, I found no articles that guided me from the most basic steps up to the more advanced topics. I hope this article can fill in that gap.

Disclaimer: I’m still learning this myself so if someone out there knows more about this than I do and has advice about what I’m showing here, do let me know by sending me an email (address at the bottom of this page).

High-Level View of a Template

A template is just a JSON object with some top-level properties.

The most significant of these properties are:

Parameters (optional): Specification of input values. For example, the allowable source IP CIDR range for use in a security group in the Resources section. These are capable of being referenced later on in either Resources or Outputs sections. Think of them as input variables to your template.

Resources (required): Specification of AWS resources to be created. For example, an EC2 instance or load balancer or an S3 bucket. These are capable of being referenced in either Resources or Outputs sections.

Outputs (optional): Output values you specify for the template viewable when viewing the stack in the AWS console. For example, you could specify as one of your outputs the DNS name of an elastic load balancer.

Templates will also generally contain the following properties:

AWSTemplateFormatVersion (optional): Specifies the version string for this template. Currently the only valid value is “2010-09-09”.

Description (optional): Human readable description of the template.

Mappings (optional): Mapping of keys and values used to supply conditional values.

Conditions (optional): Specify conditions for creations of resources.

For more detailed information see Template Anatomy in the AWS documentation.

Our First Example

We’re going to spin up one of the t2.micro instances in a simple VPC configuration. Why the complication of VPC? It’s good practice for us to figure out how to provision not only the instances, but also the entire stack of resources that support the instances. One of the more frustrating aspects I encountered while trying to learn CloudFormation was figuring out how to apply examples found in documentation to a VPC setting. The documentation details everything you need to know to accomplish this, but rarely does everything come together to a tight focus as I hope to present in this article.

First the boilerplate:

{
    "AWSTemplateFormatVersion" : "2010-09-09",
    "Description": "This is our first CloudFormation example.",

Parameters

We have two parameters for this template. First, we need to tell EC2 which of our SSH keys to use for connecting to our instance. This is made much easier using the AWS::EC2::KeyPair::KeyName type. During stack creation we’re presented with a listing of our existing keys to select from. Second, rather than hard-coding the allowable source IP for connections for port 22, we provide the ability for the user to supply this information during stack creation.

    "Parameters": {
        "KeyName": {
            "Type": "AWS::EC2::KeyPair::KeyName",
            "Description": "Name of an existing EC2 KeyPair"
        },
        "SSHLocation" : {
            "Description" : "Allowable source location for SSH (default allows any IP)",
            "Type" : "String",
            "MinLength": "9",
            "MaxLength": "18",
            "Default" : "0.0.0.0/0",
            "AllowedPattern" : "(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})/(\\d{1,2})",
            "ConstraintDescription" : "must be a valid CIDR range of the form x.x.x.x/x."
        }
    },

There are four general purpose parameter data types and seven AWS specific types:

General Purpose Parameter Types

AWS-Specific Parameter Types

The AWS documentation is a great resource on the details on each of these types. I’m just listing them here for reference.

You can reference these parameters within the resources section using the Ref intrinsic function. For example:

{"Ref": "KeyName"}
{"Ref": "SSHLocation"}

In addition to your own parameters, there are several so-called “pseudo parameters” which are defined by CloudFormation itself that you can reference just like your own parameters.

One of the particularly useful pseudo parameters is the AWS::Region parameter because it allows you to create templates that can be copied to any of the AWS regions. We’ll end up using this parameter in the example below to select the AMI ID for our instance based on our current region.

Mappings

The mappings section, in concert with the Fn::FindInMap intrinsic function, enables you to implement conditional values. Here we’ll be using it to determine what AMI ID we supply to our EC2 resource based on our current region and selected instance type.

    "Mappings": {
        "AWSInstanceType2Arch": {
            "t2.micro": {"Arch": "HVM64"}
        },
        "AWSRegionArch2AMI": {
            "us-east-1" : {"HVM64": "ami-9a562df2"},
            "us-west-1" : {"HVM64": "ami-df6a8b9b"},
            "us-west-2" : {"HVM64": "ami-5189a661"}
        }
    },

For simplicity’s sake I’ve limited the options. Fully functional examples would contain enumeration of all EC2 instance types as well as all regions. For example, see the Mappings section in the WordPress sample template in the AWS documentation.

The full AWS documenation is again very helpful in showing example usages.

Resources

The resources section is the real workhorse of the template. It contains everything that actually does something in the stack.

First, we specify the VPC resources. This part is seemingly the most complicated but that’s just because there are a lot of different kinds of resources being defined here. We can break it down to just four parts.

The network, access to the internet, routing, and access rules.

The Network

    "Resources": {
        "VPC": {
            "Type": "AWS::EC2::VPC",
            "Properties": {
                "CidrBlock": "10.0.0.0/16",
                "EnableDnsHostnames": true
            }
        },
        "Subnet": {
            "Type": "AWS::EC2::Subnet",
            "Properties": {
                "CidrBlock": "10.0.0.0/24",
                "VpcId": {"Ref": "VPC"}
            }
        },

Here we’ve defined what the address space is for the network as well as defined a single subnet in which our instances will live. The network address ranges are specified using something known as CIDR notation. This is a notation used throughout CloudFront templates when specifying networks or other aspects of networking, so it’s good to be familiar with what the notation means.

Internet Access

        "InternetGateway": {
            "Type": "AWS::EC2::InternetGateway",
            "Properties": {
            }
        },
        "AttachGateway": {
            "Type": "AWS::EC2::VPCGatewayAttachment",
            "Properties": {
                "VpcId": {"Ref": "VPC"},
                "InternetGatewayId": {"Ref": "InternetGateway"}
            }
        },

These two resources provide us with a way for hosts on our VPC to talk to the Internet. The first resource “InternetGateway” specifies the actual connection to the Internet, and the “AttachGateway” resource, through it’s “VPC” and “InternetGatewayId” properties tells AWS that “InternetGateway” should be attached to the VPC we have just created earlier. This demonstrates another usage of the Ref intrinsic function: you can use it to refer to other resources in your stack. You’ll see this throughout the template below.

Routing

        "RouteTable": {
            "Type": "AWS::EC2::RouteTable",
            "Properties": {
                "VpcId": {"Ref": "VPC"}
            }
        },
        "Route": {
            "Type": "AWS::EC2::Route",
            "DependsOn": "AttachGateway",
            "Properties": {
                "RouteTableId": {"Ref": "RouteTable"},
                "DestinationCidrBlock": "0.0.0.0/0",
                "GatewayId": {"Ref": "InternetGateway"}
            }
        },
        "SubnetRouteTableAssociation": {
            "Type": "AWS::EC2::SubnetRouteTableAssociation",
            "Properties": {
                "SubnetId": {"Ref": "Subnet"},
                "RouteTableId": {"Ref": "RouteTable"}
            }
        },

Here we specify a routing table “RouteTable”, provide it with a single default route “Route” that directs any non-local destined traffic (using CIDR notation “0.0.0.0/0”) towards our Internet gateway, and specify that the route table “RouteTable” is to be associated with our subnet. Again we’re making heavy use of the Ref intrinsic function to connect our resources to one-another.

Network Access Rules

Network ACLs act as a firewall for subnets in AWS VPC networks just like security groups act as a firewall at the level of an EC2 instance.

        "NetworkAcl": {
            "Type": "AWS::EC2::NetworkAcl",
            "Properties": {
                "VpcId": {"Ref" : "VPC"}
            }
        },
        "InboundNetworkAclEntry": {
            "Type": "AWS::EC2::NetworkAclEntry",
            "Properties": {
                "NetworkAclId": {"Ref": "NetworkAcl"},
                "RuleNumber": "100",
                "Protocol": "-1",
                "RuleAction": "allow",
                "Egress": false,
                "CidrBlock": "0.0.0.0/0"
            }
        },
        "OutboundNetworkAclEntry": {
            "Type": "AWS::EC2::NetworkAclEntry",
            "Properties": {
                "NetworkAclId": {"Ref": "NetworkAcl"},
                "RuleNumber": "100",
                "Protocol": "-1",
                "RuleAction": "allow",
                "Egress": true,
                "CidrBlock": "0.0.0.0/0"
            }
        },
        "SubnetNetworkAclAssociation": {
            "Type": "AWS::EC2::SubnetNetworkAclAssociation",
            "Properties": {
                "SubnetId": {"Ref": "Subnet"},
                "NetworkAclId": {"Ref": "NetworkAcl"}
            }
        },

Here we have specified the ACL itself NetworkAcl and told CloudFormation that it is associated with our VPC, specified two rules that allow any traffic from any address for any protocol into (ingress) our VPC as well as allow any traffic from any address for any protocol within our VPC to exit our network (egress), and finally we associate our 10.0.0.0/24 subnet with this ACL.

Check the AWS documentation for AWS::EC2::NetworkAclEntry for more information about the meanings of each property.

Our Instance (finally)

For this example we’d like our instance to have an elastic IP address (for no particular reason other than to demonstrate EIP usage within a VPC context). We specify that first.

        "IPAddress": {
            "Type": "AWS::EC2::EIP",
            "DependsOn": "AttachGateway",
            "Properties": {
                "Domain": "vpc",
                "InstanceId": {"Ref": "Instance"}
            }
        },

This resource demonstrates a helpful attribute you can specify that governs the order in which CloudFormation creates your resources: the DependsOn attribute. CloudFormation in most cases can determine the order in which resources need to be created. For example, if you specify that a network ACL is associated with a particular VPC, the VPC will be created before the ACL because otherwise the ACL would not make sense. However, there may be cases in which you must specify these dependency relationships explicitly. The DependsOn attribute provides this possibility.

We need to specify a security group next, and ensure that AWS knows it’s going to be used in a VPC context:

        "InstanceSG": {
            "Type": "AWS::EC2::SecurityGroup",
            "Properties": {
                "GroupDescription": "Enable HTTP and SSH access",
                "VpcId": { "Ref": "VPC" },
                "SecurityGroupIngress": [
                    { "IpProtocol": "tcp", "FromPort": "80", "ToPort": "80", "CidrIp": "0.0.0.0/0" },
                    { "IpProtocol": "tcp", "FromPort": "22", "ToPort": "22", "CidrIp": {"Ref": "SSHLocation"} }
                ]
            }
        },

Here we’ve used our Ref intrinsic function again to refer back to one of the parameters of this template that lets us restrict the address range from which we’re allowed to connect to our instance over port 22 (SSH). We allow any source IP to connect over port 80 (HTTP) so we can serve a public web-site, for example.

And finally we arrive at the specification of the actual EC2 instance. Note the use of the DependsOn attribute to indicate that the “Instance” resource should not be created until the AttachGateway resource has been created. This ensures that under no circumstances will our instance be spawned into a network that does not have access to the Internet because we’re guaranteed that the AttachGateway resource will have been created successfully before AWS CloudFormation starts creating our Instance resource. This is important because in order to fully setup the instance we’ll need access to the Internet.

        "Instance": {
            "Type": "AWS::EC2::Instance",
            "DependsOn": "AttachGateway",
            "Metadata": {
                "AWS::CloudFormation::Init": {
                    "config": {
                        "packages": {
                            "apt": {
                                "apache2": []
                            }
                        }
                    }
                }
            },
            "Properties": {
                "ImageId": {"Fn::FindInMap": ["AWSRegionArch2AMI",
                    {"Ref": "AWS::Region"}, {"Fn::FindInMap": ["AWSInstanceType2Arch",
                        "t2.micro", "Arch"]}]},
                "InstanceType": "t2.micro",
                "KeyName": {"Ref": "KeyName"},
                "NetworkInterfaces": [{
                    "GroupSet": [{"Ref": "InstanceSG"}],
                    "AssociatePublicIpAddress": true,
                    "DeviceIndex": 0,
                    "DeleteOnTermination": true,
                    "SubnetId": {"Ref": "Subnet"}
                }],
                "UserData": {"Fn::Base64": {"Fn::Join": ["", [
                    "#!/bin/bash\n",
                    "apt-get update\n",
                    "apt-get -y install python-setuptools\n",
                    "easy_install https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz\n",
                    "/usr/local/bin/cfn-init -v",
                    " --region ", { "Ref" : "AWS::Region" },
                    " --stack ", { "Ref" : "AWS::StackId" },
                    " --resource Instance\n",
                    "/usr/local/bin/cfn-signal -e $?",
                    " --region ", { "Ref" : "AWS::Region" },
                    " --stack ", { "Ref" : "AWS::StackId" },
                    " --resource Instance\n" 
                ]]}}
            },
            "CreationPolicy": {
                "ResourceSignal": {
                    "Timeout": "PT15M"
                }
            }
        }
    },

Image ID & Network Interfaces

Let’s go through this in more detail.

We’ve already covered the DependsOn attribute. It guarantees that our Internet gateway is setup before our instance is created. This ensures the instance will have access to the Internet at the time it is created. Why do we need this? In our example, we’re going to need to setup some utilities and packages, and we’ll need Internet access to do this.

We’ll look at the Metadata attribute after we’ve discussed Properties in more detail.

The ImageId, InstanceType, and KeyName properties are rather straight forward. However, the specification of “ImageId” demands a bit of explanation. Basically we’re just using the Fn::FindInMap intrinsic function twice in concert with the Mappings section and the Ref intrinsic function (to get our region name) to figure out the correct AMI ID to use. We could have hard-coded it, but AMI IDs are region specific. We’ve hard-coded the instance type to be the t2.micro type, but we could have made this a template input parameter to give us more flexibility.

The NetworkInterfaces property connects our instance to the VPC subnet we created earlier. It also connects our instance to the InstanceSG security group.

User Data, cfn-init, & Metadata

The UserData property is where you can interface with the system itself via shell script or cloud-init directives. It has the same effect as if you manually launched the instance from the EC2 console, and pasted a script into the user supplied data field towards the end of that process.

In a CloudFormation template, the UserData property must be a base64 encoded string that contains either the shell script or cloud-init directives. To achieve this we use the Fn::Base64 intrinsic function in concert with Fn::Join.

In our example we’ve chosen to use shell scripts rather than cloud-init directives. The amount of data you can supply via UserData is limited to 16 KB. For more information about what you can do with user supplied data, see the documentation here and here.

In our script, we run apt-get update to update the apt repositories, then we install the python-setuptools package. We need python-setuptools in order to have access to the easy_install command, which allows us to install the latest version of the cfn-bootstrap utilities provided by AWS. This gives us the cfn-init and cfn-signal commands used in the following two lines. It also gives us the cfn-hup daemon, which enables your instance to watch for changes to the template Metadata attribute and act on those changes to update the system. We’re not going to cover cfn-hup here, but will in another article.

Observation: In a production setup, after running apt-get update you should also run apt-get upgrade to ensure that you have a fully patched Ubuntu system. We’ve omitted that here for no particular reason other than it makes the instance creation take a bit longer.

The invocation of cfn-init is a front-end to AWS web-services that interact with the Metadata attribute of the Instance resource. In Metadata we’ve specified that we want to install the apache2 package using the apt package manager. This provides us with a stock Apache2 server. With similar directives, we could set up other apt packages, create files, or even run shell scripts. All of this gets kicked off by the cfn-init invocation in our UserData script.

Observation: The command line options passed to the cfn-init invocation specify the region and the resource we want to target. For more information, see the documentation on cfn-init.

Creation Policy & cfn-signal

The final command in our UserData script is an invocation of cfn-signal which signals to CloudFormation that our instance creation has completed. We’ve utilized the -e flag, which allows us to specify the success or failure of the creation. In this case we’re passing it the exit value of the previous command. This enables us to signal success or failure to CloudFormation. While I can’t find the documentation on this, it seems that anything non-zero passed to -e will be considered a failure condition, at least in a Linux environment. For more information see the documentation on cfn-signal.

The signal passed by cfn-signal then interacts with our CreationPolicy attribute. As long as AWS receives a success signal within the time specified by the policy (using the ISO 8601 duration format), CloudFormation will consider our instance to have been successfully completed. At this point, the instance will have reached the CREATION_COMPLETE status. For more details, see the documentation on CreationPolicy. You can also read an excellent article describing CreationPolicy usage.

Outputs

A CloudFormation template allows you to define a set of outputs as well as inputs.

    "Outputs": {
        "InstanceAZ": {
            "Description": "Instance AZ",
            "Value": {"Fn::GetAtt": ["Instance", "AvailabilityZone"]}
        },
        "InstancePublicIp": {
            "Description": "Instance public IP",
            "Value": {"Fn::GetAtt": ["Instance", "PublicIp"]}
        },
        "InstancePublicDnsName": {
            "Description": "Instance public DNS name",
            "Value": {"Fn::GetAtt": ["Instance", "PublicDnsName"]}
        }
    }
}

For example, you won’t know the actual IP address that gets allocated to your instance until it’s been allocated, one way to get this information easily is via an output. The same goes for the public DNS name of the instance. Naturally, you could figure out this by looking at the AWS console, but it’s nice to be able to display that information along with the application stack that the template defines. The documentation on outputs includes a bit more information about outputs than what I’ve shown here, but I’ve not found an exhaustive list of everything you can include as an output. A rule-of-thumb seems to be that if you can access it elsewhere in a template, then it can be an output. For our example we’ve utilized the Fn::GetAtt intrinsic function.

Next Topics

We’ve only scratched the surface of what AWS Cloud formation can provide. In our next article we’ll demonstrate utilizing the cfn-hup daemon to manage updates to an instance within an application stack. There are some tricks to get this working on Ubuntu that I’ll describe in that article.

This is a complicated topic. When I set out to write about CloudFormation I don’t think I realized just how involved it would be. My main motivation has been that, as I was learning CloudFormation for myself, I couldn’t find any one single introduction that covered everything I wanted to see in one place with links to documentation. My aim has been to make this the introduction I would like to have found. I’m sure there is a lot of room improvement in my writing and how I’ve communicated the concepts. Feedback is welcomed.

If you would like to comment on this article, please send me an email at pennedav@gmail.com.
Feedback from comments may be incorporated in the form of updates to the article text.