/ belodetek Ltd.

CloudFormation Generic Custom Resources

TL;DR generic Lambda to create Client VPN and Cognito demo stacks 🤓

If you ever worked with AWS CloudFormation for any reasonable length of time, you would have discovered that is is a very powerfull framework. You would also quickly discover, that the native resources tend not to be on the bleeding edge of AWS development. Sometimes they lag so much, that the lag is unreasonably measured in years (e.g. 2012-2019). At the time of writing AWS Cognito and Client VPN are either only partially implemented or not implemented at all. Granted, Client VPN has only been released in December 2018, but Cognito has been around sice 2014!

And yes, we can use frameworks of top of CloudFormation, such as Terraform, but this is a native CloudFormation solution, requiring no additional tools and frameworks. It can be deployed with awscli, without any further dependencies.

In any case, we are not going to start rolling our configurations manually, pressing buttons, etc. and will attempt to keep all of our AWS resource management within CFN. Why? Because DevOps.

Luckily, CFN has a concept of Custom Resources, which allows you to define your own resources in the tempplate and map it to your Lambda function. However, creating a separate function for every resource CFN doesn't currently (or ever will?) handle gets a bit tedious. For example, we have over 20 such functions already and I suspect others have a lot more.

Since CFN to Lambda bridge follows a very tight interface spec. and AWS provide an SDK for Python, among others, which tracks new AWS product development a lot better than CFN, woudn't it be nice to have a generic Lambda function, which would tke a bunch of parameters, invoke the appropriate API method to create, update or delete a resource and wait for the operation to complete? Right?

I found another project, which aims to address the same issue, but decided to create my own version anyway to implement generic wait handling.

The overall approach I've taken is to implement Create, Update and Delete requests which are sent by CloudFormation to the Lambda function. Each request type takes the following mandatory parameters:

  • AgentService
  • AgentType
  • AgentCreateMethod
  • AgentDeleteMethod

AgentType can be either client or resource. Depending on the type, we create either a boto3.session.Session.client() or boto3.session.Session.resource() object. AgentService can be any of of supported services, such as ec2 and rds.

The following are optional:

  • AgentUpdateMethod
  • AgentWaitMethod
  • Agent(Create|Update|Delete|Wait)Args
  • AgentResourceId
  • AgentWaitResourceId
  • AgentWaitQueryExpr
  • AgentWait(Update|Create|Delete)QueryValues
  • RoleArn

AgentResourceId is technically optional, but is normally required if the resource you are creating gets a unique id assigned by the API. If RoleArn is present in ResourceProperties, we will assume that role first and pass the credentials to AgentService. If AgentWaitMethod is supplied, we will try to instantiate it as a waiter first using the get_waiter() call and if that fails, we will create it literally.

⚠️☣️☢️ WARNING the update operation will almost certainly delete and re-create your resource, unless a valid update method is passed with correct parameters. Beware of using this tool on state-full resources, such as databases and directories. CFN offers a mechanism to prevent inadvertent deletion and updates, which should be used for these types of resources.

The most basic resource, is one that does not require waiting for it to complete creation or deletion. Just passing in the mandatory paramaters is enough, for example:

Resources:
  ClientVPNEndpoint:
    Type: 'Custom::ClientVPNEndpoint'
    Properties:
      ServiceToken: !Sub '${CustomResourceLambdaArn}'
      AgentService: ec2
      AgentType: client
      AgentCreateMethod: create_client_vpn_endpoint
      AgentDeleteMethod: delete_client_vpn_endpoint
      AgentResourceId: ClientVpnEndpointId
      AgentCreateArgs: <JSON object|JSON string>

AgentCreateArgs will take either a CFN formated JSON, with all values passed as strings, or a packed JSON-string, which the provider will unpack to preserve boolean values.

For a more complicated request, where resource creation takes time, we would normally want to wait for that resource to become available, before starting to create dependencies. In this instance, we would need to pass in some additional parameters. In the following example, we associate a subnet with a client VPN endpoint, which takes some time. Since there are no waiters implemented for this resource, we pass in a generic describe_client_vpn_target_networks method, a jsonpath in the response and successful create and delete values. For deletions, usually an empty list is enough, since when this resource is removed, the describe_client_vpn_target_networks response will be empty. We also pass in AgentWaitResourceId, since the wait method takes in a different parameter than the one specified in AgentResourceId.

  AssociateSubnet:
    Type: 'Custom::SubnetAssociation'
    Properties:
      ServiceToken: !Sub '${CustomResourceLambdaArn}'
      AgentService: ec2
      AgentType: client
      AgentCreateMethod: associate_client_vpn_target_network
      AgentDeleteMethod: disassociate_client_vpn_target_network
      AgentWaitMethod: describe_client_vpn_target_networks
      AgentWaitQueryExpr: '$.ClientVpnTargetNetworks[?(@.TargetNetworkId=="subnet-abcdef1234567890")].Status.Code'
      AgentWaitCreateQueryValues:
      - associated
      AgentWaitDeleteQueryValues: []
      AgentResourceId: AssociationId
      AgentWaitResourceId:
      - AssociationIds
      AgentCreateArgs:
        ClientVpnEndpointId: !Sub '${ClientVPNEndpoint}'
        SubnetId: subnet-abcdef1234567890
      AgentWaitArgs:
        ClientVpnEndpointId: !Sub '${ClientVPNEndpoint}'
      AgentDeleteArgs:
        ClientVpnEndpointId: !Sub '${ClientVPNEndpoint}'

There are many more examples, too numerous to list. This tool has only been tested on the AWS Client VPN and Directory resources, so there are almost definitely going to be edge-cases which would need to be handled correctly and hopefully generally.

A complete client-vpn-demo CFN stack is provided as a means to demonstrate the operation of this tool end-to-end. This stack will deploy a Client VPN endpoint, associate subnets, authorize ingress, add default routes and apply security group(s). This stack can be added as a nested stack within a parent template, to add client VPN (OpenVPN) connectivity to the private subnets.

Another (semi)complete cognito-demo CFN stack is provided. This stack deploys Cognito IdP resources and configures a user pool domain and SAML provider. This stack can be coupled with existing AWS ELBv2 (ALB) resource to provide authentication at the load balancer.

For AgentType == 'resource' usage example, take a look at the mock request and adapt to use inside your templates.

Future work will probably need to look at the DryRun parameter. If this is passed on the Agent(Create|Update|Delete)Args, it would be great if the resource creation can somehow be tested before one ends up inROLLBACK_FAILED|DELETE_FAILED hell. Also, if wouldn't hurt to set some reasonable defaults (i.e. AgentType = 'client') and keep the CFN templates brief.

We hope this tool helps someone remove some duplication in their CFN templates ad Lambda functions!

--belodetek 😬

Anton Belodedenko

Anton Belodedenko

I am a jack of all trades, master of none (DevOps). My wife and I are itinerant. I also ski, snowboard and rock climb. Oh, and I like Futurama, duh!

Read More