on reducing AWS Elastic Beanstalk costs... 💰

There appears to be plenty of documentation and examples on how to deploy web applications using AWS Elastic Beanstalk (EBS). There is also significant body of work on how to reduce AWS costs using EC2 Spot instances, RIs, etc. However there doesn't appear to be much in a way of combining the two.

This post makes a number of assumptions - the most important one is that the reader is familiar with the specific technology and the terminology. The other one is that everything we do is performed through the DevOps "infrastrucutre-as-code" tenet. In this case, this essentially means we will be using AWS CloudFormation (CFN). So let's dive right in...

Deploying EBS through CFN produces an environment running an application with an auto-scaling policy in a form of an EC2 Auto Scaling Group (ASG).

In CFN templates, an EBS environment is usually provisioned with the following resource definition:

Environment:
  Type: 'AWS::ElasticBeanstalk::Environment'
    Properties:
      ApplicationName: !Sub '${Application}'
      Description: !Sub '${NameTag} environment'
      TemplateName: !Sub '${ConfigurationTemplate}'
      VersionLabel: !Sub '${ApplicationVersion}'
      Tags:
      - Key: NameTag
        Value: !Sub '${NameTag}'

This effectively kicks off deployment of a nested CFN stack (e.g. awseb-e-tonvh4892d-stack), which is maintained by AWS/EBS and we have no visibility into it. This stack produces no outputs/exports we can leverage, aside from AWSEBLoadBalancerURL output. This nested stack deploys all of the resources, including ASG, auto-scaling policy, etc.

To provision cheaper Spot instances in our EBS environment, we just need to specify the EC2_SPOT_PRICE environment variable in our configuration template definition:

ConfigurationTemplate:
  Type: 'AWS::ElasticBeanstalk::ConfigurationTemplate'
  Properties:
    ApplicationName: !Ref 'Application'
    Description: !Sub '${NameTag} spot configuration template'
    OptionSettings:
    - Namespace: 'aws:elasticbeanstalk:application:environment'
      OptionName: EC2_SPOT_PRICE
      Value: !If [ HasSpot, !Ref 'MaxSpotPrice', !Ref 'AWS::NoValue' ]
    ...

This is covered in the following post. However, this provisions 100% Spot capacity with no On-Demand instances. This is not ideal for mission critical applications, unless additional logic is wired in to listen for termination events and provision replacement capacity automatically.

Ideally, we would want to provision something like 50% Spot and 50% On-Demand capacity within our EBS environment. This can be achieved with MixedInstancesPolicy, set on the ASG, but requires the ASG to be configured to launch instances using a Launch Template (LT), rather than the default Launch Configuration (LC), which is configured by EBS. It is posible to copy LC to LT using AWS Console, but since we need to be able to do it programatically inside our CFN template, we would need to legerage some custom resources.

First, we need to obtain the ASG name first via a call to our custom resource provider Lambda function:

AutoScalingGroupName:
  Type: 'Custom::AutoScalingGroupName'
  DependsOn: Environment
  Properties:
    ServiceToken: !Sub 'arn:${AWS::Partition}:lambda:${AWS::Region}:${AWS::AccountId}:function:generic-custom-resource-provider'
    AgentService: autoscaling
    AgentType: client
    AgentCreateMethod: describe_auto_scaling_groups
    AgentWaitQueryExpr: !Sub '$.AutoScalingGroups[*].Tags[?(@.Key=="elasticbeanstalk:environment-name" && @.Value=="${Environment}")].ResourceId'

Once we have the ASG name, we can get the name of the LC:

AutoScalingLaunchConfigurationName:
  Type: 'Custom::AutoScalingLaunchConfigurationName'
  DependsOn: AutoScalingGroupName
  Properties:
    ServiceToken: !Sub 'arn:${AWS::Partition}:lambda:${AWS::Region}:${AWS::AccountId}:function:generic-custom-resource-provider'
    AgentService: autoscaling
    AgentType: client
    AgentCreateMethod: describe_auto_scaling_groups
    AgentWaitQueryExpr: !Sub '$.AutoScalingGroups[?(@.AutoScalingGroupName=="${AutoScalingGroupName}")].LaunchConfigurationName'

After obtaining both, we can copy the LC to LT:

CreateLaunchTemplateFromConfiguration:
  Type: 'Custom::CreateLaunchTemplateFromConfiguration'
  DependsOn: AutoScalingLaunchConfigurationName
  Version: 1.0
  Properties:
    ServiceToken: !Sub 'arn:${AWS::Partition}:lambda:${AWS::Region}:${AWS::AccountId}:function:generic-custom-resource-provider'
    Version: 1.0
    # custom module: autoscaling.py
    AgentService: autoscaling
    AgentType: custom
    AgentResponseNode: LaunchTemplate
    AgentCreateMethod: create_launch_template_from_configuration
    AgentDeleteMethod: delete_launch_template
    AgentCreateArgs:
      LaunchConfigurationName: !Sub '${AutoScalingLaunchConfigurationName}'
      LaunchTemplateName: !Sub '${NameTag}-launch-template'
      Description: !Sub 'Created from ${AutoScalingLaunchConfigurationName}.'
      TagSpecifications:
      - ResourceType: 'launch-template'
        Tags:
        - Key: 'Name'
          Value: !Sub '${NameTag}'
    AgentDeleteArgs:
      LaunchTemplateName: !Sub '${NameTag}-launch-template'

We can now reference ${CreateLaunchTemplateFromConfiguration.LaunchTemplateId} in our further CFN resources and configure our ASG to use the new LT with the MixedInstancesPolicy defined:

MixedInstancesPolicy:
  Type: 'Custom::MixedInstancesPolicy'
  DependsOn: CreateLaunchTemplateFromConfiguration
  Version: 1.0
  Properties:
    ServiceToken: !Sub 'arn:${AWS::Partition}:lambda:${AWS::Region}:${AWS::AccountId}:function:generic-custom-resource-provider'
    Version: 1.0
    # custom module: autoscaling.py
    AgentService: autoscaling
    AgentType: custom
    AgentResponseNode: ResponseMetadata
    AgentCreateMethod: update_auto_scaling_group
    AgentUpdateMethod: update_auto_scaling_group
    AgentCreateArgs: !Sub |
      {
        "AutoScalingGroupName": "${AutoScalingGroupName}",
        "MixedInstancesPolicy": {
          "LaunchTemplate": {
            "LaunchTemplateSpecification": {
              "LaunchTemplateId": "${CreateLaunchTemplateFromConfiguration.LaunchTemplateId}",
              "Version": "1",
            },
            "Overrides": [
              {
                "InstanceType": "${InstanceSize}"
              }
            ]
          },
          "InstancesDistribution": {
            "OnDemandBaseCapacity": ${OnDemandBaseCapacity},
            "OnDemandPercentageAboveBaseCapacity": ${OnDemandPercentageAboveBaseCapacity}
          }
        }
      }
    AgentUpdateArgs: !Sub |
      ...

These resources effectively convert our EBS environments ASG to use a LT instead of LC and set a policy to deploy a % of Spot instances vs. On-Demand, thus significantly lowering running costs. This approach survives EBS re(deployments) using CFN and maintains our "infrastructure-as-code".

Hope this helps... 👀

--belodetek 😬