Schedule a chat

Elastic Beanstalk vs. ECS vs. Kubernetes

Mar 22, 2017

This isn’t going to be a super-technical review of these 3 platforms, but more of a high-level overview of what to expect when engaging with each.

If you’re coming here with little knowledge of containerization or running container-based workloads, here’s a short gist.

Tools like Docker and rkt provide a way to run programs in containers, isolated from the rest of a system leveraging Linux control groups. Running applications this way leads to better composability, organization, and helps build immutable infrastructure that’s more robust and easier to manage.

This is sufficient to run some containers on a single machine, link them together manually, and accept that if a container dies, you’ll need to intervene.

That pattern doesn’t work when you have multiple machines, lots of containers that may need to communicate with one another, need to reliabilty update containers, and load balance traffic among them. For this, you need a layer of orchestration that can choose which machines to schedule containers on based on certain criteria (capacity, memory availability, CPU load, etc.), perform rolling deploys of containers while keeping a certain number running, manage network access between containers, allow containers to discover one another, and so on.

Elastic Beanstalk

In this case I’m referring to the Multi-container Elastic Beanstalk Environment that can be chosen when setting up an EB environment. The original non-multi-container environment goes against the Docker best-practices by forcing you to run your entire application from a single container which simply is not using Docker in the way it was intended. It will work, but it’s poorly suited for what most people think of when they want to use container technology.

Elastic Beanstalk (multi-container) is an abstraction layer on top of ECS (Elastic Container Service) with some bootstrapped features and some limitations:

Automatically interacts with ECS and ELB
Cluster health and metrics are readily available and displayed without any extra effort
Load balancer must terminate HTTPS and all backend connections are HTTP
Easily adjustable autoscaling and instance sizing
Manageable environment variables (also see Elastic Beanstalk Secrets as a Service )
Container logs are all collected in one place, but still segmented by instance – so in a cluster environment finding which instance served a request that logged some important data is a challenge
Can only set hard memory limits in container definitions
All cluster instances must run the same set of containers

When you’re just getting started with Docker or container technology and your application is young, this can be a compelling solution. Docker images can still be pulled from public or private registries and with some exceptions, running containers is fairly consistent with other platforms. That means there’s very little vendor lock-in using Elastic Beanstalk to get things rolling.

The first problem that may be encountered though is having to run the same set of containers on every cluster instance. It’s inconvenient to not be able to independently schedule a replicated set of queue workers on the cluster, but for the ease of use Elastic Beanstalk provides, you get a more primitive scheduler in return.

More troublesome however, is the lack of a soft memory limit that’s supported by ECS but strangely not by Elastic Beanstalk. The problem this creates is that unless you know precisely how much memory your containers will use, you’re forced to either:

1) Err on the side of caution and give your containers more memory than they need, reducing the number of containers you can schedule onto an instance (your hard memory limits can’t exceed the memory for the instance type) or forcing you to upgrade your instance type to one with more memory

2) Tweak the memory for each container in order to fit all the containers your app needs on a single instance (remember, each instance runs the same containers) and when a container hits the limit, be prepared for problems

Less problematic is dealing with log aggregration. What Elastic Beanstalk provides out-of-the-box is usable for a short time, but it’s relatively trivial to add a container that mounts read-only logs from other containers, tails the logs with Fluentd and ships them to an Elasticsearch or InfluxDB cluster. This provides a searchable aggregate log stream that makes debugging and identifying errors a lot easier.

Continue to Part 2: ECS