Wow. What a lot of pain.
So let me first confirm what won’t work. Whether Apache 1, Apache 2, leader_only
or EB_IS_COMMAND_LEADER
testing, you cannot consistenty get exactly one (not zero, or 2+) instances to run a cron job over a long period of time.
The issue is the the leader information isn’t safe. It’s only available during deployments, when you usually first set up the cron job. But when an instance replacement happens—for whatever reason—there is no related deployment, the test to check if the new instance is the leader fails, and you end up with no crons.
So for anything you do at the moment of deployment—database migrations, logging, whatever— that you only want to run once, even on multi-instance environments in an Elastic Beanstalk Application, you can use and trust the leader tests available to you.
But for anything you want to run regularly, on a schedule across weeks and months, you cannot trust those options.
After a lot of research, and trial and error there are two options available to you.
Worker environments
The ‘official’ way is probably via a worker environment. Essentially an instance set up specifically to schedule and run short or long lives processes.
You can learn more about them via the links below, but for the rest of this article we’ll be talking about the alternative option of ‘runtime leader testing’.
Runtime leader testing
This option means running the cron on every instance of an environment, but the initial section of the code being run checks the if the instance id that it’s running on is the first to be returned in the list of all instances. That is, only the first instance id returned by the aws elastic beanstalk api will continue to run the task.
Below the examples are in PHP, but it’s a valid method for other languages too.
We’re assuming you’ve already set up a cron and know how to alter it to make it run on all instances of your environment.
The steps in this guide:
- Create an IAM user just for this process and give it limited read permissions
- Add the credentials to EB configuration
- Write code to check the instance id from the api against the current instance id
Adding the user
- Open IAM in AWS console
- Click Users
- Click Add users
- Type your User name – we used “[applicationCode]-eb-read-user”
- Click the ‘Access Key’ access type
- Click Next: Permissions
- Click the ‘Attach existing policies directly’ tab
- Type ‘AWSElasticBeanstalkReadOnly’ and select it
- NB: This permission actually gives more access than is required and you could trim it further
- Click Next: Tags
- Click Next: Review
- Click Create user
- Take a copy of the KEY and SECRET
Storing the key and secret in the environment
- Open your environment
- Click configuration
- Click ‘edit’ on the ‘Software’ section
- Add two new items, one for the key (examples use AWS_KEY) and one for the secret (examples use AWS_SECRET)
The code
So this is php, but the basic idea is:
- Install the aws library – here we use composer
- Add some code to call the library
- Compare the first returned instance id against the id of the instance running the code
Install the aws library
$composer require aws/aws-sdk-php
Add the code to do the check
$client = ElasticBeanstalkClient::factory([
'credentials' => [
'key' => $_ENV['AWS_KEY'],
'secret' => $_ENV['AWS_SECRET'],
],
'region' => '[[your region]]',
'version' => 'latest'
]);
Replace [[your_region]] above with your own
$result = $client->describeEnvironmentResources([
'EnvironmentName' => '[[your_environment_name]]'
]);
Then load in the current instance’s id
$currentId = file_get_contents("http://instance-data/latest/meta-data/instance-id");
Then compare the API’s first result to the current one. If the same, so something, otherwise don’t.
if ($currentId == $result['EnvironmentResources']['Instances'][0]['Id'])
{
// do something. Only one instance will get here
}
else
{
// don't do anything. All other instances will be here
}
Other useful resources:
https://rotaready.com/blog/scheduled-tasks-elastic-beanstalk-cron