Automatic backup of log files to S3 from EC2 instances

I’ve recently been deploying a service by launching EC2 instances, with the new version, and terminating the old instances, running the previous version of the service. I think this has been working pretty well for me, as I want the instances replaced at some point anyway. There has, however, been a big drawback of this approach – losing my service’s log files, which I might want to use for troubleshooting and analytics.

I decided to backup my log files by copying them to S3. It was, however, not obvious how I should do that as I’m using logrotate, rotating over a set of files. The problem is that the log filenames are static when run with default options, but the content will change with every rotation. At first I considered the dateext directive, to get the current date added to the filenames, but in the end I figured it would be easier to just add a rotate-hook to logrotate.

To create a logrotate hook is very simple. Just add the postrotate/endscript directive with the script or commands you want to run. You’d also want to add sharedscripts in order to make sure the script is only executed once per rotation.

My logrotate file (/etc/logrotate.d/my_service) for my service looks like this:

But what about the current log file that has not been rotated yet? If the instance would be terminated, then the latest log file would not be uploaded.

The solution is to use a shutdown-hook for the system. In Ubuntu it is very easy to add a shutdown-hook by using upstart.

The upstart job config file (/etc/init/shutdown-hook.conf) should look something like this:

Finally I just needed the script (/etc/my_service/upload_log_to_s3.sh) for uploading a log file to S3:

The upload script will just gzip the log file (needed as I’m using delaycompress), rename the log file to the current timestamp, and upload the file using aws-cli. The argument sets the file extension of the log file, which is necessary to be able to upload both the current (.log) as well as the previous log file (.log.1).

This is of course not the best solution for all use cases, but I think this is really simple to implement. I hope you find it useful.

This Post Has 4 Comments

  1. How do you make sure the file is removed after you upload it?
    Also, you’re uploading the log file on shutdown (which is fine) but how do you make sure you don’t upload the same data again after the machine reboots and logs continue to accumulate?

  2. Instead of executing just the upload script on shutdown, you should execute “logrotate -f /etc/logrotate.d/my_service” to ensure the same data won’t be uploaded after the server is restarted.

    1. The same server wont be restarted, EC2 instances are ephemeral. You toss them away and deploy fresh new ones.

  3. Hey,

    what does “log_file_ext=$1” mean in “/etc/my_service/upload_log_to_s3.sh”.

    Also, if my log file is for example: /var/log/app/application.log, would this line “gzip -c /var/log/my_service/*.$log_file_ext > /tmp/log.gz” be “gzip -c /var/log/app/application.$log_file_ext > /tmp/log.gz”. ?

    We have the same use case of sending logs before a machine shuts down to s3. Please help.

    Thanks,

Leave a Reply

Close Menu