How to create server-less deploys on AWS

The dream of a server-less deploy where everything scales dynamically is becoming true. No more servers, no more configuration, no more updates. Just deploy your application and watch it grow with demand. This post is about a pet project of mine that is currently being hosted on Amazon Lambda and API Gateway. The storage is provided by Amazon S3 and DynamoDB. All managed services provided by AWS. In short, it’s about server-less deploys on AWS.

I read a blog post by Nick McHardy on his experience of API Gateway & Lambda and this inspired me to share my findings after having worked with these services for a few months.

Nick points out a fair amount of the good things about API Gateway and Lambda as well as some of the pain points. It is however important to remember that API Gateway is in its early stages and I’m sure more features will be added and that a lot of the pain points will be addressed by Amazon in the coming months. With Amazon Re:invent coming up in just a few days I kind of hope that this blog post will have to be updated soon. That said, let’s move on to my experience.

The Challenge

A few months back I was working on a project where I had to implement a SCIM client. SCIM is short for System for Cross-domain Identity Management. If you haven’t heard of SCIM you can find out more at simplecloud.info. A major blocker during development was the lack of a server to run tests against so we ended up creating a simple mock server. The server part of SCIM is basically a CRUD API so I thought, what if I could create a SCIM server and make it publicly available so that anyone could use it as a test server when implementing clients?

The challenge was not to implement the server but to run it without the need for maintenance and without fixed fees; if no one used the API I didn’t want to be stuck with a monthly bill.
AWS is quite affordable for a small project. The cheapest instances start at about $10 per month, but that’s still a monthly fixed fee.

AWS Managed Services

AWS provides an array of managed services. Managed services means little or no maintenance for you. Most of these services scale dynamically as demand increases and there are few fixed fees. Instead, you pay for usage.

In order to deploy my SCIM server I needed a public API and a static website. API Gateway, AWS Lambda, Amazon DynamoDB, AWS IAM and Amazon S3 would fit my needs. All managed services provided by Amazon.

Let’s take a quick look at these services and what they do:

  • Amazon API Gateway is a REST-oriented HTTP proxy for your backend. The backend can run on Lambda, EC2 or your own external host. API Gateway will proxy the requests as per configuration along with request/response mapping that you define per resource and method. It scales dynamically as the request rate grows and provides, albeit limited, throttling capabilities.
  • AWS Lambda is compute capacity on demand. You deploy your code and pay per 100ms worth of execution time in relation to the amount of memory you allocate to the process. As an example, $1 per month with AWS Lambda gives you almost 2 invocations per second at 100ms/128 MB RAM. Each process runs in its own isolated sandbox and it doesn’t cost you a dime when no processes are running. It scales dynamically as the invocation rate grows.
  • Amazon DynamoDB is a NoSQL database with support for index based queries. The storage allocation is dynamic and IOPS are provisioned in advance but are easy to increase and decrease. And it is fast. Single-digit millisecond fast.
  • Amazon S3, or Simple Storage Service, is a highly distributed, highly scalable key/value store. Perfect for blob storage and it has some pretty advanced features, some of which I’ll mention later.
  • AWS IAM is Identity and Access Management for your AWS resources.

Provisioning the Infrastructure

I really like automation, so to the point that I consider the AWS Console to be a read-only tool with very few exceptions. My goal was to provision every piece of the infrastructure with CloudFormation.

CloudFormation works great for Lambda, DynamoDB, S3 and IAM but there is no support for API Gateway yet. The only ways to interact with API Gateway are via the AWS Console or the REST API. The REST API is quite difficult to work with if you don’t have an SDK so I decided to manage this manually. I could have created CustomResources and used the REST API directly but I decided against that. The SDKs will hopefully provide API Gateway support soon so that this too can be automated.

API Gateway

Setting up API Gateway manually is tedious at best. There is a lot of duplication between resources and methods and not much in terms of inheritance of master configuration. You end up doing a lot of copy & paste and when you realize that you have to make one small change to 20 methods it is frustrating. It’s doable though. There are only so many configuration options you need but I would personally not use this in a real application until I can automate the whole setup.

I did find a few useful things while setting up this API:

Create a template method
Configure one method exactly the way you like it, think it over one extra time, and then create the other methods with the first one as a template. It might not fit perfectly to all your use cases but it will save a lot of time if you get it right from the start.

Think about your Lambdas input data
Your Lambdas probably need data from the initial request. Think about this model early on so that you map the requests accordingly from the start. I rebuilt the mappings several times before I was happy and this was time consuming.

Mapping error codes to HTTP Status
The same way your Lambdas want input data, they also want to respond.
Managing successful requests isn’t much of a problem but the errors can be.
I decided to return errors as a string with the http status code enclosed in brackets. E.g: “[400] Bad request”. This allowed me to evaluate the string with the following regexp in the Integration Response mapping and set the HTTP status to 400: “\[400\] .*”. The body was the message in whole and provides context to the client.

Use stages!
You can deploy your API to different stages which is different versions available at different URLs. Use this feature. API Gateway is quite difficult to configure and it’s easy to make mistakes. Stages can save you from a production disaster.

Lambda

When creating a Lambda function you have to provide source code that it can run.
You can either prepare a package on S3 or provide inline code in your template. I chose the latter and added placeholder code to each Lambda function:

Once the infrastructure was created I could update all Lambdas with my deploy script.

Deploying updates

API Gateway

I’ve already mentioned stages in API Gateway. When API Gateway can be automated it will be very interesting to see how to manage changes but I think that will have to be a separate blog post.

Lambda

Updating Lambda via the aws-cli is pretty straightforward.
You can specify a local zip archive or upload the archive to S3:

I recommend using S3 because this allows you to create new packages for each update or version the objects directly on S3 and specify which version you want to deploy. Regardless of which versioning strategy you chose you will be able to rollback a bad lambda deploy if you use S3.

What about DynamoDB and S3?

I mentioned that I also use DynamoDB and S3.
DynamoDB is used as a metadata store for S3 objects. It provides the means to look up S3 object keys by specific data fields.

I used S3 both for static web hosting and cheap blob storage. All data created by the API is stored as JSON documents in S3. To further reduce cost I configured my bucket to delete anything older than 48 hours (it is a test API after all) and only store the data using Reduced Redundancy. Reduced Redundancy storage gives 99.99% durability instead of the standard “eleven nines”.

Conclusion

API Gateway in combination with AWS Lambda works perfectly for hosting and managing APIs.
There are a few constraints and caveats that are good to keep in mind when deciding to go this route or not.

Lambda

  • At the time of writing, Lambda only supports Java and nodejs
  • It doesn’t work well with RDS. Lambda cannot access VPC resources so you’ll have to expose your database to, at least, Lambdas IP ranges. On the other hand, DynamoDB access can be managed with IAM.
  • If you run nodejs you may want to lock the aws-sdk version and bundle it with your deploy. I’m not sure how this behaves with Java. Amazonians are people too, they release bugs sometimes even though it’s not often.

API Gateway

  • The lacking support for automation is a barrier for larger deployments. I expect this to be addressed shortly because it’s really a showstopper.
  • Throttling can only be set globally for a method. It doesn’t support key specific settings. This feature would make for a great way to offer different subscription levels to your end users and prevent a single client from blocking your API.
  • All APIs are public. There is no way to restrict your API to an IP range. This functionality would be useful for test environments etc.

If you are interested in the scripts I use for CloudFormation and deploys to Lambda, let me know. And don’t forget to checkout my project at scimify.com.

Update:
A colleague pointed out that API Gateway offers Swagger support for API management. I will look into this and come back to you with my findings.

 

This blog post is part of a series of posts on server-less deploys on AWS using API Gateway and AWS Lambda.

2 Comments

  1. Luis

    Hello , what should i include in my zip? only the lambdafunction.js ???

    • Hi Luis,

      You should include all source code including the node_modules directory.

      Assuming that your structure looks something like
      src/index.js
      src/lib/…
      node_modules/
      package.json

      I would package like this:
      zip -r package.zip src/* src/*/** node_modules/*/**

      Your lambda should invoke src/index.handler (or whatever it exports)

Leave a Reply