AWS Elasticsearch JavaScript Client

I have spent some time working with the AWS Elasticsearch Service lately. Regrettably, I found the threshold before being productive was higher than I anticipated. One of my obstacles was to get an AWS Elasticsearch JavaScript client working inside an AWS Lambda function, so I thought I’d better make a note of my solution in case I run into a similar problem in the future.

Elasticsearch IAM Policy Document

Before looking at the client implementation, we need to make sure that it is allowed to access the Elasticsearch domain. As always, this requires that the client is associated with an IAM Policy Document. Adhering to the AWS guideline of principle of least privileges the policy is as strict as possible.

The * character at the end of the es:ESHttp* value implies that all HTTP methods are allowed. You may choose to lock down the policy even further. One example is to use "es:ESHttpGet" for just permitting reading data from Elasticsearch. Another example is ["es:ESHttpPost", "es:ESHttpPut"] for clients that only add data to the domain. Finally, the Resource property tells us that the policy statement only affects the Elasticsearch domain with the specified ARN.

Elasticsearch Client

My first naive attempt was to use a HTTP client to make requests to the Elasticsearch HTTP API of my domain. It failed misearably, AWS requires that HTTP requests are signed with Signature Version 4 to be valid. The AWS SDK handles this internally so usually you do not need to bother. Realizing that, I took a closer look at what functionality the ES class in the AWS JavaScript SDK offers. It does indeed provide an Elasticsearch API, but it is all about domain configuration, management and it does not provide any client features. Next, when I studied the AWS Elasticsearh developer guide, I found an JavaScript client snippet. It had some limitations in my opinion (it uses global variables for request configuration and response handling just logs HTTP status code and response body). For this reason, I chose to rewrite it to a more generic elasticsearch-client.js file:

Example Usage

The above implementation enables you to implement all methods in the Elasticsearch HTTP API. The only missing part is an environment variable called ELASTICSEARCH_DOMAIN that should have the value of your AWS hosted Elasticsearch domain such as my-domain-qwertyasdf.eu-west-1.es.amazonaws.com. To create a new Elasticsearch index called my-index you execute the function call by providing the required parameters in the corresponding Create Index API:

And the result may look something like:

Considerations

  • The Elasticsearch client above returns a Promise. Timeouts and unknown domain URLs result in Promise.reject() whereas successful HTTP request/response results in Promise.resolve(). The resolved JavaScript object has three or four properties, namely the HTTP statusCode, the HTTP statusMessage, the HTTP headers and body in case there is a HTTP response body. Consequently, the promise will be resolved successfully by any 4XX client error codes (e.g. 404 – Not Found) and 5XX server errors (e.g. 503 Service Unavailable). Feel free to modify the code to reject the promise on HTTP errors if you prefer such behaviour.
  • The client uses the AWS.EnvironmentCredentials class for obtaining valid credentials since it is being deployed as part of a Lambda function. This is not the only Node.js runtime environment and for this reason this is not the only credential class in the SDK. Please study the Setting Credentials in Node.js chapter in the AWS JavaScript developer guide for other alternatives.
  • A different approach to connect to an AWS Elasticsearch domain is to use the official Elasticsearch JavaScript client. Like my HTTP client attempt, it cannot be used directly since it does not have the AWS Signature Version 4 capability. However, it has a pluggable architecture and there is a community extension called http-aws-es that solves this problem. I have not tried this method, but they are both available as npm dependencies. Please check elasticsearch and http-aws-es for more information.

Mattias Severson

Mattias is a senior software engineer specialized in backend architecture and development with experience of cloud based applications and scalable solutions. He is a clean code proponent who appreciates Agile methodologies and pragmatic Test Driven Development. Mattias has experience from many different environments, including everything between big international projects that last for years and solo, single day jobs. He is open-minded and curious about new technologies. Mattias believes in continuous improvement on a personal level as well as in the projects that he is working on. Additionally, Mattias is a frequent speaker at user groups, companies and conferences.

This Post Has 4 Comments

  1. Thanks a lot, you saved me a lot of time. Nice and well explained article!

  2. I really liked your implementation, it also pointed me to some valuable resources, thanks!
    Right now I’m debating whether I should query my ES domain from a Lambda function with AWS API Gateway or have the client (front-end/browser) app send directly the network request. Do you have any thoughts in the matter?

    PS: For this use case, and as you tightly control your function implantation is fine to use path.join to join urls, nonetheless it is not recommended as pointed out here https://stackoverflow.com/questions/16301503/can-i-use-requirepath-join-to-safely-concatenate-urls

    1. @Alejandro: Thanks for the link and for your comments.

      Regarding your question I have only been involved in projects where Elasticsearch has been used as part of a server side application and not directly accessible by clients (c.f. a database or a Redis cache). That said, it is not uncommon to see ES directly behind NGINX which in turn receives client requests and I can imagine a similar solution where API Gateway and Lamba being used as reverse proxy instead of NGINX. Chances are that you already have some authentication / authorization flow in place that you can leverage if you configure a reverse proxy. Moreover, a reverse proxy typically has some kind of rate limiting or throttling that can be enabled to protect your service in case of DDoS attacks.

      With a plain ES service you need to carefully tailor the ES access management, at least the parts that modifies the stored data and indices configuration. Presumably, you would like to expose the Search API to your users, but probably neither the Indices API nor the Documents API, to name a few. Additionally, Kibana is prebuilt with the AWS Elasticsearch service, i.e. it will also be available for clients (and the rest of the Internet) by default unless you take some action. That said, you can configure IAM policies on ES resource level, see the Policy Element Reference documentation in the Amazon Elasticsearch Service Developer Guide.

Leave a Reply

Close Menu