Serving a web application to a global audience requires deploying, hosting and scaling it on reliable cloud infrastructure. Heroku is a cloud platform as a service (PaaS) that supports many server-side languages (e.g., Node.js, Go, Ruby and Python), monitors application status in a beautiful, customizable dashboard and maintaining an add-ons ecosystem for integrating tools/services such as databases, schedulers, search engines, document/image/video processors, etc. Although it is built on AWS, Heroku is simpler to use compared to AWS. Heroku automatically provisions resources and configures low-level infrastructure so developers can focus exclusively on their application without the additional headache of manually setting up each piece of hardware and installing an operating system, runtime environment, etc.

When deploying to Heroku, Heroku's build system packages the application's source code and dependencies together with a language runtime using a buildpack and slug compiler to generate a slug, which is a highly optimized and compressed version of your application. Heroku loads the slug onto a lightweight container called a dyno. Depending on your application's resource demands, it can be scaled horizontally across multiple concurrent dynos. These dynos run on a shared host, but the dynos responsible for running your application are isolated from dynos running other applications.

Initially, your application will run on a single web dyno, which serves your application to the world. If a single web dyno cannot sufficiently handle incoming traffic, then you can always add more web dynos. For requests exceeding 500ms to complete, such as uploading media content, consider delegating this expensive work as a background job to a worker dyno. Worker dynos process these jobs from a job queue and run asynchronously to web dynos to free up the resources of those web dynos.

Below, I'm going to show you how to deploy a Node.js and PostgreSQL application to Heroku.

Getting Started#

First, let's download the Node.js application by cloning the project from its GitHub repository:

Let's walkthrough the architecture of our simple Node.js application. It is a multi-container Docker application that consists of three services: an Express.js server, a PostgreSQL database and pgAdmin. As a multi-container Docker application orchestrated by Docker Compose, the PostgreSQL database and pgAdmin containers are spun up from the postgres and dpage/pgadmin4 images respectively. These images do not need any additional modifications.

(docker-compose.yml)

The Express.js server, which resides in the api subdirectory, connects to the PostgreSQL database via the pg PostgreSQL client. The module api/lib/db.js defines a Database class that establishes a reusable pool of clients upon instantiation for efficient memory consumption. The connection string URI follows the format postgres://[username]:[password]@[host]:[port]/[db_name], and it is accessed from the environment variable DATABASE_URL. Anytime a controller function (the callback argument of the methods app.get, app.post, etc.) calls the query method, the server connects to the PostgreSQL database via an available client from the pool. Then, the server queries the database, directly passing the arguments of the query method to the client.query method. Once the database sends the requested data back to the server, the client is released back to the pool, available for the next request to use. Additionally, there's a getAllTables method for retrieving low-level information about the tables available in our PostgreSQL database. In this case, our database only contains a single table: cp_squirrels.

(api/lib/db.js)

The table cp_squirrels is seeded with records from the 2018 Central Park Squirrel Census dataset downloaded from the NYC Open Data portal. The dataset, downloaded as a CSV file, contains the fields obs_date (observation date) and lat_lng (coordinates of observation) with values that are not compatible with the PostgreSQL data types DATE and POINT respectively.

Instead of directly copying the contents of the CSV file to the cp_squirrels table, copy from the output of a GNU awk ("gawk") script. This script...

  1. Converts the obs_date field from the MMDDYYYY format to the YYYY-MM-DD format, which PostgreSQL uses for storing a DATE value.

  2. Extracts the latitude and longitude from the original lat_lng value, and re-formats the coordinates to the "(<longitude>,<latitude>)" format (equivalent to (x,y)), which PostgreSQL uses for storing a POINT value. Due to the comma between the longitude and latitude, the entire new value must be wrapped within double quotation marks to tell gawk to not separate the coordinates into two fields (keep the coordinates in one field).

  3. Outputs the CSV content with the newly formatted fields.

(db/create.sql)

Upon the initialization of the PostgreSQL database container, this SQL file is ran by adding it to the docker-entrypoint-initdb.d directory.

(db/Dockerfile)

This server exposes a RESTful API with two endpoints: GET /tables and POST /api/records. The GET /tables endpoint simply calls the db.getAllTables method, and the POST /api/records endpoint retrieves data from the PostgreSQL database based on a query object sent within the incoming request.

To bypass CORS restrictions for clients hosted on a different domain (or running on a different port on the same machine) sending requests to this server, all responses must have the Access-Control-Allow-Origin header set to the allowable domain (process.env.CLIENT_APP_URL) and the Access-Control-Allow-Headers header set to Origin, X-Requested-With, Content-Type, Accept.

(api/index.js)

Notice that the Express.js server requires three environment variables: CLIENT_APP_URL, PORT and DATABASE_URL. These environment variables must be added to Heroku, which we will do later on in this post.

The Dockerfile for the Express.js server instructs how to build the server's Docker image based on its needs. It automates the process of setting up and running the server. Since the server must run within a Node.js environment and relies on several third-party dependencies, the image must be built upon the node base image and install the project's dependencies before running the server via the npm start command.

(api/Dockerfile)

However, because the filesystem of a Heroku dyno is ephemeral, volume mounting is not supported. Therefore, we must create a new file named Dockerfile-heroku that is dedicated only to the deployment of the application to Heroku and not reliant on a volume.

(api/Dockerfile-heroku)

Unfortunately, you cannot deploy a multi-container Docker application via Docker Compose to Heroku. Therefore, we must deploy the Express.js server to a web dyno with Docker and separately provision a PostgreSQL database via Heroku Postgres add-on.

To deploy an application with Docker, you must either:

  • Build a Docker image and push it to the Heroku Container Registry.

  • Build a Docker image with heroku.yml. Then, deploy this image to Heroku.

For this tutorial, we will deploy the Express.js server to Heroku by building a Docker image with heroku.yml and deploying this image to Heroku.

Preparing the heroku.yml File#

Let's create a heroku.yml manifest file inside of the api subdirectory.

Since the Express.js server will be deployed to a web dyno, we must specify the Docker image to build for the application's web process, which the web dyno belongs to:

(api/heroku.yml)

Because our api/Dockerfile already has a CMD instruction, which specifies the command to run within the container, we don't need to add a run section.

Let's add a setup section, which defines the environment's add-ons and configuration variables during the provisioning stage. Within this section, add the Heroku PostgreSQL add-on. Choose the free "Hobby Dev" plan and give it a unique name DATABASE. This unique name is optional, and it is used to distinguish it from other Heroku PostgreSQL add-ons.