Serving a web application to a global audience requires deploying, hosting and scaling it on reliable cloud infrastructure. Heroku is a cloud platform as a service (PaaS) that supports many server-side languages (e.g., Node.js, Go, Ruby and Python), monitors application status in a beautiful, customizable dashboard and maintaining an add-ons ecosystem for integrating tools/services such as databases, schedulers, search engines, document/image/video processors, etc. Although it is built on AWS, Heroku is simpler to use compared to AWS. Heroku automatically provisions resources and configures low-level infrastructure so developers can focus exclusively on their application without the additional headache of manually setting up each piece of hardware and installing an operating system, runtime environment, etc.
When deploying to Heroku, Heroku's build system packages the application's source code and dependencies together with a language runtime using a buildpack and slug compiler to generate a slug, which is a highly optimized and compressed version of your application. Heroku loads the slug onto a lightweight container called a dyno. Depending on your application's resource demands, it can be scaled horizontally across multiple concurrent dynos. These dynos run on a shared host, but the dynos responsible for running your application are isolated from dynos running other applications.
Initially, your application will run on a single web dyno, which serves your application to the world. If a single web dyno cannot sufficiently handle incoming traffic, then you can always add more web dynos. For requests exceeding 500ms to complete, such as uploading media content, consider delegating this expensive work as a background job to a worker dyno. Worker dynos process these jobs from a job queue and run asynchronously to web dynos to free up the resources of those web dynos.
Below, I'm going to show you how to deploy a Node.js and PostgreSQL application to Heroku.
First, let's download the Node.js application by cloning the project from its GitHub repository:
Let's walkthrough the architecture of our simple Node.js application. It is a multi-container Docker application that consists of three services: an Express.js server, a PostgreSQL database and pgAdmin. As a multi-container Docker application orchestrated by Docker Compose, the PostgreSQL database and pgAdmin containers are spun up from the
dpage/pgadmin4 images respectively. These images do not need any additional modifications.
The Express.js server, which resides in the
api subdirectory, connects to the PostgreSQL database via the
pg PostgreSQL client. The module
api/lib/db.js defines a
Database class that establishes a reusable pool of clients upon instantiation for efficient memory consumption. The connection string URI follows the format
postgres://[username]:[password]@[host]:[port]/[db_name], and it is accessed from the environment variable
DATABASE_URL. Anytime a controller function (the callback argument of the methods
app.post, etc.) calls the
query method, the server connects to the PostgreSQL database via an available client from the pool. Then, the server queries the database, directly passing the arguments of the
query method to the
client.query method. Once the database sends the requested data back to the server, the client is released back to the pool, available for the next request to use. Additionally, there's a
getAllTables method for retrieving low-level information about the tables available in our PostgreSQL database. In this case, our database only contains a single table:
cp_squirrels is seeded with records from the 2018 Central Park Squirrel Census dataset downloaded from the NYC Open Data portal. The dataset, downloaded as a CSV file, contains the fields
obs_date (observation date) and
lat_lng (coordinates of observation) with values that are not compatible with the PostgreSQL data types
Instead of directly copying the contents of the CSV file to the
cp_squirrels table, copy from the output of a GNU awk ("gawk") script. This script...
obs_datefield from the
MMDDYYYYformat to the
YYYY-MM-DDformat, which PostgreSQL uses for storing a
Extracts the latitude and longitude from the original
lat_lngvalue, and re-formats the coordinates to the
"(<longitude>,<latitude>)"format (equivalent to
(x,y)), which PostgreSQL uses for storing a
POINTvalue. Due to the comma between the longitude and latitude, the entire new value must be wrapped within double quotation marks to tell gawk to not separate the coordinates into two fields (keep the coordinates in one field).
Outputs the CSV content with the newly formatted fields.
Upon the initialization of the PostgreSQL database container, this SQL file is ran by adding it to the
This server exposes a RESTful API with two endpoints:
GET /tables and
POST /api/records. The
GET /tables endpoint simply calls the
db.getAllTables method, and the
POST /api/records endpoint retrieves data from the PostgreSQL database based on a query object sent within the incoming request.
To bypass CORS restrictions for clients hosted on a different domain (or running on a different port on the same machine) sending requests to this server, all responses must have the
Access-Control-Allow-Origin header set to the allowable domain (
process.env.CLIENT_APP_URL) and the
Access-Control-Allow-Headers header set to
Origin, X-Requested-With, Content-Type, Accept.
Notice that the Express.js server requires three environment variables:
DATABASE_URL. These environment variables must be added to Heroku, which we will do later on in this post.
Dockerfile for the Express.js server instructs how to build the server's Docker image based on its needs. It automates the process of setting up and running the server. Since the server must run within a Node.js environment and relies on several third-party dependencies, the image must be built upon the
node base image and install the project's dependencies before running the server via the
npm start command.
However, because the filesystem of a Heroku dyno is ephemeral, volume mounting is not supported. Therefore, we must create a new file named
Dockerfile-heroku that is dedicated only to the deployment of the application to Heroku and not reliant on a volume.
Unfortunately, you cannot deploy a multi-container Docker application via Docker Compose to Heroku. Therefore, we must deploy the Express.js server to a web dyno with Docker and separately provision a PostgreSQL database via Heroku Postgres add-on.
To deploy an application with Docker, you must either:
Build a Docker image and push it to the Heroku Container Registry.
Build a Docker image with
heroku.yml. Then, deploy this image to Heroku.
For this tutorial, we will deploy the Express.js server to Heroku by building a Docker image with
heroku.yml and deploying this image to Heroku.
Let's create a
heroku.yml manifest file inside of the
Since the Express.js server will be deployed to a web dyno, we must specify the Docker image to build for the application's
web process, which the web dyno belongs to:
api/Dockerfile already has a
CMD instruction, which specifies the command to run within the container, we don't need to add a
Let's add a
setup section, which defines the environment's add-ons and configuration variables during the provisioning stage. Within this section, add the Heroku PostgreSQL add-on. Choose the free "Hobby Dev" plan and give it a unique name
DATABASE. This unique name is optional, and it is used to distinguish it from other Heroku PostgreSQL add-ons.
Fortunately, once the PostgreSQL database is provisioned, the
DATABASE_URL environment variable, which contains the database connection information for this newly provisioned database, will be made available to our application.
The Heroku CLI#
Check if your machine already has the Heroku CLI installed.
If not yet installed, then install the Heroku CLI. For MacOSX, it can be installed via Homebrew: