JSON has become the most popular way to pass data around on the modern web, with almost universal support between APIs and server applications.
However, the dominant data format in spreadsheet applications like Excel and Google Sheets is CSV, as this format is the tersest and easiest for users to understand and create.
A common function that backend apps will need to perform, therefore, is the conversion of JSON to CSV. In this tutorial, we'll create a CLI script to convert an input JSON file to an output CSV file.
What you'll learn#
How to create a CLI script with Node.js
How to load a JSON file from the filesystem
Writing a method to convert a JSON structure to CSV
Writing the CSV file to the filesystem
If you don't need to customize the conversion process, it's quicker to use a third-party package like json-2-csv. At the end of the tutorial, I'll also show you how to use this instead of a custom converter.
You'll need a version of Node.js that includes fs.promises and supports async/await and also have NPM installed. I recommend Node v12 LTS which includes NPM 6.
Setting up THE project#
Let's now get started making a Node script that we can use to convert a JSON file to CSV from the command line. First, we'll create the source code directory,
json-to-csv. Change into that and run
npm init -y so that we're ready to add some NPM packages.
Example JSON file#
Let's now create am example JSON file that we can work with called input.json. I've created a simple data schema with three properties: name, email, and date of birth.
Creating the CLI with yargs#
It'd be very handy to allow our utility to take in a file name input and file name output so that we can use it from the CLI. Here's the command we should be able to use from within the json-to-csv directory:
So let's now create an index.js file and install the yargs package to handle CLI input:
Inside index.js, let's require the yargs package and assign the
argv property to a variable. This variable will effectively hold any CLI inputs captured by yargs.
Nameless CLI arguments will be in an array at the
_ property of
argv. Let's grab these and assign them to obviously-named variables
We'll also console log the values now to check they're working how we expect:
Reading and parsing A JSON file#
For file operations, we're going to use the
promises API of the
fs package of Node.js. This will make handling files a lot easier than using the standard callbacks pattern.
Let's do a destructure assignment to grab the
writeFile methods which are all we'll need in this project.
Let's now write a function that will parse the JSON file. Since file reading is an asynchronous process, let's make it an
async function and name it
parseJSONFile. This method will take the file name as an argument.
In the method body, add a
catch block. In the
try, we'll create a variable
file and assign to this
await readFile(fileName) which will load the raw file. Next, we'll parse the contents as JSON and return it.
catch block, we should console log the error so the user knows what's gone wrong. We should also exit the script by calling
process.exit(1) which indicates to the shell that the process failed.
Converting JSON data to CSV format#
parseJSONFile to a CSV-compatible format. First, we're going to extract the values of each object in the array, discarding the keys. To do this, we'll
map a new array where each element is itself an array of the object values.
Next, we'll use the array
unshift method to insert a header row to the top of the data. We'll pass to this the object keys of any one of the objects (since we assume they all have the same keys for the sake of simplicity).
join method and joining each object with a newline (
Adding quotes around fields#
We're not quite finished - CSV fields should be surrounded by quotes to escape any commas from within the string. There's an easy way to do this:
Print the string within a string template with surrounding quotes
Put at the beginning and end of each new line
Replace each comma in the string with a quote surrounded comma by chaining a regex
Writing CSV file#
It's fairly trivial now to write our CSV file now that we have a CSV string - we just need to call
writeFile from an async method
writeCSV. Just like in the parse method we'll include a
catch block and exit on error.
Putting it all together#
To run our CSV converter we'll add an IIFE to the bottom of the file. Within this function, we'll call each of our methods in the sequence we wrote them, passing data from one to the next. At the end, let's console log a message so the user knows the conversion process worked.
Let's now try and run the CLI command using our example JSON file:
It works! Here's what the output looks like:
There's a fatal flaw in our script: if any CSV fields contain commas they will be made into separate fields during the conversion. Note in the below example what happens to the second field of the last row which includes a comma:
To fix this, we'll need to escape any commas before the data gets passed to the
arrayToCSV method, then unescape them afterward. We're going to create two methods to do this:
In the former, we'll use
map to create a new array where comma values are replaced by a variable
token. This token can be anything you like, so long as it doesn't occur in the CSV data. For this reason, I recommend something random like
unescapeCommas method, we'll replace the token with the commas and restore the original content.
Here's how we'll modify our run function to incorporate these new methods:
With that done, the convertor can now handle commas in the content.
Testing in Google Sheets#
Here's the real test of our CLI tool...can we import a converted sheet into Google Sheets? Yes, it works perfectly! Note I even put a comma in one of the fields to ensure the escape mechanism works.
Using the json-2-csv package#
While it's good to understand the underlying mechanism of CSV conversion in case we need to customize it, in most projects, we'd probably just use a package like json-2-csv.
Not only will this save us having to create the conversion functionality ourselves, but it also has a lot of additional features we haven't included like the ability to use different schema structures and delimiters.
Let's now update our project to use this package instead. First, go ahead and install it on the command line:
Next, let's require it in our project and assign it to a variable using destructuring:
We can now modify our run function to use this package instead of our custom
arrayToCSV method. Note we no longer need to escape our content either as the package will do this for us as well.
With this change, run the CLI command again and you should get almost the same results. The key difference is that this package will only wrap fields in double-quotes if they need escaping as this still produces a valid CSV.
So now you've learned how to create a CLI script for converting JSON to CSV using Node.js. Here's the complete script for your reference or if you've skimmed the article: