Merging csv files with Node.js and D3.js

19th January, 2018 - 2 min. read - in Tutorials - Go to Index

Node.js is awesome because it’s an ecosystem. It’s even more awesome when used together with some popular libraries such as D3.js and Lodash.

The goal

I’ve had to pre-process a bunch of csv files in order to work with a single dataset file for convenience. Node.js is the platform of choice for this type of tasks now and I couldn’t be more satisfied of it.

The process

Here a walk through of this little script that saves to me a lot of time.

Import our weapons:

const d3 = require('d3')
const fs = require('fs')
const _ = require('lodash')

Reading a folder to get the file list calling a function for each file:

var files = fs.readdirSync(`${__dirname}/data`)
_.each(files, filename => process(filename))

Read the csv content and parse it with D3.js:

var process = name => {
  var raw = fs.readFileSync(`data/${name}`, 'utf8')
  var csv = d3.csvParse(raw)
}

Wrangling some values before commit to the final array:

var process = name => {
  ...
  var parse = d3.timeParse('%m/%d/%y')
  csv.forEach(d => {
    d.timestamp = parse(d.Dates)
  })
}

Create an unique array with all the csv files merged together (thanks Lodash):

var db = []
var process = name => {
  ...
  var newdb = _.unionBy(db, csv, 'Dates')
  db = newdb
}

Save the final dataset as JSON file:

var process = name => {
  ...
  fs.writeFileSync('db.json', JSON.stringify(db))
}

The whole script generates a json file with all the entries. A perfect starting point for an explorative session with D3.js.

Feel good.


Spotted a typo or (likely) a grammar error? Send a pull request.