rolisz's site

Line charts in Javascript

Recently I wanted to take a look at some personal data that I had been collecting for several years (Quantified Self and Lifel­og­ging ftw :D). Until now it was sitting there write-only, with me oc­ca­sion­al­ly peeking at it manually, but because it was in a pretty much raw format (actually, multiple formats, from different sources), I didn't interact with it too much. However, recently I was a bit bored, I wanted to code something just for fun (as much as parsing XML files can be called fun), so I cleaned up the data and now I wanted to get some "useful" in­for­ma­tion out of it.

The data consists of multiple timeseries for various labels, of integer data. The timestamp has second gran­u­lar­i­ty, but the mea­sure­ments are not every second, sometimes with several days or months between con­sec­u­tive mea­sure­ments.

I want to be able to plot multiple timeseries on the same graph, so I can compare them to each other. Because it spans years, I want to be able to zoom and pan on the data. The chart should also look reasonably decent. The library I use should be free and open-source. And the kicker re­quire­ment: when zooming out, the data should get aggregated. This means that when looking at the data on the yearly scale, I don't want to see individual dots for each second where I have a mea­sure­ment, but I want to see all the mea­sure­ments for a month added up, and have only data points for each month show up. I want to do this on several levels, so that I have monthly, daily and hourly ag­gre­ga­tions.

And after two weeks of playing with various kinds of charting libraries in my free time, I reached the conclusion that the state of the art in free and open-source Javascript libraries is quite sad. Almost all of the libraries are awesome at only one thing and cus­tomiz­ing them is quite hard. A quick review of the main ones I looked at:

There are some others commercial libraries too (Highcharts or amCharts), which do everything and the kitchen sink, but meh, vive la open source.

I ended up going with dc.js for two reasons: the API is much nicer, being in the style of d3.js, while c3.js has a more de­clar­a­tive syntax, including some string-to-function magic, and the other reason being that it is much easier to combine multiple kinds of charts in it and do filtering across them (thanks to the cross­fil­ter in­te­gra­tion). Also, out of the box, the charts in dc.js are nicer than the ones in c3.js, but I'm sure all this can be changed without (too) much hassle.

So, let's get to the fun part: coding the line chart.

I'll assume that we have an HTML file that contains a div with an id "chart" and the necessary CSS and Javascript imports

Some code to generate some random data, in the form of lists of {"­time­stam­p": Date, "data": number}.

function randomDate(start, end) {
        return new Date(start.getTime() + Math.random() * (end.getTime() - start.getTime()));
}
function getRandomInt(min, max) {
        return Math.floor(Math.random() * (max - min + 1)) + min;
}

function generateData() {
    var data = []    
    for (var i = 0; i < 1000; i++) {
        data.push({"timestamp": randomDate(new Date(2014, 01, 01), new Date(2015, 12, 31)), 
                   "data": getRandomInt(10, 1000)})
    }
    return data.sort(function(a,b) { return a.timestamp - b.timestamp })
}

Now let's declare some d3.js formatters, generate the data, initialize the chart, and do some pre­pro­cess­ing on the data:

var dateFormat = d3.time.format.iso
var dayFormat = d3.time.format('%x')
var numberFormat = d3.format('d');
var chart = dc.compositeChart('#chart');
var data = {"label1": generateData(), "label2": generateData(),
            "label3": generateData(), "label4": generateData(),
            "label5": generateData()}
// Parse the timestamp and precompute the slots where each datapoint  will fit 
//  when aggregating
for (var label in data) {
    if (data.hasOwnProperty(label)) {
        data[label].forEach(function(d) {
            d.dd = dateFormat.parse(d.timestamp);
            d.hour = d3.time.hour(d.dd)
            d.day = d3.time.day(d.dd)
            d.month = d3.time.month(d.dd)
        })
    }
}

We are using a composite chart from dc.js, so we will have to generate the each line as an individual line chart. For each line, we generate the cross­fil­ter group (which does ag­gre­ga­tions and filtering) and then create the actual chart, setting the correct data and tooltip title.

function generateCharts(data, aggregation) {
    charts = []
    for (var label in data) {
        if (data.hasOwnProperty(label)) {
            var ndx = crossfilter(data[label]);
            var dim = ndx.dimension(function (d) { return d[aggregation] });

            lengths = dim.group().reduceSum(function(d) { return d.data})
            charts.push(dc.lineChart(chart)
                            .group(lengths, label)
                            .dimension(dim)
                            .title((function(name) { return function (d) {
                                return name + '\n' + dayFormat(d.key) + '\n' + numberFormat(d.value);
                            }})(label))
            )
        }
    }
    return charts
}
currentGroup = "month"
charts = generateCharts(data, currentGroup)

Now we have to set all the options for our chart, nothing special.

chart 
    .width(1000)
    .height(400)
    .zoomScale([1,800])
    .zoomOutRestrict(false)
    .transitionDuration(0)
    .margins({top: 30, right: 50, bottom: 25, left: 50})
    .mouseZoomable(true)
    .x(d3.time.scale().domain([new Date(2014, 1, 1), new Date(2015, 11, 31)]))
    .round(d3.time.month.round)
    .xUnits(d3.time.months)
    .elasticY(true)
    .renderHorizontalGridLines(true)
    .keyAccessor(function(d) {
        return d.key;
    })
    .legend(dc.legend().x(900).y(10).itemHeight(13).gap(5))
    .valueAccessor(function (d) {
        return d.value;
    })
    .compose(charts)
    .shareColors(true)
    .shareTitle(false)
    .brushOn(false)

This is where the magic happens. When we get a zoom event, we check what is the range that is shown on screen and based on how wide it is, we regenerate the chart if we went to a different level.

    .on('zoomed', function(chart, filter) {
        var range = chart.x().domain()
        var diff = range[1] - range[0]
        if (diff < 1000*3600*24*15) {
            aggregate = "hour"
        } else if (diff < 1000*3600*24*30*6) {
            aggregate = "day"
        } else {
            aggregate = "month"
        }
        if (aggregate != currentGroup) {
            charts = generateCharts(data, aggregate)
            chart.compose(charts).render()
            currentGroup = aggregate
        }
    })

dc.renderAll();

The results can be seen here, with the source being in this GitHub repo. And now that we have pretty graphs, it's time to interpret them :))

And after the post was written, plot.ly open-sourced their own library X(. I'll have to in­ves­ti­gate that one too.