Visualizing Pressible Blogs: Data Exploration

| July 8, 2010

A big part of my residency at EdLab involves examining the data that is produced by the system. Launched last month, Pressible is a publishing platform for students and staff at Teachers College. I’m interested in how this data is representative of the human community at Teachers. In any data-related project, my first step is always one of exploration. I try to get a feel for the scope of the data, and then attempt to turn it on a variety of angles to uncover interesting things.

Most often, this early exploration work is done as a series of short sketches in Processing. Today, I spent about two hours building a simple tool to visualize the number of blogs in the Pressible database, the number of posts for each blog, and the variation in posting patterns between the individual blogs.

Here, we see each of the Pressible blogs represented by a cluster:


If we focus on an individual cluster, we see a set of circles radiating from the central node:

Each of these circles is a post to the blog. The radius of both the circles and the ‘spokes’ correlate to the size of the post – longer posts show up as bigger circles. In the blog above, we see a mix of short to medium length posts, with one extra-long post existing as an outlier.

This blog shows a different post pattern, with more longer posts, and more very short posts:

While this blog consists of many, very short posts, with only a few variations:

This simple visualization allows me to look at the entire set of Pressible blogs, and quickly identify those which are most active, and those with interesting posting patterns. Conversely, it lets me see how many blogs are deleted (these have black central nodes), and those which are either new or rarely used (many of the blogs have an identical pattern of 5 posts – this is the Pressible default).

The next step is to bring time into the picture. While the posts are ordered chronologically, they are currently evenly spaced around the central node. In the next version of this tool, I’ll space the posts according to real time, so that I can see posting patterns over time – and so that I can play back the growth of the entire system to see if there are any interesting patterns.