Amazon Book Map

Aaron Swartz, who runs theinfo.org, contacted me back in January '08 with an interesting data set. He had built a list of 735,323 books by crawling Amazon. Of course a gigantic list is pretty boring, but Aaron had also captured similarity data between books. In particular, he had amassed a whopping 10,316,775 connections (edges) between books Amazon believed were related. This allowed me to throw the data into my old wikiviz engine to spatially layout a huge mosaic of books (I let it run for a 140 hours). Items that were noted as being similar had attractive forces, bringing them together, often into large groups. Unsurprisingly, when we color coded by Amazon book category, there was an obvious coalescence. The way various high-level categorizations mix and meet also seems fairly logical.

I produced a few versions of what I am dubbing the Amazon Book Map. The first visualization is a huge mosaic of book covers, tinted by their respective category colors. I can't produce this in one go at full resolution because the memory requires are enormous. The second version uses color-coded dots.

The layout (clustering-wise) is decent, but not great. I don't think my algorithm works all that well for highly-unstructured graphs. For those that are curious, I've included a small graph of how the layout converged. Details below.

Book Cover Version - Download Full Resolution JPG (10,296 x 15,444)

Close-up of Book Cover Version

Super Close-up of Book Cover Version (email me if you want this level of resolution).

Dot Version - Download Full Resolution JPG (8580 x 8580)

Color Coding Key - Amazon Book Categories

© Chris Harrison