Since we recently announced our $10001 Binary Battle to promote applications built on the Mendeley API (now including PLoS as well), I decided to take a look at the data to see what people have to work with. My analysis focused on our second largest discipline, Computer Science. Biological Sciences (my discipline) is the largest, but I started with this one so that I could look at the data with fresh eyes, and also because it’s got some really cool papers to talk about. Here’s what I found:
What I found was a fascinating list of topics, with many of the expected fundamental papers like Shannon’s Theory of Information and the Google paper, a strong showing from Mapreduce and machine learning, but also some interesting hints that augmented reality may be becoming more of an actual reality soon.
The top graph summarizes the overall results of the analysis. This graph shows the Top 10 papers among those who have listed computer science as their discipline and chosen a subdiscipline. The bars are colored according to subdiscipline and the number of readers is shown on the x-axis. The bar graphs for each paper show the distribution of readership levels among subdisciplines. 17 of the 21 CS subdisciplines are represented and the axis scales and color schemes remain constant throughout. Click on any graph to explore it in more detail or to grab the raw data.(NB: A minority of Computer Scientists have listed a subdiscipline. I would encourage everyone to do so.)
1. Latent Dirichlet Allocation (available full-text)
LDA is a means of classifying objects, such as documents, based on their underlying topics. I was surprised to see this paper as number one instead of Shannon’s information theory paper (#7) or the paper describing the concept that became Google (#3). It turns out that interest in this paper is very strong among those who list artificial intelligence as their subdiscipline. In fact, AI researchers contributed the majority of readership to 6 out of the top 10 papers. Presumably, those interested in popular topics such as machine learning list themselves under AI, which explains the strength of this subdiscipline, whereas papers like the Mapreduce one or the Google paper appeal to a broad range of subdisciplines, giving those papers a smaller numbers spread across more subdisciplines. Professor Blei is also a bit of a superstar, so that didn’t hurt. (the irony of a manually-categorized list with an LDA paper at the top has not escaped us)
2. MapReduce : Simplified Data Processing on Large Clusters (available full-text)
It’s no surprise to see this in the Top 10 either, given the huge appeal of this parallelization technique for breaking down huge computations into easily executable and recombinable chunks. The importance of the monolithic “Big Iron” supercomputer has been on the wane for decades. The interesting thing about this paper is that had some of the lowest readership scores of the top papers within a subdiscipline, but folks from across the entire spectrum of computer science are reading it. This is perhaps expected for such a general purpose technique, but given the above it’s strange that there are no AI readers of this paper at all.
3. The Anatomy of a large-scale hypertextual search engine (available full-text)
In this paper, Google founders Sergey Brin and Larry Page discuss how Google was created and how it initially worked. This is another paper that has high readership across a broad swath of disciplines, including AI, but wasn’t dominated by any one discipline. I would expect that the largest share of readers have it in their library mostly out of curiosity rather than direct relevance to their research. It’s a fascinating piece of history related to something that has now become part of our every day lives.
4. Distinctive Image Features from Scale-Invariant Keypoints
This paper was new to me, although I’m sure it’s not new to many of you. This paper describes...
Please join StudyMode to read the full document