This is the list I used 27 years ago(1) when I taught at Stanford. It's a
mix of what the previous guy (Bob Hagmann of Xerox PARC) used, what I
read at Wisconsin, with a sprinkling of Sun's stuff. In looking this
over, it's missing some of the Sun stuff, here are some links, some
good papers here especially those that start with "v" (if the formatting
in any of these looks wonky let me know, I have the troff source for
a bunch of these and had to format it just now to get pdfs, might have
screwed something up):
and to toot my own horn, some stuff I did at Sun (and after)
http://mcvoy.com/lm/papers/SunOS.ufs_clustering.pdf
http://mcvoy.com/lm/papers/SunOS.nvram.pdf (proposal, never built)
http://mcvoy.com/lm/papers/lmbench-usenix.pdf
http://mcvoy.com/lm/papers/splice.pdf (hand waving, somewhat built in Linux)
And the Stanford list:
http://mcvoy.com/lm/papers/stanford.txt
(1) Jesus, I'm old. Feels like yesterday I was teaching there.
Yeah so I did that and I ended up with the same thing as sillysaurus, it looks like a bunch of crud jammed together. So I used the code block stuff to make it look reasonable.
Are those my two choices? I googled and HN seems to have very limited formatting.
"Reflections on Trusting Trust" by Ken Thompson is one of my favorites, and fairly self contained.
Most papers by Jon Bentley (e.g. "A Sample of Brilliance") are also great reads and usually pretty short.
I'm a frequent contributor to Fermat's Library (https://fermatslibrary.com), which posts an annotated paper (CS, Math and Physics mainly) every week. In the annotations you will usually find a concise piece of knowledge that helps you understand some part of the paper without having to spend a long time in the "recursive rabbit hole". For instance, in the Bitcoin paper, there is an annotation with a succinct explanation of the essential cryptography concepts (Hash functions, Public Key Cryptography, Signatures) you need to know to understand the paper. And the nice thing is that if you feel so inclined, you can add your own annotations and make the paper easier to grasp for the next person who reads it :)
I'm a grad student and I spend a lot of my time reading CS papers.
It is amazing that you want to start reading CS papers and I highly encourage it. However, if you don't have a CS degree I think papers might be remarkably off-putting (they usually are even to early grad students). I suggest you start with textbooks instead. Papers suffer from not being rewritten once they are published, hence from a pedagogical point of view, they are usually not explained as well as they could be after digesting them for many many years. Another problem is that most papers contain multiple ideas some of which rise and shine and the other ones die (often for good reasons). It is not easy to spot which is which without knowing about the wider context, which you naturally lack as a beginner.
If you insist on reading papers, however, at least don't read them in linear order. A good section order is abstract -> intro -> conclusion -> related work (because it uses comparisons which help) -> background -> evaluation (if exists!) -> technical chapters (those are usually best read in linear order). If there are proofs read them only if you must! If the proof is presented in the paper that is a good indication that it contains multiple subtle points which are tough to understand even if you are an experienced researcher.
Could you suggest some textbooks? And if you have the time, a word on why particularly book X and not book Y in the same subject?
There’s so much noise online now about these topics that it’s very hard to figure out where to focus and which resources are worth investing time into.
Lamport's 'Time, Clocks and the Ordering of Events in a Distributed System'. Very readable and, if you are new to distributed systems, really fast way to get a useful new mental model.
Finally got around to reading that paper after seeing your comment. What a great piece! The logic was clearly laid out and simple to follow. I'll have to read through the proof in the appendix later.
Thanks for the feedback! I've assigned that paper to hundreds of 4th year undergrads, and I always enjoy discussing it with them. I've also had my grad students read it as a model for how to structure a paper.
That is a really wonderful paper, indeed. It's very clearly written (as anything by Lamport), and provides a simple and elegant solution to a complicated problem.
I'm surprised no one has mentioned "the morning paper" (https://blog.acolyer.org/) yet. Some papers are more advanced but a lot I found are definitely approachable. You can subscribe to the mailing list and get a new paper every day, along with the author's summary!
That aside, I'm very impressed how he can keep up with this rhythm.
I like Pugh's paper on skip lists [1], Shannon's "Mathematical Theory of Communication" [2], and these two might be a stretch but I also like Rong's "Word2vec Parameter Learning Explained" [3] and Levy & Goldberg's "Word2vec Explained" [4]. In any case use the recommended papers to learn paper reading skills e.g look at references when you don't understand a concept, find an introductory textbook to clarify a proof method, write a summary to make your understanding concrete. Good Luck!
I really like 'Hints for Computer System Design', Butler Lampson [0]. He passes on useful experience without demanding much in preparation from the reader.
I suggest Alan Turning's 1950 Computing Machinery and Intelligence. Many of the ideas first written in this paper are commonly referenced. It's good to have read the primary source. Do machines think? If you know what a theoretical Turning Machine is you'll have all the prerequisite knowledge. If not you can skip over that part. Do they still teach Turning Machines in Highschool?
It’s also a pretty common autocorrect. A fair few of us post using our phones and I’m always horrified to see what actually got posted when I look at it 61 minutes later.
Ha, I literally had 'What are Turing machines?' as my Question of the Week. Thanks for the link. Any other good links to help me find the best answer to this question?
Pixar published a lot of great, groundbreaking graphics papers in their early days. (They still do, but as the field is more mature now there's a lot more background reading required.)
For example, I think this one on their rendering pipeline REYES (Render Everything You Ever Saw) is pretty readable, and gives a great overview of how they rendered stuff like Red's Dream, in the years leading up to Toy Story: http://graphics.pixar.com/library/Reyes/paper.pdf
(Edit to add: in fact, just check out the overviews at http://graphics.pixar.com/library/ which are much better than I can describe here)
I suggest that the undergrads be encouraged to go to Arxiv and browse papers, try to work through them and see the range of papers.
The reason for this is that many papers aren't all that well written and well argued. Many are, to be sure, but it will get undergrads to understand what is clear, what is not.
It will also ensure that they are not intimidated by papers and the math on them. They should know they can dive into one and learn something and come out the other end.
But if you want specific recommendations, I find that the HCI (Human Computer Interaction) papers are very readable. Maybe its the people drawn to the field?
Look for survey papers in the fields you are interested in. These papers are written exactly for your situation - to provide a glimpse of the field assuming little prior knowledge.
regarding:
>>Plus, you can ignore the math part and yet appreciate the beauty of Bitcoin
you can appreciate the beauty of car and even be a mechanic without physics, chemistry, and mathematics (!), but to
understand cars...
It’s hard reading random papers like that without deeply understanding the problem they are trying to solve. Sometimes understanding the true problem is more difficult than the solution in the papers.
What I would suggest instead as a learning exercise is to pick a domain you want to tackle and reinvent the wheel by implementing it. while doing so you’ll naturally find yourself digging into research papers. The advantage here is that during implementation you’d have understood the problem and the context much better and can relate to what the authors are discussing and trying to solve. Instead of moving backwards from solution to problem.
Margaret Martonosi at Princeton runs a Great Moments in Computing Course where students read a through a number of papers that describe, well, great moments in computing.
The papers in it are all well-worth reading and are mostly accessible (with some effort) to a first-year grad student (i.e., anyone with an undergraduate degree in CS or something related.)
PS. Don't make the mistake of reading easy papers just because they're easy to read.
The Byzantine Generals Problem's only prereq is familiarity with basic mathematical notation and a willingness to read carefully. The paper is seminal; it articulates a major computer security concern, the concern that rogue nodes in a network may lie in their communications.
Congestion avoidance and control, by Van Jacobson looks at how TCP, the fundamental protocol behind the internet, works (or didn't work to begin with). It's a pretty easy paper to digest without too many dependencies, especially if you skip the maths (which is explained intuitively anyway). And it's a fascinating read. I highly recommend it!
Look for older papers. They are for various reasons often much easier to read. Here is one example from 1997: https://cadxfem.org/inf/Fast%20MinimumStorage%20RayTriangle%... It's only seven pages and the math is simple for someone to understand with basic linear algebra knowledge. Despite that, it has been cited thousands of times.
Kind of off-topic, but the world is really hurting for something like a distributed version of Zotero; a standardized format for collecting, distributing, and collaborating on reference lists. It's a shame that the Zotero team has no plans to make the Zotero sync protocol itself distributed or self-hostable, because in my (probably naive) opinion it would be a real killer application for both universities and private researchers. If I had enough money to fund this development myself I'd gladly do it, and if someone else kicked off the project I'd also gladly donate.
The only free-as-in-freedom-ish alternative to Zotero I know of is JabRef[0] plus a browser extension like JabFox[1] and either a manual WebDAV/Rsync synchronization solution or Bibsync[2], which hasn't had been updated in 3 years. And it still lacks the collaboration utility of a Zotero public library.
The Raft paper [0] is a great read but context of distributed systems and the importance of consensus algorithms is probably a prerequisite. Once you understand the context, it's a nice read that it small enough to contain within your mind in one or two reads.
Why do you think best paper award articles make good reading for beginners? It's usually quite the opposite, they require a good knowledge of the field they contribute to.
Or you could start with something you already knew, but in the form of a research paper to see if you could figure that out with just reading the paper. It's reading from the source: You understand the quirks and the intent of the original author.
I would say none! Research papers are usually not written to be understood by an undergraduate student. Besides, older papers may be hard to read because they were written in a very different context. IMHO, it's better to find well-written, pedagogical and contemporary textbooks on the topic you find interesting.
A related question is "If I had the best professor in the world, what gems would they mention as examples of the beauty of computer science". These things maybe should be on the syllabus, but gems aren't necessarily seen as appropriate stepping stones in a learning process, and may only be accessible to the most capable students. Textbooks also, arguably, rarely offer charismatic enthusiasm or a glimpse of the sublime; they have to adopt a workaday attitude and play it safe.
Things that occur in books that might fit this bill of sublimity; Things that the very best practitioners get excited about:
Cormen names Tarjan's analysis of the complexity of the disjoint set union-find algorithm as his favourite thing[1]. Knuth is an appreciator of Tarjan's algorithm for finding strongly connected components[2].
Tarjan emerges as a contender for the algorithmicist's algorithmicist award.
[1] "It's not an algorithm, but a data structure. I've always marveled at the simple tree-based data structure for disjoint-set union, using union by rank and path compression (Section 21.3 in the third edition of CLRS). The code is amazingly simple, the data structure operations take just barely superlinear time, and the analysis (by Bob Tarjan) blows my mind."
[2] "While I was preparing for Volume 4 of TAOCP in the 90s, I wrote several dozen short routines using what you and I know as "literate programming." Those little essays have been packaged into The Stanford GraphBase (1994), and I still enjoy using and modifying them. My favorite is the implementation of Tarjan's beautiful algorithm for strong components, which appears on pages 512–519 of that book."
and
"The data structures that he devised for this problem fit together in an amazingly beautiful way, so that the quantities you need to look at while exploring a directed graph are always magically at your fingertips. And his algorithm also does topological sorting as a byproduct."
Finding this kind of enthusiastic recommendation is always a joy.
It sounds very basic, but I highly recommend 'The Annotated Turing'[0] to any beginner in Computer Science. It's a walk through Turing's original 36-page paper on Turing Machines, and requires only high school level math to understand. I picked it up early in my CS undergrad and it blew my mind. I suddenly understood what a computer was.
http://www.sciencedirect.com/science/article/pii/01676423879... It is incredible how so many concepts are clearly explained in so few pages (30). It is the foundation of caml light language that gave birth to ocaml. It explains lambda calculus, semantic of ML family languages, SECD machine, how to perform demonstration in CS. IMHO, it requires at least 5 days to fully understand it if everything is new for you.
2 classics that don't require a whole lot of backgrond are:
- An Axiomatic Basis for Computer Programming by C. A. R. Hoare
- The Next 700 Programming Languages by P. J. Landin
The most recent paper I've read is:
- Storage strategies for collections in dynamically typed languages - Bolz, Diekmann, Tratt
I blog about some papers I've read on my blog [0], and these three papers I've mentioned have their own post, where I summarize key ideas of the papers. Take a look if you're interested.
Of Turing Awards I would recommend John Backus' "Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs"
What do you consider to be basic CS? e.g. would knowledge of A* or dynamic programming be considered basic CS? And are you just looking for possibly interesting papers (even if written by completely random people nobody's heard of), or more well-known ones?
What paper do you mean for Bitcoin? The original white paper from Satoshi? I loved the bit torrent paper. I'd add the paper on HMMs and NFS. My list of favorite papers :) The oldies used to be better. Kinda sad really.
This is the seminal paper that laid the foundation for RDBMS systems.
At my university (NUS) the CS program had a research paper reading program, where sophomores were tasked to read and summarize a well known paper and this was one of the most popular.
reply