Annotating the news

By pbrantley | 23 July, 2013

The desire to engage in a conversation – and to debate – is central to human experience. At, we’re trying to bring that capability to the web. We realize that how and what we annotate is likely to be quite different depending on the subject and the material: journalism, policy, law, scholarly publishing, and classroom uses – all of these communities that we believe have a high affinity for annotation – are likely to integrate different annotation forms and functionalities into their workflows. Annotation on the web is just now reaching adolescence, and in most cases we don’t know how it will be adopted – in other words, the practice of annotation is an emergent phenomenon.

That’s why is organizing a series of “tiger teams” – small groups of stakeholders within each of these communities that we will gather together for a day, holding informal conversations about how annotation and commentary might be used by both lay and professional practitioners. We just held our first of these in San Francisco, drawing together a small group of working journalists, and it was fabulous.

We were thrilled to convene Declan McCullagh, our host at CNet/CBS Interactive; Michael Coren, a journalist and entrepreneur of PubLet; Mike Masnick of TechDirt; Violet Blue, a well known independent reporter and blogger; Dan Gillmor of Arizona State University; Jim Giles of MATTER, and now by virtue of acquisition, also with Medium; and Jon Mitchell of The Daily Portal. Our conversation ranged from theoretical issues to technical practice to matters of law, and was illuminating for the opportunities and challenges faced by a read/write web.

Some of the most basic things we heard are already on our roadmap: awareness of the identity of the original author, and enabling moderation of commentary; the desirability of having analytics that can inform which articles are being read and commented on most, and which segments of the article are generating the greatest debate; and alerts for specific actions – one hypothetical example was, “The annotation system should inform me if Snowden comments on my NSA article.”

An additional challenge, typical of journalism where articles may have high mutability as stories develop over time, is the need to make enduring the association between an annotation and the original target text it had selected, even when that text sees significant change over time. On the flip side, annotations ensure a form of permanence for reporters, preserving the context of the comment, even if the entirety of the original article has disappeared from the web.

There were in-depth conversations about specific topics of interest. Several reporters raised the virtue of an annotation system serving as a smart highlighting service to aid in the preparation and editing of stories. Annotation structures can associate flexible and searchable metadata with highlights, enabling reporters to log what was of specific interest from their source documents. Instead of having a “dumb” cache of quotes and text, journalists can create their own databases of notes with commentary.

Authors also noted that an annotation system would enable them to tie or associate additional material which was not published with the final story, but was relevant or instrumental to the reporting or analysis. Further, a journalist could annotate a source document and then link that to the published article, creating a new form of journalism which is more richly textured than is commonly presented today. Of course, this same functionality exists for readers as well, and for both the author and the reader of news, annotations provide not merely a vehicle for reader-to-reader dialogue, but a robust means of fact-checking.

For publishers, as more and more of both the discovery and consumption of content moves to the web, enabling readers to have access to engagement with stories – and, in some cases, the original journalists – is enticing. For the last few years, newspapers and magazines have “outsourced” commentary to external platforms such as Facebook and Disqus, and while that traffic often appears to be on the same web site as the news publication, it actually resides elsewhere. It is more beneficial for publishers to be able to retain the usage data that readers generate, enabling more informed engagement and creating more attractive, data-derived profiles for advertising CPMs.

Finally, the discussions around both reader engagement and journalist workflow brought attention to one of the more nebulous but important concerns: legal issues impacting annotation. Around the table, there was a clear consensus that selecting target texts and commenting on them was an obvious Fair Use. However, it is conceivable that a user could select too much of the source material, or otherwise approach annotation in a way that might trigger copyright liability. Further, it is unclear what would happen if a source document was itself taken down as a result of a DMCA notice. What would be the resulting state of annotations based on such a document?

There was agreement that defaulting to open – in other words, assuming that a publishers supported annotation – was the best starting point, but that permitting publishers to suggest the preferred level of commenting might be desirable. This might be accomplished via the equivalent of a robots.txt, which is used to demarcate excluded material for web harvesting. A similar “notes.txt” might set forth publisher desires for the amount and extent of selection accepted by annotators using open systems. While that might not be a technically insuperable barrier, like robots.txt it might develop into the equivalent of a legally acknowledged practice.

These conversations were suggestive of many opportunities for Overall, we were greatly cheered by the enthusiasm for annotation and the sense that not only was it rapidly developing, but it would inevitably reshape the practice of discussion online. We look forward to taking these insights into our next tiger team in mid-August on how we should annotate the law, coordinated with our partners at the Berkman Center for Internet and Society at Harvard. We can’t wait!

Share this article