Saturday, April 17, 2010

Library of Tweets


So, have you heard the news?  The Library of Congress is going to be archiving every tweet that has ever been publicly twitted, or tweeted, or twittered, or whatever.  The announcement was first made on Twitter in an April 14th tweet that simply stated:  "Library to acquire ENTIRE Twitter archive -- ALL public tweets, ever, since March 2006! Details to follow."  The details could be found on the Library of Congress blog, on a post posted on the same day, written by Matt Raymond, and entitled, How Tweet It Is!: Library Acquires Entire Twitter Archive.  Here's how it starts out:


Have you ever sent out a “tweet” on the popular Twitter social media service?  Congratulations: Your 140 characters or less will now be housed in the Library of Congress.

Someone I follow on Twitter quipped that he could now say that he has several thousand volumes in the Library of Congress.  I don't remember who it was who tweeted that, but I guess now I could look it up in the card catalog?  Er, I mean, database.  

But I have to wonder how many folks out there who were quite content to have their impulsive thoughts and expressions broadcast to a limited number of followers, and archived on their somewhat obscure Twitter profile page (where any tweet you think better of can be deleted), are now feeling a bit uneasy at the prospect of having this material permanently on file as part of our national archives?

Matt's blog post continues

That’s right.  Every public tweet, ever, since Twitter’s inception in March 2006, will be archived digitally at the Library of Congress. That’s a LOT of tweets, by the way: Twitter processes more than 50 million tweets every day, with the total numbering in the billions.

We thought it fitting to give the initial heads-up to the Twitter community itself via our own feed @librarycongress.  (By the way, out of sheer coincidence, the announcement comes on the same day our own number of feed-followers has surpassed 50,000. I love serendipity!)

We will also be putting out a press release later with even more details and quotes.  Expect to see an emphasis on the scholarly and research implications of the acquisition.  I’m no Ph.D., but it boggles my mind to think what we might be able to learn about ourselves and the world around us from this wealth of data.  And I’m certain we’ll learn things that none of us now can even possibly conceive.
Well, as long as it's educational, right?  Sure, it's research.  Everything is potentially research material, of course.  But is any of it of any lasting value otherwise?  Here's what Matt has to say:

Just a few examples of important tweets in the past few years include the first-ever tweet from Twitter co-founder Jack Dorsey (http://twitter.com/jack/status/20), President Obama’s tweet about winning the 2008 election (http://twitter.com/barackobama/status/992176676), and a set of two tweets from a photojournalist who was arrested in Egypt and then freed because of a series of events set into motion by his use of Twitter (http://twitter.com/jamesbuck/status/786571964) and (http://twitter.com/jamesbuck/status/787167620).

I can't help but imagine Neil Armstrong stepping off of the lunar module ladder and tweeting, "That's one small step for man..."   How about Harry Truman retweeting, "RT @chicagodailytrib Dewey Wins (LOL)" or Lincoln tweeting, "Foursquare and 7 yrs ago..." (sorry, couldn't resist the pun).

So, of course, this vastly enhances the value of all of our tweets, and of Twitter itself.  And Twitter was certainly not reluctant to crow about it on its own Twitter Blog here on blogspot, in a post also dated April 14th, posted by Twitter founder Biz Stone, aka @biz, where he goes on to state, after providing the basic background information

It is our pleasure to donate access to the entire archive of public Tweets to the Library of Congress for preservation and research. It's very exciting that tweets are becoming part of history. It should be noted that there are some specifics regarding this arrangement. Only after a six-month delay can the Tweets be used for internal library use, for non-commercial research, public display by the library itself, and preservation.
The six month cooling off period is a relief, and I assume that it is also a grace period during which folks can delete tweets that they don't want to be preserved in such an official capacity, but that's not entirely clear.  Biz goes on to make a second announcement of related significance:

The open exchange of information can have a positive global impact. This is something we firmly believe and it has driven many of our decisions regarding openness. Today we are also excited to share the news that Google has created a wonderful new way to revisit tweets related to historic events. They call it Google Replay because it lets you relive a real time search from specific moments in time.

Google Replay currently only goes back a few months but eventually it will reach back to the very first Tweets ever created. Feel free to give Replay a try—if you want to understand the popular contemporaneous reaction to the retirement of Justice Stevens, the health care bill, or Justin Bieber's latest album, you can virtually time travel and replay the Tweets. The future seems bright for innovation on the Twitter platform and so it seems, does the past!

So, Google's announcement was made on the Official Google Blog here on blogspot, in a post put up by Replay it: Google search across the Twitter archive.  Here's what Dylan has to say:

Since we first introduced real-time search last December, we’ve added content from MySpace, Facebook and Buzz, expanded to 40 languages and added a top links feature to help you find the most relevant content shared on updates services like Twitter. Today, we’re introducing a new feature to help you search and explore the public archive of tweets.

With the advent of blogs and micro-blogs, there’s a constant online conversation about breaking news, people and places — some famous and some local. Tweets and other short-form updates create a history of commentary that can provide valuable insights into what’s happened and how people have reacted. We want to give you a way to search across this information and make it useful.

Starting today, you can zoom to any point in time and “replay” what people were saying publicly about a topic on Twitter.

This is followed by specific information on how to use Google Replay.  Dylan concludes by saying, "All of us are just beginning to understand the many ways real-time information and short-form web content will be useful in the future, and we think being able to make use of historical information is an important part of that."  And this does raise an interesting question as to what is history?  

That's been the subject of considerable debate, discussion and disagreement, and it is perhaps best to employ the general semantics technique of turning the singular into plural, and recognizing that the term history refers to many things.  There are the actual events of the past, which cannot be retrieved.  There are individuals' perceptions of those events, and individuals' memories based on their perceptions.  There are the records, documents, and artifacts of the past that provide some indications about what occurred.  And there are the chronologies and narratives that we put together to organize historical information,  to explain what happened and why it happened.

One of the fundamental trade-offs in history is the importance of the eyewitness, of actual presence, first hand accounts, and failing that, primary evidence.  Without data derived from such sources, history writing is reduced to little more than conjecture.  But we also have to keep in mind that perceptions, however direct, can be biased or faulty, that reports can be misleading, and that evidence can be interpreted in different ways (points all covered in general semantics).  

Moreover, it is also understood that sometimes you need some distance from events, in space but especially in time, to truly put them into their appropriate context, understand their import, identify all of the significant interrelationships.  And this is a task that is never quite completed, history remains an open book that is constantly subject to revision (which is not to suggest that it is a complete fabrication or construction with no basis in fact).  In this sense, history is both a story that is told and retold by a succession of storytellers, and a science that is updated in accordance with new discoveries and new theories.

There are a number of expressions about the relationship between journalism and history, that journalism is history in a hurry, history on the run, history on speed, history on wheels, and most prominently, that journalism is history's first draft.  So what does that make Twitter?  History's notes?  

From a media ecology perspective, we would have to say that, along the lines of there being different meanings to the word history, different media environments give us different types of historical accounts, different forms of history.  Oral cultures rely on oral tradition, and therefore myths and legends.  Writing gives us history as we know it, and printing gives us history as scholarship and science, and a historical consciousness.  Audiovisual media such as the photograph, audio recording, and moving image, have drastically changed our notions of history, and opened up a gulf between the events of the last century and a half and all that came before.  Why do you think World War Two is so much more present to us than World War One?  

And now this...

Not unrelated to changing concepts of history are the changing concepts of the library, as it becomes a data center and digital archive (see my brief essay, "The Medium is the Memory").  And with that, let's return to the blog post from the Library of Congress:

So if you think the Library of Congress is “just books,” think of this: The Library has been collecting materials from the web since it began harvesting congressional and presidential campaign websites in 2000.  Today we hold more than 167 terabytes of web-based information, including legal blogs, websites of candidates for national office, and websites of Members of Congress.

We also operate the National Digital Information Infrastructure and Preservation Program www.digitalpreservation.gov, which is pursuing a national strategy to collect, preserve and make available significant digital content, especially information that is created in digital form only, for current and future generations.

In other words, if you’re looking for a place where important historical and other information in digital form should be preserved for the long haul, we’re it!

So, the library, as we once knew it, is history.  Now that is something to tweet about!   

And while we're at it, here's another Twitter testimonial for you:



Where would you put that in your Dewey Decimal system?  Good thing the Library of Congress doesn't have to worry about it!

No comments: