Heretical Librarian: Google and its Enemies...

is the title of a interesting Jonathan Last essay that was the cover story in the December 10 Weekly Standard. Last makes a good case against the Google book digitization project. His arguments, though, are ultimately unconvincing.

Last makes copyright the centerpiece of his critique. It is a lengthy essay, but the following passages give a sense of his objections:

Blake Field sued Google for copying and caching 51 works from his website. The court ruled in Google's favor, citing in particular the ease of Google's "opt out" feature, but the decision was based in part on dubious grounds. The court said that Field had "invited" Google's spiders--web robots which crawl through the Internet cataloguing and indexing pages for a search engine--by not including code on his website which discouraged them. In other words, by not telling Google to stay away, Field was asking to have his copyright violated. It's the intellectual property version of "She wore a red dress to the bar on Saturday night."

[...]

The Internet has become, like the 17th-century printing press, incapable of observing copyrights. In the same way the printing press encouraged the mass production of books and magazines and newspapers, the Internet cries out for the distribution of all information--everything from blog entries to pictures to books. And as it distributes all of this information, it exerts a leveling force that diminishes the value of everything it touches. There is no reason that the Internet, unlike the printing press before it, should be exempt from the same protections of creative value. Yet, this is what Google's defense would achieve.

As you can tell, Last is a bit of a traditionalist. His piece oozes with disdain for this new fangled Internet thing. Personally, I find his account of the Field case didactic if not downright Luddite. What he finds to be "dubious grounds", I call simple common sense. If Field didn't want anyone to be able to copy his images, then why did he post them online to begin with? Plus, as the court pointed out, all he had to do was put "robots.txt" on his web site to make it Google-proof. It's certainly a lot cheaper and easier than filing a lawsuit.

Last also misses an important distinction: Google's policy of caching online materials is about providing access to works produced by others; it is not the same thing as republishing something without permission in a print environment. The argument that the Internet is "incapable of observing copyrights" is also spurious. Numerous publishers have built subscription firewalls into their web sites. Yes, you can find their articles via Google; but you have to pay if you want the full text and aren't already a subscriber.

Should Google or other search engines really have to send letters to the owners of every single web site asking express permission to index and/or cache their materials? No, this would be ridiculous, especially when such sites can make themselves unavailable to search engines with just a little bit of coding. Whatever its flaws as a resource and a company, the good Google does by making online information more accessible far outweighs the negative aspects that Last chooses to emphasize.

Last is on a bit firmer ground in his critique of Google's library digitization project. After all, in this case Google is actively digitizing print content, not simply providing access to what others have posted:

Yet even if Google finds a way to realize its dreams, it's unclear exactly how useful the Book Search would ever be for the average user. Is there value in seeing "snippets" of this or that text? The only way the project could really achieve its goal of disseminating knowledge to the masses would be by ignoring copyrights and putting all texts into the public domain. Which is, of course, what the logic of the Internet ultimately wants. "Information wants to be free," according to one of the web's founding mantras.

If Google was a different company, with a different set of motivating principles, it might well have constructed its Library project along the lines of Apple's iTunes model--that is, it would have spent time and money not perfecting a mass scanning operation designed to gobble up as many pages as possible per hour, but in securing the rights to a large catalogue of books which it could then sell as downloads. After all, it's not as though the current delivery mechanism for books is in any way optimal.

But this concept is beyond its ken. Google's corporate philosophy is based on the model which brought them success: organizing and giving away other people's content, creating space for advertisements in the process. The enormous success Google found with that model in the search engine business spurred it to try and impose it in every arena. In the Google worldview, content is individually valueless. No one page is more important than the next; the value lies in the page view. And a page view is a page view, regardless of whether the page in question has a picture of a cat, a single link to another site, or the full text of Freakonomics. When all you're selling is ad space, the value shifts from the content to the viewer. And ultimately the content is valued at nothing. And here, finally, is the larger problem posed by Google's actions. Books are not in any important sense user-centric. Whether or not a book has readers matters little. Books stand on their own, over time, as ideas and creations. In the world of books, it is the ideas and the authors that matter most, not the readers. That is why the copyright exists in the first place, to protect the value of these created works, a value which Google is trying mightily to deny.

Again, despite a few good points, Last has oversimplified things. Yes, people prefer full text when they're online. However, his notion that "the logic of the Internet" will lead inevitably to everything being fully available online is less than realistic. In a way, it's a mirror image of the belief that the Internet will become the long sought after "universal library". Even if every item that Google has digitized is made full text,it would still be a small percentage of everything that has ever been published.

It's also simplistic to suggest that the "only way" Google's book search would be useful is if it's all full text. By that standard, online library catalogs are not helpful either. Google Book Search is a very useful resource for finding books on a topic, even more so than most library catalogs because Google refers you to specific pages that contain your search terms. If the book looks relevant, the user can either purchase a copy or get it from their local library.

Last complains that Google is "organizing and giving away other people's content". As I noted earlier, this is absurd. Google is a tool to find what's available on the web. Using the same standard, Last could equally accuse libraries of "giving away other people's content". In terms of Google Book Search, copyrighted content is not being given away. The user can only retrieve a few pages at most. Last argues elsewhere that Google is still making money off of scanned copyrighted books through advertising revenue. This is likely true. However, this could easily be remedied by a revenue sharing arrangement with publishers.

Last's final paragraph makes it clear that his opposition to Google and its digitization project transcends such mundane concerns as copyright infringement and revenue. I personally love books, or else I wouldn't be in the profession I'm in. Last, however, objectifies them. In his view, books "stand on their own, over time, as ideas and creations". Implicit in his essay is an attitude that the very act of digitizing a book is a form of desecration, turning it into just another valueless page view.

Last is right that the online environment transforms how users find and perceive information. This does not, however, validate his argument. In fact, it directly contradicts his case that a print standard of copyright should be applied in the digital environment. It also ignores that people are usually content to settle for what they can easily find on the web when doing research. Unfortunately, for many users, it's not a question of either using a page of Freakonomics found online or using the print version. They're going to go online and use what they find in Google regardless. Wouldn't you prefer that those search results include references to books?

The Internet has dramatically changed how people find and use information and has greatly complicated the application of copyright law. Even assuming that Last's negative depiction of Google is correct, the company is merely a symptom of these broader trends. Users need some way to find information on the web; by meeting that need Google has done far more good than harm. Yes, there is definitely a "buyer beware" quality to much of what is posted online. This is why the user needs to exercise critical thinking skills when using the web. Sadly this requirement is often honored in the breach.

Halting Google's library digitization project will not change this situation. In fact, by decreasing online users' access to and awareness of books it is likely to exacerbate it.

Heretical Librarian

Monday, December 17, 2007

Google and its Enemies...

0 Comments:

About Me

The opinions expressed on this site, with the exception of user comments, are entirely my own, and in no way represent the views of the North Carolina National Guard, United States Army, or my civilian employers. The opinions expressed in user comments are solely those of the individual commenters.

The Loneliness of a Conservative Librarian
(My now "infamous" article from the September 30, 2005 Chronicle of Higher Education.)

Previous Posts

Supporting the Troops

The War on Terror and International Affairs

Radical Islamism, The Middle East and Reforming Islam

News, Opinion, and Analysis

Censorship and Intellectual Freedom

Librarian Blogs and Web Sites

Blogroll

Opposing Viewpoints