Home Unlabelled Archiving online citations: are we all Americans now?

Archiving online citations: are we all Americans now?

Tuesday, June 29, 2010

Read

As an occasional academic himself, the IPKat takes great interest in the issue raised in the following request for information from his friend Susan Hall (Cobbetts), who writes to him as follows:

"As citing online sources in academic papers becomes more acceptable, one issue which is becoming more important is the fragile nature of online materials and the obvious concern when reviewing or answering papers which cite online sources, that such sources remain accessible for verification and follow-up. A tool called WebCite, offered here, purports to offer an answer, allowing people who cite online sources to cache them on the WebCite servers, on the terms set out on the site in question.

However, this seems to produce a number of interesting IP implications on both sides of the Atlantic (incidentally, the examples of archiving of site uses are drawn from the Guardian).

The advertised service appears intended to protect people using online sources in academic research from dead links and future take-downs, but there seem difficulties fitting the WebCite model into a UK copyright framework. Further, although the WebCite FAQs put forward a justification based on fair use under US law, this is itself not devoid of problems:

"Caching and archiving webpages is widely done (e.g. by Google, Internet Archive etc.), and is not considered a copyright infringement, as long as the copyright owner has the ability to remove the archived material and to opt out. WebCite® honors robot exclusion standards, as well as no-cache and no-archive tags. Please contact us if you are the copyright owner of an archived webpage which you want to have removed".

A U.S. court has recently (Jan 19th, 2006) ruled that caching does not constitute a copyright violation, because of fair use and an implied license (Field vs Google, US District Court, District of Nevada, CV-S-04-0413-RCJ-LRL, see also news article on Government Technology). Implied license refers to the industry standards mentioned above: If the copyright holder does not use any no-archive tags and robot exclusion standards to prevent caching, WebCite® can (as Google does) assume that a license to archive has been granted. Fair use is even more obvious in the case of WebCite® than for Google, as Google uses a “shotgun” approach, whereas WebCite® archives selectively only material that is relevant for scholarly work. Fair use is therefore justifiable based on the fair-use principles of purpose (caching constitutes transformative and socially valuable use for the purposes of archiving, in the case of WebCite® also specifically for academic research), the nature of the cached material (previously made available for free on the Internet, in the case of WebCite® also mainly scholarly material), amount and substantiality (in the case of WebCite® only cited webpages, rarely entire websites), and effect of the use on the potential market for or value of the copyrighted work (in the case of Google it was ruled that there is no economic effect, the same is true for WebCite®)." (FAQs)

Asks Susan, "Is anyone doing any work on archiving of online sources and the legal issues entailed?" If so, she -- and, out of sheer curiosity, the IPKat -- would love to know.