If You Harvest arXiv.org, Will They Come?
Michael L. Nelson, Johan Bollen
The NASA Technical Report Server (NTRS) is an Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) compliant aggregator, harvesting from 17 repositories. When NTRS was created, there were few scientific, technology and medicine (STM) OAI-PMH repositories, so non-NASA STM repositories were included: arXiv.org, BioMed Central, Energy Citation Database, and the Aeronautical Research Council (the UK equivalent of NASA's predecessor, NACA).
In NTRS's simple search mode, only NASA repositories are searched. Advanced searches have the option of including non-NASA repositories in their search. Thus users never receive non-NASA results unless they explicitly requested. We examined 13 months of NTRS log data. NTRS is instrumented to record when a user requests a download for the full-text content. Despite a large number of records, The Energy Citation Database, BioMed Central and arXiv.org contributed few downloads. ARC represents a significant number of downloads. This indicates users will select non-NASA repositories from the advanced search interface (logs show the advanced search is used 2X as simple search), and the prominence of both NACA and ARC suggests an interest in historical aeronautical publications. The subject matter of ARC is similar to the NASA repositories, suggesting NTRS remains aerospace-focused and the presence of other STM materials has yet to expand its user base. arXiv.org is the most well-known OAI-PMH repository and is harvested by many OAI-PMH service providers, but its presence did not guarantee its use in NTRS.
© Copyright 2005 Michael L. Nelson and Johan Bollen