Thursday, February 7, 2013

"Digging into Data" Challenge Data Repositories Review

On Feb. 5, 2013, Digging into the Data Challenge Round 3 has been announced. The challenge aims to invite the researchers to explore and gain new insights about the data. Digging into the Data covers a wide range of areas such as: social sciences, archival information, library, etc. The application deadline is May 15, 2013.

I reviewed the dataset and got the following points
  1. The list of datasets didn't cover any web materials, even they had a wide varieties of born-digital materials, but they didn't cover the Web pages. Even some web archives participated in this competition, but they preferred to join with other materials. For example, Internet Archive , the largest and the oldest web archive, participated with Films and Books collections.
  2. Additional to the ignorance of web pages collections, the social media didn't appear. Library of Congress preferred to participate with "Chronicling American" newspaper collection instead of their valuable Twitter archive collection. I'm not sure if there is a copyright limitation for that.
  3. Most of the datasets have APIs access points that range between OAI-PMH, XML, and REST services.
In the attached table, I tried to summarize the type of the data in each collection. I may revisit the table again to add additional information about the nature of the data (e.g., Biomedical, history, technical).
 
Data objects
Images Book text videos Newspaper Code projects biography Numeric Maps Metadata
The Archaeology Data Service (ADS) x
ARTstor x
Biodiversity Heritage Library x
The Centre for Contemporary Canadian Art Canadian Art Database Project x x x x
Chronicling America Library of Congress National Digital Newspaper Program x
Data-PASS x
The Digital Archaeological Record (tDAR) x x x
Digital Library for Earth System Education (DLESE) x
Early Canadiana Online x
FLOSSmole x
English Broadside Ballad Archive (EBBA) x
Great War Primary Documents Archive x x
Harvard Time Series Center (TSC) x
HathiTrust x
The History Data Service (HDS) x x
Infochimps.org x
Internet Archive x x
Inter-university Consortium for Political and Social Research x x
JISC MediaHub x x x x
JSTOR x
Marriott Library- University of Utah x x x x
Smithsonian/NASA Astrophysics Data System (ADS) x
National Archives, London x
National Library of Medicine (NLM) x x x
The National Library of Wales x x x x x x x
National Science Digital Library (NSDL) x
National Technical Information Service (NTIS) x
Nebraska Digital Newspaper Project x
New York Public Library x
The New York Times Article Search API x
Opening History x
PhilPapers x
Project MUSE x
PSLC DataShop
Scholarly Database at the Cyberinfrastructure for Network Science Center, Indiana University
ScholarSpace at the University of Hawai'i at Manoa x
Statistical Accounts of Scotland x x
University of Florida Digital Library Center x x x x x
University of North Texas x x x x

No comments:

Post a Comment