Find and Use Data
Researchers often store and publish their data in so-called repositories, i.e. digital archives. There are thousands of repositories that make data openly accessible so that it can be used for one's own research.
How do I Find Suitable Data?
Finding the Right Repository
There are generic, institutional and discipline-specific repositories. The best way to find existing data for reuse is by searching in discipline-specific repositories. How to start your search:
- Ask the community or Data Stewards: Which repositories are generally used by other researchers in your discipline?
- Recommended repositories of research funders : Both the SNSF and the European Commission maintain lists of subject-specific data repositories.
- re3data.org is another good starting point for your research. re3data.org is currently the most important and largest registry of research repositories world-wide.
Use the following filters on re3data.org
- Subject = filter by discipline
- Data Licenses = data are licensed for reuse
- Data Access = data are openly accessible
- PID (persistent identifier) = the repository assigns persistent identifiers to its objects, e.g. DOIs.
Finding the Right Data
Once you have found a relevant repository, you can limit the search for a suitable dataset further.
- Identify relevant keywords and search terms (be careful to narrow down this list as much as possible, e.g. only look for keywords in abstract and title but not also in author name) or categories.
More information:UB UZH's Good search tips - Evaluate datasets for their technical and legal reusability. Choose data in open formats and with a license that allows for re-use .
More information on:data formats andlicenses. - Evaluate the quality of data documentation. The more you know about a dataset from the documentation, the better you can assess the suitability of the data for your own project.
How Do I Reuse Data?
To reuse the data you must also comply with the terms and conditions of the license. For further processing (and proper data management) you would also document your reuse and then cite the data in the publication.
How to Cite Data Sources?
Sometimes the repository indicates how the data should be cited. If the dataset comes with a DOI, you can use crosscite to create a citation.
Generally we recommend to include the following information in a citation:
Creator (PublicationYear): Title. Publisher. (resource type). Identifier.
Example:
Pidgeon, Nicholas and Demski, Christina and Stuart, Capstick and Alexa, Spence and Sposato, Robert (2016). Public perceptions of climate change and personal experience of flooding. [Data Collection]. Colchester, Essex: UK Data Archive. 10.5255/UKDA-SN-851835
Do You Have Technical Questions?
Science IT supports UZH researchers in technical matters for software solutions, data storage, data management and data visualization.
Do You Have Domain-Specific Questions?
Get in touch with a data steward: Data Stewards Network