Documentation & metadata
Metadata are structured information about data that help you understand your own data and help others discover, interpret, and cite your data.
Metadata provide context for data:
- Who created the data?
- What do the data contain?
- What manipulations and cleaning were applied to the data?
- Where were the data collected?
- When were the data were collected?
- Why were the data collected?
- How were the data collected?
Metadata also assist with the management and use of data:
- Who can access the data?
- What license or rights is associated with the data?
- What file formats are the data in?
- What hardware or software was used to generate the data?
- How are the files organized and structured?
When creating metadata, use a standard where possible. Types of metadata standards include:
- Schemas – sets of elements used to describe data
- Controlled vocabularies – lists of standardized terms assigned to data for labeling and indexing
- Measurement standards – guidelines for normalizing units like date and time
Some academic disciplines already have established metadata standards for datasets (e.g., Astronomy Visualization Metadata, Darwin Core, and Data Documentation Initiative). There are also general purpose metadata standards such as DataCite, Dublin Core, and Project Open Data.
Additionally, most data repositories require that your metadata follow a specific standard.
Manage your metadata
Consider one or more of the following methods for managing your metadata throughout your research project:
- Maintain a notebook with information about your project, such as:
- Locations, organization, file names, and formats of data files
- File naming convention used with the data files
- In each directory folder, include a text file, often called a readme, that describes the contents of the data files in that folder
- Keep a codebook that explains the codes, abbreviations, or variables used in the data
- Record metadata in a fielded form, such as a spreadsheet, csv file, or tab-delimited file