Skip to main content

Data description

Ensure that everyone can understand your research data
Share
Share this page

Describing and documenting your data is essential in ensuring that you, and others who may need to use your data, can make sense of your data and understand the processes that have been followed in the collection, processing and analysis of your data.

Describing your data: metadata

The descriptive information about your data is known as metadata. When describing your data you’ll need to create a lot of the metadata yourself, such as ‘title’, ‘description’ and ‘creator’, while your computer system will create additional technical metadata about your files. All of this metadata is useful for understanding, managing, and preserving your data effectively and makes your data more findable and usable to other researchers if the data’s published or shared. Find more information about metadata on the ANDS metadata page or the DataONE best practice guide for metadata.

You may want also want to consider using a metadata schema, which is a defined set of metadata fields that will ensure you are always collecting the same information about all of your research data. Dublin Core is the most commonly used descriptive metadata schema and is used for records in the Sydney eScholarship repository. The University’s Sydney Research Data Registry uses the RIF-CS metadata schema, enabling Research Data Australia to harvest these records.

Some fields of research have developed discipline specific metadata schemas. The Digital Curation Centre has set up a disciplinary metadata directory so that you can search for and find a metadata schema that suits your data.

You can contact the Research Data team for advice on metadata, metadata schemas and publishing your research data.

Data documentation

You should document everything that you or another researcher would need to make sense of your data in the future. Some storage systems provide mechanisms for you to do this when you save your data. In other storage systems you may have to document your data manually in README documents or version control tables.

A README document is a plain text document that’s stored alongside and describes your data. Details to include in a README are:

  • file naming conventions
  • data definitions, e.g. definitions of variables or row and column headings
  • units of measurement
  • how different files relate to one another
  • explanation of data processing steps
  • software or tools used to create or read the files