Skip to main content

COVID-19 update: Library spaces are open to students and staff only, with precautions in place for your safety. Learn more.

Archive research data

Ensure that your data is safe in the long term

Archiving data that is no longer actively used by you or your research team will ensure that your data is stored securely to prevent data loss in the long term. Archiving your data will ensure continued access for you, or for others if you choose to publish or share your data.

Archiving options
Organise your files
  • Organising your files well helps to ensure that they remain findable and understandable for you and for other users. Using meaningful file names and a strategically organised folder structure throughout your project are useful techniques for organising data to ensure that it will be easy to archive at the end of a project. If your files don’t have meaningful files names and aren’t well organised, you should do this before archiving your data.

    File naming

    Select an appropriate naming convention for your files as early as possible and follow it throughout your research consistently. Make sure your naming convention is documented, for instance, in a README file, before archiving your data.

    • You may wish to start your title with the date, formatted as YYYYMMDD, to display your files in chronological order
    • Choose useful keywords that you or others might use to search for your files, separating each word or section with a hyphen or underscore. Document the keywords you choose to use, so that you can interpret your file names later. Useful keywords may include:
      • project acronym
      • location
      • data type
      • data collection methods

      Example: [Date]_[Project]_[Location]_[Method]_[Run]

    Things to avoid

    • Don’t manually change or delete the file extension suffix (e.g. .docx, .pdf, .csv) which is usually generated automatically
    • Avoid the use of special characters such as \ / : * ? " < > |, apart from hyphens and underscores, in file names
    • Don’t make file names too long

    Folder structure

    A well organised folder structure can save you time and will help other future users to understand and find your data.

    Key considerations:

    • Keep your raw data in a separate folder from working data
    • Store consent forms separately for ethics and privacy reasons
    • Nest your folders in the direction that best suits to how you plan to use them, e.g. Location > Method > Date or Method > Date > Location
    • Don’t create too many empty folders ahead of time

    Example: [Project] > [Experiment] > [Instrument or Type of file] > [Location]

    Further information:

Describe your data
  • Before archiving data, it’s essential to ensure that your dataset is accurately documented and described. This will ensure that you and any future users can make sense of it and understand the processes that have been followed during data collection, processing and analysis. If your datasets are published or shared, then well-described data will be more easily discoverable, verifiable, and reusable by other researchers.


    Metadata is descriptive and contextual information about your data. It may include title, creator(s), date produced/ collected, location, abstract, subject, method, process, quality, format, rights, and ownership.

    To determine what metadata to keep, it’s useful to think about what information would help someone else understand and reuse your data. You may consider using a metadata standard (also called a metadata schema), a defined set of fields that can either be general or discipline-specific. Using a standard will not only provide a rich description of your data, but also increases the likelihood of people finding your data.

    If you’re not sure where to start, check out Dublin Core, which is a commonly used general metadata standard, or find a metadata standard related to your discipline by searching the Digital Curation Centre’s disciplinary metadata directory.

    Creating data documentation

    Once you’ve decided what metadata you need to keep for your data, you should record this information and store it with the data. Some storage systems, like the University’s eNotebook, provide mechanisms for you to do this when you save your data. In other storage systems, like the Research Data Store and CloudStor, you may have to record your metadata manually in a README document (a text document) or a version control table.

    Further information:

File formats
  • When preserving and publishing data it’s essential that all datasets are saved in an appropriate file format to ensure long-term accessibility of data. The file formats you use when working with your data may not be appropriate for archiving or publishing purposes. You should think about capturing data or converting files into formats that are:

    • widely used within your discipline
    • publicly documented, ie the complete file specification is publicly available
    • open and non-proprietary
    • endorsed by standards agencies such as the International Organisation for Standardization (ISO)
    • self-documenting, ie the file itself can include useful metadata
    • unencrypted
    • uncompressed or that use lossless compression

    For example:

    • Quantitative research data
      • While you collect and analyse your data, you might need it in a number of different formats: an Excel spreadsheet, a database, an SPSS, SAS, R, MATLAB or other file format native to the specific data analysis software you are using.
      • Once the data’s been collected and the analysis performed, save the data as a comma separated values (.csv) file for long-term storage. Most data software packages provide options for saving data as a comma separated values file. This format is portable across different computing and software platforms and is therefore more resilient to software updates.
    • Image files
      • The uncompressed TIFF (Tagged Image File Format) is a good choice for long-term preservation of image files. Most image creation software packages provide options for saving images as TIFF files. You should save your image files in this format right from the outset, so that you capture the highest possible quality master image files.
      • While working with your images, you may need to manipulate, share, or embed them in other documents. For these purposes it may be useful to compress your image files into JPEG format so that they’re smaller and easier to send over the internet or embed in analysis project files.

    The following table provides general suggestions for suitable file format choices for long-term preservation and for working data. For more specific recommendations, please contact for further advice on which file formats to use for long-term preservation, as well as when and how your data should be converted into these formats.

    • Archive
      Preservation Format(s)
      ZIP File Format (.zip)
    • Audio
      Preservation Format(s)
      Broadcast Wave Format (.wav)
    • Images
      Preservation Format(s)
      Tagged Image File Format (.tif, .tiff)
    • Tabular Datasets
      Preservation Format(s)
      Comma Separated Values (.csv)
      Microsoft Excel (.xlsx)
    • Text
      Preservation Format(s)
      Plain Text (UTF-8) (.txt)
      Portable Document Format (.pdf)
    • Video
      Preservation Format(s)
      Motion JPEG 2000 (.mj2) MPEG-4 (.mp4)

    • Audio
      Preservation Format(s)
      MPEG-1 Audio Layer 3 (.mp3)
    • Images
      Preservation Format(s)
      JPEG (.jpeg, .jpg)
    • Video
      Preservation Format(s)
      MPEG-4 (compressed) (.mp4)

    Further information

Data retention periods
  • Retention periods define the minimum amount of time that you need to keep data after you complete your research project. You must keep your data for at least the minimum retention period.

    Retention periods for research data vary depending on what the data is, and what research the data is related to. Retention periods are specified in:

    In most cases data will continue to be of value to you and to the wider research community for a long time beyond its minimum retention period. In fact, funders and publishers are increasingly encouraging, or even mandating, that researchers retain and share their data.

    Data should only be destroyed if specifically required by ethical, legal, or contractual obligations. In cases where data must be destroyed, you mustn’t do so before the end of the minimum retention period. Destruction of data must be authorised, documented, and performed using secure destruction methods. Destruction of research data must be coordinated through Archives and Records Management Services (ARMS). If you believe your data is eligible for destruction and wish to destroy it, then you must complete an Application to Dispose of University Records form and contact the University’s Disposal Officer (

    Minimum retention periods

    • Projects of regulatory or community significance. Includes data that is: part of genetic research, including gene therapy; controversial or of high public interest or has influence in the research domain; costly or impossible to reproduce or substitute if the primary data is not available; relates to the use of an innovative technique for the first time; of significant community or heritage value to the state or nation; or required by funding or other agreements to be retained permanently.
      Must be kept...
    • Clinical trials, or research with potential long-term effects on humans. Includes animal testing for human products
      Must be kept...
      For 15 years after completion of research activity or until research participant reaches or would have reached 25 years, whichever is longer
    • Patent applications
      Must be kept...
      For the life of the patent (generally 20 years)
    • All other research
      Must be kept...
      For 5 years after project completed
    • Short-term projects that are for assessment purposes only, such as projects completed by students
      Must be kept...
      If not returned to the student, retain at least until the end of the appeal period

    You can delete the following types of data at any time without authorisation or documentation, provided you no longer need them for reference or administrative purposes:

    • duplicate copies of data files
    • unused forms or templates
    • routine system logs
    • calibration or test data
    • temporary working copies of datasets used to prepare the final version
    • paper copies of data that have been captured in electronic form

    Third party data

    • When the lead investigator who’s authorised to use the data leaves the University, and/or
    • When the project for which you applied to use the data comes to a close, and/or
    • After a time period specified in the agreement relating to the use of the data

    Data of this type must be destroyed in accordance with the terms and conditions of use.