Skip to main content

Archive research data

Ensure that your data is safe in the long term

Archiving data that is no longer actively used by you or your research team will ensure that your data is stored securely to prevent data loss in the long term. Archiving your data will ensure continued access for you, or for others if you choose to publish or share your data.

Archiving options
File formats
  • When preserving and publishing data it’s essential that all datasets are saved in an appropriate file format to ensure long-term accessibility of data. The file formats you use when working with your data may not be appropriate for archiving or publishing purposes. You should think about capturing data or converting files into formats that are:

    • widely used within your discipline
    • publicly documented, ie the complete file specification is publicly available
    • open and non-proprietary
    • endorsed by standards agencies such as the International Organisation for Standardization (ISO)
    • self-documenting, ie the file itself can include useful metadata
    • unencrypted
    • uncompressed or that use lossless compression

    For example:

    • Quantitative research data
      • While you collect and analyse your data, you might need it in a number of different formats: an Excel spreadsheet, a database, an SPSS, SAS, R, MATLAB or other file format native to the specific data analysis software you are using.
      • Once the data’s been collected and the analysis performed, save the data as a comma separated values (.csv) file for long-term storage. Most data software packages provide options for saving data as a comma separated values file. This format is portable across different computing and software platforms and is therefore more resilient to software updates.
    • Image files
      • The uncompressed TIFF (Tagged Image File Format) is a good choice for long-term preservation of image files. Most image creation software packages provide options for saving images as TIFF files. You should save your image files in this format right from the outset, so that you capture the highest possible quality master image files.
      • While working with your images, you may need to manipulate, share, or embed them in other documents. For these purposes it may be useful to compress your image files into JPEG format so that they’re smaller and easier to send over the internet or embed in analysis project files.

    The following table provides general suggestions for suitable file format choices for long-term preservation and for working data. For more specific recommendations, please contact the Digital Curation and Data team for further advice on which file formats to use for long-term preservation, as well as when and how your data should be converted into these formats.

    • Archive
      Preservation Format(s)
      ZIP File Format (.zip)
    • Audio
      Preservation Format(s)
      Broadcast Wave Format (.wav)
    • Images
      Preservation Format(s)
      Tagged Image File Format (.tif, .tiff)
    • Tabular Datasets
      Preservation Format(s)
      Comma Separated Values (.csv)
      Microsoft Excel (.xlsx)
    • Text
      Preservation Format(s)
      Plain Text (UTF-8) (.txt)
      Portable Document Format (.pdf)
    • Video
      Preservation Format(s)
      Motion JPEG 2000 (.mj2) MPEG-4 (.mp4)

    • Audio
      Preservation Format(s)
      MPEG-1 Audio Layer 3 (.mp3)
    • Images
      Preservation Format(s)
      JPEG (.jpeg, .jpg)
    • Video
      Preservation Format(s)
      MPEG-4 (compressed) (.mp4)

    Further information

Data retention periods
  • Retention periods define the minimum amount of time that you need to keep data after you complete your research project. You must keep your data for at least the minimum retention period.

    Retention periods for research data vary depending on what the data is, and what research the data is related to. Retention periods are specified in:


    In most cases data will continue to be of value to you and to the wider research community for a long time beyond its minimum retention period. In fact, funders and publishers are increasingly encouraging, or even mandating, that researchers retain and share their data.

    Data should only be destroyed if specifically required by ethical, legal, or contractual obligations. In cases where data must be destroyed, you mustn’t do so before the end of the minimum retention period. Destruction of data must be authorised, documented, and performed using secure destruction methods. Destruction of research data must be coordinated through Archives and Records Management Services (ARMS). If you believe your data is eligible for destruction and wish to destroy it, then you must complete an Application to Dispose of University Records form and contact the University’s Disposal Officer (records.online@sydney.edu.au).


    Minimum retention periods

    • Projects of regulatory or community significance. Includes data that is: part of genetic research, including gene therapy; controversial or of high public interest or has influence in the research domain; costly or impossible to reproduce or substitute if the primary data is not available; relates to the use of an innovative technique for the first time; of significant community or heritage value to the state or nation; or required by funding or other agreements to be retained permanently.
      Must be kept...
      Permanently
    • Clinical trials, or research with potential long-term effects on humans. Includes animal testing for human products
      Must be kept...
      For 15 years after completion of research activity or until research participant reaches or would have reached 25 years, whichever is longer
    • Patent applications
      Must be kept...
      For the life of the patent (generally 20 years)
    • All other research
      Must be kept...
      For 5 years after project completed
    • Short-term projects that are for assessment purposes only, such as projects completed by students
      Must be kept...
      If not returned to the student, retain at least until the end of the appeal period

    You can delete the following types of data at any time without authorisation or documentation, provided you no longer need them for reference or administrative purposes:

    • duplicate copies of data files
    • unused forms or templates
    • routine system logs
    • calibration or test data
    • temporary working copies of datasets used to prepare the final version
    • paper copies of data that have been captured in electronic form

    Third party data

    • When the lead investigator who’s authorised to use the data leaves the University, and/or
    • When the project for which you applied to use the data comes to a close, and/or
    • After a time period specified in the agreement relating to the use of the data

    Data of this type must be destroyed in accordance with the terms and conditions of use.