RDIM Step 3 - Archive

Step 3 - Archive

This step requires you to create a Data Record in Research Data JCU. The Data Record documents information about the data used to reach your findings and where your completed data is located. In addition to creating a Data Record, any ‘physical’ research assets will also need to be archived via TRIM using the Process for Archiving Physical Research Assets.

Need to archive or publish your research data and just want to get it done?

The Archiving and Publishing Guide brings everything together in one place—with practical steps to help you identify, prepare, archive, and publish your data with confidence.

It’s structured around HDR milestones at JCU but useful for all researchers, including advisors and ECRs.

About Data Records

A Data Record:

Is documented in Research Data JCU.
Is a non-public metadata record of the data and information associated with your research project.
Includes data attachments or the storage location of completed (not active) data.
Should also include any documentation necessary to understand or reproduce the research (such as survey questions, data dictionary, codebooks and R scripts).
- Research Data JCU can be used to store data up to 100 MB. However, if data is greater than 100MB or is SENSITIVE, contact researchdata@jcu.edu.au to organise appropriate storage options.
Can apply to a large funded research project, a smaller or less formal project, a thesis, data chapters in a thesis or a dataset that will later be made available with a paper.

Completing a Data Record will:

Enable you to find data even after long periods of time have elapsed.
Ensure the integrity of your research methods and findings.
Satisfies requirements under the Australian Code for the Responsible Conduct of Research.

The Research Data JCU platform includes extensive help text (click on the ? icon) and prompts for each metadata field.

If you require any assistance completing your Data Record, please contact researchdata@jcu.edu.au.

To create a Data Record:

Log into Research Data JCU
Use one of two methods to create a Data Record i.e.:
- Directly from your RDMP
  - Click on ‘Plan’
  - Click on ‘Create a data record from this plan’
- Using the Manage menu
  - Click on ‘Manage’
  - Click ‘Create Data Record’
- At this stage, you’ll be prompted to link the Data Record to your RDMP if you have one.
Ensure your RDMP is updated as metadata from the RDMP will auto-fill your Data Record.
Click on each tab to complete and/or edit each metadata field.
Fields marked with an asterisk (*) are required.
Remember to save regularly.

Method 1 - Create a Data Record directly from your RDMP

First find the RDMP

Click on image for larger version.

Select the RDMP

Click on image for larger version.

Create a data record from the RDMP

Click on image for larger version.

Method 2 - Create a Data Record from the menu

Click on image to see a larger version.

The Research Data JCU platform includes extensive help text (click on the ? icon) and prompts for each metadata field.

If you require any assistance completing your Data Record, please contact researchdata@jcu.edu.au.

Log in to Research Data JCU
Select View & Update RDMP from the Manage tile, OR
Select MANAGE and then VIEW & UPDATE DATA RECORD from the menu.

Click on the image to see a larger version.

This process relates to:

The archiving of physical research (data and information) assets and should run concurrently with the archiving of digital research (data and information) assets via Research Data JCU, and
Research projects that have been finalised from 2021 onwards.

Click on the image to see a larger version.

When deciding what completed data needs to be archived, the Researcher/HDR Candidate will need to consider:

What would be required for validation and re-use?
How long would it take to collect the data again or is this even possible?
What is the significance and value of the data?
What are the required retention periods for the data (refer to University Sector Retention and Disposal Schedule)?
Are there any contractual data terms that need to be followed?

Archiving of digital research (data and information) assets process

For all campuses, the Researcher/HDR Candidate responsible must:

E-mail researchdata@jcu.edu.au to identify the most appropriate storage option(s) for both digital and physical research (data and information) assets
Complete the Data Record in Research Data JCU as appropriate

Archiving of physical research (data and information) assets process

Townsville campus

The following process is relevant to all Townsville based persons.

Other Australian campuses

Whilst the following process is relevant to all Australian-based campuses, the Records Team will need to advise how the archive box(es) will be transferred from other Australian Campuses to the Townsville Campus for final storage (transfer charges will apply).

Singapore campus

Process to be finalised.

**Archiving of physical research (data and information) assets**
Roles	Process steps
Researcher/HDR Candidate	Liaise with the Records Team via corporateinformation@jcu.edu.au to: Request a TRIM number (JCU’s electronic document and records management system); and Seek advice as to how best prepare the physical research assets for archiving; and Identify what metadata the Records Team will need to record in TRIM; and Request advice as to where the archive box(es) will be stored (this location will need to be recorded on each archive box) Note: It is likely the above will take several communications to finalise. Collect empty archive box(es) from the Academic Administrator Prepare and box the physical research assets and record the TRIM number and intended location on each box. Deliver the archive boxes to the Academic Administrator Add the TRIM number to the Data Record in Research Data JCU This will ensure a link is established between the digital and physical data as well as the researcher and the research project
Academic Administrator	On receipt of the archive box(es), create a maintenance request via MEX (JCU’s online maintenance request system) to trigger collection of the box(es) by the Estates Team
Estates Team	On receipt of the MEX request, collect the box(es) and liaise with the Records Team to meet and accept delivery of the boxes at the relevant location
Records Team	Securely store the box(es) and update TRIM to indicate the process has been completed

Costs

There are significant costs associated with retaining data however, the majority of these are borne at the corporate (University) level. Each College/Centre will be responsible for the following:

Purchase of archive boxes
- The Academic Administrator will be responsible for ordering archive boxes (T1 and T4) for their areas of responsibility.
Transfer and delivery costs of research (data and information) assets that are being archived (this is an internal cost charged by the Estate’s Office).

Air conditioned and cold room facilities

Townsville archive rooms are air conditioned at 20 degrees. However, there are currently no fridges/freezers available at the JCU Records (corporate) level.

Should fridges or freezers be required, the Records Teams will work with individuals to determine the most appropriate storage location. Some Colleges/Centres have their own facilities, and these will usually be the first preference.

Where College/Centre facilities are used, it is IMPORTANT that the process below is followed so that these archives are included in TRIM and that this information is then available to the Records Team to negotiate the purchase of future facilities.

Watch Module 3: Data Record of the Management of Data and Information in Research series of training videos to learn more about completing your Data Record.

This module is part of the RD7003 Compulsory Workshops for HDR Candidates and can also be accessed via the Higher Degree by Research Students Organisation on LearnJCU. Completion of a short quiz on LearnJCU is required.

When developing your Data Record you will need to consider

Consent is required from human participants before data can be collected or published. Obtaining informed consent to facilitate data sharing and publication involves:

Developing an information sheet about maintaining confidentiality, data sharing and publication so participants can make an informed decision before consenting to participate. Your information sheet should be approved by JCU's Human Research Ethics Committee (HREC) as part of the ethical clearance process (refer below).
Stating the possibility of future data publication and sharing, de-identification processes and conditions for access (refer below)
Seeking prior approval from HREC for consent forms and information sheets.

The Australian Research Data Commons (ARDC) provides some example sentences in their guide (pp. 14) - the examples listed below are appropriate in different contexts (e.g. open and conditional access respectively):

The information in this study will only be used in ways that will not reveal who you are. You will not be identified in any publication from this study or in any data files shared with other researchers. Your participation in this study is confidential.

I agree that research data gathered for the study may be published provided my name and other identifying information is not used. Other genuine researchers [may] have access to this data only if they agree to preserve the confidentiality of the information as requested in this form.

If explicit consent for sharing is not obtained at the time of the study, it may be possible to seek a waiver from reviewers or to go back to participants for additional consent.

Secondary use of data or information:

The National Statement on Ethical Conduct in Human Research (p. 36) raises the ethical issue of obtaining consent for secondary use of data or information. It is, for example, usually impractical to obtain consent for secondary use of data routinely collected during delivery of a service and respect for participants needs to be demonstrated in other ways.

Sharing existing data without explicit consent is a possibility if all of the following conditions for a waiver of consent, as outlined in Section 2.3.10 of the National Statement of Ethical Conduct of Human Research, are met:

involvement in the research carries no more than low risk to participants,
the benefits from the research justify any risks of harm associated with not seeking consent,
it is impracticable to obtain consent (for example, due to the quantity, age or accessibility of records),
there is no known or likely reason for thinking that participants would not have consented if they had been asked,
there is sufficient protection of their privacy,
there is an adequate plan to protect the confidentiality of data,
in case the results have significance for the participants’ welfare there is, where practicable, a plan for making information arising from the research available to them (for example, via a disease-specific website or regional news media),
the possibility of commercial exploitation of derivatives of the data or tissue will not deprive the participants of any financial benefits to which they would be entitled, and
the waiver is not prohibited by State, federal, or international law.

JCU researchers and HDR candidates should always consult their College / Centre Human Ethics Advisor, and the Research and Innovation Services Ethics and Research Integrity team for specific advice.

A contract is a legally binding agreement that defines and governs the rights and duties between or among its parties. A number of specific issues need to be addressed prior to the commencement of a research project relating to the use, management, sharing and ownership of research data and information.

Some examples of agreements include:

With research funders:

The research funding agreement may stipulate that the funding organization has a claim to, or ownership of, the intellectual property (copyrights and other rights) created through the funded research;
Alternatively, the agreement may grant licence rights to the funding organisation with respect to the use of the data. Even if the funding organization does not acquire full ownership, they may be granted specific rights to use, share, or commercialize the data as outlined in the agreement.

With collaborators:

Research Collaboration Agreement (RCA): This agreement should specify whether data ownership is joint, shared, or retained by the originating party and clearly articulate the rights granted to each collaborator for using and disseminating the data. It should also address Intellectual Property (IP) rights including the ownership and potential commercialization of IP resulting from collaborative effort;
Indigenous Culture and IP (ICIP) Agreements: Agreements related to Indigenous culture and intellectual property often prioritize community rights and control over the data. Ownership may remain with the Indigenous community, and researchers might be granted specific, limited rights for their research purposes.
ICIP may also be relevant in the context of data providers (below)

With data providers:

Several types of agreements with data providers may be relevant, including:

Confidentiality Agreement: Ownership usually remains with the data provider, and the receiving party (your research team) is obligated to keep the information confidential and not disclose it to third parties;
Data Transfer Agreement: Ownership and rights are often specified in these agreements. They may grant the recipient certain rights to use the data for the intended purpose but restrict further dissemination or commercialization without explicit permission;
Application under the Public Health Act 2005 (PHA): Ensure that you understand the terms related to data ownership, as some PHAs might allow specific uses for public health research while preserving certain rights for the data provider.

Make sure that any agreement you enter into makes the conditions for storing and sharing any derived data clear.

Always consult Research and Innovation Services for further information at contractsconnect@jcu.edu.au for specific advice.

In most jurisdictions, including Australia, copyright protection applies to the expression of ideas rather than the ideas or facts themselves. Datasets will be protected by copyright if they meet certain threshold criteria of human authorship, originality, or creativity. Compiling and presenting raw data (e.g. adding labels, units, performing calculations etc.) is often sufficient to attract copyright protection.

Here's an example: If a data logging machine placed in a creek generates 'raw data' about (for example) water quality, that data may not attract copyright protection, despite the fact that researchers may have used considerable skill, effort and expertise in siting the machine. Since the 'raw' data itself would have no human authorship and originality, it would not satisfy the legal basis of copyright. However, if a researcher were to examine the data from the data logging machine, note certain errors and makes corrections to the data, or reform the selection and arrangement of the dataset; that (sometimes relatively minor) act of human authorship, originality, and application of skill and judgment may be sufficient for the resulting dataset to attract copyright protection.

Even if your data does not meet the threshold for copyright there is no harm in applying a Creative Commons licence when publishing the data. It lets others know how you would like to be attributed and applies a limitation of liability and warranty clause to the data.

Refer also to Moral Rights and the Library website for further information about Copyright.

Diagram indicating when copyright applies to data

This section lists some of the repository options available for publishing research data or finding existing data for reuse at JCU.

The two key data repositories are:

Research Data JCU (via Research Data Australia) JCU datasets registered in Research Data Australia (most recent first)
Research Data Australia (RDA): Australia's research data commons helps you find, access and reuse data for research from 100 Australian research organisations, government agencies and cultural institutions. RDA harvests data descriptions and links to data held with their data publishing partners. JCU has over 2,500 datasets in RDA

Data repositories - whether institutional, national, international, generalist, or discipline-specific - exist to support and facilitate long-term access to research data.

Research funders or journals may mandate data deposition in a particular repository. For example, ’Most journals require DNA and amino acid sequences that are cited in articles be submitted to a public sequence repository (DDBJ/ENA/Genbank - INSDC) as part of the publication process.’ https://www.ncbi.nlm.nih.gov/genbank/submit/).

Many journals integrate data deposition in a generalist repository (e.g. Dryad) with the submission of manuscripts of related research publication.

Some researchers may also choose to publish a data paper - these are published research outputs in a specialist data journal (or section of a more generalist journal) with the primary purpose of exploring the research potential of a particular data set and deriving new research findings. While a published data paper would usually constitute a significant investment of time and effort above the deposit and publication of data through a repository, it may be an avenue worth exploring depending on the field of research.

Information about specific important repositories ...

Preserving the data and information after your research project has been finalised is critical to:

Prevent data loss;
Enable long-term access, discovery and reuse; and
Ensure researchers and institutions can defend their research outcomes if they are challenged.

Preservation activities need to be planned and should take into account file formats and data quality, data ownership (refer to Copyright, Intellectual Property and Moral Rights), retention periods, preferred data repositories and ways to share data safely.

Retention rules are defined by the research funding body or the university. Key documents for JCU researchers include the guide 'Management of Data and Information in Research' which supports the 2018 Code and the University Sector Retention and Disposal Schedule for Queensland universities.

In general, the minimum period for retention of data is five years from the end of the year of publication of the last refereed publication or other form of public release to an audience outside of the University that is based on the data. However, in any particular case the period for which data should be retained should be determined by the specific type of research e.g. for areas such as gene therapy, research data must be retained permanently.

For more information refer to the retention rules for specific data types.

De-identifying data is the process used to prevent someone’s personal identity from being revealed. Data that has been de-identified no longer triggers the Privacy Act.

For example, data from the PALS (Pregnancy and Lifestyle Study) has been de-identified and is available for download. The risk of re-identification via triangulation has also been considered and managed.

Although the study contains highly sensitive data, several techniques have been used to de-identify the dataset e.g. identifiers and dates of birth have been removed, ages have been aggregated into bands - and postcodes have been excluded. It would be possible to re-identify (triangulate) participants by combining (for example) a rural postcode with an occupation.

Think about de-identifying your data early as it can be time consuming and difficult later. The Australian Research Data Commons (ARDC) has some tips on de-identification, listed below and in their Identifiable Data guide. You should also seek discipline-specific advice as required.

plan de-identification early in the research as part of your data management planning
make sure the consent process includes the accepted level of anonymity required and clearly states what may and may not be recorded, transcribed, or shared
retain original unedited versions of data for use within the research team and for preservation
create a de-identification log of all replacements, aggregations or removals made
store the log separately from the de-identified data files
identify replacements in text in a meaningful way, e.g. in transcribed interviews indicate replaced text with [brackets] or use XML markup tags
for qualitative data (such as transcribed interviews or survey textual answers), use pseudonyms or generic descriptors rather than blanking out information
digitally manipulate audio and image files to remove identifying information

'Intellectual property' is a broad term that describes the laws which protect products of people’s imagination and creativity.

The main forms of intellectual property include:

Under the Australian Copyright Act 1968, the owner of literary, dramatic, musical and artistic works, sound recordings, films and broadcasts have exclusive rights which allow them to:

Reproduce/copy a work
Publish
Perform in public
Communicate the work to the public via electronic means – including making it available online or sending via email.

Intellectual property rights at JCU are managed in accordance with the Intellectual Property Policy and Procedure; see the Research and Innovation Services website for further information about intellectual property in research.

Moral rights are personal legal rights belonging to the creator of copyright works and cannot be transferred, assigned or sold. They ensure that the creators of works are correctly attributed and the works are not treated in a derogatory way and that the integrity of the work is upheld.

By assigning ownership of the copyright of a work to someone else the author transfers control over its future publication or reproduction to the new owner. But the author almost always retains moral rights to his/her work regardless of the copyright owner. This provides the creator of a work the right to be identified as the author and the right to take legal action against any change to the title of or derogatory treatment of the work itself.

Data is considered sensitive if it can be used to identify an individual, species, object, or location in a way that introduces a risk of discrimination, harm, or unwanted attention.

Examples of sensitive data include identifiable or re-identifiable personal and health/medical data, Indigenous data, ecological data (e.g., location of rare or endangered species), commercial-in-confidence and defence-related data.

Sensitive data is commonly subject to legal, ethical and/or regulatory requirements that restrict how it can be accessed, handled and shared.

Personal information is sensitive if it directly identifies a person and includes one or more pieces of information from Table 1 (Part I, Division I, Section 6) of the Privacy Act 1988. This information includes:

Racial or ethnic origin
Political opinions
Membership of a political association
Religious beliefs or affiliations
Philosophical beliefs
Membership of a professional or trade association
Membership of a trade union
Sexual orientation or practices
Criminal record
Health information (see section 6FA for definition)
Genetic information
Biometric information.

While sensitive data cannot be published in its original form, in the majority of cases, it can be shared using a combination of:

See Sensitivity levels in Research Data JCU for guidelines on applying research data classification tags in your RDMP.

It's important researchers are aware that data that is not obviously sensitive (no names or dates of birth for example) or that has been de-identified, can become sensitive through triangulation or data linkage.

Triangulation in this context is the process of combining several pieces of non-sensitive information (in the same dataset) to determine the identity or sensitivity of a participant or subject.

Data linkage combines one or more datasets that include the same participant or subject, an activity that carries the risk of re-identification and may place subjects at risk. Data linkage is highly useful (it increases understanding without having to collect new data and derives greater value from existing datasets) and is increasingly common in epidemiology, medical, social and ecological sciences. Researchers should treat the new, linked dataset as an identifiable dataset and assess the risks involved.

High risk data integration projects involving information from Australian, state or territory governments will need to be managed by an accredited integrating authority such as the Australian Institute of Health and Welfare (AIHW), Australian Institute of Family Studies (AIFS) or the Australian Bureau of Statistics (ABS) to ensure security. Once data is linked researchers will access it through a secure data lab in Canberra, a mobile data lab, a remote access computing environment or other secure arrangement and output and use of data will be monitored. The AIHW has useful information on data linkage on their website.

Step 3 - Archive

About Data Records

What is a Data Record?

How to create a Data Record

Method 1 - Create a Data Record directly from your RDMP

First find the RDMP

Select the RDMP

Create a data record from the RDMP

Method 2 - Create a Data Record from the menu

How to view and update a Data Record

Processes for Archiving Research (Data and Information) Assets

Archiving of digital research (data and information) assets process

Archiving of physical research (data and information) assets process

Costs

Air conditioned and cold room facilities

Training

When developing your Data Record you will need to consider

Consent

Contracts

Copyright

Data Repositories

Data Retention

De-identifying Data

Intellectual Property

Moral Rights

Sensitive Data

Triangulation, Data Linkage and Integrating Authorities