There’s a lot to consider in planning and designing your research project and it’s important to do this as early as possible.
Before you begin, you will need to create a Research Data Management Plan (RDMP) in Research Data JCU.
Although you may not have all the information available at this time, the Research Data JCU platform will assist and prompt you to plan and manage the data and information components of your project. As your project progresses, these details can and should be updated as changes occur.
Having a RDMP also helps satisfy requirements of the Australian Code for the Responsible Conduct of Research and is often required by funders, including the Australian Research Council (ARC).
This step will also require you to consider:
Whether you will be collaborating with (external) partners and require a contract or agreement to be developed, e.g. confidentiality agreements.
Whether you will be dealing with personal information and/or sensitive data which will require ethical review.
How you will ensure your data meets the FAIR Principles.
About Research Data Management Plans (RDMPs)
A Research Data Management Plan (RDMP) is:
Documented in Research Data JCU.
A non-public metadata record that describes:
what data will be created;
what policies apply to the data;
who will own and have access to the data;
what data management practices will be used;
what facilities and equipment will be required; and
who will be responsible for each of these activities.
A living document that is added to and refined throughout the research project.
Completing a RDMP:
Establishes a link between the research project and the research data and information.
Is increasingly becoming a requirement of research funding.
Mitigates the risks of data loss, and unauthorised use of your data and information, by addressing storage and ethical/legal issues.
Supports best-practice in research.
Makes it easier to store and publish data in future.
Satisfies requirements under the Australian Code for the Responsible Conduct of Research.
The Research Data JCU platform includes extensive help text (click on the ? icon) and prompts for each metadata field.
This module is part of the RD7003 Compulsory Workshops for HDR Candidates and can also be accessed via the Higher Degree by Research Students Organisation on LearnJCU. Completion of a short quiz on LearnJCU is required.
When developing your RDMP you will need to consider
The following access levels can be applied to your data via the Research Data JCU platform.
Most data can be published via open or conditional access but this option is useful for sensitive datasets that cannot be de-identified and for highly confidential data. Making metadata available via a Data Publication ensures your work is more visible and facilitates discussion/collaboration with other researchers.
This can be a good option for sharing sensitive data that has been de-identified. By making access conditional you can ensure requestors are genuine researchers and that they will maintain confidentiality and keep data files secure.
Data can be downloaded via a link in Research Data JCU. Open data can be freely used, reused and redistributed by anyone - subject at most to the requirement to attribute and share alike. The ideal route is to ensure data is in a machine-readable format on an easily accessible platform with an open licence applied to it. Some licences are more ‘open’ than others. Look at the Know Your Rights: Understanding CC Licences Poster for a good visualization/comparison of the licences.
This option maximizes the visibility and potential impact of your data and may be required by your funder or publisher.
You can make data files (in the same dataset) available under different conditions. Using survey data as an example:
raw data with direct identifiers would need to be stored in a secure location (option 1),
de-identified data might be made available via negotiation (2), and
the survey questions and codebook describing data variables could be public (3).
You can change options even after the dataset has been published. Under certain circumstances you may wish to have restricted or conditional access to your data and then open it up after a nominated period. In this example, you should choose a licence for your data, as a licence will govern use of the data.
It’s important to remember that whatever Access Conditions you apply to your data and information needs to completely align with any ‘ethics’ approval and/or ‘consent’ that has been given. For example:
If the consent form and related information sheet indicates the data and information is for the ‘specific' purpose of this research project, the data and information CANNOT be used for ANY other purpose, even by the primary researcher.
However, if the consent form and related information sheet indicates the data and information can be used for ‘extended’ purposes such as related research, the data and information CAN be used for whatever secondary purpose is stated. Where appropriate, further conditional access may be applied as per above.
Confidentiality of data may be a requirement for projects being developed with industry partners, and/or with commercial applications. Such projects may require the negotiation and enactment of non-disclosure or confidentiality agreements that generally will affect the handling and use of data.
Consent is required from human participants before data can be collected or published. Obtaining informed consent to facilitate data sharing and publication involves:
Developing an information sheet about maintaining confidentiality, data publication and sharing so participants can make an informed decision before consenting to participate. Your information sheet should be approved by JCU's Human Research Ethics Committee (HREC) as part of the ethical clearance process (refer below).
Stating the possibility of future data publication and sharing, conditions for access (refer below) and de-identification processes (refer below)
Seeking prior approval from HREC for consent forms and information sheets.
The Australian National Data Service provides some example sentences in their guide (pp. 14-15) - the examples listed below are appropriate in different contexts (e.g. open and conditional access):
I agree that research data gathered for the study may be published provided my name and other identifying information is not used. Other genuine researchers [may] have access to this data only if they agree to preserve the confidentiality of the information as requested in this form.
If explicit consent for sharing is not obtained at the time of the study, it may be possible to seek a waiver from reviewers or to go back to participants for additional consent.
The National Statement on Ethical Conduct in Human Research (p. 36) raises the ethical issue of obtaining consent for secondary use of data or information. It is, for example, usually impractical to obtain consent for secondary use of data routinely collected during delivery of a service and respect for participants needs to be demonstrated in other ways.
Sharing existing data without explicit consent is a possibility if all of the following conditions are met:
It is no longer possible or practical to gain consent; and
Data has been de-identified; and
Process of de-identification matches the definition in the Privacy Act; and
There is no risk that publishing or sharing the data will cause harm or discrimination; and
Information sheets and consent forms from the original data collection didn't preclude sharing.
JCU researchers and HDR candidates should always consult their College / Centre Human Ethics Advisor, and the JCU Connect Ethics and Research Integrity team for specific advice.
Will you be collaborating with (external) partners and require a contract or agreement to be developed e.g., for data transfer or data processing?
A contract is a legally binding agreement that defines and governs the rights and duties between or among its parties. A number of specific issues need to be addressed prior to the commencement of a research project relating to the use, management, sharing and ownership of research data and information. See JCU Connect for further information at firstname.lastname@example.org
De-identifying data is the process used to prevent someone’s personal identify from being revealed. Data that has been de-identified no longer triggers the Privacy Act.
Although the study contains highly sensitive data, several techniques have been used to de-identify the dataset; e.g. identifiers and dates of birth have been removed, ages have been aggregated into bands - and postcodes have been excluded. It would be possible to re-identify (triangulate) participants by combining (for example) a rural postcode with a rare occupation.
Think about de-identifying your data early as it can be time consuming and difficult later. Consult the relevant ANDS guides and seek discipline-specific advice as required.
Research Data Management within Australia should comply with the Australian Code for the Responsible Conduct of Research, and needs to take into account any relevant ethical obligations, privacy protocols, and intellectual property rights with respect to the storage and security of research data and associated information. The level of detail in which data can be shared may also be limited by factors such as research ethics, and/or by intellectual property rights and other legal restrictions.
Many research projects will involve collecting data or information about human or animal subjects in a way that might impact on their rights. Before collecting or using the personal information of others in research (e.g. health or social science research), or planning or conducting experimental or other research involving animals or animal populations, JCU researchers must obtain ethical clearance for their projects through a formal application process managed by the Ethics and Integrity Office within JCU Connect.
Under Australia’s FAIR Access Policy Statement, all publicly funded research outputs must follow the FAIR principles.
The FAIR Principles have been developed to make research more visible and to allow researchers to more easily collaborate and maximise the return on investment in research and innovation. The acronym stands for:
Data can be more findable by: properly describing what the data is; putting it in a permanent and easily searchable place; and making it easy for humans and computers to search for it.
Data can be more accessible by: using non-proprietary, standardised and automated methods to supply the data to those who want or need it; letting others know how they can get the data; and letting others know if the data is no longer available.
Data can be more interoperable by: storing and providing the data in widely-used and accessible file formats; describing the data using standard terms (vocabularies) that are relevant and widely known; and describing if it relates to other data and what exactly that relationship is.
Data can be more reusable by: making it clear how the data was collected or if there are validity concerns; making any conditions of reuse clear in license readable to humans and machines; and meeting the standards used within the relevant research community.
Research involving the use of personal information (information or an opinion about an identified individual, or an individual who is reasonably identifiable) must comply with applicable privacy legislation.
Information Privacy Act 2009 (Queensland) - In the state of Queensland, the right to privacy with regard to the use of personal information is enshrined by the Information Privacy Act 2009. With regard to research projects, JCU's Right to Information and Privacy webpages state that 'It is the responsibility of the researcher and the relevant Division/College to ensure compliance with the information privacy principles in the IP [Information Privacy] Act. The personal information should be securely held and access to it should be limited to members of the research team, the funding body, if appropriate, and staff providing assistance to or supervising the research team. Researchers should seek the informed written consent of individuals who will provide personal information for research purposes and keep a record of that consent.
Privacy Act 1988 (Commonwealth) - The federal Privacy Act does not directly apply its own force to universities and research institutions created under state legislation - they are governed by State privacy laws. Universities are however, contractually bound to follow the National Privacy Principles (NPPs) contained in the Act under funding agreements from Commonwealth agencies including the NHMRC and ARC.
Broadly, sensitive data is information that could potentially impact on the rights of others. The Australian National Data Service includes the following definition of sensitive data in their Publishing and Sharing Sensitive Data guide: ‘Sensitive data identifies individuals, species, objects or locations, and carries a risk of causing discrimination, harm or unwanted attention.’
Sensitive data is often about people (i.e. personal information) but ecological data can also be sensitive if it reveals, for example, the location of rare or endangered species. Under law and the research ethics governance of most institutions, sensitive data cannot typically be shared in this form. The Legal and Ethical Framework section outlines the applicable legislation and guidelines.
Personal information is sensitive if it directly identifies a person and includes one or more pieces of information from Table 1 (Part I, Division I, Section 6) of the Privacy Act 1988. This information includes:
Racial or ethnic origin
Membership of a political association
Religious beliefs or affiliations
Membership of a professional or trade association
Membership of a trade union
Sexual orientation or practices
Health information (see section 6FA for definition)
While sensitive data cannot be published in its original form, in almost all cases, it can be shared using a combination of:
It's important researchers are aware that data that is not obviously sensitive (no names or dates of birth for example) or that has been de-identified, can become sensitive through triangulation or data linkage.
Triangulation in this context is the process of combining several pieces of non-sensitive information (in the same dataset) to determine the identity or sensitivity of a participant or subject.
Data linkage combines one or more datasets that include the same participant or subject, an activity that carries the risk of re-identification and may place subjects at risk. Data linkage is highly useful (it increases understanding without having to collect new data and derives greater value from existing datasets) and is increasingly common in epidemiology, medical, social and ecological sciences. Researchers should treat the new, linked dataset as an identifiable dataset and assess the risks involved.
High risk data integration projects involving information from Australian, state or territory governments will need to be managed by an accredited integrating authority such as the Australian Institute of Health and Welfare (AIHW), Australian Institute of Family Studies (AIFS) or the Australian Bureau of Statistics (ABS) to ensure security. Once data is linked researchers will access it through a secure data lab in Canberra, a mobile data lab, a remote access computing environment or other secure arrangement and output and use of data will be monitored. The AIHW has useful information on data linkage on their website, including an overview of a typical data integration project.