This section provides an alphabetical listing of some of the terminology used in managing research data and information along with the meaning and/or application of these terms.
Click on the letter to see the definitions starting with that letter.
Data citation refers to the practice of providing a reference to data in the same way as researchers routinely provide a bibliographic reference to other research outputs such as journal articles, reports and conference papers.
Data citation is important because:
- It leads to recognition of data as a primary research output.
- It facilitates reproducible and transparent research.
- Only cited data can be counted and tracked to measure impact.
- Citations for your published data can be included in researcher profiles (e.g. ORCID) curricula vitae etc.
- It increases the citation rate of those publications.
DataCite provides a recommended minimum format for citing data:
- Required elements: Creator | Publication Year | Title | Publisher | Identifier (a URL, DOI or other persistent identifier)
- Options elements: Version | Resource Type (e.g. ‘dataset’)
Follow your style manual or publisher's advice for citing data. If no format is suggested for datasets, take a standard data citation style and adapt it to match the style for textual publications.
The DataCite DOI Citation Formatter is an online tool for formatting citations (just paste in the DOI) in hundreds of different styles.
Due to current system limitations, internal and external collaborators are dealt with in two ways:
- Internal collaborators
- A JCU staff member or HDR candidate who has contributed to the research data in some way but is NOT a data creator.
- External collaborators
- Someone who has contributed to the research data in some way and may INCLUDE external data creators.
Refer to Data Storage - Completed Data
Confidentiality of data may be a requirement for projects being developed with industry partners, and/or with commercial applications. Such projects may require the negotiation and enactment of non-disclosure or confidentiality agreements that generally will affect the handling and use of data.
See Research and Innovation Services' guide to data and confidentiality for some general guidance. Specific enquiries about data confidentiality in the context of protection and commercialisation of intellectual property should be directed to JCU's commercialisation or contracts teams.
Consent is required from human participants before data can be collected or published. Obtaining informed consent to facilitate data sharing and publication involves:
- Developing an information sheet about maintaining confidentiality, data publication and sharing so participants can make an informed decision before consenting to participate. Your information sheet should be approved by JCU's Human Research Ethics Committee (HREC) as part of the ethical clearance process (refer below).
- Stating the possibility of future data publication and sharing, conditions for access (refer below) and de-identification processes (refer below)
- Seeking prior approval from HREC for consent forms and information sheets.
The Australian National Data Service provides some example sentences in their guide (pp. 14-15) - the examples listed below are appropriate in different contexts (e.g. open and conditional access):
I agree that research data gathered for the study may be published provided my name and other identifying information is not used. Other genuine researchers [may] have access to this data only if they agree to preserve the confidentiality of the information as requested in this form.
If explicit consent for sharing is not obtained at the time of the study, it may be possible to seek a waiver from reviewers or to go back to participants for additional consent.
The National Statement on Ethical Conduct in Human Research (p. 36) raises the ethical issue of obtaining consent for secondary use of data or information. It is, for example, usually impractical to obtain consent for secondary use of data routinely collected during delivery of a service and respect for participants needs to be demonstrated in other ways.
Sharing existing data without explicit consent is a possibility if all of the following conditions for a waiver of consent, as outlined in Section 2.3.10 of the National Statement of Ethical Conduct of Human Research, are met:
- involvement in the research carries no more than low risk to participants,
- the benefits from the research justify any risks of harm associated with not seeking consent,
- it is impracticable to obtain consent (for example, due to the quantity, age or accessibility of records),
- there is no known or likely reason for thinking that participants would not have consented if they had been asked,
- there is sufficient protection of their privacy,
- there is an adequate plan to protect the confidentiality of data,
- in case the results have significance for the participants’ welfare there is, where practicable, a plan for making information arising from the research available to them (for example, via a disease-specific website or regional news media),
- the possibility of commercial exploitation of derivatives of the data or tissue will not deprive the participants of any financial benefits to which they would be entitled, and
- the waiver is not prohibited by State, federal, or international law.
JCU researchers and HDR candidates should always consult their College / Centre Human Ethics Advisor, and the Research and Innovation Services Ethics and Research Integrity team for specific advice.
A contract is a legally binding agreement that defines and governs the rights and duties between or among its parties. A number of specific issues need to be addressed prior to the commencement of a research project relating to the use, management, sharing and ownership of research data and information. See Research and Innovation Services for further information at email@example.com
Datasets will be protected by copyright (and defined as ‘literary works’ under the Australian Copyright Act) if they meet certain threshold criteria of human authorship, originality, or creativity. Basically, compiling and presenting raw data (e.g. adding labels, units, performing calculations etc) is often sufficient to attract copyright protection. Copyright in research data produced at JCU is governed by JCU's Intellectual Property Policy and Procedure.
The Australian Research Data Commons Research Data Rights Management Guide provides the following example of the relationship between copyright and research data.
If a data logging machine placed in a creek were to generate 'raw data' about (for example) water quality, that data would not attract copyright protection, despite the fact that researchers may have used considerable skill, effort and expertise in siting the machine. Since the 'raw' data itself would have no human authorship and originality, it would not satisfy the legal basis of copyright. However, if a researcher were to examine the data from the data logging machine, notes certain errors and makes corrections to the data, or reforms the selection and arrangement of the dataset; that (sometimes relatively minor) act of human authorship, originality, and application of skill and judgment may be sufficient for the resulting dataset to attract copyright protection.
Even if your data does not meet the threshold for copyright there is no harm in applying a Creative Commons licence when publishing the data. It lets others know how you would like to be attributed and applies a limitation of liability and warranty clause to the data.
Applying a Creative Commons licence to your data is an easy way to ensure correct attribution and enable reuse. The following link provides a detailed explanation of the benefits, conditions and restrictions associated with the six CC licences:
You can also access a summary of the licence deeds from the links below:
The current generation of Creative Commons licences are International 4.0 licences. Creative Commons recommends you take advantage of the improvements in the 4.0 suite unless there are particular considerations that would require a ported (e.g. Australian) licence. The Australian Creative Commons licence chooser redirects to the international site. Older, ported licences can be selected using the drop-downs in the Data Publication section of Research Data JCU (but this is not usually required).
Offering your data under a CC licence does not mean you are giving up your copyright. Rather, you are allowing users to make use of your work in various ways, but only on certain conditions. The core conditions are outlined in the following table and can be combined to produce the six CC licences.
| No Derivative Works|
| Share Alike|
Applies to every Creative Commons work - except Creative Common Zero (CC 0).
Users are expected to give you appropriate credit, provide a link to the licence and indicate if changes have been made.
Users may copy, distribute, display or perform your work but only for non-commercial purposes.
Users may not adapt or change your work in any way.
|Users may remix, adapt and build on your work, but only if they distribute the derivative works under the same licence terms that govern the original work.|
Watch out for:
It is possible to dedicate your work to the Public Domain by using Creative Commons Zero (CC0).
You may prefer to use one of the CC licences listed to ensure any re-use is counted towards your research impact.
Proponents of CC 0 would argue that community norms are sufficient to ensure citation.
Watch out for:
This condition has the potential to stifle engagement and innovation. Only some datasets will have commercialisation potential but you should check with Research and Innovation Services if you're not sure.
The ‘preferred’ licence at JCU is CC BY-NC but your funder or journal may require you to make your data more open.
Permitting commercial use enables reuse such as sharing content on Wikipedia (which uses CC BY) and commercial organisations preserving content if publishers go bust!
Watch out for:
This condition severely restricts reuse including aggregating data and meta-analyses. Open Access journals such as PLoS will not allow you to use this condition. CC BY-NC-ND is often referred to as a ‘free advertising’ licence.
Journals may not permit you to use the ND clause as it limits the ability to do meta-analyses.
Watch out for:
This condition can reduce interoperability which is one of the aims of the FAIR Principles.
A licence can't feature both the Share Alike and No Derivative Works options. The Share Alike condition only applies to derivative works.
Creative Commons Zero (CC0) is for dedicating works to the public domain and is used by Dryad and other data repositories.
CC0 works on two levels: as a waiver of a person's rights to the work, and in case that is not effective, as an irrevocable, royalty-free and unconditional licence for anyone to use the work for any purpose. In Australia we always have moral rights (which includes the right to attribution) so the waiver is ‘ineffective’ i.e.CC0 waives all copyright and related rights to the fullest extend allowed by the law of the land. There are pros and cons for this approach and researchers need to decide what best meets their needs.
As the Digital Curation Centre suggests, this can be an ‘unattractive option for data whose creators have yet to fully exploit them, either academically or commercially. Nevertheless, it does resolve many of the ambiguities surrounding data use and reuse ... and greatly simplifies integration with other data.’
Dryad also argues that CC0 reduces the legal and technical impediments to data re-use. Imagine, for example, the difficulties you would encounter if you were mining multiple sources for data and were legally required to formally attribute all of the data owners. Community norms for scholarly communication are a more effective way of encouraging positive behaviour, such as data citation, than applying licences and that ‘Any publication that makes substantive reuse of the data is expected to cite both the data package and the original publication from which it was derived.’
The Open Data Commons Public Domain Dedication and Licence (PDDL) is similar to CC0, but is worded specifically in database terms. There is also the Open Data Commons Database Contents Licence (ODC-DbCL), which waives copyright for the contents of the database without affecting the copyright or database right of the database itself.
Attribution: Shaddim; original CC license symbols by Creative Commons, CC BY 4.0 , via Wikimedia Commons
See Data Custodian.
Refer to Custodianship Model for Research Data and Information and Data Custodian.