RDIM Terminology Version Control

Version Control

Version control is the process of managing revisions for files, records or datasets. It is particularly important for files that undergo numerous revisions, where there are multiple members of a research team, or when files are shared across multiple locations. Version control ensures you are working with current versions and that you are not wasting valuable research time or putting data at risk.

Basic version control can be achieved by assigning unique File Names and keeping a version control table to record changes. While including details such as initials, date modified, and status (e.g. draft, revised, and final) in file names alongside version numbers aids identification, it can become unwieldy. Much of this information is better captured in a version control table like this one:

Title  
Description  
Created By  
Date Created  
Maintained By  
Version Number Modified By Modifications Made Date Modified Status
     
     

Best practice:

Planning for best practice involves recognizing that there is no one-size-fits-all solution. Instead, it is essential to make thoughtful decisions regarding:

  • Retention Policy: Determine how many versions of a file to keep, which versions to retain, the duration of retention, and the folder structures for organizing versions.
  • Milestone Identification: Identify significant milestone versions, prioritizing major versions over minor ones. For instance, consider keeping version 02-00 but not 02-01.
  • Naming Conventions: Establish a systematic naming convention to uniquely identify different versions of files.
  • Documentation: Record changes made to a file when creating a new version and establish clear documentation for tracking those changes.
  • Relationship Mapping: Record relationships between items as needed, such as between code and the data file it operates on, the data file and related documentation or metadata, or multiple files.
  • Location Tracking: Track the location of files, especially if stored in various locations.
  • Synchronization: Regularly synchronize files in different locations to maintain consistency.
  • Centralized Storage: Identify a single location for storing milestone and master versions.

(Source: Adapted from the UK Data Archive Guide)


Version control systems:

While platforms like OneDrive, Google Docs and Dropbox offer built-in version history and the ability to restore previous versions, this does not substitute for a planned  and systematic approach to version control as outlined above. It is also critical to understand how long previous versions are retained when using these services. Always consult the documentation or support resources to ensure alignment with your project's needs.

For more complex research projects, especially those involving extensive collaboration or code development, a dedicated version control system may offer a more robust solution. These systems include sophisticated branching, collaboration, and tracking capabilities.

Git (used with the GitHub or GitLab platforms) and Mercurial are well-known solutions. The Bitbucket platform is also a popular choice and can host repositories that use Git or Mercurial. Subversion (SVN) with TortoiseSVN (Windows-based client for Subversion) is a user-friendly option, especially for those less familiar with command-line interfaces.