Analyse Data

Analysing your data is the most crucial part of your research project during which you will be summarising and interpreting enormous amounts of data to make meaningful and useful patterns, relationships and trends.

Data visualisation uses statistical graphs, plots, information graphics and other tools to create visual representations of data. The goal is to summarise and communicate data clearly, precisely and efficiently so that it might promote new insights.

There are many types of visualisations and thousands of tools available and they range greatly in complexity (e.g. from bar graphs to heat maps, networks, 3D models etc) and specificity.

The following are some of the more popular tools and training resources. Keep in mind that data visualisation tools are often used for exploratory data analysis and not just for displaying results. Some of these tools are designed to do both.

  • From Data to Viz leads you to the most appropriate graph for your data. It links to the code (R, Python, D3.js) to build it and lists common caveats you should avoid.
  • Selected Tools is a curated collection of tools that the people behind recommend. View the entire list or filter by function (maps, charts, data or colour) and whether you are willing to write any code.
  • Data Visualisation Catalogue is a library of different visualisation types and can be searched by function e.g. comparisons, hierarchy, processes & methods, analysing text etc) or viewed as a list. Each entry includes an example, explains how the visualisation is used and links to tools.
  • Visualising Data - Resources List's Categories (filters) include data handling, charting, programming, multivariate, mapping, web-based, specialist and colour.

Data Wrangling or Data Cleaning is the process of identifying and correcting errors and/or making formatting more consistent. It’s often required to prepare data for analysis and/or visualisation, and (where appropriate) when publishing and sharing data. Data also needs to be cleaned before archiving. This will ensure that it’s preserved correctly, is not misinterpreted by other users, and facilitates interoperability (one of the FAIR Principles).

White et al (2013) published an excellent paper ‘Nine simple ways to make it easier to (re)use your data in Ideas in Ecology and Evolution. The authors noted that much of the shared data in ecology and evolutionary biology is not easily reused because they don't follow best practices in terms of data structure, metadata and licences.

Their nine specific recommendations are:

  • Share your data.
  • Provide metadata.
  • Provide an unprocessed form of the data.
  • Use standard data formats.
  • Use good null values.
  • Make it easy to combine your data with other datasets.
  • Perform basic quality control.
  • Use an established repository.
  • Use an established and liberal license