How do we analyse historical climate data?

Zak Baillie is the Research Assistant working with Climate History Australia to piece together Australia’s longest daily weather record. 

He’s currently analysing the dataset that Zooniverse citizen scientists have produced for Adelaide to span a critical gap from April 1843 to 1856 in the record.

Zak Baillie is a Research Assistant at the ANU’s Climate History Australia project.

What makes these old Adelaide Survey Department journals so important to understanding what Australia’s past climate was like?

“Right now our understanding of Australia’s climate is largely based on ‘modern’ climate records which the Australian Bureau of Meteorology maintain. These stretch back to around 1900. However, as we know from journals like those from Adelaide that we’re currently transcribing through Zooniverse, the colonists were observing the weather right from the moment they arrived in Australia. If we can locate and digitise these observations it means that, for some areas of Australia, we can add an extra 100 years of daily data to what we already have from the Bureau of Meteorology. That is an amazing resource and one that has largely been untapped in Australia.”

“A lot of interesting weather happened in Australia in the 1800s and having daily observations throughout this period gives us the ability to better understand the variability of the country’s climate. Just how hot and dry can it get? How wet and cold can it get? Answers to these questions will be instrumental in understanding how our climate might respond to climate change.”

“Once analysed, we will also submit this data to global databases. The information in these databases are used to improve the overall understanding of the global climate system.”

For the Adelaide weather rescue project on Zooniverse, each observation is entered eight times by different volunteers. What is the reason behind this? How will you analyse eight versions of each datapoint?

“As many of our volunteers will have noticed, this record can be hard to read at times. This means, for some pages, there is going to be disagreement between volunteers. For example, one volunteer might interpret a number as a ‘7’, while another might interpret the same number as a ‘2’. The challenge for us in compiling this record is knowing whether that number is a seven or two.” 

“Our colleagues from New Zealand at Southern Weather Discovery, during their most recent Zooniverse campaign, set out to understand just how many times a single observation needs to be digitised for the ‘true’ value to be recorded. The ‘true’ value being the value that appears in the weather journal. The magic number was eight.” 

“Therefore we have each page of the record digitised eight times. To find the ‘true’ value we search for what we call the ‘consensus’ value. This is the value where six of eight transcriptions agree. We can then build our whole dataset with these ‘consensus’ values. For the workflows we’ve analysed so far, 92% of digitised data has satisfied this criterion, which is fantastic.” 

“In the case when we don’t have six or more transcriptions in agreement, we flag this observation and take the most common value that has been transcribed and run further checks.”

Climate scientists sometimes use the word “homogenisation” when referring to analysing historical weather observations. What does that mean? How is this Adelaide record going to be homogenised?

“When we analyse climate data it’s critical that any changes observed in the dataset – for example a warming trend or a cooling trend – reflect changes in the climate, rather than changes in other non-climate factors. If these changes are due to non-climate factors it will undermine our interpretation of this dataset and its usefulness. Removing non-climate influences from historical weather observations is known as ‘homogenisation’. It’s important that we do all we can to minimise or remove inhomogeneities before analysis.” 

“Sometimes, the location of the instruments might have changed, or the instruments might become damaged over time. If we’re lucky, there will be a note about this in the comments, so we know to look out for how this might have biased the data. For example, if a thermometer was relocated, the observed temperatures might be cooler or warmer than before the thermometer was moved. This would be an example of a dataset that is not homogenous. That is, there were non-climate factors which produced a variation in the observations. In preparing datasets for analysis we must ‘homogenise’ a record so that all non-climate influences are minimised or removed.” 

“Other common non-climate influences on climate records include the type of instrument that was used to measure the weather. These include exposure of the site (for example if a building was constructed adjacent to the weather station), changes in the time the observations were taken throughout the record, or biases associated with different weather observers.” 

“There’s a lot of work in preparing historical weather observations for analysis. There are a number of statistical procedures we use to ‘homogenise’ a record and this is a complex task. You can find more information about this from The Bureau of Meteorology’s website.”

What have you found most interesting since starting this project collecting 19th century weather data from Adelaide?

“I’ve been most struck by the generosity of our volunteers who have taken the time to digitise this record. We really couldn’t do any of this work without their help – we are so appreciative!” 

“We’ve got a lot of really exciting and important research planned for this Adelaide Survey Office data. Once complete, it will be the longest daily weather record in Australia! We are really looking forward to sharing the results with our volunteers.” 

The digitisation of the Adelaide Survey Department journals is now halfway complete. Read our progress update for an update on the project, including our first results: One month in: Our preliminary results

We’ve made great progress, but we need your help to complete the last few variables! Anyone with access to a computer and the internet can help recover these important details about Australia’s climate history. To get involved, visit: