Category Archives: Bureau of Meteorology

Analysis of parallel temperature data using t-tests

Part 2. Brisbane Airport

Dr Bill Johnston

Using paired and un-paired t-tests to compare long timeseries of data observed in parallel by instruments housed in the same or different Stevenson screens at one site, or in screens located at different sites, is problematic. Part of the problem is that both tests assume that the air being monitored is the control variable. That air inside the screen is spatially and temporally homogeneous, which for a changeable, turbulent medium is not the case.  

Irrespective of whether data are measured on the same day, paired t-tests require the same parcels of air to be monitored by both instruments 100% of the time. As instruments co-located in the same Stevenson screen are in different positions their data cannot be considered ‘paired’ in the sense required by the test. Likewise for instruments in separate screens, and especially if temperature at one site is compared with daily values measured some distance away at another.

As paired t-tests ascribe all variation to subjects (the instruments), and none to the response variable (the air) test outcomes are seriously biased compared to un-paired tests, where variation is ascribed more generally to both the subjects and the response.

The paired t-test compares the mean of the differences between subjects with zero, whereas the un-paired test compares subject means with each other. If the tests find a low probability (P) that that the mean difference is zero, or that subject means are the same, typically less than (P<) 0.05, 5% or 1 in 20, it can be concluded that subjects differ in their response (i.e., the difference is significant). Should probability be less than 0.01 (P<0.01 = 1% or 1 in 100) the between-subject difference is highly significant. However, significance itself does not ensure that the size of difference is meaningful in the overall scheme of things.

Assumptions

All statistical tests are based on underlying assumptions that ensure results are trustworthy and unbiased. The main assumption for is that differences in the case of paired tests, and for unpaired tests, data sequenced within treatment groups are independent meaning that data for one time are not serially correlated with data for other times. As timeseries embed seasonal cycles and in some cases trends, steps must be taken to identify and mitigate autocorrelation prior to undertaking either test.  

A second, but less important assumption for large datasets, is that data are distributed within a bell-shaped normal distribution envelope with most observations clustered around the mean and the remainder diminishing in number towards the tails.

Finally, a problem unique to large datasets is that the denominator in the t-test equation becomes diminishingly small as the number of daily samples increase. Consequently, the t‑statistic becomes exponentially large, together with the likelihood of finding significant differences that are too small to be meaningful. In statistical parlance this is known as Type1 error – the fallacy of declaring significance for differences that do not matter. Such differences could be due to single aberrations or outliers for instance.

A protocol

Using a parallel dataset related to a site move at Townsville airport in December 1994, a protocol has been developed to assist avoiding pitfalls in applying t-tests to timeseries of parallel data. At the outset, an estimate of effect size, determined as the raw data difference divided by the standard deviation (Cohens d) assesses if the difference between instruments/sites is likely to be meaningful. An excel workbook was provided with step-by-step instructions for calculating day-of-year (1-366) averages that define the annual cycle, constructing a look-up table and deducting respective values from data thereby producing de-seasoned anomalies. Anomalies are differenced as an additional variable (Site2 minus Site1, which is the control).

Having prepared the data, graphical analysis of their properties, including autocorrelation function (ACF) plots, daily data distributions, probability density function (PDF) plots, and inspection of anomaly differences assist in determining which data to compare (raw data or anomaly data). The dataset that most closely matches the underlying assumptions of independence and normality should be chosen and where autocorrelation is unavoidable, randomised data subsets offer a way forward. (Randomisation may be done in Excel and subsets of increasing size used in the analysis.)

Most analyses can be undertaken using the freely available statistical application PAST from the University of Oslo: https://www.nhm.uio.no/english/research/resources/past/ Specific stages of the analysis have been referenced to pages in the PAST manual.

The Brisbane Study

The Brisbane study replicates the previous Townsville study, with the aim of showing that protocols are robust. While the Townsville study compared thermometer and automatic weather station maxima measured in 60-litre screens located 172m apart, the Brisbane study compared Tmax for two AWS each with 60-litre screens, 3.2 km apart, increasing the likelihood that site-related differences would be significant.

While the effect size for Brisbane was triflingly small (Cohens d = 0.07), and the difference between data-pairs stabilised at about 940 sub-samples, a significant difference between sites of 0.25oC was found when the number of random sample-pairs exceeded about 1,600. Illustrating the statistical fallacy of excessive sample numbers, differences became significant because the dominator in the test equation (the pooled standard error) declined as sample size increased, not because the difference widened. PDF plots suggested it was not until the effect size exceeded 0.2, that simulated distributions showed a clear separation such that the difference between Series1 and Series2 of 0.62oC could be regarded as both significant and meaningful in the overall scheme of things.

Importantly, the trade-off between significance and effect size is central to avoiding the trap of drawing conclusions based on statistical tests alone.

Dr Bill Johnston

4 June 2023

Two important links – find out more

First Link: The page you have just read is the basic cover story for the full paper. If you are stimulated to find out more, please link through to the full paper – a scientific Report in downloadable pdf format. This Report contains far more detail including photographs, diagrams, graphs and data and will make compelling reading for those truly interested in the issue.

Click here to access a full pdf report containing detailed analysis and graphs

Second Link: This link will take you to a downloadable Excel spreadsheet containing a vast number of data used in researching this paper. The data supports the Full Report.

Click here to download a full Excel data pack containing the data used in this research

Why statistical tests matter

Fake-news, flash-bangs

and why statistical tests matter

Dr Bill Johnston

www.bomwatch.com.au

Main Points

Comparing instruments using paired t-tests, verses unpaired tests on daily data is inappropriate. Failing to verify assumptions, particularly that data are independent (not autocorrelated), and not considering the effect of sample size on significance levels creates illusions that differences between instruments are significant or highly significant when they are not. Using the wrong test and naïvely or bullishly disregarding test assumptions plays to tribalism not trust.

Investigators must justify the tests they use, validate that assumptions are not violated, that differences are meaningful and thereby show their conclusions are sound. 

Discussion

Paired or repeated-measures t-tests are commonly used to determine the effect of an intervention by observing the same subjects before and after (e.g., 10 subjects before and after a treatment). As within-subjects variation is controlled, differences are attributable to the treatment. In contrast, un-paired or independent t‑tests compare the means of two groups of subjects, each having received one of two interventions (10 subjects that received one or no treatment vs. 10 that were treated). As variation between subjects contributes variation to the response, un-paired t-tests are less sensitive than paired tests.

Extended to a timeseries of sequential observations by different instruments (Figure 1), the paired t-test evaluates the probability that the mean of the difference between data-pairs (calculated as the target series minus the control) is zero. If the t‑statistic indicates the mean of the differences is not zero, the alternative hypothesis that the two instruments are different prevails. In this usage, significant means there is a low likelihood, typically less than 0.05, 5% or one in 20, that the mean of the difference equals zero. Should the P-value be less than 0.01, 0.001, or smaller, the difference is regarded as highly significant. Importantly, significant and highly significant are statistical terms that reflect the probability of an effect, not whether the size of an effect is meaningful.

To reiterate, paired tests compare the mean of the difference between instruments with zero, while un-paired t‑tests evaluate whether Tmax measured by each instrument is the same.

While sounding pedantic, the two tests applied to the same data result in strikingly different outcomes, with the paired test more likely to show significance. Close attention to detail and applying the right test is therefore vitally important.

Figure 1. Inside the current 60-litre Stevenson screen at Townsville airport. At the front are dry and wet-bulb thermometers, behind are maximum (mercury) and minimum (alcohol) thermometers, held horizontally to minimise “wind-shake” which can cause them to re-set, and at the rear, which faces north, are dry and wet-bub AWS sensors. Cooled by a small patch of muslin tied by a cotton wick that dips into the water reservoir, wet-bulb depression is used to estimate relative humidity and dew point temperature. (BoM photograph).

Thermometers Vs PRT Probes

Comparisons of thermometers and PRT probes co-located in the same screen, or in different screens, rely on the air being measured each day as the test or control variable, thereby presuming that differences are attributable to instruments. However, visualize conditions in a laboratory verses those in a screen where the response medium is constantly circulating and changing throughout the day at different rates. While differences in the lab are strictly attributable, in a screen, a portion of the instrument response is due to the air being monitored. As shown in Figure 1, instruments that are not accessed each day are more conveniently located behind those that are, thereby resulting in spatial bias. The paired t-test, which apportions all variation to instruments is the wrong test under the circumstances.

Test assumptions are important

The validity of statistical tests depends on assumptions, the most important of which for paired t-tests is that differences at one time are not influenced by differences at previous times. Similarly for unpaired tests where observations within groups cannot be correlated to those previous. Although data should ideally be distributed within a bell-shaped normal-distribution envelope, normality is less important if data are random and numbers of paired observations exceed about 60. Serial dependence or autocorrelation reduces the denominator in the t-test equation, which increases the likelihood of significant outcomes (false positives) and fatally compromises the test.

Primarily caused by seasonal cycles the appropriate adjustment for daily timeseries is to deduct day-of-year averages from respective day-of-year data and conduct the right test on seasonally adjusted anomalies.

Covariables on which the response variable depends are also problematic. These includes heating of the landscape over previous days to weeks, and the effects of rainfall and evaporation that may linger for months and seasons. Removing cycles, understanding the data, using sampling strategies and P-level adjustments so outcomes are not biased may offer solutions.

Significance of differences vs. meaningful differences

A problem of using t-tests on long time series is that as numbers of data-pairs increase, the denominator in the t-test equation, which measures variation in the data, becomes increasingly small. Thus, the ratio of signal (the instrument difference) to noise (the standard error, pooled in the case of un-paired tests) increases. The t‑value consequently becomes exponentially large, the P-level declines to the millionth decimal place and the test finds trifling differences to be highly significant, when they are not meaningful. So, the significance level needs to be considered relative to the size of the effect.

For instance, a highly significant difference that is less than the uncertainty of comparing two observations (±0.6oC) could be an aberration caused by averaging beyond the precision of the experiment (i.e., averaging imprecise data to two, three or more decimal places).

The ratio of the difference to the average variation in the data [i.e., (PRTaverage minus thermometeraverage) divided by the average standard deviation], which is known as Cohens d, or the effect size, also provides a first-cut empirical measure that can be calculated from data summaries to guide subsequent analysis.

Cohens d indicates whether a difference is likely to be negligible (less than 0.2 SD units), small (>0.2), medium (>0.5) or large (<0.8), which identifies traps to avoid, particularly the trap of unduly weighting significance levels that are unimportant in the overall scheme of things.

The Townsville case study

T-tests of raw data were invalidated by autocorrelation while those involving seasonally adjusted anomalies showed no difference. Randomly sampled raw data showed significance levels depended on sample size not the difference itself, thus exposing the fallacy of using t‑tests on excessively large numbers of data-pairs. Irrespective of the tests, the effect size calculated from the data summary of 0.12 SD units is trivial and not important.     

Conclusions

Using paired verse unpaired t-tests on timeseries of daily data inappropriately, not verifying assumptions, and not assessing the effect size of the outcome creates division and undermines trust. As illustrated by Townsville, it also distracts from real issues. Using the wrong test and naïvely or bullishly disregarding test assumptions plays to tribalism not trust.

A protocol is advanced whereby autocorrelation and effect size are examined at the outset. It is imperative that this be carried out before undertaking t-tests of daily temperatures measured in-parallel by different instruments.

The overarching fatal error is using invalid tests to create headlines and ruckus about thin-things that make no difference, while ignoring thick-things that would impact markedly on the global warming debate.

Two important links – find out more

First Link: The page you have just read is the basic cover story for the full paper. If you are stimulated to find out more, please link through to the full paper – a scientific Report in downloadable pdf format. This Report contains far more detail including photographs, diagrams, graphs and data and will make compelling reading for those truly interested in the issue.

Click here to download the full paper Statistical_Tests_TownsvilleCaseStudy_03June23

Second Link: This link will take you to a downloadable Excel spreadsheet containing a vast number of data points related to the Townsville Case Study and which were used in the analysis of the Full Report.

Click here to access the full data used in this post Statistical tests Townsville_DataPackage

Part 6. Halls Creek, Western Australia

Is homogenisation of Australian temperature data any good?

Dr Bill Johnston[1]

scientist@bomwatch.com.au

Background

Homogenisation of Australian temperature data commenced in the late 1980s and by 1996 under the watchful eye of Bureau of Meteorology (BoM) scientist Neville Nicholls, who at that time was heavily involved with the World Meteorological Organisation (WMO) and the fledgling Intergovernmental Panel on Climate Change (IPCC), the first Australian high-quality homogenised temperature dataset (HQ1) was produced by Simon Torok. This was followed in succession by an updated version in 2004 (HQ2) that finished in 2011, then the Australian Climate Observations Reference Network – Surface Air Temperature (ACORN-SAT) dataset, with version 1 (AcV1) released in 2012 being updated until 2017. AcV2 replaced ACV1 from 2018, with the most recent iteration AcV2.3 updated to December 2021.

Why is homogenisation important?

Data homogenisation represents the pinnacle of policy-driven science, meaning that following release of the First Assessment Report by the Intergovernmental Panel on Climate Change (IPCC) in 1990, for which Neville Nicholls was a substantial contributor, the Australian government set in-place a ‘climate’ agenda. Although initially rejected by Cabinet, in 1989 Labor Senator Graham Richardson proposed a 20% reduction in 1988 Australian greenhouse gas emission levels by 2005. The target was adopted in October 1990 as a bipartisan policy (i.e., by both major Australian political parties) and endorsed by a special premiers conference in Brisbane as the InterGovernmental Agreement on the Environment in February 1992 (https://faolex.fao.org/docs/pdf/aus13006.pdf). Following that meeting in February, the Council of Australian Governments (COAG) was set-up by Labor Prime Minister Paul Keating in December 1992.

As outlined by the Parliamentary Library Service: https://www.aph.gov.au/About_Parliament/ Parliamentary_Departments/Parliamentary_Library/pubs/rp/rp1516/Climate2015 this was the mechanism whereby the most important and far-reaching policy agenda since Federation in 1901, was ushered into place without a vote being cast by the unsuspecting electorate. However, in order to support the policy:

  • Land-surface temperatures had to be shown to be warming year-on-year, particularly since 1950.
  • Models were needed that predicted climate calamities into the future.
  • Natural resources -related science, which was previously the prerogative of the States, required reorganisation under a funding model that guided outcomes in the direction of the policy agenda. 
  • Particular attention was also paid to messaging climate alarm regularly and insistently by all levels of government.  

As it provides the most tangible evidence of climate warming, trend in maximum temperature (Tmax) is of overarching importance. It is also the weakest link in the chain that binds Australians of every creed and occupation to the tyranny of climate action. If homogenisation of Tmax data is unequivocally shown to be a sham, other elements of the policy, including evidence relied on by the IPCC are on shaky ground. This is the subject of the most recent series of reports published by www.bomwatch.com.au.

The question in this paper is whether trend and changes in the combined Tmax dataset for Halls Creek, reflect site and instrument changes or changes in weather and climate.

Halls Creek maximum temperature data

Detailed analyses of Halls Creek Tmax using objective, replicable physically-based BomWatch protocols found data were affected by a change at the old post office in 1917, another in 1952 after the site moved to the Aeradio office at the airport in 1950, and another in 2013 due to houses being built within 30m of the Stevenson screen two years before it relocated about 500m southeast to its present position in September 2015. Three step-changes resulted in four data segments; however, mean Tmax for the first and third segments were not different.

While the quality of data observed at the old post office was inferior to that of sites at the airport, taking site changes and rainfall into account simultaneously left no trend or change in Tmax data that could be attributed to climate change, CO2, coal mining, electricity generation or anything else. Furthermore, step-changes in the ratio of counts of data less than the 5th and greater than the 95th day-of-year dataset percentiles (low and high extremes respectively) were attributable to site changes and not the climate. Nothing in the data therefore suggests the climate of the region typified by Halls Creek Tmax has warmed or changed.        

 Homogenisation of Halls Creek data

As it was originally conceived, homogenisation aimed to remove the effects non-climate impacts on data, chief amongst those being weather station relocations and instrument changes, so homogenised data reflected trends and changes in the climate alone. In addition, ACORN-SAT sought to align extremes of data distributions so, in the words of Blair Trewin, data “would be more homogeneous for extremes as well as for means”. To achieve this, Trewin used first-differenced correlated reference series and complex methods to skew data distributions at identified changepoints based on transfer functions. This was found to result in an unrealistic exponential increase in upper-range extremes since 1985.

Reanalysis using BomWatch protocols and post hoc tests and scatterplots showed that in order to achieve statistically significant trends, homogenisation cooled past temperatures unrealistically and too aggressively. For instance, cool-bias increased as observed Tmax increased. It was also found that the First Law of Thermodynamics on which BomWatch protocols are based, did not apply to homogenised Tmax data. This shows that the Bureau’s homogenisation methods produce trends in homogenised Tmax data that are unrelated to the weather, and therefore cannot reflect the climate.

As it was consensus-driven and designed to serve the political ends of WWF and Australia’s climate industrial elites, and it has no statistical, scientific or climatological merit, the ACORN-SAT project is a disgrace and should be abandoned in its entirety.   


[1] Former NSW Department of Natural Resources research scientist and weather observer.

Two important links – find out more

First Link: The page you have just read is the basic cover story for the full paper. If you are stimulated to find out more, please link through to the full paper – a scientific Report in downloadable pdf format. This Report contains far more detail including photographs, diagrams, graphs and data and will make compelling reading for those truly interested in the issue.

Click here to download the full paper with photos graphs and data.

Second Link: This link will take you to a downloadable Excel spreadsheet containing a vast number of supporting data points for the Potshot (Learmonth) paper.

Click here to download the Excel spreadsheet.

Part 5. Potshot and ACORN-SAT

Is homogenisation of Australian temperature data any good?

Part 5. Potshot and ACORN-SAT

Dr Bill Johnston

Former NSW Department of Natural Resources research scientist and weather observer.

scientist@bomwatch.com.au

Careful analysis using BomWatch protocols showed that ACORN-SAT failed in their aim to “produce a dataset which is more homogeneous for extremes as well as for means”. Their failure to adjust the data for a step-change in 2002 shows unequivocally that methodology developed by Blair Trewin lacks rigour, is unscientific and should be abandoned.

Read on …

Potshot was a top-secret long-shot – a WWII collaboration between the United States Navy, the Royal Australian Air Force and Australian Army, that aimed to deter invasion along the lightly defended north-west coast of Western Australia and take the fight to the home islands of Japan. It was also the staging point for the 27- to 33-hour Double-Sunrise Catalina flying-boat service to Ceylon (now Sri Lanka) that was vital for maintaining contact with London during the dark years of WWII. It was also the base for Operation Jaywick, the daring commando raid on Singapore Harbour by Z-Force commandos in September 1943.

The key element of Potshot was the stationing of USS Pelias in the Exmouth Gulf to provide sustainment to US submarines operating in waters to the north and west. To provide protection, the RAAF established No. 76 OBU (Operational Base Unit) at Potshot in 1943/44 and, at the conclusion of hostilities, OBU Potshot was developed as RAAF Base Learmonth, a ‘bare-base’ that can be activated as needed on short-notice. Meteorological observations commenced at the met-office in 1975. Learmonth is one of the 112 ACORN-SAT sites (Australian Climate Observations Reference Network – Surface Air Temperature) used to monitor Australia’s warming. Importantly, it is one of only three sites in the ACORN-SAT network where data has not been homogenised.

Potshot was the top-secret WWII base that transitioned to RAAF Base Learmonth at the conclusion of WWII.  

  • By not adjusting for the highly significant maximum temperature (Tmax) step-change in 2002 detected by BomWatch, ACORN-SAT failed its primary objective which is to “produce a dataset which is more homogeneous for extremes as well as for means”.
  • Either Blair Trewin assumed the 12-year overlap would be sufficient to hide the effect of transitioning from the former 230-litre Stevenson screen to the current 60-litre one; or his statistical methods that relied on reference series were incapable of objectively detecting and adjusting changes in the data.
  • In either case it is another body-blow to Trewin’s homogenisation approach. Conflating the up-step in Tmax caused by the automatic weather station and 60-litre screen with “the climate” and lack of validation within the ACORN-SAT project generally, unethically undermines the science on which global warming depends.
    • Due to their much-reduced size, and lack of internal buffering, 60-litre Stevenson screens are especially sensitive to warm eddies that arise from surfaces, buildings etc. that are not representative of the airmass being monitored.
    • Increased numbers of daily observations/year ≥95th day-of-year dataset percentiles relative to those ≤5th day-of-year percentiles, which explains the up-step, is a measurement issue, not a climatological one.

By omission in the case of Learmonth, as ACORN-SAT produces trends and changes in homogenised data that do not reflect the climate, the project and its peers including others run under the guise of the World Meteorological Organisation of which Trewin is a major player, should be abandoned.   

Dr. Bill Johnston

12 January 2023     

Two important links – find out more

First Link: The page you have just read is the basic cover story for the full paper. If you are stimulated to find out more, please link through to the full paper – a scientific Report in downloadable pdf format. This Report contains far more detail including photographs, diagrams, graphs and data and will make compelling reading for those truly interested in the issue.

Click here to download the full paper with photos graphs and data.

Second Link: This link will take you to a downloadable Excel spreadsheet containing a vast number of supporting data points for the Carnarvon paper.

Click here to download the Excel spreadsheet

Postscript

Dr Trewin and his peers, including those who verify models using ACORN-SAT data, scientists at the University of NSW including Sarah Perkins-Kilpatrick, those who subscribe to The Conversation or The Climate Council are welcome to fact-check or debate the outcome of our research in the open by commenting at www.bomwatch.com.au. The Datapack relating to the Potshot Report is available here Learmonth_DataPack

Part 4. Carnarvon, Western Australia

Is homogenisation of Australian temperature data any good?

Part 4. Carnarvon, Western Australia

Dr Bill Johnston[1]

scientist@bomwatch.com.au

The ACORN-SAT project is deeply flawed, unscientific and should be abandoned.

Read on …

Maximum temperature data for Carnarvon, Western Australia is surprisingly no use at all for tracking climatic trend and change. With Learmonth close second at Longitude 114.0967o, Carnarvon (113.6700o) is the western-most of the 112 stations that comprise the ACORN-SAT network (Australian Climate Observations Reference Network – Surface Air Temperature) used to monitor Australia’s warming (Figure 1). Carnarvon is also just 1.4o Latitude south of the Tropic of Capricorn but unlike Learmonth, which receives moderate rain in February, Carnarvon receives practically none from the end of August through to March. Due to its vast underground aquifer, the lower Gascoyne is the most productive irrigation area in WA and most of the vegetables, melons and fruits sold in Perth are grown around Carnarvon.

Figure 1. The distribution of ACORN-SAT sites in WA, as well as all the neighbouring sites used by ACORN-SAT v.3 to homogenise Carnarvon data. 

Located about 900 km north of Perth, Carnarvon post office, built in 1882, was an important link in the expanding WA telegraph network and the aerodrome was an important refuelling stop. Runways were lengthened by the Royal Australian Air Force and during WWII it was used as a forward operation base

Servicing the main 1930s west-coast air corridor from Perth to Darwin, and Java and on to Europe, Aeradio was established by the Air Board in Geraldton, Carnarvon, Port Hedland, and at aerodromes every 300 km or so further north, and around Australia in 1939/40. A close-up view of Carnarvon Aeradio snipped from a 1947 aerial photograph is shown in Figure 2. While weather observers trained by the Bureau of Meteorology (BoM) in Melbourne undertook regular weather observations, prepared forecasts and provided pilot briefings, radio operators maintained contact with aircraft and advised of inclement conditions. As units were well-distributed across the continent, combined with post officers that reported weather observations by telegraph, the aeradio network formed the backbone of ACORN-SAT. Homogenisation of their data with that of post offices, lighthouses and pilot stations ultimately determines the apparent rate of warming in Australia’s climate.

Figure 2. A snip of the Carnarvon aerodrome operations precinct in 1949 showing location of facilities and that are area in the vicinity of the met-enclosure (met) was watered. (Three additional radio masts were located within in the watered area and watering was probably necessary to ensure grounding of the earth-mat buried in the dry soil.) 

Despite several rounds of peer review the question of whether trends and changes in raw data are homogenised Tmax data reflect site and changes or the true climate has not been independently assessed.

The research is important. Alteration of data under the guise of data homogenisation, flows through to BoM’s annual climate statements, CSIRO’s State of the Environment, and State of the Climate reports, reports put out by IPCC, COP, and the World Economic Forum, and ultimately government supported scare campaigns run by WWF, the Climate Council and other green groups, that underpin the unattainable goal of net-zero.

Commencing around 1990, Australia’s homogenisation campaign has been overseen by Dr Neville Nicholls, Emeritus Professor at Monash University, a loud and vocal supporter of the warming hypotheses. Changing the data on which it is based, then claiming in collegially peer-reviewed publications that the climate is warming, risks considerable economic, strategic and political harm. Australia is being weakened from within by Nicholls and his protégés, including those within CSIRO, and every aspect of the economy, from de-industrialisation, to attacks on agriculture, to looming conflict with China is predicated by temperature data homogenisation.

Summary findings

  • Good site control is fundamental to any experiment investigating long-term trend and change. However, the site at the post office was shaded, watered and generally poor, the Aeradio site was also watered, while the site at the meteorological office was subject to multiple changes after 1990.
  • Homogenisation failed to undertake basic quality assurance and frequency analysis, so could not objectively adjust for the effect of extraneous factors such as watering, site chances etc. Consequently, as ACORN-SAT data for Carnarvon were not homogeneous either for extremes or annual means it failed its primary objective.

While changing data to agree with hypotheses is unscientific and dishonest, the most obvious homogenisation subterfuge is the adjustment of changes that made no difference to the data, while ignoring those that did. Second, using reference series comprised of neighbouring datasets without ensuring they are homogeneous.

The use of correlated data that likely embed parallel faults to disproportionally correct faults in ACORN-SAT data and thereby embed trends in homogenised data, has no statistical or scientific merit. As the ACORN-SAT project is misleading irredeemably deeply flawed and is a clear danger to Australia’s prosperity and security, it should be abandoned.

A reoriented and rescaled overlay of an aerial photograph showing the watered airport precinct relative to the location of the post office in 1947, and the November 2021 Google Earth Pro satellite image, locating the same sites. Runways were unsealed in 1947 and there were still several splinter-proof aircraft shelters visible (marked ‘s’). By 2021 they had moved the non-directional beacon (NDB) from behind the power house (ph), and the site to the right of the meteorological office (MO) had been moved out of the way of the access road. The MO closed in 2016.

3 January 2023

Two important links – find out more

First Link: The page you have just read is the basic cover story for the full paper. If you are stimulated to find out more, please link through to the full paper – a scientific Report in downloadable pdf format. This Report contains far more detail including photographs, diagrams, graphs and data and will make compelling reading for those truly interested in the issue.

Click here to download the full paper with photos graphs and data

Second Link: This link will take you to a downloadable Excel spreadsheet containing a vast number of supporting data points for the Carnarvon paper.

Click here to download the Excel spreadsheet for Carnarvon

Note: Line numbers are provided in the linked Report for the convenience of fact checkers and others wishing to provide comment. If these comments are of a highly technical nature, relating to precise Bomwatch protocols and statistical procedures, it is requested that you email Dr Bill Johnston directly at scientist@bomwatch.com.au referring to the line number relevant to your comment.   


[1] Former NSW Department of Natural Resources research scientist and weather observer.

Part 3. Meekatharra, Western Australia

Is homogenisation of Australian temperature data any good?

Dr Bill Johnston[1]

scientist@bomwatch.com.au

The ACORN-SAT project is deeply flawed, unscientific and should be abandoned.

Read on …   

Situated 500 km east from Shark Bay, south of the Gibson Desert and adjacent to the Great Victoria Desert, Meekatharra is a hot, dry isolated outback town in mid-west Western Australia. Famously referred to as the end of the earth by Australia’s former Prime Minister, Malcolm Fraser when his aircraft was diverted from Perth in 1977 due to inclement weather, Meekatharra is now the epicentre of a mining boom and the airport serves as a hub for fly-in fly-out workers and a base for the Royal Flying Doctor Service (RFDS).

Constructed as an all-weather ‘bare-base’ aerodrome with long, sealed runways in 1943, linking Perth, the secret bomber base at Corunna Downs near Marble Bar, and Darwin, Meekatharra was one of only a few aerodromes in outback WA capable of handling heavy bombers. It was relinquished by the RAAF to the Department of Civil Aviation as a Commonwealth airport after 1946, and ownership transferred to the Shire of Meekatharra in 1993.

Weather observations commenced at the post office on the corner of Main and High streets Meekatharra in January 1926, having previously been reported from Peak Hill, about 110 km to the NW from 1898. Observations transferred to the former RAAF Aeradio office in 1950, and according to ACORN-SAT metadata, the site moved to a new meteorological office (MO) in about 1975 (Figure 1). However, files held by the National Archives of Australia (NAA) show that before the office was built in 1974, an instrument enclosure, instruments, a theodolite post and wind shield used in launching weather balloons were installed near the proposed new office in 1972 (Figure 2). The overlap with data from the previous Aeradio site, which continued to be used at least until staff relocated, probably in 1975 (Figure 3), was apparently used to smooth the transition to the new site.

ACORN-SAT

Meekatharra is one of 112 Australian Climate Observations Reference Network – Surface Air Temperature (ACORN-SAT) sites used by the Bureau of Meteorology, CSIRO, state governments, WWF and the Climate Council, to convince themselves, kiddies for climate action, and everyone else that the climate is warming irrevocably due to CO2.

Combined with dodgy measurement practices, data homogenisation is used at Meekatharra to create warming in maximum temperature (Tmax) data that is unrelated to the climate. Adjusting for a change in 1934 that was not significant, ignoring that the Aeradio site was watered, and that a period of overlap from 1972 was used to smooth the move to the MO site, allegedly in about 1975, for which no adjustment was made, created trends in homogenised data that were unrelated to the climate. Furthermore, data for the total of 18 sites used to homogenise Meekatharra Tmax, were not homogeneous.

The assertion that ACORN-SAT sites have been carefully and thoroughly researched, and that comparator reference sites selected on the basis of inter-site correlations would be broadly homogeneous around the time site changes occurred is demonstrably untrue. From multiple perspectives, the underlying proposition that series derived from up to 10 reference stations could provide a “high level of robustness against undetected inhomogeneities” is not supported.

As no change in the climate is detectable across the nineteen datasets examined, including Meekatharra, and the methodology is unscientific and deeply flawed, the ACORN-SAT project should be abandoned.

Figure 1. The Meekatharra meteorological office in August 2010 (from the ACORN-SAT Catalogue).

Figure 2. A screenshot of files held by the National Archives of Australia relating to the new 1972 instrument enclosure at Meekatharra (Search term Meekatharra meteorological).

Figure 3. Building plan in 1971 showing the RFDS hanger (108), Aeradio and met-office (101), the fenced enclosure southwest of the office including met (H2) and seismograph huts, towers suspending the aerial array and earth-mat, workshop (102), fuel bowser (107), power plant (106), and workshop and equipment buildings (120 and 124).       


An important link – find out more

The page you have just read is the basic cover story for the full paper. If you are stimulated to find out more, please link through to the full paper – a scientific Report in downloadable pdf format. This Report contains far more detail including photographs, diagrams, graphs and data and will make compelling reading for those truly interested in the issue.

Click here to download the full paper with photos graphs and data.

Note: Line numbers are provided in the linked Report for the convenience of fact checkers and others wishing to provide comment. If these comments are of a highly technical nature, relating to precise Bomwatch protocols and statistical procedures, it is requested that you email Dr Bill Johnston directly at scientist@bomwatch.com.au referring to the line number relevant to your comment.   

[1] Former NSW Department of Natural Resources research scientist and weather observer.

About

Welcome to BomWatch.com.au a site dedicated to examining Australia’s Bureau of Meteorology, climate science and the climate of Australia. The site presents a straight-down-the-line understanding of climate (and sea level) data and objective and dispassionate analysis of claims and counter-claims about trend and change.

BomWatch delves deeply into the way in which data has been collected, the equipment that has been used, the standard of site maintenance and the effect of site changes and moves.

Dr. Bill Johnston is a former senior research scientist with the NSW Department of Natural Resources (abolished in April 2007); which in previous guises included the Soil Conservation Service of NSW; the NSW Water Conservation and Irrigation Commission; NSW Department of Planning and Department of Lands. Like other NSW natural resource agencies that conducted research as a core activity including NSW Agriculture and the National Parks and Wildlife Service, research services were mostly disbanded or dispersed to the university sector from about 2005.

BomWatch.com.au is dedicated to analysing climate statistics to the highest standard of statistical analysis

Daily weather observations undertaken by staff at the Soil Conservation Service’s six research centres at Wagga Wagga, Cowra, Wellington, Scone, Gunnedah and Inverell were reported to the Bureau of Meteorology. Bill’s main fields of interest have been agronomy, soil science, hydrology (catchment processes) and descriptive climatology and he has maintained a keen interest in the history of weather stations and climate data. Bill gained a Batchelor of Science in Agriculture from the University of New England in 1971, Master of Science from Macquarie University in 1985 and Doctor of Philosophy from the University of Western Sydney in 2002 and he is a member of the Australian Meteorological and Oceanographic Society (AMOS).

Bill receives no grants or financial support or incentives from any source.

BomWatch accesses raw data from archives in Australia so that the most authentic original source-information can be used in our analysis.

How BomWatch operates

BomWatch is not intended to be a blog per se, but rather a repository for analyses and downloadable reports relating to specific datasets or issues, which will be posted irregularly so they are available in the public domain and can be referenced to the site. Issues of clarification, suggestions or additional insights will be welcome.   

The areas of greatest concern are:

  • Questions about data quality and data homogenisation (is data fit for purpose?)
  • Issues related to metadata (is metadata accurate?)
  • Whether stories about datasets consistent and justified (are previous claims and analyses replicable?)

Some basic principles

Much is said about the so-called scientific method of acquiring knowledge by experimentation, deduction and testing hypothesis using empirical data. According to Wikipedia the scientific method involves careful observation, rigorous scepticism about what is observed … formulating hypothesis … testing and refinement etc. (see https://en.wikipedia.org/wiki/Scientific_method).

The problem for climate scientists is that data were not collected at the outset for measuring trends and changes, but rather to satisfy other needs and interests of the time. For instance, temperature, rainfall and relative humidity were initially observed to describe and classify local weather. The state of the tide was important for avoiding in-port hazards and risks and for navigation – ships would leave port on a falling tide for example. Surface air-pressure forecasted wind strength and direction and warned of atmospheric disturbances; while at airports, temperature and relative humidity critically affected aircraft performance on takeoff and landing.

Commencing in the early 1990s the ‘experiment’, which aimed to detect trends and changes in the climate, has been bolted-on to datasets that may not be fit for purpose. Further, many scientists have no first-hand experience of how data were observed and other nuances that might affect their interpretation. Also since about 2015, various data arrive every 10 or 30 minutes on spreadsheets, to newsrooms and television feeds largely without human intervention – there is no backup paper record and no way to certify those numbers accurately portray what is going-on.

For historic datasets, present-day climate scientists had no input into the design of the experiment from which their data are drawn and in most cases information about the state of the instruments and conditions that affected observations are obscure.

Finally, climate time-series represent a special class of data for which usual statistical routines may not be valid. For instance, if data are not free of effects such as site and instrument changes, naïvely determined trend might be spuriously attributed to the climate when in fact it results from inadequate control of the data-generating process: the site may have deteriorated for example or ‘trend’ may be due to construction of a road or building nearby. It is a significant problem that site-change impacts are confounded with the variable of interest (i.e. there are potentially two signals, one overlaid on the other).

What is an investigation and what constitutes proof?

 The objective approach to investigating a problem is to challenge the straw-horse argument that there is NO change, NO link between variables, NO trend; everything is the same. In other words, test the hypothesis that data consist of random numbers or as is the case in a court of law, the person in the dock is unrelated to the crime. The task of an investigator is to open-handedly test that case. Statistically called a NULL hypothesis, the question is evaluated using probability theory, essentially: what is the probability that the NULL hypothesis is true?

In law a person is innocent until proven guilty and a jury holding a majority view of the available evidence decides ‘proof’. However, as evidence may be incomplete, contaminated or contested the person is not necessarily totally innocent –he or she is simply not guilty.

In a similar vein, statistical proof is based on the probability that data don’t fit a mathematical construct that would be the case if the NULL hypothesis were true. As a rule-of-thumb if there is less than (<) a 5% probability (stated as P < 0.05) that that a NULL hypothesis is supported, it is rejected in favour of the alternative. Where the NULL is rejected the alternative is referred to as significant. Thus in most cases ‘significant’ refers to a low P level. For example, if the test for zero-slope finds P is less than 0.05, the NULL is rejected at that probability level, and trend is ‘significant’. In contrast if P >0.05, trend is not different to zero-trend; inferring there is less than 1 in 20 chance that trend (which measures the association between variables) is not due to chance.

Combined with an independent investigative approach BomWatch relies on statistical inference to draw conclusions about data. Thus the concepts briefly outlined above are an important part of the overall theme. 

Using the air photo archives available in Australia, Dr Bill Johnston has carried out accurate and revealing information about how site changes have been made and how these have affected the integrity of the data record.