Data analysis of climate stations from the Australian Bureau Of Meteorology (BOM) shows that:

1-- Most country stations before 2000 are riddled with errors, fabrication and manipulation as to make them worthless of any evidential value relating to long term climate trends.

This is backed up by performing digit tests such as Benford's Law and Simonsohn's Number Bunching Test used in fraud analytics, as well Frequency Histogram analysis and pattern exploration within SAS software JMP, and using Bootstrap Resampling, a computational technique used to estimate statistics on a population by sampling a data set with replacement.

Also, data mining Decision Trees and Boosted Gradient Trees show patterns in the data.

Further, First Differences of temperature shows that after adjustments, *many more pairs of days with identical temperatures are created*, and also dramatic **changes** in temperature from day to day are created. These distributions are visually startling when graphed and show the propensity for BOM to use the 1950's and the 1980's as tipping points where there are dramatic changes in the temperature time series after adjustments.

2-- The BOM claim that raw data has only had basic data cleaning and spatial adjustments and then they say, *"The often quoted
“raw” series is in fact a composite series from
different locations."* Raw is not Raw, as I showed my last blog post, it is a modeled output which fails digit analysis tests. On top of that, a series of further adjustments are done, which biases the data.

3-- Missing values in the data can either be left missing or *carefully* imputed (infilled). We show that BOM are not adverse to making up values, copy/pasting blocks of temperatures into different dates, sometimes they delete data and sometimes they impute data including outliers. This is bad practice and biases data.

BUT the worst scenario is having data that is *Missing NOT At Random*....in other words, much of the BOM missing data has structure, meaning it is missing for a reason, thus biasing the data further. State of the art data mining applications such as Treenet find structure in the missing data.

4-- BOM claim homogeneity adjustments are needed because of changing conditions such as station moves, infra structure changes etc. They also say:

*"The differences between ‘raw’ and ‘homogenised’
datasets are small, and capture the uncertainty in
temperature estimates for Australia." -BOM*

No *uncertainty* is captured with adjustments, the data is not made more accurate, it is only "aligned" with the temperature time series.

In fact, the medicine is worse than the disease here, with **adjustments larger than any biases,** real or imagined (false positives), that they are attempting to correct. Some stations have adjustments of *over 10 C degrees* at some specific years.

Every station is getting adjustments, most before 1957 are cooled, then after that they are warmed. Every few years the adjustments are tweaked and warmed further. The magnitude of the manipulation is easily shown with First Difference Frequency Distributions and digit analysis.

Looking at the Australia averaged warming trends, at the moment the consensus is 1.5 C warming per 100 years, 2 years ago it was 0.9 C and 10 years ago even lower, around 0.47 C. The Wayback Machine on the internet can capture the Bom's previous handiwork:

**Above: Australia Averaged Temperature Anomaly of 1.5 C for 2021**

**Below: Australia Averaged Temperature Anomaly of 0.47 C for 2011**

A study compares 100 rural agricultural climate stations with

*minimal adjustments*to

*BOM's "high quality" temperature series:*

# Biases in the Australian High Quality Temperature Network

**Additional problems with the HQN include failure of homogenization procedures to properly identify errors, individual sites adjusted more than the magnitude of putative warming last century, and some sites of such poor quality they should not be used, especially under a “High Quality” banner."**

*High Quality*banner that BOM attaches to it's network.

5 -- We show that after the adjustments, there are tweaks involving 10-15 day blocks that have *linear relationships with raw data*. It appears linear regression is being used is used to "tweak" blocks of days after adjustments.

7-- We show most country stations are worthless for climate analysis before 2000, but some stations in particular are *rotten, even after that time.* Stations such as Bourke, Mildura, Moree, Nhill and even Port Macquarie are so riddled with fabricated and error ridden data that they are worthless for trend prediction. Digital analytics, bootstrap resampling, first differencing and data mining show show that these stations should not be used for analysis.

8--The dates 1953-1957 and 1963-1967 are significant in many times series. These are the tipping points at which the earlier data date is cooled, and the later data is warmed--this occurs whether the data has an inhomegeneity problem or not. These dates are frequent tipping points, and data mining applications such as Treenet show these trigger points again and again.

Data cannot be made more accurate with homogenisation software, it can only bring the time series "into step" if biases exist. The accuracy can only be as accurate as at the initial reading, which has very large boundaries of errors, we show.

The myth created by the industry is that homogenisation is required for station changes, yet it is *this *homogenisation process that is the principal driver of warming. *In fact, adjusted data is in most cases less compliant to Benfords law (used for fraud detection) compared to raw data.* The data processing is inherently flawed because:

a-- Biases are created with imputation/infilling and/or selective deletions.

b- Biases are created by sequentially running software over and over again without multiplicity correction. Software has a critical value of 5%, which is the accepted false positive value which increases with sequential runs..

d--Biases are created by p-hacking, harking, publication bias. (link).

Google ngrams show that "Climate Change" surfaced around 1992 after "Global Warming", yet it was Republican pollster Frank Luntz who came up with the term "Climate Change" but wanted to change it to "Global Warming" because it was thought that climate change sounded less severe at the time. (link)

Language has had a large part to play in the climate scenario, and an analysis of the BOM language will follow at the end.

### A Quick Look At Australia Averaged Anomaly Trends

Since the main weapon in the arsenal of BOM is the Global or Average Australia trend temps, it all comes down to one number. And since BOM have no interest in providing

*boundaries of error*or

*confidence intervals*, we have no way of knowing whether any warming is statically significant or not. But what we can do is compare the BOM 2011 Global temps with the 2021 Global temps. This will give us a handle on the BOM methodology. The charts are already shown above, but now we download the actual numbers to compare:

This histogram shows the 2011 warming data and the blue is the 'update' 2021 warming data. You can see the standard methodology -- the past is cooled even more, the dates after the late 60's are warmed even more. The original 2011 data has been deleted by BOM on its website, leaving you with history re-written.

Another histogram, with binned values to smooth it out:

So this looks bad. But how bad is it?

Looking at the red dotted curve as our baseline, you can clearly see that the 2021 data has less ones (lower blue bar on graph) than 2011, it has more 4,5,6,7 and 9's.

** The 2011 curve is roughly compliant, with only a few too many values 1's and 4's in the first digit position!!**

So in the 2021 curve, the first digit value has less 1 and 2's and lots more higher numbers than there should be. This shows more extreme numbers being used in the second graph and it it clearly non compliant.

This is typical throughout the whole time series -- the more versions of temperature we get, the more extreme they are, and the more non compliant to Benfords Law they are.

Would it's own mother recognise what 2011 has become?

Nope--

The distributions and therefore the temp trends are different enough at the 95% level to reject the null of similarity. They are different, and any claim of small changes between distributions are obviously not true

Bootstrapping is a modern powerful statistical computation technique that overcomes many problems and assumptions of the old methods of estimating statistics on a population. I will be using it on all comparisons, especially since the assumption of normality (used by older methods) is not true in climate data (more later).

OK, so, maybe the old global data is bogged down with dodgy numbers in the first 90 years--maybe the most recent temperatures are similar between 2011 and 2021?

The difference of means between the 2 groups is not contained in either Confidence Interval at the 95% level, meaning the groups are different enough to reject the null hypothesis with a p value of 0.05. They are very different, it was not small tweak by BOM, they obviously "got the memo" to increase temps to 1.5 C per century and it is highly significant.

I turned the 2011 and 2021 distributions into a time series model with a 10 year forecast at the end of it to compare models. There are vastly different results, but the important bit to not is to look at the green Confidence Intervals at the 95% level--they are huge! This means the warming variation is massive and you would be hard pressed to form a conclusion based on this series.

Below: 2011 and 2021 distributions turned into time series forecast model.

### Notes On Benford's Law

**Now however, I have been using a better, quicker and more accurate method to use temperatures in Benford's Law.**

Body temperatures have even a lower range than climate temperatures, so this is a good test for temperatures overall.

### Do Different Years Get Different Warming Treatment?

The red values at the Pearson Test number show rejection of the null hypothesis at the 95% level, the p value is 0.05--anything less than this is non compliant and the null is rejected. The 20 year block of 1920-1940 scrapes thru. Overall, 3 out of 5 binned years are non compliant, the the other 2 binned blocks of years are just compliant.

This is very evident in raw and exists but is weaker in adjusted data. The histogram has highly repeated temps interspersed with low freq (occurring) temps, see raw pic above. There is a consistent pattern of high low. I have been told it may be conversion from Fahrenheit to Celsius during metrication in the 70's, but this appears in some stations after 1980 and even after 2000.

*above what is expected.*

### Simonsohn Number Bunching Test For Sydney Histograms, Month By Month.

**It can be seen that the temperature frequency histograms are extremely heavily manipulated and/or fabricated.**We know some data has been fabricated in the Sydney time series with my last post showing some of the incidences of weeks and months being copy/pasted into subsequent years.

**Now we can see conclusively that the temperatures have been heavily tampered with, by changing the frequency of occurrence, where some temperatures are overused and appear far too much, some far too little.**

*September, June and May are the only months where the null is not rejected at the 95% level of significance, in other words, they have temperature frequencies as we would expect.*

*The other 9 months have heavily manipulated temperature frequencies*, with many temperatures appearing far more often than they should.

**This further backs up Benford's Law that the temperature time series is not observational data.**

*Thus the probability of large scale tampering is virtually certainty.*

*even if the data were clean!*

## No comments

## Post a Comment