Deception Detection In Non Verbals, Linguistics And Data.

Australian Climate Data Before 2000 Has No Evidential Value - Digit Analysis Proof.

Data analysis of climate stations from the Australian Bureau Of Meteorology (BOM) shows that:

1-- Most country stations before 2000 are riddled with errors, fabrication and manipulation as to make them worthless of any evidential value relating to long term climate trends.

This is backed up by performing digit tests such as Benford's Law and Simonsohn's Number Bunching Test used in fraud analytics, as well Frequency Histogram analysis and pattern exploration within SAS software JMP, and using Bootstrap Resampling, a computational technique used to estimate statistics on a population by sampling a data set with replacement.

Also, data mining Decision Trees and Boosted Gradient Trees show patterns in the data.

Further, First Differences of temperature shows that after adjustments, many more pairs of days with identical temperatures are created, and also dramatic changes in temperature from day to day are created. These distributions are visually startling when graphed and show the propensity for BOM to use the 1950's and the 1980's as tipping points where there are dramatic changes in the temperature time series after adjustments.

2-- The BOM claim that raw data has only had basic data cleaning and spatial adjustments and then they say, "The often quoted “raw” series is in fact a composite series from different locations."   Raw is not Raw, as I showed my last blog post, it is a modeled output which fails digit analysis tests. On top of that, a series of further adjustments are done, which biases the data.

3-- Missing values in the data can either be left missing or carefully imputed (infilled). We show that BOM are not adverse to making up values, copy/pasting blocks of temperatures into different dates, sometimes they delete data and sometimes they impute data including outliers. This is bad practice and biases data. 

BUT the worst scenario is having data that is Missing NOT At other words, much of the BOM missing data has structure, meaning it is missing for a reason, thus biasing the data further. State of the art data mining applications such as Treenet find structure in the missing data.

4-- BOM claim homogeneity adjustments are needed  because of changing conditions such as station moves, infra structure changes etc. They also say:

"The differences between ‘raw’ and ‘homogenised’ datasets are small, and capture the uncertainty in temperature estimates for Australia." -BOM

No uncertainty is captured with adjustments, the data is not made more accurate, it is only "aligned" with the temperature time series.

In fact, the medicine is worse than the disease here, with adjustments larger than any biases, real or imagined (false positives), that they are attempting to correct. Some stations have adjustments of over 10 C degrees at some specific years.

Every station is getting adjustments, most before 1957 are cooled, then after that they are warmed. Every few years the adjustments are tweaked and warmed further. The magnitude of the manipulation is easily shown with First Difference Frequency Distributions and digit analysis.

Looking at the Australia averaged warming trends, at the moment the consensus is 1.5 C warming per 100 years, 2 years ago it was 0.9 C and 10 years ago even lower, around 0.47 C. The Wayback Machine on the internet can capture the Bom's previous handiwork:

Above: Australia Averaged Temperature Anomaly of 1.5 C for 2021

Below: Australia Averaged Temperature Anomaly of 0.47 C for 2011

A study compares 100 rural agricultural climate stations with minimal adjustments to BOM's "high quality" temperature series:

Biases in the Australian High Quality Temperature Network

"Various reports identify global warming over the last century as around 0.7°C, but warming in Australia at around 0.9°C, suggesting Australia may be warming faster than the rest of the world. This study evaluates potential biases in the High Quality Network (HQN) compiled from 100 rural surface temperature series from 1910 due to: (1) homogeneity adjustments used to correct for changes in location and instrumentation, and (2) the discrimination of urban and rural sites.

The approach was to compare the HQN with a new network compiled from raw data using the minimal adjustments necessary to produce contiguous series, called the Minimal Adjustment Network (MAN). The average temperature trend of the MAN stations was 31% lower than the HQN, and by a number of measures, the trend of the Australian MAN is consistent with the global trend. This suggests that biases from these sources have exaggerated apparent Australian warming. Additional problems with the HQN include failure of homogenization procedures to properly identify errors, individual sites adjusted more than the magnitude of putative warming last century, and some sites of such poor quality they should not be used, especially under a “High Quality” banner."

As the above study shows, there are large biases in the Australian warming record that are larger then Europe and the US. I have downloaded temps for Berlin, Marseilles, Frankfurt, Amsterdam and various US areas to compare, and the level of bias in the Australian data is much larger. As mentioned above, there is no way to justify the High Quality banner that BOM attaches to it's network.

Every station is getting adjustments over every block of ten years before the year 2000, and they are large especially during specific months like March, August and October. The BOM averages averages which produces mostly incorrect results and hides large effects.

5 -- We show that after the adjustments, there are tweaks involving 10-15 day blocks that have linear relationships with raw data. It appears linear regression is being used is used to "tweak" blocks of days after adjustments.

7-- We show most country stations are worthless for climate analysis before 2000, but some stations in particular are rotten, even after that time. Stations such as Bourke, Mildura, Moree, Nhill and even Port Macquarie are so riddled with fabricated and error ridden data that they are worthless for trend prediction. Digital analytics, bootstrap resampling, first differencing and data mining show show that these stations should not be used for analysis.

8--The dates 1953-1957 and 1963-1967 are significant in many times series. These are the tipping points at which the earlier data date is cooled, and the later data is warmed--this occurs whether the data has an inhomegeneity problem or not. These dates are frequent tipping points, and data mining applications such as Treenet show these trigger points again and again.

Data cannot be made more accurate with homogenisation software, it can only bring the time series "into step" if biases exist. The accuracy can only be as accurate as at the initial reading, which has very large boundaries of errors, we show.

The myth created by the industry is that homogenisation is required for station changes, yet it is this homogenisation process that is the principal driver of warming. In fact, adjusted data is in most cases less compliant to Benfords law (used for fraud detection) compared to raw data. The data processing is inherently flawed because:

a-- Biases are created with imputation/infilling and/or selective deletions.
b- Biases are created by sequentially running software over and over again without multiplicity correction. Software has a critical value of 5%, which is the accepted false positive value which increases with sequential runs..
d--Biases are created by p-hacking, harking, publication bias. (link).

Google ngrams show that "Climate Change" surfaced around 1992 after "Global Warming", yet it was Republican pollster Frank Luntz who came up with the term "Climate Change" but wanted to change it to "Global Warming" because it was thought that climate change sounded less severe at the time. (link)

Language has had a large part to play in the climate scenario, and an analysis of the BOM language will follow at the end.

Quick Look At Australia Averaged Anomaly Trends

Since the main weapon in the arsenal of BOM is the Global or Average Australia  trend temps, it all comes down to one number. And since BOM have no interest in providing boundaries of error or confidence intervals, we have no way of knowing whether any warming is statically significant or not. But what we can do is compare the BOM 2011 Global temps with the 2021 Global temps. This will give us a handle on the BOM methodology. The charts are already shown above, but now we download the actual numbers to compare:

This histogram shows the 2011 warming data and the blue is the 'update' 2021 warming data. You can see the standard methodology --  the past is cooled even more, the dates after the late 60's are warmed even more. The original 2011 data has been deleted by BOM on its website, leaving you with history re-written.

Another histogram, with binned values to smooth it out:

So this looks bad. But how bad is it?

Above are 2 Benford's law curves. I will go into more detail about Benford's Law later on, suffice to say for the quick comparison, that Benford's law provides a test for the value of the first digit. Most observational data complies with this log family of curves.

Looking at the red dotted curve as our baseline, you can clearly see that the 2021 data has less ones (lower blue bar on graph) than 2011, it has more 4,5,6,7 and 9's.

The 2011 curve is roughly compliant, with only a few too many values 1's and 4's in the first digit position!!

So in the 2021 curve, the first digit value has less 1 and 2's and lots more higher numbers than there should be. This shows more extreme numbers being used in the second graph and it it clearly non compliant.

This is typical throughout the whole time series -- the more versions of temperature we get, the more extreme they are, and the more non compliant to  Benfords Law they are.

Would it's own mother recognise what 2011 has become?

The distributions and therefore the temp trends are different enough at the 95% level to reject the null of similarity. They are different, and any claim of small changes between distributions are obviously not true

"Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. This process allows you to calculate standard errors, construct confidence intervals, and perform hypothesis testing for numerous types of sample statistics. Bootstrap methods are alternative approaches to traditional hypothesis testing and are notable for being easier to understand and valid for more conditions." (link)

Bootstrapping is a modern powerful statistical computation technique that overcomes many problems and assumptions of the old methods of estimating statistics on a population. I will be using it on all comparisons, especially since the assumption of normality (used by older methods) is not true in climate data (more later).

OK, so, maybe the old global data is bogged down with dodgy numbers in the first 90 years--maybe the most recent temperatures are similar between 2011 and 2021?

The difference of means between the 2 groups is not contained in either Confidence Interval at the 95% level, meaning the groups are different enough to reject the null hypothesis with a p value of 0.05. They are very different, it was not small tweak by BOM, they obviously "got the memo" to increase temps to 1.5 C per century and it is highly significant.

I turned the 2011 and 2021 distributions into a time series model with a 10 year forecast at the end of it to compare models. There are vastly different results, but the important bit to not is to look at the green Confidence Intervals at the 95% level--they are huge! This means the warming variation is massive and you would be hard pressed to form a conclusion based on this series.

Below: 2011 and 2021 distributions turned into time series forecast model.

Notes On Benford's Law

I did go into depth on Benford's law in my recent posts, but there I was using the climate industry offsets called temperature anomalies to convert temperatures into a log type curve to use it for Benfords law.

Now however, I have been using a better, quicker and more accurate method to use temperatures in Benford's Law.

Briefly, Benfords law is the most used technique in the world to red flag suspect data, there are tens of thousands of pages describing it on the internet, it is used by the tax office to red flag potential tax fraud, and law enforcement to detect money laundering, and so and so on.

Most observational data has enough range to automatically be able to be tested--the first digit of your house number address conforms to Benford's Law because the range of numbers is observational and the range is large enough.

Now the problem with temperature is that observed temps in climate are very narrow, lets say between -10C to +50C, give or take a few degrees. This range is not 200-1 or more so wont fit a log-type curve.

A simple technique I came up with that I baseline checked for accuracy  involves First Differences or First Differencing:

It is used in climate and economics and statistics and involves the subtracting of the number before it. If yesterday was 25 C and today is 20, the differences is 5...this is done for all the sequential values in a time series.

So Sydney's temperatures (left highlighted column) runs down in sequential order as they happened, and the differences are in the right column.

This results in the time series becoming de-trended, that is any trend in the time series is removed, it is also used to identify patterns and detect linearity. (link

But it also has the happy result of exactly conforming to Benford's Law. Of course, we need to test this on placebo data, so lets do that.

Case 1:
Kinsa in the U.S have several million thermometers connected to the internet via smart phones. People buy the body temperature thermometers for $30 and volunteer to add their readings into a real time data base. This has been the most effective predictor of Covid spread, because fever is one of the Covid symptoms. The accuracy of the 'fever' maps produced by Kinsa has helped develop strategies in the U.S to combat spread. We know the temperatures are on average accurate because of the effectiveness of the maps. 

I downloaded 300 000 readings and selected a subset in Feb and March with the counties with highest number of readings.

A difference was done on the Kinsa temperatures for March and Feb.

Benfords Law first digit analysis was performed for the Feb data which resulted in a tight fit--the temperatures are Benford Compliant.

Case 2:
This was duplicated for March (below) with the same results.

Body temperatures have even a lower range than climate temperatures, so this is a good test for temperatures overall.

Case 3:
But most observational data complies with Benfords law, so 30 000 greyhound dogs that ran in Ballarat were downloaded. Their weight was differenced and tested:

Case 4 +5:
This was also down for Covid hospital admissions. I randomly picked Belgium from European data and got a good fit. I also checked with PM2.5, one of the most widely analysed air pollution studies in the U.S, and this complied with Benfords law too, however, I have decided not to show every single study and analysis I did because of space constraints.

There is little doubt that First Difference of temperatures creates a curve that is Benfords law compliant.

case 6:
As a comparison, lets look at Nhill in Victoria for the month of October. Instead of a graph, I will used the actual numbers output so that you can see the size of the variation:

The first digit position has 9 values, of which 1 is most prevalent, it should occur about 30% of the time, here it occurs a bit over 20% of the time. This shows that the values of 1 and 2 come in far less than they should, with values 3,4,5 coming in more than they should. The p value is <0.0001, the data is non compliant. The data is red flagged and very suspicious--this is an extreme deviation from expected observational data.

Do Different Years Get Different Warming Treatment?

I binned the temperature time series into 20 year blocks to test how different years are treated. I used minv2.1 for Sydney.

The red values at the Pearson Test number show rejection of the null hypothesis at the 95% level, the p value is 0.05--anything less than this is non compliant and the null is rejected. The 20 year block of 1920-1940 scrapes thru. Overall, 3 out of 5 binned years are non compliant, the the other 2 binned blocks of years are just compliant.

Other Anomalies:

In my last blog I highlighted the large amount of copy pasting that was being done in Sydney data - large blocks of temperatures, full months in fact-- were  copy pasted into other years. I won't be repeating this in this analysis, instead I want to look at a few new directions.

Decimal values are are serious problem in the BOM data. I have already looked at the missing decimal values, where only months and months of integers exist. But there is another problem with decimals-they have a correlation with the integers.
There shouldn't be any correlation between the integer and the decimal value, if the integer goes up, the decimal value shouldn't be decreasing, but that is what is happening.

These are  not minor correlations, they are large, among the largest in the time series:

As a matter of interest, it's evident that the minimum temps are the most manipulated--they are the most copy/pasted/the largest anaomalies with integer to decimal, the largest warming or cooling adjustments and they fail the most in Benfords law and Number Bunching Tests in general.

While looking at tests, let's get into Uri Simonsohn's Number Bunching Test. (link)
This is a very novel premise to test all the numbers at once, instead of just the first one as in Benford's Law. I managed to script Uri's test in JMP for analysis and was able to replicate all his tests and results on his blog.

This allows me to get consistent output with the SAS JMP software. The test checks for bunching or clumping in histograms. We know the histograms are dodgy by the BOM, I had dozens of examples in my last blog.

A quick look at Nhill in Victoria shows a fabricated frequency distribution in the histogram:

This is very evident in raw and exists but is weaker in adjusted data. The histogram has highly repeated temps interspersed with low freq (occurring) temps, see raw pic above. There is a consistent pattern of high low. I have been told it may be conversion from Fahrenheit to Celsius during metrication in the 70's, but this appears in some stations after 1980 and even after 2000.

Now this is extreme and easy to see, but the Melbourne August temperature below is more subtle and and a bit  different:

And Sydney August follows:

In both cases, a few temperature occur a lot (the high bars indicate high freq), while some temperatures say between 20-25 C never occur or occur a few times only in a time series that has 3300 days of August in this case. So certain temps dominate, some have large gaps revealing that they never occur or only occur a few times. These temps are "bunched" as per Simonsohn's terminology, or "clumped" together, with too few occurrences in some places and too many occurrences in others, above what is expected. 

So we need a test to quantify this, and that's what the number bunching test bootstrap resampling test does. Simonsohn has much experience in exposing fabrication in scientific studies, and is an expert in digital analysis.

To test this on placebo data, I tested it on various data sets, notably the PM2.5 air pollution data set from the US, subject of over 40 studies, as well as the Kinsa body temperature data, greyhound dogs starting prices.... all checked out as expected, no excessive frequencies of certain numbers outside the 95% level of expectation. I am not showing the results here because I am having trouble keeping posts to a manageable size.

Simonsohn Number Bunching Test For Sydney Histograms, Month By Month.

Testing each month -- a distribution of expected average frequencies is created with a bootstrap resample and the monthly BOM frequency histogram is compared to the bootstrapped expected average frequencies to see how common or uncommon they are. See datacoloda for more info.

It can be seen that the temperature frequency histograms are extremely heavily manipulated and/or fabricated. We know some data has been fabricated in the Sydney time series with my last post showing some of the incidences of weeks and months being copy/pasted into subsequent years.

Now we can see conclusively that the temperatures have been heavily tampered with, by changing the frequency of occurrence, where some temperatures are overused and appear far too much, some far too little.

Each version of the adjusted data changes the frequency of occurrence of temperatures, which can be easily seen in histograms that have a bin width of 1.
September, June and May are the only months where the null is not rejected at the 95% level of significance, in other words, they have temperature frequencies as we would expect.

The other 9 months have heavily manipulated temperature frequencies, with many temperatures appearing far more often than they should.

Each updated adjusted data set further increases in frequency a range of temperatures. This increasing of frequency is a tool to increase warming, reducing the frequency of occurrence cools the temps down. 

This further backs up Benford's Law that the temperature time series is not observational data.

One more thing to check, and keeping the Sydney data for consistency, lets look at the month of November for binned blocks of years-- 1920-1940, 1940-1960, 1960-1980, 1980-2000 and finally 2000-2020.

Every block of years for Sydney November temperatures fails horribly at the 95% level of significance, 2000-2020 is the only block of years that is at 1.2%, still lower than p value of 5% (0.05) so we reject the null, but not that far away.

The rest of the years are hopeless, the chance of seeing such extreme temperature frequencies at random are over a hundred million to other words virtually zero. Thus the probability of large scale tampering is virtually certainty.

As mentioned in the beginning, there is no evidential value in nearly any of the time series 1910-2000, and even after 2000 it is lucky dip as to which series still have value.

I was going to post some more number bunching tests for Bourke and Mildura, but they are even worse than Sydney, if that is possible. I may still do a few more in another post.

Summary: If a major capital city has dirty data, the country stations have no hope, and this is borne out with subsequent tests. The data is dirty, manipulated and fabricated and this can be proven with universal digit tests.

Averaging can hide a multitude of sins, that's why I am using daily time series with around 38 000 days. The BOM technique of coming up with a single averaged number instead of a distribution without Confidence Intervals which is supposed to encompass countrywide warming is meaningless, even if the data were clean! 

The Flaw Of Averages States, " If you are using averages, on average you will be wrong." (link)

No comments

Post a Comment

© ElasticTruth

This site uses cookies from Google to deliver its services - Click here for information.

Professional Blog Designs by pipdig