Deception Detection In Non Verbals, Linguistics And Data.



An Investigation Into Australian Bureau Of Meteorology - Large Scale Data Tampering.

“Findings from a multitude of scientific literatures converge on a single point: People are credulous creatures who find it very easy to believe and very difficult to doubt.” 
(How Mental Systems Believe, Dan Gilbert, psychologist)

The concept of garbage in, garbage out means that no meaningful output can result from 'dirty' data  being input, and all adjustments that follow are moot. So raw data files as records of observation are critical as an accurate temperature record. The question is, are they raw? Is this unadjusted observational data?

"The Bureau does not alter the original temperature data measured at individual stations." 
             -- BOM, 2014

Summary Of Results.
My Analysis Shows Heavily Tampered Raw Data. 
There are whole months that have been copy/pasted into other months, impossible sequences and duplications, complete nonconformance to Benford's law indicating heavy data tampering, standard errors of 30 or more in number bunching tests indicating abnormally repeated temperature frequencies, strategic rounding where no decimal numbers exist for years, there are temperatures where Raw is missing but the Adjusted data has been infilled/imputed to create yearly temp records and or upward trends. This infilling is strategically selective, only specific cases are infilled, thousands are left empty while extreme outliers are added into the data.

In most cases data is even worse after adjustments according to Benford's Law and Control Charts. 

And there is a smoking gun: nearly all raw data has man-made fingerprints showing engineered highs and lows in repeated data frequencies which can only be due to large scale tampering -- all beautifully visual when exposed by zero bin histograms. Raw is Not Raw, It Is Rotten And Overcooked.

Update: Below in More Dodgy Adjustments section, new tables in Minimum Temps show Bourke, the most manipulated fabricated station of them all, has had large cooling adjustments of -2.7C or more put onto ONLY May and Aug for 84 years! From 1911-1995, May and August Minimum received an average of 23 adjustments per year, whether the station moved up the hill or down the hill, whether the vegetation engulfed it or the thermometer drifted, or even if it was spot-on, it got a large cooling adj of -2.7C or more, right up to 1995, only for May and Aug!

Data from the 112 ACORN stations that I used:
Bom supply data for minimum and maximum temperatures:
maximum raw = maxraw
minimum raw = minraw
maximum adjusted = maxv1, maxv2, maxv2.1    different updates
minimum adjusted = minv1, minv2, minv2.1       different updates
I have compared ACORN Raw with AWAP Raw and it is identical. 
There has been no statistically significant warming over the last 15 years.” -- 13 February 2010, Dr. Phil Jones

Billion Dollar Data That Has Never Had An Audit.
Incredibly, the BOM temperature series data has never had an independent audit and has never been tested with fraud analytic software, despite the vast amounts of money involved in the industry and the flourishing consultancies that have popped up.

BOM make a lot of the independent review 10 years ago that compared their methodology and results to other climate data and found it robust because it was similar, but the data has never been audited.
As we will see later on, the other climate agencies have complete lack of conformance to Benford's Law too indicating data problems.

You don't even need Benford's Law to see multiple red flags with online queries from from the GHCN U.S data network. (linkThe U.S GHCN data stops reporting cooling stations in south East Australia after 1990!

The BOM say adjustments don't make any difference and their evidence is a graph of averages of averages of averages - days averaged to months averaged to years, averaged with 112 stations, all without published boundaries of error.

It is well known that pooled data can hide individual fraud and manipulation signatures, and even begin to conform with Benford's Law due to multiplication and or division in data (Diekmann, 2007).

Pooled data can also exhibit Simpsons Paradox where a trend in different groups can reverse when combined. 

Below: Annual Mean temperature averages BOM use to show that individual adjustments don't matter....without boundaries of error or confidence levels.

What is also amazing is that Benford's Law which has a proven history in fraud detection in many fields and is admissible as evidence in a court of law in the U.S  has not been run on any climate data of significance.

Much natural and man-made data follows Benford's Law if there are several orders of magnitude.

Temperature has an upper and lower limit to what you would normally observe so it doesn't follow Benford's Law per se, but if you convert it to a temperature anomaly (which is simply an offset used by climate industry ), then it does follow Benford's Law. (Sambridge et al, 2010).

You would think that your tax returns first digit would have values of equal probability appearing, but this isn't so -- one appears about 30% of the time and nine appears less than 5% of the time which is why the tax man is interested in this law too, it has helped find and even convict on tax or cheque fraud.

It turns out that human beings are not very good at fabricating numbers.

Peer Review Is No Guarantee
"Scientific fraud, particularly data fabrication is increasing."
 -- (Data Fraud In Clinical Trials, Stephen George and Marc Buyse, 2015) has over 2000 retractions.

Fujii has the world record with 183 retractions, after which he was fired from Tokyo University, hereUri Simonsohn has a website where he replicates studies and has been responsible for many retractions.

Smeesters, Staples and Sanna were three very high profile professors in peer reviewed journals. All were found guilty of data fabrication. All resigned and restracted their papers. 

Peer reviews are no protection against fabrication. Uri Simonsohn was responsible for exposing the 3 professors on data alone, he argues the way forward is to to supply all raw data and code with studies for replication.(link)

Below: One of the 53 studies from Stapels that was retracted due to fabrication. Note the duplicated entries. 

Number duplication or repeating the frequency of numbers is one of the most common causes of fabrication, and even the BOM uses low level copy/paste duplications of temperatures.

John Carlisle is an anaesthetist who is also a part time data detective, he has uncovered scientific misconduct in hundreds of papers and helped expose some of the world's leading scientific frauds. (link).

Reproducibility -
"No Raw Data, No Science"

Reproducibility is a major principle of the scientific method. (wiki).

"Reproduction in climate science tends to have a
broader meaning, and relates to the robustness
of results." -BOM

"Robustness checks involve reporting alternative specifications that test the same hypothesis.  Because the problem is with the hypothesis, the problem is not addressed with robustness checks."

Tsyoshi Miyakawa is editor of Molecular Brain, he estimates a quarter of the journals he has handled contain fabricated data, and is leading for his push "No Raw Data, No Science", to only publish reproducible studies that supply raw data.

The BOM supply little documentation regarding meta data, adjustments, neighbouring stations used, correlations used for adjustments. It is not transparent and so cannot be replicated.

Excel plug-ins such as XLstat have already built in the Alexandersson algorithms used by BOM so it would be possible to replicate the adjustments if sufficient documentation were available. 

"The removal of false detected inhomogeneities and the acceptance of inhomogeneous series affect each subsequent analysis." (A. Toreti,F. G. Kuglitsch,E. Xoplaki, P. M. Della-Marta, E. Aguilar, M. Prohom fand J. Luterbacher g, 2010)

The adjustment software used by the BOM is running at a 95% significance level so 1 in 20 sequences that are normal will be flagged as "breaks" or anomalous, as will the number of stations selected; this in turn affects each subsequent analysis.

"Homegenization does not increase the accuracy of the data - it can be no higher than the accuracy of the observations. The aim of adjustments is to put different parts of a series in accordance with each other as if the measurements had not been taken under different conditions." (M.Syrakova, V.Mateev, 2009)

Fraud Analytics
The principle in Fraud Analytics is that data that is fabricated or tampered looks different to naturally occurring data.

Tools to help in the search:

(1) SAS JMP - powerful statistical software designed for data exploration and anomolous pattern detections. This detects patterns such as copy/paste, unlikely duplications, sequences etc.

(2) R code for Benfords Law - industrial strength code to run most of the tests advocated by Mark Nigrini in his fraud analytics books. Benford's law points to digits that are used too much or too little. Duplication of exact numbers is a major cause of fraud. (Uri Simonsohn)

(3) R code from "Measuring Strategic Data Manipulation: Evidence from a World Bank Project"    --  By Jean Ensminger and Jetson Leder-Luis

 (4) R code to replicate BOM methodology to create temperature anomalies to use with Benford's law.

(5) R code from University of Edinburgh -- "Technological improvements or climate change? Bayesian modeling of time-varying conformance to Benford’s Law"  -- Junho Lee + Miguel de Carvalho.

(6) R code - "NPC: An R package for performing nonparametric combination of multiple dependent hypothesis tests" 
-- Devin Caughey from MIT

(7) R code - Number-Bunching: A New Tool for Forensic Data Analysis (datacolada).  Used to analyze the frequency with which values get repeated within a dataset, a major source of data fraud .

(8) CART decision trees from Salford Systems + K-means clustering from JMP.


"The Bureau's ACORN-SAT dataset and methods have been thoroughly
peer-reviewed and found to be world-leading." - BOM

Unlocking Data Manipulation With Temperature REPEATS -
The Humble Histogram Reveals Tampering Visually.

Data Detective Uri Simonsohn's Number Bunching R code is used in forensic auditing to determine how extreme number bunching is in a distribution.

I used to use the code for fairly subtle distribution discrepancies before realising that the BOM Raw temperature data has been so heavily engineered that it isn't needed -- the visual display from any stats program shows this specific residue of extreme tampering. 

This visual display is a fingerprint to manipulated data, it involves a specific structure of 1 temperature that is highly repeated in the data, then 4 low repeated temps, then 1 high repeater, then 5 lower repeats. This 4-5 alternating sequence is methodical, consistant and man-made, and it leaves gaps between the highest repeated temperatures.

And this is immediately visible in virtually all Raw Data and it proves that the data is not observational temperature readings.

The way to see this is with a particular histogram that can be created with any stats program. Let's talk about the histogram.

A histogram is an approximate representation of the distribution of numerical data. Lets look at Tennant Creek Maximum Raw temps as an example. This gives you a rough idea what the distribution looks like with the temperatures at the bottom horizontal X axis and the frequency (repeats) on the vertical Y axis. This lets you see what temps appeared the most often, this show you the shape of the distribution.

But the data is binned, many observations are put in each bin, so you can't tell exactly how many times a specific temperature appeared in the data. 

Looking at binned histograms though won't show you anything unusual on cursory inspection because the BOM use Quantile Matching algorithms to match distributions of Adjusted data with Raw.

We want to know how many times each individual temperature appears, so we need a histogram that doesn't bin it's data.

Above: We are looking at the exact same distribution, but now each and every temp has a value that shows exactly how often it appears in the data.

The highest spike is 37.8C degrees and repeated 758 times in Maxraw data. The higher the spike, the more often it appeared.

note -- this is NOT related to Australia going metric and changing to Celcius in 1972, these graphs show the same thing in the 1940's and the 1990's too.

And here is the problem for the BOM --  you can see straight away that this is dodgy data. The reason data is binned in normal histograms is that if you get down to a granular data level things become very noisy and it's difficult to see the shape of the distribution, a bit like this (below the US NOAA NW region climate data).

Not so with the BOM data -- things become clearer because this is not observational data, it has been engineered to have specific high repeated temperatures (high spikes), followed by gaps where there are lower frequency (repeated) temps, then a high one again and so on.

What this is saying is that the highest frequency temperatures are neatly ordered between lower repeated temperatures.

Let's look at Deniliquin Minraw.
These are the numbers that go into the created histogram.
The Maxraw temp of 8.9C degrees (top line) repeated 833 times in the data creating the highest spike because it appeared the most often, it had the highest frequency.

But look what happens -- there is a gap of 4 LOW repeated temps then another HIGH repeated temp (next one is 9.4C at 739 repeats).
Then there is a gap of 5 LOW then 1 HIGH, then 4 LOW and so on and so on.

This is RAW data and it is engineered so that there are consistantly alternating gaps of 4 and 5 low numbers between the extreme high spikes. And this occurs with most Raw data (at least 80%). It is a major mistake by the BOM, it is a residue, a left over from tampering.

Recap -- the very high spikes you see in the graph is from a simple histogram without binning available in most stats programs. It is showing us that so-called RAW data which is supposed to be observational data, actually has an artificial structure that is man-made!

You don't see a dataset with a high frequency temperature (high repeat) then 4 very low frequency temps, then a high frequency, then 5 low temps, continuing, in a dataset from the natural world. This is not random, it is engineered, and it is a mistake from one of the BOM algorithms!

Lets look at Bourke Minraw:

Above: Bourke Minraw temperatures, exact same signature.

Lets look at Charters Towers:

Above: The same fraud signature showing Raw data is not Raw but overcooked. These high alternating repeated temps between the low ones are unnatural and there is no explanation for this except large scale tampering of Raw.

This tampering is so extreme it doesn't exist to this level in other climate data from other agencies -- the BOM is the most heavy handed and brazen.

Let's do a quick tour around various stations with just the visual histogram:

All Raw, all have extreme spikes showing extreme repeats, all have the same 4-5 alternating gaps! All are unnatural.

Now what happens when the Raw get Adjusted?
The spikes get turned down, their frequency and rate of repeats is reduced, but the vast body of lower repeated temps are increased.

Look at Deniliquin Minraw-- 7.2C degrees is repeated 799 times in Raw but only 362 times in Minv2 adjusted.
But the low repeats in the gaps are increased!

Let's look at Bourke repeated temps and compare Raw to Adjusted.

Same thing, the highest repeated temps, the spikes in the graph are reduced by reducing the frequency with which they appear. The low frequency temps are increased.

Tennant Creek, same thing:

What is the net result of doing adjustments on raw data? The high spikes are reduced, getting rid of the evidence.

The low frequency temps (of which there are many more) are increased in frequency giving a net warming effect.

Temperatures are controlled by reducing or increasing the frequency with which they are repeated in the data!

Tennant Creek Maxv2 Adjusted -- spikes reduced, they now merge with the gaps, so everthing appears more kosher on a cursory inspection. 

Below: A different look at how temperature frequency is manipulated up or down.

You can see that in Raw, 15C repeated 827 times, while in Minv2.1 it appears 401 times. This reduces the large spikes in the Adjusted histograms.

Summary Of Histograms Showing Patterns In Raw:
Histograms with zero binning at a granular level expose systematic tampering with the RAW data being engineered with a specific layout - the highest repeated temperatures are followed with an alternating gaps of 4 and 5 low frequency repeated temperatures, followed by a single high frequency repeated temp and so on.

Normally, for most data that is a bit more subtle than the BOM Raw, Uri Simonsohn's Number Bunching R code is required to detect extreme number bunching or repeating. But the BOM data is extremely heavy handed to such an extent, they have left a visual obvious residue from their tampering algorithms. This is proven when comparing their temps to other agencies, none display the extreme spikes and gaps. The BOM really is world-leading with it's data tampering.

This is a visual residue of large scale tampering. All Adjustments from RAW is moot. Raw Is Very Cooked.


"Carefully curating and correcting records is global best practice for analysing temperature data." 
      -- BOM.


Looking for strange patterns and duplication of sequences.
SAS JMP is responsible for this pattern exploration section, see video link how this works on Pharma data, and how JMP finds anomolous or suspicious values.

JMP computes probabilities of finding a sequence by random, depending on number of unique values, sample size and so on.

I have only listed sequences here that have over 1 in 100 000 chance of occurring by random as calculated by JMP, a full month copy/pasted get's 100% certainty for fabrication. 

Copy/pasting exact data into another month or year is the ultimate in lazy tampering. It's incredible the BOM didn't think anyone would ever notice!

Having a run of days ALL with the exact temperature to 1/10 of a degree is dodgy too. This proves raw is not raw.

Below: Sydney Min Raw - A full 31 days copy/pasted into the following year.

If this is possible, then anything is possible. And a major capital city too. It's not as it they didn't have the data, they leave thousands of entries blank.  The correct procedure is to use proper imputation methods or leave the data missing.

And it's not one-off. Another full Sydney month copy/pasted into the following year.

More Sydney, another month copy/pasted. Notice, this is Raw Data and will ever notice!

Below: Richmond duplicate sequences. Raw as well.

Below: Georgetown duplicated sequences = dodgy Raw data.

Palmerville - over 2 weeks with the exact same temps, to 1/10 of a degree.

Below: Comooweal - I love how they are unsure what to put into
2002-03-05 in Maxv1!

Below: Cairns -- Full month copy/pasted in Max Raw.

Below: Tennant Creek -- paste January temps into March, that'll warm it!

Below: Tennant Creek. 

Below: Port Macquarie
Look at the top week - a change of week on the second day, pasted into another year but the change of week on the second day is mimicked!

At least they are fabricating consistantly.

 "The data available via Climate Data Online is generally considered ‘raw’, by convention, since it has not been analysed, transformed or adjusted apart from through basic quality control. "  -BOM

Below: Bourke, copy/paste July Into June, that'll cool it down! 

Below: Charleville
Here's a great way to cool down a month -- copy/paste the entire August temperatures into September.
I kid you not.

I left the best for last. I can go on and on with these sequences, but this is the last one for now, it's hard to beat.

Lets copy the full month of December into the following year of December.
And let's do this for ALL the Raw and Adjusted temperature series.
BUT let's not make it so obvious -- we'll hide this by changing ONE value and DELETING two  values.
You've got to love the subtlety here.

"Producing analyses such as ACORN-SAT involves much work, and typically takes scientists at the Bureau of Meteorology several years to complete. " -BOM

Summary of Pattern Exploration
I deleted quite a few sequences in a re-write of the blog because I can go on and on. There are hundreds or very suspicious to confirmed fabrication sequences. This is a sampling of what is out there.

Charlleville is my favourite. Changing 1 value out of 31 in the Minv2 adjustment data above was a masterstroke....they must have found a 'break' through neighbouring stations!

Overall, the sequences show 100% definite data tampering and fabrication on a large scale. What this shows is a complete lack of integrity for data. A forensic audit is long overdue. 


"ACORN-SAT data has its own quality control and analysis..." -BOM 


Benford Law's Fraud Analytics
Benford's Law has been widely used with great success for many years from money laundering and financial scams to tracking hurricane distances travelled and predicting times between earthquakes, and is accepted into evidence in a court of law in the USA. 

Benford's Law can also be applied on ratio - or count scale measures that have sufficient digits and that are not truncated (Hill & Schürger, 2005)

It describes the distribution of digits in many naturally occurring circumstances, including temperature anomalies (Sambridge et al, 2010). 

Some novel innovations to increase accuracy of Benford's Law has been developed in this paper , and which has been correlated and validated with an actual forensic audit done at the World Bank. 

If a data distribution should follow a Benfords law distribution and it doesn't, it means that something is going on with the data. It is a red flag for an audit, and is likely to have been tampered.

The first graph below shows Hobart, Sydney Melbourne, Darwin and Mildura Maxv2 combined for 200 000 data points. Running a Benfords Law analysis using the first two digits produces a weak conformance based on Nigrini's Mean Absolute Deviation parameter. 
This supports the hypothesis that Benford’s Law is the appropriate theoretical distribution for our dataset. Importantly, this does not indicate that the data is legitimate, as pooled data may cancel out
different individual signatures of manipulation and replicate Benford’s Law (Diekmann 2007, Ensminger+LederLuis 2020).

Below: 5 cities aggregated and Benfords Law curve of first 2 digits (red dotted line). The individual spikes/gaps indicate excessive overuse/underuse of specific numbers.

The above curve with all the aggregated data still has a bias with low numbers 10-15 appearing too few times, 17-45 appearing too often, then specific high numbers appearing with too low a frequency and a few high numbers popping up slightly. 


Benford's Law on Individual Stations.
Below: Deniliquin, Raw + Adj

Looking at the entire temperature series from 1910-2018 and using the first two digit values in a Benford analysis shows extreme non-conformance and a tiny p-value in the Max Raw and Max Adjusted data. 

You can see the high systematic spikes in the graph indicating excessive specific digit use in temperature anomalies.

The tiny p-value is less than 2.16 e10-16 indicates a rejection of the null hypothesis of this data set following Benford's Law. In other words, there is something wrong with the data. 

Below: Min Raw and Adjusted Minv2 Temps for Deniliquin.
These are extreme biases in a 39 000 variable data point set that suggests tampering. The high frequency "spikes" are temps that are repeated a lot and are also evident in the histograms.

Specific Months

Some months are much more tampered with than other months. Not all months are treated equally.

Below: Deniliquin Max Raw for January, all the days of January 1910-2019 are combined for a total of about 3300 days. This graph is screaming out, "audit me, audit me."

Below is Deniliquin Max Raw for July, all the days of July where combined from 1910-2018 to give about 3300 days. These are astounding graphs that show extreme tampering of RAW data.

Below: Deniliquin Min Raw for July.
This is max and min RAW data we have been looking at. 
You are unlikely to find worse less conforming Benford's Law graphs anywhere on the internet. This is as bad as it gets.
This is a massive red flag for a forensic audit.


Below: Mackay Min Raw For July

Below: Amberley Max Raw, All Data. Systematic tampering.

Below: Amberley January Min Raw. All the days of January. 


Amberley Month By Month.- 
Stratified months for Amberley shows which months have the most tampering. The results are p-values.

Keeping the same significance level as the BOM, any results less than 0.05 indicates rejection of the null hypothesis of conforming to Benford's Law. In other words, it should follow Benford's, it doesn't....tampering is likely.

Minraw 1 digit test, p-values
All 2.267687e-147
jan  1.106444e-17
feb  1.884201e-17
mar  7.136804e-11
apr  1.171959e-06
may  5.280244e-21
jun  5.561890e-28
jul  3.042741e-24
aug  1.439602e-32
sep  3.522860e-19
oct  9.930470e-25
nov  2.039136e-14
dec  4.546736e-23

This shows all the months aggregated for minraw as well as individual months It shows August + June being the worst offenders followed by October. April is the 'best' month. As with Bourke, August gets major cooling. 

Amberley Minv2 Adj 1 digit
All 7.701986e-192
jan  5.367620e-47
feb  1.269502e-25
mar  3.116875e-30
apr  8.924123e-24
may  9.250971e-26
jun  2.388032e-20
jul  2.889563e-38
aug  2.039597e-22
sep  1.678454e-19
oct  4.009116e-26
nov  6.251654e-15
dec  1.563074e-28

This compares the minv2 adjusted data and shows that adjustments overall (All at the top of the list) are worse than raw, which are pretty bad by themselves. 

January and July are the most heavily manipulated months.

 Amberley Maxraw 1 digit
All 6.528697e-217
jan  4.243928e-74
feb  3.451515e-48
mar  1.279319e-52
apr  1.141334e-69
may  4.425933e-58
jun  1.069427e-58
jul  3.903140e-49
aug  9.602354e-70
sep  2.312850e-53
oct  3.374468e-63
nov  5.669760e-48
dec 5.804254e-100

Overall, Maxraw data is worse than Minraw data. 

Amberley maxv2 adj digit
All 2.701983e-234
jan  2.309923e-83
feb 2.012154e-103
mar  1.492867e-56
apr  8.215013e-52
may  2.721058e-35
jun  9.487054e-40
jul  2.774663e-59
aug  7.915751e-47
sep  2.796343e-69
oct  1.096688e-39
nov  6.902012e-48
dec  1.814576e-68

Once again, adjustments are worse than Raw. February takes over from January with extreme values.

These results from Benford's Law first digit test show that adjusted data is worse/ less compliant to the Benford distribution than Raw.


Tracking Benford's Law For First Digit Value Over Years 

Amberley Minv2 Adj Data 1942-2017
The University Of Edinburgh have created a smooth bayesian model that tracks performance of first digit values for Benford's Law over time so that you can see exactly when first digit probabilities increase or decrease. In effect, this allows you to see how the values of the first digit in a temperature anomaly changes over time. 

Running this model with temperature anomalies fom Minv2 with all the data took 15 minutes on a laptop and produced the graph below:

Amberley started in 1942, so that is the zero reference on the X axis. The dotted line is the baseline to what should occur for the digits to conform to Benfords law. 1980 would be just after 40 on the X axis.

This graph shows that the first digit with value 1 has always been underused. Too few ones are used in Minv2 temp anomalies.

It became slightly more compliant in the 1950's then worsened again. There are far too few 1's in first digit position of temperature anomaly Minv2.

There are too many 2's and 3's right from the beginning at 1942, but use of 2's lessens (thus improving) slightly from the 1980's onwards. But use of 4's increases from the 1980's.

The values of  8's and 9's in first digit position have always been underused. The 9 value is undersused from the 1990's onwards.

These digit values indicate less conformance after the 1980 adjustments.

Below: Amberley 2 digit test for all data indicates large scale tampering.

 Amberley testing using the first 2 digit test. On the left is the raw data. Already the Min Raw is noncompliant with Benfords law giving a tiny P value, so we reject the null of conformity. This is not natural data. It has been heavily manipulated already, there is a big shortfall of value 10-15, there are too many numbers around 24-38 then in the 40's and then methodical spikes in the 50-90 range with big gaps signifying shortfalls.

But look what happens AFTER the adjustments are made on the right -- the Minv2 data is far less compliant and has had digit values from 22-47  become greatly increased in frequency with a tapering off around the 87-97's. 

Below: Bourke Max Raw and Maxv2.1. 
Adjustments make the data 'worse', if that is possible.
Extreme adjustments in both Raw and Adj data, with consistant underuse of lower digits and overuse of higher digits.

Below: Bourke Min Raw Temp Anomalies vs. Minv2.1.

Below: Mackay January, first digit Benford's test, Raw Data

This show how much tampering has gone into January and July. 

This is a wall hanger--the beauty of 'naturally' occurring observational data in an outstanding pattern that is shouting, "audit me, audit me!"
Almost as if the BOM are getting into fractals.

Below: Sydney data indicates engineered specific number use, certain numbers are repeated consistantly.

The BOM has obviously never heard of Benford's Law. This shows engineered specific numbers at specific distances that are over and under used. These are man-made fingerprints showing patterns in RAW data that do not occur in natural observations.



The Global Temperature graphs shown by various climate agencies generally have no levels of significance or boundaries or errors and are the result of averaging daily temp anomalies into months which are averaged into years and are then averaged with 112  stations in Australia or many more world wide. Here is an example of NASA GISS data. (link).

These are the primary graphs that are used by BOM in media releases.

And these are the graphs they use to argue that 15C Adjustments ( and more at some stations) don't matter.

How reliable are they?

This is what BOM Global anomalies look like when analysed with Benford's Law:

An under use of  1's, and a large over use of 2,3,4 + 5's.
Nonconformance with a p-value less than 2.16e10-16.
This means the data is likely highly tampered with.

BOM talk about their data being robust because it matches other agencies:

Below: NASA GISS Global anomalies.
Overuse of 3,4,5,6,7,+8's, terrible graph, nonconforming to Benford's Law.

US agency NOAA Global anomalies are weakly conforming.
Still an overuse of 6,7,8's.

Where we really begin to go into La-La land is looking at land/ocean global anomalies, it's apparent that it's just modeled data. This is not real data.

Below: NOAA global land/ocean with 2 digit test for Benford's Law.

None of the global temperature anomalies can be taken seriously. This is obviously (badly) modeled data to be used for entertainment purposes only. The Global Temperature Anomalies fail conformance too.



Strategically Rounding/Truncating Temperatures To Create Extreme Biases.

Correct rounding can add 0.1-0.2C of a degree to the mean, incorrect rounding such as truncating can add 0.5C of a degree to the mean.

This is NOT related to Australia going metric in 1972. Some stations have blocks of years rounded with almost no decimal values in the later years. For example Deniliquin from 1998-2002 has only 30 days with any decimal values, everything has been rounded for 4 years.

Some stations have 25 years where rounding increases from 10% to 70%. Looking at the graphs you can see which years get most treatment.

Below: Deniliquin Maxv2, 1998-2002 all rounded!  

A graphical view of rounding of Max Raw temperatures by years. Notice the high density black dots in the 1998-2002 area which creates a bias. Obviously special attention is given in those years.

This comes on the heels of the review panel advising BOM their thermometers/readings needed to meet world standards and increase tolerance from 0.5C to 0.1C. Rounding with no decimal digits in specific blocks of years ensures they won't be meeting world standards any time soon.

"However, throughout the last 100 years, Bureau of Meteorology guidance has allowed for a tolerance of ±0.5 °C for field checks of either in-glass or resistance thermometers. This is the primary reason the Panel did not rate the observing practices amongst international best practices." - BOM

A more visual way to see the rounding is to view the graphs with actual black data dots which shows all the rounded temperatures. 

Below: Deniliquin. - there are strategic patterns to rounding in RAW data. The black dots are rounded temperatures!

Below: Bourke Minraw--strategically rounded up or down.

Below: Bourke Maxraw. The years of strategic rounding/truncating are clearly visible in a 20 year block.

Below: Rutherglen Maxraw rounding/truncating.
Black dots are rounded temperatures! 

Below: Mackay with strategic rounding visible.

Below: Sydney Maxv2 Adj has rounding bias concentrated on the lower part (lower temps) of the graph.

The BOM is using rounding/truncating particularly in the years 1975-2002. The amount of rounding becomes more dense in the 90's. This is on Raw data, it is strategic and varies from station to station and from year to year. Rounding or truncating can add around 0.5C of a degree to the mean. It is likely used to add to warming/ cooling on the temperature series.

It shows the RAW data has been fiddled with and lacks integrity.



Selective Infilling/Imputing Of Missing Data To Create Extreme Outliers And Biases.

Temperatures which are not collected because there is no instrument available, or because an instrument has failed, cannot be replicated. It is forever unavailable. Similarly, data which is inaccurate because of changes in the site metadata or instrument drift is forever inaccurate and it's accuracy cannot be improved with homogenisation adjustment algorithms.

What we are concerned with here are computer generated temperatures,
called imputation or infilling or interpolation into Adjusted data where there is no Raw.

This process cannot be as accurate as actual temperature readings but becomes worse when only specific missing values are selected for infilling, leaving thousands blank. Selecting only some values to infill creates a bias!

So computer generated temperatures can dominate selective parts of the temperature series with specific warming and cooling segments.

In fact, this is exactly what is happening -- BOM is creating computer generated outliers in Adjusted data where Raw is missing. BUT only some of these missing variables are infilled, creating very biased data. 


The missing variables pattern in JMP is flagged with a "1" when data is missing. The above pic is the missing variable report for minv2.1 and Minraw. What we are interested in is all the missing temps in Raw that have infilled/imputed values in Minv2.1. 

So in the above there are 512 values with NO Raw but FULL minv2.1 data. There are also 622 values where both Raw and Minv2.1 is missing.

BOM infilled outliers into the data, see below.

Melbourne, below epitomises the selective infilling of missing data and bias creation.

48 values with no Raw are infilled into Minv2 BUT 47 values are still left missing! Black dots are the actual infilled values.

The infilled values are all extreme cooling (lowest values in past) that help cool down the earlier part of the temperature series.

Just to be clear what we are seeing-- these are missing Raw temperatures that have been selectively infilled/imputed/interpolated with values that are computer generated to be on the lowest boundaries of cool in a position which 'helpfully' increases the BOM trendline in the morte recent years. This is selective biased data.

Palmerville (below)  has computer generated a 'record warmest Minv2 ever' created. Black dots are the actual infilled values.

Richmond (below) gets the treatment.
Black dots are the actual infilled values.

Let's look at Sydney and the infilled values in detail, below with relation to creating a trend.

These infilled values are tested for a trend. Yep, you guessed it--the infilled values by themselves trend UPWARDS (below).

Let's look at EXTREME warm and cool infilling at Port Macquarie (below). Keep in mind, these are computer generated values, all 12768 values in Minv2.1! Black dots are the actual infilled values.

Similar story to the Maxv2.1. They infilled 13224 values complete with extreme values, yet still left some blank.

Lastly, let's look at Mt. Gambier. The missing patterns show 10941 values where Raw is missing, but values have been infilled into Minv2.1
There are still 77 values missing for both Raw and Minv2.1 and 17 values ignored in Minv2.1 that exist in Raw (another interesting concept!)

The box plot below tells us that there are a lot of outliers in Minv2.1 and Raw.

Likewise, the scatterplot shows another view of the outliers with the 'bulge' in the plot and all the little dots scattered around by themselves. This tells us to prepare for outliers.

Indeed, over 10000 values have been infilled, many with extreme values creating outliers. These are not true observational readings because Raw is missing. Records have been set by computer, look at the outlier dot in 1934, way above the others.

As well as Minv2.1, we have Maxv2.1 as well (below). The second lowest maximum temperature ever is a computer generated infilled value!

Infilling/interpolation/imputing values need to be carefully done to be statistically valid. And the values will never be as accurate as a valid reading.

This is not being done by the BOM because they are selecting specific values they want, leaving others blank, AND they are creating extreme value outliers in positions they want! Outliers are normally removed, here they are added inThis data is completely without integrity.




 "The primary purpose of an adjusted station dataset it to provide quality station level data for users, with areal [sic] averages being a secondary product." -BOM

The complete Amberley temperature series both raw and adjusted is below. The orange graph is the raw temperature, the blue is the cooled down adjusted version. By cooling the past, a warming trend is created. Notice cooling in Adj stopped around 1998.


A Summary Of The Amberley Problem:

(1) A dip in the temperature series in 1980 ( an 'inhomogeneity detected') made them realise the station was running warm because now it didn't match it's neighbours -- therefore it was cooled down significantly from 1942 to 1998. ???

(2)The unspecified 'neighbour stations' were totalled as 310 by NASA and several dozen by BOM.

The stations involved were vague and non transparent, and so unable to be tested.

(3) Conveniently a warming trend had been created where there was none before.

(4) In 1998 the station mysteriously returned to normal and no more significant adjustments were required. 

(5) The following iteration of the temperature series from Minv1 to Minv2 resulted in them now warming the cooled station after 1998


No evidence supplied, no documentation on the 'neighbours' involved, no meta data. 

The review panel from 10 years ago had problems with the methodology too -- 

"C7 Before public release of the ACORN-SAT dataset the Bureau should determine and document the reasons why the new data-set shows a lower average temperature in the period prior to 1940 than is shown by data derived from the whole network, and by previous international analyses of Australian temperature data." 

"C5 The Bureau is encouraged to calculate the adjustments using only the best correlated neighbour station record and compare the results with the adjustments calculated using several neighbouring stations. This would better justify one estimate or the other and quantify impacts arising from such choices."

Using only the 'best correlated neighbour stations' has obviously confused Gavin Schmidt from NASA, he used 310 neighbours (see Jennifer Marohasy's blog). Dr. Jennifer Marohasy documents the whole dubious adjustment saga in detail.

The BOM were eventually forced to defend their procedures in a statement:

"Amberley: the major 
adjustment is to minimum temperatures in 1980. There is very little 
available documentation for Amberley before the 1990s (possibly, as an 
RAAF base, earlier documentation may be contained in classified 
material) and this adjustment was identified through neighbour 
comparisons. The level of confidence in this adjustment is very high because of the size of the inhomogeneity and the large number of other stations in the region (high network density), which can be used as a reference. The most likely cause is a site move within the RAAF base."

Obviously their level of confidence wasn't that large because they warmed up their cooled temperatures somewhat in the next iteration of the temperature series data set (from minv1 to minv2).

Update minv2.1
Warming continues from iteration minv2 to minv2.1 by increasing the frequency of temperature repeats slightly. Every iteration gets warmer.


This whole situation is ludicrious and you get the feeling that the BOM has been caught in a lie. There are several ways to check the impact of the adjustments, though.

(1) Benford's Law before and after adjustments
(2) Control Charts, before and after adjustments
(3) Tracking first digit values from 1942-2018 to see if we can spot digit values changing using a smooth bayesian model from University of Edinburgh.



Benford's Law
Below: Raw and adjusted data is compared from 1942-1980 using Benford's law of first digit analysis. 

Using Benford's law for first digit analysis we can see the adjustments make the data worse with lower conformance and a smaller p-value.

Below: Beford's law first 2 digits for January and July.
The graphs are as bad as anything you are likely to see and would trigger an automatic audit in any financial situation.



Basic Quality Control - The Control Chart 
Besides Benfords Law, let's use Control Charts to get a handle on the Amberley data and get a second opinion.

I put the Min Raw and Minv2.1 temperature data into a Control Chart, one of the seven basic tools of quality control. 

The temperature series was already 'out of control' in the raw sequence, but not in 1980. There are 11 warning nodes where the chart is over or under the 3 sigma limit, but after adjustments, this nearly doubles. There are many more warning nodes and the temperature sequence is more unstable.



Tracking Benford's Law For First Digit Value Over Time 
Amberley Minv2 Adj Data 1942-2017

The University Of Edinburgh have created a smooth bayesian model that tracks performance of first digit values for Benford's Law over time so that you can see exactly when first digit probabilities increase or decrease. 

In effect, this allows you to see how the values of the first digit in a temperature anomaly changes over time. 

Running this model with temperature anomalies fom Minv2 took 15 minutes on a laptop and produced the graph below:

Amberley started in 1942, so that is the zero reference on the X axis. The dotted line is the baseline to what should occur for the digits to conform to Benfords law. 1980 would be just after 40 on the X axis.

This graph shows that the first digit with value 1 has always been underused. It became slightly more compliant in the 1950's then worsened again. There are far too few 1's in first digit position of temperature anomaly Minv2.

There are too many 2's and 3's right from the beginning at 1942, but use of 2's lessens (thus improving) slightly from the 1980's onwards. But use of 4's increases from the 1980's.

The values of  8's and 9's in first digit position have always been underused. The 9 value is undersused from the 1990's onwards.
These digit values indicate less conformance after the 1980 adjustments.



Bourke Adjustments Of 0.5C = Less Than Sampling Variation
BOM released a statement about the adjustments made at Bourke:

"Bourke: the major adjustments (none of them more than 0.5 
degrees Celsius) relate to site moves in 1994 (the instrument was moved  from the town to the airport), 1999 (moved within the airport grounds) and 1938 (moved within the town), as well as 1950s inhomogeneities that were detected by neighbour comparisons which, based on station photos 
before and after, may be related to changes in vegetation (and therefore exposure of the instrument) around the site."

Looking at Bourke, below:

This is strange because there are lots of adjustments in 1994, 1999 and 1938 that are far more than 0.5 degree, some are over 3C degrees.

But maybe the vague language about the 1950's is where the low adjustments are -- well it depends on the year which is not specified.

Here are the biggests Adjustments in the time series for Bourke:

So the 'none of them more than 0.5 degrees Celsius' is meant for some unknown years in the 1950's, it seems to be misdirection to distract from the bigger adjustments all along the time series.

But look at the months column -- so many August entries I had to look at it more closely.

So there were more than 4400 adjustments over 2C degrees in the time series. That's the subset we'll look at--

Look at August -- half of all the adjustments over 2 degrees for 1911-2019 in the time series were in August! 

August is getting special attention by the BOM in Bourke with a major cooling of Minimum temperatures. 

Getting back to the '0.5 degree adjustments in the 1950's' -- this is nonsense because:

These are the statistics  for 1950-1959:

The mean Min Raw temp is 13.56 degrees. 
A single mean digit contains sampling variation and does not give a true picture.

Putting Bourke Min into a Control Chart below shows what the real problems are. The upper red line is the upper 3 sigma limit, the lower one is the lower 3 sigma limit, temps will vary between the red lines 99.7% of the time unless it is 'out of control.'
You can tell something is wrong with Bourke with the number of nodes that have breached the upper and lower limits at the beginning and end of the series, the 1950's doesnt even register. How can it, we are talking 0.5C.

Look at 2010, it is off the chart, extremely remote chance of seeing this event at random.

In Control Chart language, this temp series is 'Out Of Control', there is something very wrong with it.

Above: Control Chart for Bourke showing the system is 'out of control' from the beginning, but the 1950's are not the problem.

UPDATE: **********************************************
MORE on the specific months Bourke is manipulated/adjusted.
Looking at Minv2.1, all manipulation, um adjustments at -2.7C or more ie the biggest cooling adjustments--shown below. May gets 414 adjustments, August gets 1330.

What this shows is that in Minimum Temps, 97% of ALL adjustments over 84 years at -2.7C or less (cooling down) were May and August! Whether or not May and August needed it, every year for 84 years,  adjustments were made to May and August.

Below, all the years the adjustments were done as well as how many.

What this means is that the largest cooling adjustments were all done on May+ August every single year---whether the station moved, vegetation grew, thermometers drifted, it matters not--every year May and August were cooled by -2.7C or more with an average of 23 adjustments per year.

This makes a mockery of reasons concocted by BOM in hindsight such as vegetation growing, station moving up the hill then down the hill etc.


There has been no statistically significant warming over the last 15 years.” -- 13 February 2010, Dr. Phil Jones

Statistical Sigificance With NPC Test from MIT
Every Decade Warmer Since 1980's Warmer? 

Very often the BOM display graphs and charts without boundaries of error or confidence intervals. The statistical significance is implied.

Given that there are problems with past historic temperature series,what if we could test just the best, most recent results with modern fail safe equipment for statistical significance?

A hypothesis like this is easy to test:

Dr Colin Morice from the Met Office Hadley Centre.
"Each decade from the 1980s has been successively warmer than all
the decades that came before." 

We can use Non Parametric Combination Test with R code from Devin Caughey at MIT.

This technique is common in brain mapping labs because no assumptions are made about the distribution, inter-dependencies are handled and multiple test are exactly combined into a p value. A great signal to noise ratio and the ability to handle very small sample sizes makes this the ideal candidate to test the hypothesis.

The null hypothesis = all decades after 1980 are NOT getting warmer
The alternate hypothesis = all decades since 1980 have become warmer.

We'll use the temps from Berkley Earth.
The data will be:

The output of NPC is a p value after exactly combining the sub-hypotheses. In keeping with the BOM, we use the 95% significance level, so anything that is LESS than p value = 0.05 has the null hypothesis rejected.
The results using Berkely Earth temps (except NOAA which is from NOAA) are:

berkley earth temp
h0=!1>2>3----null hypothesis
h1=1>2>3=each decade warmer----alternate hypothesis

Don't Reject The Null - each decade NOT getting warmer.
Alice Springs p-value = 0.4188
Amberley p-value = 0.3326
Tennant Creek p-value = 0.7159
Benalla p-value = 0.4085
Bering p-value = 0.1651
Capetown p-value = 0.2872
Corowa p-value = 0.1776
Darwin p-value = 0.5984
DeBilt, Netherlands p-value = 0.146
Deniliquin p-value = 0.4067
Echuca p-value = 0.3645
Launceston p-value = 0.3331
Mawson p-value = 0.3043
Mildura p-value = 0.2888
Mt. Isa p-value = 0.5782
NOAA Southern Region p-value = 0.2539 
Nowra p-value = 0.2141
Rutherglen p-value = 0.2283
Sale p-value = 0.3685
Tamworth p-value = 0.2407
Wangaratta p-value = 0.277

Reject The Null - each decade is getting warmer
Beechworth p-value = 2e-04
Hobart p-value = 3e-04

Beechworth is less than 40kms away from Wangaratta yet decisively rejects the null while Wangaratta does not! Similar to Hobart Launceston that are 2 hours apart.

This shows that the premise from Met Office Hadley is wrong for our sample. Using Berkely Earth temps, a random sampling of stations using NPC test to calculate significance without assuming a normal distribution, has rejected the alternate hypothesis in most cases.

Going over the results again, I found most country stations reject the alternate while the capital cities being Urban Heat Islands, decisively reject the null and agree with Met Office Hadley.

This shows 2 things:
Statistical significance/confidence intervals/boundaries of error are mostly ignored in climate presentations.

Don't trust everything you hear - test, test, test!

As An Aside:
Here are 40 000 coin tosses documented at Berkley University, heads are +1 and tails -1:

I took the first 1000 tosses from their supplied spreadsheet, graphed it and plotted a trend.

There's even a 95% percent boundary of error which is more than the BOM supply on most of their trends. 

Moral of the story: Even a sequence of coin tosses can show a trend. 


19 Jan 2021
Over The Next Few Days/Weeks

NEW evidence on missing raw data being imputed/estimated/fabricated into Adjusted data that breaks records. In one case 10 000 data of missing raw is imputed/estimated/fabricated into Adjusted data with yearly records. DONE

Count or ratio data conforms to Benford's Law. A NEW way to use temperatures into the histogram graphs that test for Benford's Law without the need to convert them to temperature anomalies. This tests the histogram of repeated temperatures for Benfords Law conformance!

Statistical significance tests using climate data from Berkley Earth. This shows that many/most stations are not even statistically valid data from a climate agency when testing this hypothesis--

Dr Colin Morice from the Met Office Hadley Centre:
"Each decade from the 1980s has been successively warmer than all
the decades that came before." 

We do this using Non Parmatric Combination Test from MIT which makes no assumptions about the distribution and automatically accounts for inter-dependencies. DONE


This blog has taken quite a few months to research and write. The more I dug into the data, the more rotten it was. And I am still digging. It is a shocking case of extreme data tampering and fabrication. 

It is on a larger scale than Enron if it were financial (check Enron Benford curves from my first post), the fabrication/duplication is larger the Prof Staples who retracted his studies and was fired from the University of Rotterdam (and who said his techniques were 'commonly in use' in the research labs). 

This has to be a wake-up call for the Government to launch a forensic audit. The BOM cannot be trusted with the temperature record, it should be handed over to a reputable origanisation like the Bureau Of Statistics. 

It's obvious the BOM either don't know or don't want to know about data integrity. This isn't science. The Brit's have a term for this - Noddy Science.


More to follow in other posts, there is much to write about in relation to climate data. It's making the tulip frenzy of the 1600's look like a hiccup.

No comments

Post a Comment

© ElasticTruth

This site uses cookies from Google to deliver its services - Click here for information.

Professional Blog Designs by pipdig