Deception Detection In Non Verbals, Linguistics And Data.

BOM Raw Climate Data - EVIDENCE LARGE SCALE TAMPERING.



An Investigation Into Australian Bureau Of Meteorology - Large Scale Data Tampering.


“Findings from a multitude of scientific literatures converge on a single point: People are credulous creatures who find it very easy to believe and very difficult to doubt.” 
(How Mental Systems Believe, Dan Gilbert, psychologist)


The concept of garbage in, garbage out means that no meaningful output can result from 'dirty' data  being input, and all adjustments that follow are moot. So raw data files as records of observation are critical as an accurate temperature record. The question is, are they raw? Is this unadjusted observational data?

"The Bureau does not alter the original temperature data measured at individual stations." 
             -- BOM, 2014


Summary Of Results.
My Analysis Shows Heavily Tampered Raw Data. 
There are whole months that have been copy/pasted into other months, impossible sequences and duplications, complete nonconformance to Benford's law indicating heavy data tampering, standard errors of 30 or more in number bunching tests indicating abnormally repeated temperature frequencies, 4 years where all temperatures are rounded up with no decimal data, there several 25 year blocks with rounding increased by 300%. Missing temps in Raw are being estimated or imputed into Adjusted data which break or nearly break records. 

In most cases data is even worse after adjustments according to Benford's Law and Control Charts. There are cases where extreme adjusted outliers are added to missing data.

And there is a smoking gun: nearly all raw data has man-made fingerprints showing engineered highs and lows in repeated data frequencies which can only be due to large scale tampering -- all beautifully visual when exposed by zero bin histograms. Raw is Not Raw, It Is Rotten And Overcooked.


Preliminary.
Data from the 112 ACORN stations that I used:
Bom supply data for minimum and maximum temperatures:
maximum raw = maxraw
minimum raw = minraw
maximum adjusted = maxv1, maxv2, maxv2.1    different updates
minimum adjusted = minv1, minv2, minv2.1       different updates
 

Billion Dollar Data That Has Never Had An Audit.
Incredibly, the BOM temperature series data has never had an independent audit and has never been tested with fraud analytic software, despite the vast amounts of money involved in the industry and the flourishing consultancies that have popped up.

BOM make a lot of the independent review 10 years ago that compared their methodology and results to other climate data and found it robust because it was similar, but the data has never been audited.
As we will see later on, the other climate agencies have complete lack of conformance to Benford's Law too indicating data problems.

You don't even need Benford's Law to see multiple red flags with online queries from from the GHCN U.S data network. (linkThe U.S GHCN data stops reporting cooling stations in south East Australia after 1990.....figure that one out.

The BOM say adjustments don't make any difference and their evidence is a graph of averages of averages of averages - days averaged to months averaged to years, averaged with 112 stations, all without published boundaries of error.

It is well known that pooled data can hide individual fraud and manipulation signatures, and even begin to conform with Benford's Law due to multiplication and or division in data (Diekmann, 2007).

Pooled data can also exhibit Simpsons Paradox where a trend in different groups can reverse when combined. 

Below: Annual Mean temperature averages BOM use to show that individual adjustments don't matter....without boundaries of error or confidence levels.


What is also amazing is that Benford's Law which has a proven history in fraud detection in many fields and is admissible as evidence in a court of law in the U.S  has not been run on any climate data of significance.

Much natural and man-made data follows Benford's Law if there are several orders of magnitude.

Temperature has an upper and lower limit to what you would normally observe so it doesn't follow Benford's Law per se, but if you convert it to a temperature anomaly (which is simply an offset used by climate industry ), then it does follow Benford's Law. (Sambridge et al, 2010).

You would think that your tax returns first digit would have values of equal probability appearing, but this isn't so -- one appears about 30% of the time and nine appears less than 5% of the time which is why the tax man is interested in this law too, it has helped find and even convict on tax or cheque fraud.

It turns out that human beings are not very good at fabricating numbers.


Peer Review Is No Guarantee
"Scientific fraud, particularly data fabrication is increasing."
 -- (Data Fraud In Clinical Trials, Stephen George and Marc Buyse, 2015)

Retractionwatch.com has over 2000 retractions.

Fujii has the world record with 183 retractions, after which he was fired from Tokyo University, hereUri Simonsohn has a website where he replicates studies and has been responsible for many retractions.

Smeesters, Staples and Sanna were three very high profile professors in peer reviewed journals. All were found guilty of data fabrication. All resigned and restracted their papers. 

Peer reviews are no protection against fabrication. Uri Simonsohn was responsible for exposing the 3 professors on data alone, he argues the way forward is to to supply all raw data and code with studies for replication.(link)

Below: One of the 53 studies from Stapels that was retracted due to fabrication. Note the duplicated entries. 

Number duplication or repeating the frequency of numbers is one of the most common causes of fabrication, and even the BOM uses low level copy/paste duplications of temperatures.


John Carlisle is an anaesthetist who is also a part time data detective, he has uncovered scientific misconduct in hundreds of papers and helped expose some of the world's leading scientific frauds. (link).


Reproducibility -
"No Raw Data, No Science"

Reproducibility is a major principle of the scientific method. (wiki).


"Reproduction in climate science tends to have a
broader meaning, and relates to the robustness
of results." -BOM

"Robustness checks involve reporting alternative specifications that test the same hypothesis.  Because the problem is with the hypothesis, the problem is not addressed with robustness checks."

Tsyoshi Miyakawa is editor of Molecular Brain, he estimates a quarter of the journals he has handled contain fabricated data, and is leading for his push "No Raw Data, No Science", to only publish reproducible studies that supply raw data.

The BOM supply little documentation regarding meta data, adjustments, neighbouring stations used, correlations used for adjustments. It is not transparent and so cannot be replicated.

Excel plug-ins such as XLstat have already built in the Alexandersson algorithms used by BOM so it would be possible to replicate the adjustments if sufficient documentation were available. 

"The removal of false detected inhomogeneities and the acceptance of inhomogeneous series affect each subsequent analysis." (A. Toreti,F. G. Kuglitsch,E. Xoplaki, P. M. Della-Marta, E. Aguilar, M. Prohom fand J. Luterbacher g, 2010)

The adjustment software used by the BOM is running at a 95% significance level so 1 in 20 sequences that are normal will be flagged as "breaks" or anomalous, as will the number of stations selected; this in turn affects each subsequent analysis.




"Homegenization does not increase the accuracy of the data - it can be no higher than the accuracy of the observations. The aim of adjustments is to put different parts of a series in accordance with each other as if the measurements had not been taken under different conditions." (M.Syrakova, V.Mateev, 2009)

"The primary purpose of an adjusted station dataset it to provide quality station level data for users, with areal [sic] averages being a secondary product." -BOM


Fraud Analytics
The principle in Fraud Analytics is that data that is fabricated or tampered looks different to naturally occurring data.

Tools to help in the search:

(1) SAS JMP - powerful statistical software designed for data exploration and anomolous pattern detections. This detects patterns such as copy/paste, unlikely duplications, sequences etc.

(2) R code for Benfords Law - industrial strength code to run most of the tests advocated by Mark Nigrini in his fraud analytics books. Benford's law points to digits that are used too much or too little. Duplication of exact numbers is a major cause of fraud. (Uri Simonsohn)

(3) R code from "Measuring Strategic Data Manipulation: Evidence from a World Bank Project"    --  By Jean Ensminger and Jetson Leder-Luis

 (4) R code to replicate BOM methodology to create temperature anomalies to use with Benford's law.

(5) R code from University of Edinburgh -- "Technological improvements or climate change? Bayesian modeling of time-varying conformance to Benford’s Law"  -- Junho Lee + Miguel de Carvalho.

(6) R code - "NPC: An R package for performing nonparametric combination of multiple dependent hypothesis tests" 
-- Devin Caughey from MIT

(7) R code - Number-Bunching: A New Tool for Forensic Data Analysis (datacolada).  Used to analyze the frequency with which values get repeated within a dataset, a major source of data fraud .

(8) CART decision trees from Salford Systems + K-means clustering from JMP.


_________________________________________________________


"The Bureau's ACORN-SAT dataset and methods have been thoroughly
peer-reviewed and found to be world-leading." - BOM


Unlocking Fraud With Temperature REPEATS -
The Humble Histogram Reveals Tampering Visually.

Data Detective Uri Simonsohn's Number Bunching R code is used in forensic auditing to determine how extreme number bunching is in a distribution.

I used to use the code for fairly subtle distribution discrepancies before realising that the BOM Raw temperature data has been so heavily engineered that it isn't needed -- the visual display from any stats program shows this specific residue of extreme tampering. This isn't so obvious with any other climate data except from the BOM, this is truly world-leading data manipulation taking place.

This visual display becomes a smoking gun that immediately shows the structure put into the Raw data. It is a fingerprint to fraud.

It involves a specific structure of 1 temperature that is highly repeated in the data, then 4 low repeated temps, then 1 high repeater, then 5 lower repeats. This 4-5 alternating sequence is methodical, consistant and man-made, and it leaves gaps between the highest repeated temperatures.

And this is immediately visible in virtually all Raw Data and it proves that the data is not observational temperature readings.

The way to see this is with a particular histogram that can be created with any stats program. Let's talk about the histogram.

A histogram is an approximate representation of the distribution of numerical data. Lets look at Tennant Creek Maximum Raw temps as an example. This gives you a rough idea what the distribution looks like with the temperatures at the bottom on the X axis, and how how they appear in the data on the vertical Y axis. It lets you see what temps appeared the most often, this show you the shape of the distribution.

Looking at binned histograms won't show you anything unusual on cursory inspection because the BOM use Quantile Matching algorithms to match distributions.
 
But the data is binned, many observations are put in each bin, so you can't tell exactly how many times a specific temperature appeared in the data. 




We want to know how many times each individual temperature appears, so we need a histogram that doesn't bin it's data.



Above: We are looking at the exact same distribution, but now each and every temp has a value that shows exactly how often it appears in the data.

The highest spike is 37.8C degrees and repeated 758 times in Maxraw data. The higher the spike, the more often it appeared.

And here is the problem for the BOM --  you can see straight away that this is dodgy data. The reason data is binned in normal histograms is that if you get down to a granular data level things become very noisy and it's difficult to see the shape of the distribution.

Not so with the BOM data -- things become clearer because this is not observational data, it has been engineered to have specific high repeated temperatures (high spikes), followed by gaps where there are lower frequency (repeated) temps, then a high one again and so on.

What this is saying is that the highest frequency temperatures are neatly ordered between lower repeated temperatures.

Let's look at Deniliquin Minraw.
These are the numbers that go into the created histogram.
The Maxraw temp of 8.9C degrees (top line) repeated 833 times in the data creating the highest spike because it appeared the most often, it had the highest frequency.




But look what happens -- there is a gap of 4 LOW repeated temps then another HIGH repeated temp (next one is 9.4C at 739 repeats).
Then there is a gap of 5 LOW then 1 HIGH, then 4 LOW and so on and so on.

This is RAW data and it is engineered so that there are consistantly alternating gaps of 4 and 5 low numbers between the extreme high spikes. And this occurs with most Raw data (at least 80%). It is a major mistake by the BOM, it is a residue, a left over from tampering.

Recap -- the very high spikes you see in the graph is from a simple histogram without binning available in most stats programs. It is showing us that so-called RAW data which is supposed to be observational data, actually has an artificial structure that is man-made!

You don't see a dataset with a high frequency temperature (high repeat) then 4 very low frequency temps, then a high frequency, then 5 low temps, continuing, in a dataset from the natural world. This is not random, it is engineered, and it is a mistake from one of the BOM algorithms!

Lets look at Bourke Minraw:



Above: Bourke Minraw temperatures, exact same signature.



Lets look at Charters Towers:




Above: The same fraud signature showing Raw data is not Raw but overcooked. These high alternating repeated temps between the low ones are unnatural and there is no explanation for this except large scale tampering of Raw.

This tampering is so extreme it doesn't exist to this level in other climate data from other agencies -- the BOM is the most heavy handed and brazen.

Let's do a quick tour around various stations with just the fraud-busting histogram:






All Raw, all have extreme spikes showing extreme repeats, all have the same 4-5 alternating gaps! All are unnatural.

Now what happens when the Raw get Adjusted?
The spikes get turned down, their frequency and rate of repeats is reduced, but the vast body of lower repeated temps are increased.

Look at Deniliquin Minraw-- 7.2C degrees is repeated 799 times in Raw but only 362 times in Minv2 adjusted.
But the low repeats in the gaps are increased!





Let's look at Bourke repeated temps and compare Raw to Adjusted.


Same thing, the highest repeated temps, the spikes in the graph are reduced by reducing the frequency with which they appear. The low frequency temps are increased.

Tennant Creek, same thing:



What is the net result of doing adjustments on raw data? The high spikes are reduced, getting rid of the evidence.

The low frequency temps (of which there are many more) are increased in frequency giving a net warming effect.

Temperatures are controlled by reducing or increasing the frequency with which they are repeated in the data!

Tennant Creek Maxv2 Adjusted -- spikes reduced, they now merge with the gaps, so everthing appears more kosher on a cursory inspection. 




Below: A different look at how temperature frequency is manipulated up or down.

You can see that in Raw, 15C repeated 827 times, while in Minv2.1 it appears 401 times. This reduces the large spikes in the Adjusted histograms.




Summary Of Histograms Showing Patterns In Raw:
Histograms with zero binning at a granular level expose systematic tampering with the RAW data being engineered with a specific layout - the highest repeated temperatures are followed with an alternating gaps of 4 and 5 low frequency repeated temperatures, followed by a single high frequency repeated temp and so on.

Normally, for most data that is a bit more subtle than the BOM Raw, Uri Simonsohn's Number Bunching R code is required to detect extreme number bunching or repeating. But the BOM data is extremely heavy handed to such an extent, they have left a visual obvious residue from their tampering algorithms. This is proven when comparing their temps to other agencies, none display the extreme spikes and gaps. The BOM really is world-leading with it's data tampering.

This is a visual residue of large scale tampering. All Adjustments from RAW is moot. Raw Is Very Cooked.


_________________________________________________________________________



Basic Data Visualisation Tool #2

Scatterplots
Histograms are a basic way of visualising a distribution, and scatterplot is another basic tool. This shows the correlation between two variables and can indicate outliers or big differences between Raw and Adjusted. 

It is very handy to see the magnitude of differences for a clue to potential problems.


Above:
This is Richmond comparing Max Raw and Max Adj, this has a perfect correlation of 1 between the variables -- you expect Adjusted to simply follow Raw in this case, there is no difference between the variables.

But then we get to Minimum temperatures:


Above: Richmond Min Raw and Minv2.1. Look at this....extremely low correlation between Raw and Adjusted. This means there is extreme differences between Raw and Adjusted.

Look how 29 degrees is turned down to11.2 and how 25 degrees is turned down and cooled to 10.9 degrees. The Adjusted temps have been turned into outliers, extreme values at the edge of data (outer dots), that should be removed, not inserted!

So if there is perfect correlation of 1 or close to it between Raw and Adjusted, there is little to no difference between Raw and Adjusted.

A lower correlation of say around 0.5 or 0.6 or 0.7 means you can expect big differences between Raw and Adjusted.

Port Macquarie below tells a similar story. A 20 year subset was tested where the rounding suddenly tripled in use. Normally when you find one dodgy procedure in the data, there are many more waiting to be uncovered.

Look at the massive differences between Maxraw and Maxv2.1 -- the first line show that 24.1 degrees in Maxraw was turned up to 41.6 in Maxv2.1!

The clue here  was the weaker correlation but also the outliers (circled). These are values far removed from the main data. 

In our case these were adjusted temperatures that were turned into outliers. This is the exact opposite of data preprocessing goals -- extreme outliers are removed because they can affect data integrity, here they are put in.




The scatterplot works just as well on the full temperature series data. Here is Rutherglen Max Raw and Maxv2.1 with the full data set:



Above: Rutherglen scatterplot shows lots of temps with extreme differences between Raw and Adjusted. In this case there is an adjustment of 17.9C made on top of Raw. The circled dots are outliers. Expect extreme adjustments.

_________________________________________________________________________


"Carefully curating and correcting records is global best practice for analysing temperature data." 
      -- BOM.


PATTERN EXPLORATION A.K.A Copy/Paste

Looking for strange patterns and duplication of sequences.
SAS JMP is responsible for this pattern exploration section, see video link how this works on Pharma data, and how JMP finds anomolous or suspicious values.

JMP computes probabilities of finding a sequence by random, depending on number of unique values, sample size and so on.

I have only listed sequences here that have over 1 in 100 000 chance of occurring by random, a full month copy/pasted get's 100% certainty for fabrication. 

One of the most common data fraud modus operandi is duplicating exact number sequences, as well as runs of duplicate numbers and good old fashioned copy/paste. Surely that is not possible in the BOM dataset?

Below: Sydney Min Raw - A full 31 days copy/pasted into the following year.
If this is possible, then anything is possible. And a major capital city too. It's not as it they didn't have the data, they leave thousands of entries blank sometimes. The correct procedure is to use proper imputation methods or leave the data missing.



And it's not one-off. Another full Sydney month copy/pasted into the following year.



More Sydney, another month copy/pasted. Notice, this is Raw Data.




Below: Richmond duplicate sequences. Raw as well.




Below: Georgetown duplicated sequences = dodgy Raw data.



Palmerville - over 2 weeks with the exact same temps, to 1/10 of a degree.



Below: Comooweal - I love how they are unsure what to put into
2002-03-05 in Maxv1!




Below: Cairns -- Full month copy/pasted in Max Raw.




Below: Tennant Creek -- paste January temps into March, that'll warm it!




Below: Tennant Creek. 


Below: Port Macquarie
Look at the top week - a change of week on the second day, pasted into another year but the change of week on the second day is mimicked!

At least they are fabricating consistantly.



 "The data available via Climate Data Online is generally considered ‘raw’, by convention, since it has not been analysed, transformed or adjusted apart from through basic quality control. "  -BOM


Below: Bourke, copy/paste July Into June, that'll cool it down! 




Below: Charleville
Here's a great way to cool down a month -- copy/paste the entire August temperatures into September.
I kid you not.



I left the best for last. I can go on and on with these sequences, but this is the last one for now, it's hard to beat.

Charleville:
Lets copy the full month of December into the following year of December.
And let's do this for ALL the Raw and Adjusted temperature series.
BUT let's not make it so obvious -- we'll hide this by changing ONE value and DELETING two  values.
You've got to love the subtlety here.



"Producing analyses such as ACORN-SAT involves much work, and typically takes scientists at the Bureau of Meteorology several years to complete. " -BOM


Summary of Pattern Exploration
I deleted quite a few sequences in a re-write of the blog because I can go on and on. There are hundreds or very suspicious to confirmed fabrication sequences. This is a sampling of what is out there.

Charlleville is my favourite. Changing 1 value out of 31 in the Minv2 adjustment data above was a masterstroke....they must have found a 'break' through neighbouring stations!

Overall, the sequences show 100% definite data tampering and fabrication on a large scale. What this shows is a complete lack of integrity for data and intent to deceive. A forensic audit is long overdue. 

Calling the people that produced this data 'scientists' is a stretch by any standards.


_________________________________________________________________________


"ACORN-SAT data has its own quality control and analysis..." -BOM 


BENFORD'S LAW INDICATES EXCESSIVE DIGIT FREQUENCY

Benford Law's Fraud Analytics
Benford's Law has been widely used with great success for many years from money laundering and financial scams to tracking hurricanes and predicting times between earthquakes, as well as being accepted into evidence in a court of law in the USA. 

Benford's Law can be applied on ratio - or count scale measures that have sufficient digits and that are not truncated (Hill & Schürger, 2005)

It describes the distribution of digits in many naturally occurring circumstances, including temperature anomalies (Sambridge et al, 2010). 

Some novel innovations to increase accuracy of Benford's Law has been developed in this paper , and which has been correlated and validated with an actual forensic audit done at the World Bank. 

If a data distribution should follow a Benfords law distribution and it doesn't, it means that something is going on with the data. It is a red flag for an audit, and is likely to have been tampered.

The first graph below shows Hobart, Sydney Melbourne, Darwin and Mildura Maxv2 combined for 200 000 data points. Running a Benfords Law analysis using the first two digits produces a weak conformance based on Nigrini's  Mean Absolute Deviation parameter. 
 
This supports the hypothesis that Benford’s Law is the appropriate theoretical distribution for our dataset. Importantly, this does not indicate that the data is legitimate, as pooled data may cancel out
different individual signatures of manipulation and replicate Benford’s Law (Diekmann 2007, Ensminger+LederLuis 2020).

Below: 5 cities aggregated and Benfords Law curve (red dotted line).
The individual spikes/gaps indicate excessive overuse/underuse of specific numbers.


The above curve with all the aggregated data still has a bias with low numbers 10-15 appearing too few times, 17-45 appearing too often, then specific high numbers appearing with too low a frequency and a few high numbers popping up slightly. 


___________________________________________________________________________________



Benford's Law on Individual Stations.
Below: Deniliquin, Raw + Adj

Looking at the entire temperature series from 1910-2018 and using the first two digit values in a Benford analysis shows extreme non-conformance and a tiny p-value in the Max Raw and Max Adjusted data. 

You can see the high systematic spikes in the graph indicating excessive specific digit use in temperature anomalies.

The tiny p-value is less than 2.16 e10-16 indicates a rejection of the null hypothesis of this data set following Benford's Law. In other words, there is something wrong with the data. 



Below: Min Raw and Adjusted Minv2 Temps for Deniliquin.
These are extreme biases in a 39 000 variable data point set that suggests tampering. The high frequency "spikes" are temps that are repeated a lot and are also evident in the histograms.





Specific Months

Some months are much more tampered with than other months. Not all months are treated equally.

Below: Deniliquin Max Raw for January, all the days of January 1910-2019 are combined for a total of about 3300 days. This graph is screaming out, "audit me, audit me."


Below is Deniliquin Max Raw for July, all the days of July where combined from 1910-2018 to give about 3300 days. These are astounding graphs that show extreme tampering of RAW data.


Below: Deniliquin Min Raw for July.
This is max and min RAW data we have been looking at. 
You are unlikely to find worse less conforming Benford's Law graphs anywhere on the internet. This is as bad as it gets.
This is a massive red flag for a forensic audit.


SOME RANDOM BENFORD'S LAW GRAPHS

Below: Mackay Min Raw For July




Below: Amberley Max Raw, All Data. Systematic tampering.



Below: Amberley January Min Raw. All the days of January. 




_________________________________________________________________________



Amberley Month By Month.- 
 
Stratified months for Amberley shows which months have the most tampering. The results are p-values.

Keeping the same significance level as the BOM, any results less than 0.05 indicates rejection of the null hypothesis of conforming to Benford's Law. In other words, it should follow Benford's, it doesn't....tampering is likely.

Minraw 1 digit test, p-values
All 2.267687e-147
jan  1.106444e-17
feb  1.884201e-17
mar  7.136804e-11
apr  1.171959e-06
may  5.280244e-21
jun  5.561890e-28
jul  3.042741e-24
aug  1.439602e-32
sep  3.522860e-19
oct  9.930470e-25
nov  2.039136e-14
dec  4.546736e-23

This shows all the months aggregated for minraw as well as individual months It shows August + June being the worst offenders followed by October. April is the 'best' month. As with Bourke, August gets major cooling. 

Amberley Minv2 Adj 1 digit
All 7.701986e-192
jan  5.367620e-47
feb  1.269502e-25
mar  3.116875e-30
apr  8.924123e-24
may  9.250971e-26
jun  2.388032e-20
jul  2.889563e-38
aug  2.039597e-22
sep  1.678454e-19
oct  4.009116e-26
nov  6.251654e-15
dec  1.563074e-28

This compares the minv2 adjusted data and shows that adjustments overall (All at the top of the list) are worse than raw, which are pretty bad by themselves. 

January and July are the most heavily manipulated months.

 Amberley Maxraw 1 digit
All 6.528697e-217
jan  4.243928e-74
feb  3.451515e-48
mar  1.279319e-52
apr  1.141334e-69
may  4.425933e-58
jun  1.069427e-58
jul  3.903140e-49
aug  9.602354e-70
sep  2.312850e-53
oct  3.374468e-63
nov  5.669760e-48
dec 5.804254e-100

Overall, Maxraw data is worse than Minraw data. 

Amberley maxv2 adj digit
All 2.701983e-234
jan  2.309923e-83
feb 2.012154e-103
mar  1.492867e-56
apr  8.215013e-52
may  2.721058e-35
jun  9.487054e-40
jul  2.774663e-59
aug  7.915751e-47
sep  2.796343e-69
oct  1.096688e-39
nov  6.902012e-48
dec  1.814576e-68

Once again, adjustments are worse than Raw. February takes over from January with extreme values.

These results from Benford's Law first digit test show that adjusted data is worse/ less compliant to the Benford distribution than Raw.



_________________________________________________________________________



Tracking Benford's Law For First Digit Value Over Years 

Amberley Minv2 Adj Data 1942-2017
The University Of Edinburgh have created a smooth bayesian model that tracks performance of first digit values for Benford's Law over time so that you can see exactly when first digit probabilities increase or decrease. In effect, this allows you to see how the values of the first digit in a temperature anomaly changes over time. 

Running this model with temperature anomalies fom Minv2 with all the data took 15 minutes on a laptop and produced the graph below:

Amberley started in 1942, so that is the zero reference on the X axis. The dotted line is the baseline to what should occur for the digits to conform to Benfords law. 1980 would be just after 40 on the X axis.

This graph shows that the first digit with value 1 has always been underused. Too few ones are used in Minv2 temp anomalies.

It became slightly more compliant in the 1950's then worsened again. There are far too few 1's in first digit position of temperature anomaly Minv2.

There are too many 2's and 3's right from the beginning at 1942, but use of 2's lessens (thus improving) slightly from the 1980's onwards. But use of 4's increases from the 1980's.

The values of  8's and 9's in first digit position have always been underused. The 9 value is undersused from the 1990's onwards.

These digit values indicate less conformance after the 1980 adjustments.





Below: Amberley 2 digit test for all data indicates large scale tampering.




Above:
 Amberley testing using the first 2 digit test. On the left is the raw data. Already the Min Raw is noncompliant with Benfords law giving a tiny P value, so we reject the null of conformity. This is not natural data. It has been heavily manipulated already, there is a big shortfall of value 10-15, there are too many numbers around 24-38 then in the 40's and then methodical spikes in the 50-90 range with big gaps signifying shortfalls.

But look what happens AFTER the adjustments are made on the right -- the Minv2 data is far less compliant and has had digit values from 22-47  become greatly increased in frequency with a tapering off around the 87-97's. 

Below: Bourke Max Raw and Maxv2.1. 
Adjustments make the data 'worse', if that is possible.
Extreme adjustments in both Raw and Adj data, with consistant underuse of lower digits and overuse of higher digits.




Below: Bourke Min Raw Temp Anomalies vs. Minv2.1.
NonConformance.




Below: Mackay January, first digit Benford's test, Raw Data

This show how much tampering has gone into January and July. 



FINALLY BELOW:
This is a wall hanger--the beauty of 'naturally' occurring observational data in an outstanding pattern that is shouting, "audit me, audit me!"
Almost as if the BOM are getting into fractals.

Below: Sydney data indicates engineered specific number use, certain numbers are repeated consistantly.




The BOM has obviously never heard of Benford's Law. This shows engineered specific numbers at specific distances that are over and under used. These are man-made fingerprints showing patterns in RAW data that do not occur in natural observations.


_________________________________________________________________________



BENFORD'S LAW ANALYSIS ON GLOBAL  TEMPERATURE ANOMALIES
-- THE END GAME IN CLIMATE CHARTS

The Global Temperature graphs shown by various climate agencies generally have no levels of significance or boundaries or errors and are the result of averaging daily temp anomalies into months which are averaged into years and are then averaged with 112  stations in Australia or many more world wide. Here is an example of NASA GISS data. (link).

These are the primary graphs that are used by BOM in media releases.

And these are the graphs they use to argue that 15C Adjustments ( and more at some stations) don't matter.

How reliable are they?

This is what BOM Global anomalies look like when analysed with Benford's Law:



An under use of  1's, and a large over use of 2,3,4 + 5's.
Nonconformance with a p-value less than 2.16e10-16.
This means the data is likely highly tampered with.

BOM talk about their data being robust because it matches other agencies:


Below: NASA GISS Global anomalies.
Overuse of 3,4,5,6,7,+8's, even worse than BOM if that is possible.



US agency NOAA Global anomalies are weakly conforming.
Still an overuse of 6,7,8's.





Where we really begin to go into La-La land is looking at land/ocean global anomalies, it's apparent that it's just modeled data. This is not real data.



Below: NOAA global land/ocean with 2 digit test for Benford's Law.


RESULTS
None of the global temperature anomalies can be taken seriously. This is obviously (badly) modeled data to be used for entertainment purposes only. The Global Temperature Anomalies fail conformance too.


__________________________________________________________



EXTREME ROUNDING OF TEMPERATURES


The Case Of The Disappearing Decimal Digits.
a.k.a Rounding To 1C As Required

This is the curious case of the disappearing decimal digits due to rounding up and down. Some stations such as Deniliquin show 4 years of ONLY rounded temperatures! 

Some stations have 25 years where rounding increases from 10% to 70% -- rounded up or down depending on whether warming or cooling is required. 

Below: Deniliquin. This is the adjusted data Maxv2 and Minv2, and has it been adjusted. From 1999-2002, that's 1461 days, there were less than 30 days with decimal values.

That's 98% rounding for Maxv2 and Minv2!




Below: A graphical view of rounding of Max Raw temperatures by years. Notice the spike where the rounding kicks in. Obviously special attention is given in those years.



This comes on the heels of the review panel advising BOM their thermometers/readings needed to meet world standards and increase tolerance from 0.5C to 0.1C.

"However, throughout the last 100 years, Bureau of Meteorology guidance has allowed for a tolerance of ±0.5 °C for field checks of either in-glass or resistance thermometers. This is the primary reason the Panel did not rate the observing practices 
amongst international best practices." - BOM


Below: I bring into evidence Bourke with 92% rounding at the exact same years.



Below: This time for variety, the Min Raw and Max Raw temperatures are all rounded up and down depending on the required outcome, warming or cooling. We already know Raw data has been tampered/fabricated on a large scale, this is an extra bit of tweaking.





Below: A graphical view of rounding at Bourke.



Unlike Deniliquin with a single large spike in rounding activity, Bourke gets two stages. A big block of twenty years or so where the rounding triples, then the exact same years as Deniliquin, from 1999-2002, where extreme rounding occurs.


Below: Even Sydney gets 3 years of complete rounding except for 2 days. 1999-2001, in the same time period as Deniliquin and Bourke too.




Below: Rutherglen Increases In Rounding Per Year. 




Below: More Rutherglen.


There are more cases of extreme rounding but I haven't listed them all because it becomes unmanageable to get through.

This again shows another technique the BOM is using to engineer their data to get a particular result.


__________________________________________________________________________________



EXTREME OUTLIERS ADDED IN


The Power Of Subset Data.
The key to analysing much of the BOM temperature series is to subset the data, to break it into key chunks which show interesting things. 

Raw and Adjusted data sets has been engineered to look roughly similar on a cursory look of the histogram. This is because Quantile Matching algorithms are used by BOM to ensure the data distributions are roughly similar.

A great method that has been used with success in anomaly detection is to use unsupervised learning with the K-Means Clustering algorithm which lumps similar data together into clusters, then run Decision Trees  on the clusters to get an explanation for the clustering.

In effect this reverse engineers the BOM more subtle changes, if there is such a thing.......

Bourke Minv2 -- extreme outliers revealed by subsetting data.
First, missing data table below shows missing patterns. There are 188 cases where there is no data in Minraw, with 188 variables in Minv2.1
So where do these numbers come from? Are the carefully imputed from the data or neighbour stations?

Don't be silly, why waste empty space....BOM used it to creating near record breaking outliers!



Normal data preprocessing removes extreme outliers, but in this case they have been put into Adjusted Data where there is no Raw!

Outliers are extreme values that don't match the rest of the values and can cause problems with a dataset. 

As well  enhancing warming/cooling.





The Box plot below confirms the high number of outliers added into Minv2.




Below: Min Raw is missing but the 7th highest temperature on record for Min has been in. There are 188 variables that are missing in raw but have been filled into Minv2.1.

No argument can be made about that this is data imputation because these are large outliers head and shoulders above the rest of the temperatures. 

Nearly record-breaking outliers are added into Minv2.1 where Raw is missing. 






By having empty variables in Raw, BOM has the option of imputing or interpolating any values they want into the Adjusted temp series. This gives them the option of tweaking cooling/warming as required. No wonder their ad hoc imputation is undocumented.

Creating extreme outliers by imputation = deceit.


__________________________________________________________________________________


DUBIOUS ADJUSTMENTS

___________________________________________________________________________________


 "The primary purpose of an adjusted station dataset it to provide quality station level data for users, with areal [sic] averages being a secondary product." -BOM


The complete Amberley temperature series both raw and adjusted is below. The orange graph is the raw temperature, the blue is the cooled down adjusted version. By cooling the past, a warming trend is created. Notice cooling in Adj stopped around 1998.




___________________________________________________________________________________


A Summary Of The Amberley Problem:

(1) A dip in the temperature series in 1980 ( an 'inhomogeneity detected') made them realise the station was running warm because now it didn't match it's neighbours -- therefore it was cooled down significantly from 1942 to 1998. ???

(2)The unspecified 'neighbour stations' were totalled as 310 by NASA and several dozen by BOM.

The stations involved were vague and non transparent, and so unable to be tested.

(3) Conveniently a warming trend had been created where there was none before.

(4) In 1998 the station mysteriously returned to normal and no more significant adjustments were required. 

(5) The following iteration of the temperature series from Minv1 to Minv2 resulted in them now warming the cooled station after 1998


__________________________________________________________________________

No evidence supplied, no documentation on the 'neighbours' involved, no meta data. If you think about this you realise it's nonsense.


The review panel from 10 years ago had problems with the methodology too -- 

"C7 Before public release of the ACORN-SAT dataset the Bureau should determine and document the reasons why the new data-set shows a lower average temperature in the period prior to 1940 than is shown by data derived from the whole network, and by previous international analyses of Australian temperature data." 

Also:

"C5 The Bureau is encouraged to calculate the adjustments using only the best correlated neighbour station record and compare the results with the adjustments calculated using several neighbouring stations. This would better justify one estimate or the other and quantify impacts arising from such choices."

Using only the 'best correlated neighbour stations' has obviously confused Gavin Schmidt from NASA, he used 310 neighbours (see Jennifer Marohasy's blog). Dr. Jennifer Marohasy documents the whole dubious adjustment saga in detail.

The BOM were eventually forced to defend their procedures in a statement:

"Amberley: the major 
adjustment is to minimum temperatures in 1980. There is very little 
available documentation for Amberley before the 1990s (possibly, as an 
RAAF base, earlier documentation may be contained in classified 
material) and this adjustment was identified through neighbour 
comparisons. The level of confidence in this adjustment is very high because of the size of the inhomogeneity and the large number of other stations in the region (high network density), which can be used as a reference. The most likely cause is a site move within the RAAF base."

Obviously their level of confidence wasn't that large because they warmed up their cooled temperatures somewhat in the next iteration of the temperature series data set (from minv1 to minv2).

Update minv2.1
Warming continues from iteration minv2 to minv2.1 by increasing the frequency of temperature repeats slightly. Every iteration gets warmer.




_______________________________________________________________________________



This whole situation is ludicrious and you get the feeling that the BOM has been caught in a lie. There are several ways to check the impact of the adjustments, though.

(1) Benford's Law before and after adjustments
(2) Control Charts, before and after adjustments
(3) Tracking first digit values from 1942-2018 to see if we can spot digit values changing using a smooth bayesian model from University of Edinburgh.


__________________________________________________________


AMBERLEY TEST 1


Benford's Law
Below: Raw and adjusted data is compared from 1942-1980 using Benford's law of first digit analysis. 


Using Benford's law for first digit analysis we can see the adjustments make the data worse with lower conformance and a smaller p-value.

Below: Beford's law first 2 digits for January and July.
The graphs are as bad as anything you are likely to see and would trigger an automatic audit in any financial situation.




___________________________________________________________________________________


AMBERLEY TEST 2

Basic Quality Control - The Control Chart 
Besides Benfords Law, let's use Control Charts to get a handle on the Amberley data and get a second opinion.

I put the Min Raw and Minv2.1 temperature data into a Control Chart, one of the seven basic tools of quality control. 

The temperature series was already 'out of control' in the raw sequence, but not in 1980. There are 11 warning nodes where the chart is over or under the 3 sigma limit, but after adjustments, this nearly doubles. There are many more warning nodes and the temperature sequence is more unstable.




__________________________________________________________________________


AMBERLEY TEST 3


Tracking Benford's Law For First Digit Value Over Time 
Amberley Minv2 Adj Data 1942-2017

The University Of Edinburgh have created a smooth bayesian model that tracks performance of first digit values for Benford's Law over time so that you can see exactly when first digit probabilities increase or decrease. 

In effect, this allows you to see how the values of the first digit in a temperature anomaly changes over time. 

Running this model with temperature anomalies fom Minv2 took 15 minutes on a laptop and produced the graph below:

Amberley started in 1942, so that is the zero reference on the X axis. The dotted line is the baseline to what should occur for the digits to conform to Benfords law. 1980 would be just after 40 on the X axis.

This graph shows that the first digit with value 1 has always been underused. It became slightly more compliant in the 1950's then worsened again. There are far too few 1's in first digit position of temperature anomaly Minv2.

There are too many 2's and 3's right from the beginning at 1942, but use of 2's lessens (thus improving) slightly from the 1980's onwards. But use of 4's increases from the 1980's.

The values of  8's and 9's in first digit position have always been underused. The 9 value is undersused from the 1990's onwards.
These digit values indicate less conformance after the 1980 adjustments.





___________________________________________



MORE DODGY ADJUSTMENTS


Bourke Adjustments Of 0.5C = Less Than Sampling Variation
BOM released a statement about the adjustments made at Bourke:

"Bourke: the major adjustments (none of them more than 0.5 
degrees Celsius) relate to site moves in 1994 (the instrument was moved  from the town to the airport), 1999 (moved within the airport grounds) and 1938 (moved within the town), as well as 1950s inhomogeneities that were detected by neighbour comparisons which, based on station photos 
before and after, may be related to changes in vegetation (and therefore exposure of the instrument) around the site."


Looking at Bourke, below:


This is strange because there are lots of adjustments in 1994, 1999 and 1938 that are far more than 0.5 degree, some are over 3C degrees.

But maybe the vague language about the 1950's is where the low adjustments are -- well it depends on the year which is not specified.


Here are the biggests Adjustments in the time series for Bourke:



So the 'none of them more than 0.5 degrees Celsius' is meant for some unknown years in the 1950's, it seems to be misdirection to distract from the bigger adjustments all along the time series.

But look at the months column -- so many August entries I had to look at it more closely.

So there were more than 4400 adjustments over 2C degrees in the time series. That's the subset we'll look at--



Look at August -- half of all the adjustments over 2 degrees for 1911-2019 in the time series were in August! 

August is getting special attention by the BOM in Bourke with a major cooling of Minimum temperatures. 

Getting back to the '0.5 degree adjustments in the 1950's' -- this is nonsense because:

These are the statistics  for 1950-1959:


The mean Min Raw temp is 13.56 degrees. 
A single mean digit contains sampling variation and does not give a true picture.

 
Putting Bourke Min into a Control Chart below shows what the real problems are. The upper red line is the upper 3 sigma limit, the lower one is the lower 3 sigma limit, temps will vary between the red lines 99.7% of the time unless it is 'out of control.'
 
You can tell something is wrong with Bourke with the number of nodes that have breached the upper and lower limits at the beginning and end of the series, the 1950's doesnt even register. How can it, we are talking 0.5C.

Look at 2010, it is off the chart, literally....an extremely remote chance of seeing this event at random.

In Control Chart language, this temp series is 'Out Of Control', there is something very wrong with it.



Above: Control Chart for Bourke showing the system is 'out of control' from the beginning, but the 1950's are not the problem.



___________________________________________________________________________________




Statistical Sigificance 1980-2009
Every Decade Warmer? 

Very often the BOM display graphs and charts without boundaries of error or confidence intervals. The statistical significance is implied.

Given that there are problems with past historic temperature series,what if we could test just the best, most recent results with modern fail safe equipment for statistical significance?

A hypothesis like this is easy to test:

Dr Colin Morice from the Met Office Hadley Centre.
"Each decade from the 1980s has been successively warmer than all
the decades that came before." 

We can use Non Parametric Combination Test with R code from Devin Caughey at MIT.

This technique is common in brain mapping labs because no assumptions are made about the distribution, inter-dependencies are handled and multiple test are exactly combined into a p value. A great signal to noise ratio and the ability to handle very small sample sizes makes this the ideal candidate to test the hypothesis.

The null hypothesis = all decades after 1980 are NOT getting warmer
The alternate hypothesis = all decades since 1980 have become warmer.

We'll use the temps from Berkley Earth.
The data will be:
1980-1989
1990-1999
2000-2009

The output of NPC is a p value after exactly combining the sub-hypotheses. In keeping with the BOM, we use the 95% significance level, so anything that is LESS than p value = 0.05 has the null hypothesis rejected.
 
The results using Berkely Earth temps (except NOAA which is from NOAA) are:

berkley earth temp
h0=!1>2>3----null hypothesis
h1=1>2>3=each decade warmer----alternate hypothesis

Don't Reject The Null - each decade NOT getting warmer.
Alice Springs p-value = 0.4188
Amberley p-value = 0.3326
Tennant Creek p-value = 0.7159
Benalla p-value = 0.4085
Bering p-value = 0.1651
Capetown p-value = 0.2872
Corowa p-value = 0.1776
Darwin p-value = 0.5984
DeBilt, Netherlands p-value = 0.146
Deniliquin p-value = 0.4067
Echuca p-value = 0.3645
Launceston p-value = 0.3331
Mawson p-value = 0.3043
Mildura p-value = 0.2888
Mt. Isa p-value = 0.5782
NOAA Southern Region p-value = 0.2539 
Nowra p-value = 0.2141
Rutherglen p-value = 0.2283
Sale p-value = 0.3685
Tamworth p-value = 0.2407
Wangaratta p-value = 0.277

Reject The Null - each decade is getting warmer
Beechworth p-value = 2e-04
Hobart p-value = 3e-04


Beechworth is less than 40kms away from Wangaratta yet decisively rejects the null while Wangaratta does not! Similar to Hobart Launceston that are 2 hours apart.

This shows that the premise from Met Office Hadley is wrong for our sample. Using Berkely Earth temps, a random sampling of stations using NPC test to calculate significance without assuming a normal distribution, has rejected the alternate hypothesis in most cases.

Going over the results again, I found most country stations reject the alternate while the capital cities being Urban Heat Islands, decisively reject the null and agree with Met Office Hadley.

This shows 2 things:
Statistical significance/confidence intervals/boundaries of error are mostly ignored in climate presentations.

Don't trust everything you hear - test, test, test!

As An Aside:
Here are 40 000 coin tosses documented at Berkley University, heads are +1 and tails -1:




I took the first 1000 tosses from their supplied spreadsheet, graphed it and plotted a trend.

There's even a 95% percent boundary of error which is more than the BOM supply on most of their trends. 

Moral of the story: Even a sequence of coin tosses can show a trend. 



__________________________________________________________




***UPDATES COMING ***
19 Jan 2021
Over The Next Few Days/Weeks

coming-
NEW evidence on missing raw data being imputed/estimated/fabricated into Adjusted data that breaks records. In one case 10 000 data of missing raw is imputed/estimated/fabricated into Adjusted data with yearly records.


Count or ratio data conforms to Benford's Law. A NEW way to use temperatures into the histogram graphs that test for Benford's Law without the need to convert them to temperature anomalies. This tests the histogram of repeated temperatures for Benfords Law conformance!


Statistical significance tests using climate data from Berkley Earth. This shows that many/most stations are not even statistically valid data from a climate agency when testing this hypothesis--

Dr Colin Morice from the Met Office Hadley Centre:
"Each decade from the 1980s has been successively warmer than all
the decades that came before." 

We do this using Non Parmatric Combination Test from MIT which makes no assumptions about the distribution and automatically accounts for inter-dependencies.


__________________________________________________________



This blog has taken quite a few months to research and write. The more I dug into the data, the more rotten it was. And I am still digging. It is a shocking case of extreme data tampering and fabrication. 

It is on a larger scale than Enron if it were financial (check Enron Benford curves from my first post), the fabrication/duplication is larger the Prof Staples who retracted his studies and was fired from the University of Rotterdam (and who said his techniques were 'commonly in use' in the research labs). 

This has to be a wake-up call for the Government to launch a forensic audit. The BOM cannot be trusted with the temperature record, it should be handed over to a reputable origanisation like the Bureau Of Statistics. 

It's obvious the BOM either don't know or don't want to know about data integrity. This isn't science. The Brit's have a term for this - Noddy Science.

__________________________________________________________________________


More to follow in other posts, there is much to write about in relation to climate data. It's making the tulip frenzy of the 1600's look like a hiccup.





No comments

Post a Comment

© ElasticTruth

This site uses cookies from Google to deliver its services - Click here for information.

Professional Blog Designs by pipdig