**Summary:**

**Prelude:**

*Garbage In, Garbage Out.*It expressed the idea that flawed or incorrect input data will always produce faulty output. It's been claimed that 90% of the world's data has been created in the last two years (Horton, 2015), making it even more critical to check data integrity.

The Australian Government announced in 2016 that it has committed $2.55 billion dollars for carbon reduction and other $1 billion to support developing countries reduce their carbon dioxide emissions. (Link)

The premise behind the spending is that world wide temperatures have risen to dangerous levels, and are caused by man made emissions. The most cited dataset used to prove this is the HadCRUT4 from Met Office Hadley Centre UK, and before 2017 this data had never had an independent audit.

John McLean published his dissertation McLean, John D. (2017) *An audit of uncertainties in the HadCRUT4 temperature anomaly dataset plus the investigation of three other contemporary climate issues.* PhD thesis, James Cook University. showing comprehensively how error- ridden and unreliable this dataset actually was. (Link)

In Australia, the Bureau Of Meterology (BOM) created and maintains The Australian Climate Observations Reference Network-Surface Air Temperature (ACORN SAT) which *"provides the best possible dataset for analyses variability and change of temperature in Australia."*

This dataset has also never had an independent audit despite claims that *"The Bureau's ACORN-SAT dataset and methods have been thoroughly peer-reviewed and found to be world-leading."* (Link)

The review panel in 2011 assessed the **data analysis methodology,** and compared the temperature trends to *"several global datasets"*, finding they *"exhibited essentially the same long term climate variability"*. This *"strengthened the panels view*" that the dataset was *"robust"*. (Link)

**Benford's Law**

**It has been shown that temperature anomalies conform to Benford's law**, as do a large number of natural phenomena and man-made data sets. (Benford's Law In The Natural Sciences, M.Sambridge et al 2010)

Benford's law has been widely applied to many varied data sets for statistical fraud and data integrity analysis, yet surprisingly has never been used to analyse climate data.

Some examples of Benfords Law: *(Hill, 1995a; Nigrini, 1996; Leemis, Schmeiser, and Evans, 2000; Bolton and Hand, 2002; Applying Benford’s law to detect fraudulent practices in the banking industry Theoharry Grammatikos a∗ and Nikolaos I. Papanikolaou 2015, Benford’s Law in Time Series Analysis of Seismic Clusters Gianluca Sottili ·2015; Schräpler, Jörg-Peter (2010) : Benford's Law As an Instrument for Fraud Detection in Surveys Using the Data of the Socio-Economic Pane 2019; Using Benford’s law to investigate Natural Hazard dataset homogeneity Renaud Joannes-Boyau et al 2015; Indentifying Falsified Clinical Data Joanne Lee, George Judge 2008; self-reported toxic emissions data (de Marchi and Hamilton, 2006), numerical analysis (Berger and Hill, 2007), scientific fraud detection (Diekmann,2007), quality of survey data (Judge and Schechter, 2009), election fraud analysis (Mebane, 2011)*

**Benford's Law states that the leading digit will occur with a probability of 30.1%** for many naturally occuring datasets such as the length of rivers or distance travelled by hurricanes, street addresses, and also man-made data such as tax returns and and invoices, making this a very useful tool for accounting forensics.

**leading digit**takes the value of 1 about 30.1% of the time, the value of 2 about 17.6% of the time, and so on, see table below. So the probability that nearly half the population live at a street address with the first number being a 1 or a 2 is 47.7%. Essentially this means that in the universe there are more one's than two's, more two's than three's and so on.

Data conformance to Benfords Law can be visually checked by looking at the graph of the actual versus the expected frequencies, and statistically confirmed with a Chi-square test to compare expected frequencies with actual, the Kolomogorov-Smirnov test was used as a back up confirmation. These tests were validated in this application for accuracy using monte-carlo simulations. (

*Two Digit Testing for Benford’s Law, Dieter W. Joensseny, 2013*)

*first digit and the second digit*adds more power to Benfords Law. (

*Two Digit Testing for Benford’s Law Dieter W. Joenssen, University of Technology Ilmenau, Ilmenau, Germany 2013*)

*and*second leading digits in the analysis.

*S.Miller, Benford's Law: Theory and Applications, 2015*)

*digits are constrained and therefore don't conform to Benfords Law.*

**However, temperature anomalies DO conform to Benfords law**. Malcolm Sambridge from the University Of Canberra showed this - (

*Benford’s law in the natural sciences, M. Sambridge, 1 H. Tkalčić, and A. Jackson 2010*)

**What Are Temperature Anomalies?**

"A temperature anomaly is the difference from an average, or baseline, temperature.A positive anomaly indicates the observed temperature was warmer than the baseline,while a negative anomaly indicates the observed temperature was cooler than the baseline."

“Anomalies more accurately describeclimate variability over larger areas than absolute temperatures do, and they give aframe of reference that allows more meaningful comparisons between locations andmore accurate calculations of temperature trends.”

*than temperatures in spatial grids. (*

**are in most cases less accurate***New Systematic Errors In Anomalies Of Global Mean Temperature Time Series, Michael Limburg, Germany, 2019*)

**Anomalies are widely used in climate analysis and do conform to Benfords Law which gives us a very useful powerful tool for auditing climate data.**

**Benford's Law Analysis:**

**Data Integrity Audit Of BOM Climate Data Using Benfords Law**

**And Statistical Pattern Exploration using R and JMP**

**Raw Data**is

*"is quality controlled for basic data errors"*.

**Adjusted Data**"

*has been developed specifically to account for various changes in the network over time, including changes in coverage of stations and observational practices.*"

*Investigation of methods for hydroclimatic data homogenization, E. Steirou and D. Koutsoyiannish, 2012*)

**Sydney Daily Max And Min Temperature Time Series**

**NOTE:**The Minimum and Maximum adjusted data is called Minv2 and Maxv2 respectively, and the Min Raw and Max Raw is the Minimum and Maximum Raw daily data from 1910-2018 as supplied by BOM.

**Benford's Law NOTE:**Temp anomalies are used for Benfords Law first digit and first two digits test. In the first digit test,

*only the leading digit is used after the - + or 0 are stripped away.*In other words, according to Benfords Law, leading digit 0 is thrown away, as is - or + signs. Only digits 1-9 are used.

*In the Benfords Law two digit test, only the leading two digit values (10-99) are used after stripping out - or + or leading 0.*

**Below: All Sydney days for Maxv2 data with first digit Benford's law test, expected (dotted red line) versus actual frequency.**

**Above**: The first (leading) digit for the

*complete*

**Daily**

**Sydney Maximum Adjusted Temps (maxv2)**from 1910-2018, nearly 40 000 days.

*The red dotted line is the expected, the bars are the actual.*

**over the full data set**, but with too few one's and too many three's and four's overall. This curve fails the chi-square test with a very small p value but is "weak" according to the Nigrini MAD index. To gain more power, the first two digits are used in the next Benford Test below.

**Above:**This shows a better picture why the data fails conformance. The first two digits test gives a more complete picture and is more powerful. The data set also fails the conformance tests with two digits. You can clearly see some digits are in use too much and some too little.

There are far too many 17's, 38's and 37's, 42's and 85's. Looking at the curve, you can see systematic increase with "blocks" of numbers. There are too few 10's as well. The numbers seem to be in blocks of two's and three's, either too many or too few. Overall, as seen in both graphs, the mid range and larger numbers are over used. The Maxv2 data is non conforming to Benford's distribution.

**Minv2**

**Below: The**

**Daily**

**Minimum Temperatures Adjusted (Minv2)**

**Above**: The minimum adjusted temps (minv2) for the first digit fails chi-square conformance test with a a small p value below our 0.005 cutoff. It is worse than the maximum temperatures graph for single and double digit test using the complete data.

**Lower numbers get higher frequency than expected in Minv2 thus upward warming trend.**

**Below: The**

**Daily**

**Minimum Temperatures Adjusted (Minv2), first 2 digits Benford's test.**

**Above**: This Minv2 graph for 2 digit test is more dramatic -- it clearly show how

*Peter has been robbed to pay Paul*-- the higher numbers from 40-90 or so have been reduced in frequency, the lower numbers around 15-38 have been increased in frequency.

*This has the potential to be more extreme when looking at specific months.*

*This will show us if the anomalies in the above average or below average Maxv2 groups changes.*

**Below:**

**Sydney Maxv2 Data, ONLY Positive Temp Anomalies Tested.**

*,*there is a

**greater lack of conformance to Benford's Law**.

*Particular numbers have increased and decreased with regularity*, there is nothing "natural" in this number distribution. This appears to be data tampering in the

**resultant**above-average temp anomalies.

*increased in frequency*in the Maxv2 data.

**Above**:

**ONLY POSITIVE temp anomalies for Maxv2.**

**What About Above-Average Minimum Temps?**

**resultant above-average**temperature in the Minv2 data appears to have tbeen tampered with quite dramatically.

**:Below**

**ONLY POSITIVE temp anomalies for Minv2.**

**Results Of Min Max Temp Anoms + Benfords Law**

**Neither Maxv2 or Minv2 temperature anomalies data conform to Benfords law.**There are very large deviations from the expected Benford's curve, particularly when looking at only the positive anomalies for Minv2 and Maxv2.

*are to "remove" biases of non climatic effects*is doing the opposite - in fact very large biases are added because

**normal observational data with occasional corrections/adjustments would not look like this on data known to conform to Benford's Law.**This is nearly 40 000 observations in sample size, the "adjustments" have to be

*very large*to look like this.

**In any financial situation, this data would be flagged for a forensic audit, it suggests tampering.But What About RAW Data?**

*unadjusted*" and are only subject to

*"pre-processing"*and "

*quality control*." (Link) This consists of:

"T*o identify possible errors, weather observations received by the Bureau of Meteorology are run through a series of automated tests which include:*

*‘common sense’ checks (e.g. wind direction must be between 0 and 360 degrees)**climatology checks (e.g. is this observation plausible at this time of year for this site?)**consistency with nearby sites (e.g. is this observation vastly different from nearby sites?)**consistency over time (e.g. is a sudden or brief temperature spike realistic?)"*

**Maximum Raw Data**:

**Maximum Raw Temp**Anomalies reveals extremely biased data, about as "unnatural" a distribution as you can get, with periodic spikes and dips. There is a man-made fingerprint here in the rugularity. This

**RAW**data fails a chi-square test for Benford conformance.

**Below:**

**This is the Maximum Raw Temperature Anomalies with a two digit Benford test.**

**Below:**This is the

**Minimum Raw Temperature**Anomalies with a two digit Benford test. Again, biased data and definitely not raw observational data with minor preprocessing. Very cooked. Too many 15-47's, too few 10-13 and higher numbers, such as 59, 69, 79, 89.

**The RAW data Min and Max is not raw, it is cooked. It is very cooked.**

**Comparison With Other Climate Data Sets**

*"Berkeley Earth is a source of reliable, independent, non-governmental,*

*and unbiased scientific data and analysis of the highest quality." (Link)*

**Below: Berkely Earth Global Temp Anomalies, First 2 Digits.**

**Above: Benford's 2 digit test shows increased frequencies of digits 40-90 and reduced frequencies of digits 10-35 in Berkley Earth Gobal Anoms.**

**Below:**

**Below**:

**Nasa GISS Global Temp Anomalies.**

**Results Of Comparison**

**Difference Between RAW data and Adjusted data?**

*differences*between raw and adjusted.

**positive number, the adj temp is warmer than to raw**, and when it's a

**negative number, it is being cooled compared to raw.**

**Above:**

**This is the extra warming done by software on top of Raw.**

*The adjustments are regularly updated and tweaked by BOM as the "*

*science changes*." and "

*network changes"*are detected

*.*

**average**temperature values were used on the left vertical axis.

**The actual values of how much the temps were modified is below**.

**The below**

**graph shows data points with actual temp degrees of added warming**

**.**

**Above:**

**it was cooled by -7.6 C degrees**at that point.

**What month's are getting most of the warming from Raw to Adjusted data set?**

*Two Digit Testing for Benford’s Law,Dieter W. Joenssen, 2013*)

**Specific Months Using Benfords Law.**

**Below: JANUARY Maxv2**

**Temp Anomalies**

**- First 2 digits Benfords Law**

**Below: FEBRUARY Maxv2**

**Temp Anomalies**

**- First 2 digits Benfords Law**

**Below:**

**JANUARY Minv2 Temp Anomalies - First 2 digits Benfords Law**

**Below:**

**FEBRUARY Minv2**

**Temp Anomalies**

**- First 2 digits Benfords Law**

**Benfords Law Results For Individual Months:**

**University Of Edinburgh Bayesian R Code**

**Tracking Data Conformance to Benfords Law Over Time.**

*over time*. This would tell us exactly at what point the data was modified (what year) and by how much and how little.

**more**accurate than empirical methods of evaluation because of the discretisation effect.(Link)

*data in recent years was less homogenous!*

*that tracks periods at which conformance to*

*Benford’s Law is lower. Our methods are motivated by recent attempts to assess how the*

*quality and homogeneity of large datasets may change over time by using the First-Digit*

*Rule."*

**he above output shows the net effect of non conformance to Befords Law with the leading digit.**The smooth SSD (smooth sum of squared deviations) statistics assesses overall conformance over nine digits with the First-Digit Rule in each year,

*"which avoids overestimation of the misfit due to a discretization effect, whereas a*

*naive empirical SSD as in can be shown to be biased."*(Link)

*lack of conformance to Benfords Law*. In other words, certain digits have been used excessively and some too sparsely in the leading digit values of temperature anomalies of Berkley Earth Daily Global Amomalies. The trend increases dramatically at 2008, suggesting much more data tampering in the latter years.

**Sydney JUNE Maxv2 Bayesian Tracking 1910-2018**

**Above:**The first digit is now equal to 2 and the use was excessive around 1910, declined in the 1920's and was overused in the 1980's, and reducing in use in the last 5 years or so.

**Above:**Leading digit =3, use declined greatly from 1960's, although there was a leveling out in the 1990's before dropping down to around normal expected level.

**Above:**Leading digit = 5 shows complete under use throughout the years, with an increase in the 1990's but still below expected.

**Above:**Leading digit = 6, shows under use and then a sharp increase in the 1950-1980's. It has been under used from the early 90's.

**Above**: Leading digit = 7, this shows excessive use in 1920's -- then a gradually declining use.

**Above:**Leading digit = 8, the magical date of 1980 where so much happens in the climate world comes into again with excessive use in the 1980's and the 90's.

**higher**than Berkely Earth, there is a higher overall lack of conformance to Benfords Law. This can be see on the left hand side vertical axis. The lack of conformance to Benford's law is relatively flat with slight cyclic variations around 1910, 1930's, 1950's and gradually increasing from the 1970's, with accelerated increase in the last 5 years. That signals the worst lack of conformance, suggesting Benford's Law conformance has been getting worse in the last 5 years or so.

**Summary:**The use of leading digit value = 1 increases dramatically from the 2000's, causing negative anomalies in the June data set to be warmed. Overall lack of conformance to Benfords first-digit rule is worse than the Berkely Earth global data set as shown on left axis values of SSD graph.

**Statistical Analysis Of BOM Data Sets Without Benford's Law:**

**Pattern Exploration, Trailing Digits And Repeated Numbers.**

The pharmaceutical industry is also actively involved in replication of studies-

*Proxy Temperature Reconstruction" data from “Global Surface Temperatures Over the Past Two Millenia" (Phil D. Jones, Michael E. Mann),*the infamous "climategate" dataset.

**Trailing Digit Analysis With Sydney BOM Daily Data Sets**

*Durtschi et al., 2004*), the trailing digit is typically uniformly distributed (

*Preece, 1981*)

*The Australian Climate Observations Reference Network – Surface Air Temperature (ACORN-SAT) Data-set Report of the Independent Peer Review Panel 4 September 2011)*

*Jean Ensminger and Jetson Leder-Luis*World Bank audit is used here to test various months from the Sydney Minimum and Maximum temperature data sets, both raw and adjusted.(

*Measuring Strategic Data Manipulation: Evidence from a World Bank Project*).

**Above: All the days for October,**about 3300 of them from 1910-2018. This is from the

**Sydney Max Raw**data set, this is unadjusted data from the BOM. We are looking at the 3rd digit in all the raw temperatures (not anomalies) because this test is regardless of Benfords and can thus be used directly on temperature data.

**October Mav2**, the adjusted data set.

**Maxv2**adjusted data set. This also fails the uniform distribution. The 5 digit has increased dramatically from Raw.

**Above: Sydney Min Raw**data from 1910-2018, the the 3300 October days. This data is supposed to be unadjusted but fails the Chi-square test for uniform distribution. The 5 digit has a too low probabilty in 3rd postion again.

**Next:**

**October Minv2**.

**Above: Sydney October Minv2**adjusted dataset fails to comply with a uniform distribution as well, with an equally low p value conpared to the raw data.

**Note:**The 5 digit is often low and occurs in other months too.

**This could be indicative of a double-rounding error, where the majority of temperature readings were done in Fahrenheit and rounded to 1/10 of a degree, then later converted to Celcius and rounded to 1/10 of a degree again.**

**daily time-scales**, are sensitive to rounding, double-rounding, and precision or unit changes. Application of precision-decoding to the GHCND database shows that

*63% of all temperature observations are misaligned*due to unit conversion and double-rounding, and that many time series

*(Decoding the precision of historical temperature observations,*

*Andrew Rhines et al)*

**Result Of Trailing Digits Analysis For Sydney Daily**

**Min Raw, Minv2, Max Raw and Maxv2 Data Sets**

**Lack of uniformity with Trailing Digits are a classic marker of data tampering**(

*Uri Simonsohn, http://datacolada.org/74*)

**Pattern Exploration Of Sydney Daily Min Max Data Sets--**

**Looking for Duplication and Repeated Sequences-**

**Beyond Chance.**

**Above:**Daily Min Raw Data Set For Sydney for December, about 3300 days.

**zero**.

**July daily temps for Sydney 1910-2018.**

**Above:**Both

**Min Raw and Minv2 for July**have 31 days, a complete month, "copy pasted" into another year. The probability of this happening by chance is zero again.

**Above:**A snapshot of a full month being copy pasted into another year in both Sydney Minv2 and Min Raw data. Again, the Raw is supposed to be relatively untouched according to BOM. Yet this copied sequence gets carried over to the adjusted Minv2 set.

**June Minv2 + Min Raw also have a full month of 30 days copy pasted into another year.**

**Above:**Sydney June Daily Temps, Minv2 + Min Raw duplicated 30 day sequence.

**Above:**A duplicated 30 day sequence for June Minv2 and Min Raw.

**Above:**Direct linear relationships between, minimum and maximum adjusted daily temperatures in March.

**Above:**A shorter yet still improbable sequence in March Maxv2 dailies. Only sequences above the probability of 1 in 40 000 being chance are shown here, there are many many shorter sequences in the BOM data that are more unusual. For example 2 cities in the Netherlands (De Kooy and Amsterdam) were checked as well as 2 regions in the U.S (nw and sw regions from NOAA) and were compared to Sydney sequences, none came close to the large number of rare events.

**Results Of Pattern Exploration:**

**Sydney has sequences copied between months and years that have**

**zero probability of being a chance occurance. The large number of the shorter duplicated series are also improbable.**

**There are multiple linear relationships between raw and adjusted data suggesting linear regression adjustments between raw and adj.**

**Copy/pasting months into different years, or worse,**This should not happen with time series data. See below:

*into different months*as has happened in other data sets, is data tampering.*Weather Data: Cleaning and Enhancement, Auguste C. Boissonnade; Lawrence J. Heitkemper and David Whitehead, Risk Management Solutions; Earth Satellite Corporation*

**An Analysis Of Repeating Numbers In Climate Data.**

**Problem:**

**Above**: This example is from all the days in March in the Sydney daily Min Raw and Minv2 temp time series.

**A massive increase in repeated numbers from raw to minv2!**

*adjusted*temperatures for Max temps of March.

*a lot.*

**Above:**Repeated temps in Dec, Minv2 in blue and Min Raw in orange.

**The most repeated temps have the longest spikes.**

**Above:**This is the Min Raw in orange and Minv2 in blue for December. The number bunching analysis for repeated numbers will be run again with this data to asssess the bunching of repeats.

**Above:**Results of number bunching analysis for Max raw Sydney temps.

*red line is the*

*observed*average frequencies for Max Raw data. The red line is within the distribution, it is 2.02 Std errors from the mean, about a 1 in 20 occurance. This is well within expectation.

**Above:**Maxv2 -- the expected average frequencies and the observed average frequencies have been separated by a massive Std error of 27.9. We are seeing far too many observed average repeated numbers against what is expected for this sample. We would expect to see this bunching in fewer than 1 in 100 million times.

Above: The Min Raw Data tells a similar story, there are too many repeated numbers. The observed repeats have a 7.6 Std error. This is more than a 1 in a million occurance. |

**Above:**Minv2 - the observed average repeated numbers (red line) here is so far out of expectation, 41.5 Std errors, we never expect to see this. The numbers become too tiny for any meaningful computation. The data has extremely high rate of bunching. Extremely high number of repeated temps.

**June below:**

**Above:**June Max Raw data has standard error of nearly 12, a very high level of bunching we would virtually never expect to see.

**Above:**The Maxv2 adjusted data for June....and is it adjusted! It was bad in Raw, it is a whopper in adjusted Maxv2 data. The standard error of 49 is massive, the chance of seeing this in this sample is nil. A high level of manipulation in repeated numbers (temps).

**October below:**

**Above:**The October Max Raw data has observed average repeated numbers against expected average repeated numbers of 4.8 Std errors past the mean, highly unusual but not beyond expectation. More than 1 in 150 000 event.

**Above:**The October Maxv2 adjusted data set has far too many oberserved repeats against expected repeated temps, over 24 Std errors. Too tiny a probabilty to calculate. We would not expect to see this.

**Results:**

*A much more extreme outcome exists here*than in the study Uri Simonsohn highlights on this website and where he supplies the R code to test this. The suspect study he used was shown was retracted for suspected fabrication. The BOM data is extremely suspicious.

**Wrapping Up:**

**Preliminary work shows even worse results for the smaller towns compared to the Sydney data.**

**Government forensic audit should be performed on The BOM climate data.**

**It is extremely suspicious and would have been flagged in any financial data base for an audit.**

**Errors that Increase Uncertainy Even More**

*averages.*

*While the Panel is broadly satisfied with the ACORN-SAT network coverage, it is concerned that network coverage in some of the more remote areas of Australia is sparse*." Report of the Independent Peer Review Panel 4 September 2011.