tag:blogger.com,1999:blog-14172397205436683062024-03-13T03:03:37.987-07:00ElasticTruthDeception Detection In Non Verbals, Linguistics And Data.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.comBlogger21125tag:blogger.com,1999:blog-1417239720543668306.post-25818888399816882882021-11-25T21:45:01.658-08:002022-11-08T15:52:42.711-08:00Australian Climate Data -- Big, Dirty, Biased and Manipulated.<p><span style="font-family: verdana;"><b>Australian Climate Data used for creating trends by BOM is analysed and dissected. The results show the data to be biased and dirty, even up to 2010 in some stations, making it unfit for predictions or trends.<br /><br />In many cases the data temperature <i>sequences are </i>strings of duplicates and duplicate sequences which bear no resemblance to observational temperatures.</b></span></p><p><span style="font-family: verdana;"><b>This data would have been thrown out in many industries such as pharmaceuticals and industrial control, and many of the BOM data handling methologies are unfit for most industries. </b></span></p><p><span style="font-family: verdana;">Dirty data stations appear to have been used in the network to combat the <i>scarcity of climate stations </i>argument made against the Australian climate network. (Modeling And Pricing Weather-Related Risk, Antonis K. Alexandridis et al)</span></p><p><span style="font-family: verdana;">We use a forensic exploratory software (SAS JMP) to identify fake sequences, but also develop a technique which we show at the end of the blog that spotlights clusters of these sequences in time series data. This technique, as well as Data Mining Bayesian and Decision Tree analysis <b>prove the causality of BOM adjustments creating fake unnatural temperature sequences</b> that no longer function as observational data, making it unfit for trend or prediction analysis.</span></p><p><span style="font-family: verdana;"><br /></span></p><p><i><span style="font-family: verdana;">"These (Climate) research findings </span><span style="font-family: verdana;">contain circular reasoning because in the end the hypothesis is proven with data from which the </span><span style="font-family: verdana;">hypothesis was derived."</span></i></p><p><span style="font-family: verdana;">Circular Reasoning in Climate Change Research - Jamal Munshi<br /></span><br /></p><p><span style="font-family: verdana;"><b><br /></b></span></p><p><span style="font-family: verdana;"><b>Before We Start -- The Anomaly Of An Anomaly:<br />One of the persistent myths in climatology is:</b></span></p><p><span style="font-family: verdana;"><i>"Note that temperature timeseries are presented as anomalies or departures from the 1961–1990 average because <b>temperature anomalies tend to be more consistent throughout wide areas than actual temperatures.</b>" --BOM</i></span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;"></span></p><p><span style="font-family: verdana;">This is complete nonsense. Notice the weasel word </span><b style="font-family: verdana;">"tend"</b><span style="font-family: verdana;"> which isn't on the NASA web site. Where BOM use weasel words such as "perhaps", "may", "could", "might" or "tend", these are red flags and provide useful investigation areas.</span></p><p><span style="font-family: verdana;">Using an offset value <b>arbitrarily</b> chosen, a 30 year block of average temperatures, does not make them "normal", nor does it give you any more data than you already have.</span></p><p><span style="font-family: verdana;">Plotting deviations from an </span><b style="font-family: verdana;">arbitrarily</b><span style="font-family: verdana;"> chosen offset, for a limited network of stations gives you no more insight and it most definitely does not mean you can extend analysis to areas without stations, or make extrapolation any more legitimate, if you haven't taken measurements there.</span></p><p><span style="font-family: verdana;">Averaging temperature anomalies <i>"throughout wide areas"</i> if you only have a few station readings, doesn’t give you any more an accurate picture than averaging straight temperatures. </span></p><p><br /></p><h3 style="text-align: left;"><span style="font-family: verdana;">Think Big, Think Global:</span></h3><p><span style="font-family: verdana;">Lets look Annual Global Temperature Anomalies. This is the weapon of choice when creating scare campaigns. It consists of averaging nearly a million temperature anomalies into a single number. (<a href="https://scied.ucar.edu/image/measure-global-average-temperature-five-easy-steps" target="_blank">link</a>)</span></p><p><span style="font-family: verdana;">Here it is from the BOM site for 2022. </span></p><p><span style="font-family: verdana;"> </span></p><p><span style="font-family: verdana;"></span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjO5S0soHQeysjnjs7eMpgnyG68JAhEjXduSokeRdlWW83kfaJkrbu2mCxZgjx3xEAdxM5wY915yT_dXUO8qh6xSEISALqQN8qfEAHR7zSR9dKWhFruatxrn3YXS-Asvsv7gjXOCbSKDqIqlxan7sILnLoHvC2pw3Zh3EiCjh-ujkINumrjEGcSjhxz/s1044/global2022.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="778" data-original-width="1044" height="475" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjO5S0soHQeysjnjs7eMpgnyG68JAhEjXduSokeRdlWW83kfaJkrbu2mCxZgjx3xEAdxM5wY915yT_dXUO8qh6xSEISALqQN8qfEAHR7zSR9dKWhFruatxrn3YXS-Asvsv7gjXOCbSKDqIqlxan7sILnLoHvC2pw3Zh3EiCjh-ujkINumrjEGcSjhxz/w640-h475/global2022.jpg" width="640" /></a></div><br /><p></p><p><span style="font-family: verdana;">Data retrieved using the Wayback website consists of the years 2014 and 2010 and 2022 from BOM site (actual data is only to 2020). Nothing is available earlier. </span></p><p><span style="font-family: verdana;">Below is 2010.</span></p><p><span style="font-family: verdana;"><br /></span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1Zr37nNZyg0r5KDcL3fQqJ0wMI2Yu3-sWk8r1bug7L8i9IjPRgE49l_q9CUq6V9wQ3Q4JZMAu6bDR-1JhD6HpFmUBGb0NgsNv2yc-XYWNg9Xyl6wU_xXm00pfd-mmGsgt5fuoK3ZD_Yao8nZlkFA8zZqZuMskqIXUcYgJB3EBmA9yTvSTK9wXH7kS/s907/global2010.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="763" data-original-width="907" height="538" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1Zr37nNZyg0r5KDcL3fQqJ0wMI2Yu3-sWk8r1bug7L8i9IjPRgE49l_q9CUq6V9wQ3Q4JZMAu6bDR-1JhD6HpFmUBGb0NgsNv2yc-XYWNg9Xyl6wU_xXm00pfd-mmGsgt5fuoK3ZD_Yao8nZlkFA8zZqZuMskqIXUcYgJB3EBmA9yTvSTK9wXH7kS/w640-h538/global2010.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span><p></p><p><span style="font-family: verdana;">Looking at the two graphs you can see differences. There has been warming but by how much?</span></p><p><span style="font-family: verdana;">Overlaying the temperature anomalies for 2010 and 2020 helps. </span></p><p><br /></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaAWOCP1fNhsAV8FKBv9PsYaSd4_xeOEJ6ti9olFsl_0LmIz7eKCtLyfvmYLjwdA-tNEog37I4f6ETZQVbOEOMG-14Tx2ZZjgCeJYFWxkZAEa7eADIKm06nuWwOFHQ1BoqlDB6u85o9_GvAwwneAK6ha7bVEp6jYF-nzCq0FJGz-R1G1YubcWVv8oG/s866/globalcompare.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="723" data-original-width="866" height="534" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaAWOCP1fNhsAV8FKBv9PsYaSd4_xeOEJ6ti9olFsl_0LmIz7eKCtLyfvmYLjwdA-tNEog37I4f6ETZQVbOEOMG-14Tx2ZZjgCeJYFWxkZAEa7eADIKm06nuWwOFHQ1BoqlDB6u85o9_GvAwwneAK6ha7bVEp6jYF-nzCq0FJGz-R1G1YubcWVv8oG/w640-h534/globalcompare.jpg" width="640" /></a></div><br /><p><span style="font-family: verdana;">BOM always state that their adjustments and changes are small, for example:</span></p><p><i style="font-family: verdana;">"The differences between ‘raw’ and ‘homogenised’ datasets are <b>small,</b> and capture the uncertainty in temperature estimates for Australia." <a href="http://www.bom.gov.au/climate/data/acorn-sat/documents/5-ACORN-SAT-TAF-TOR3.pdf">-BOM</a></i></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">Let's create a hypothesis: Every few years the temperature is warmed up significantly, at the 95% level (using BOM critical percentages).</span></p><p><span style="font-family: verdana;">Therefore, 2010 > 2014 <2020.<br /><b>The null hypothesis is that the data is from the same distribution therefore not significantly different.</b></span></p><p><span style="font-family: verdana;">To test this we use:</span></p><p><span style="font-family: verdana;"><br /></span></p><h3 style="text-align: left;"><span style="font-family: verdana;">Nonparametric Combination Test</span></h3><p><span style="font-family: verdana;">For this we use NONPARAMETRIC COMBINATION TEST or NPC. This is a permutation test framework that allows accurate combining of different hypothesis.</span></p><p><span style="font-family: verdana;">Pesarin popularised NPC, but Devin Caughey of MIT has the most up to date and flexible version of the algorithm, written in R. (<a href="https://caughey.mit.edu/software" target="_blank">link</a>).</span></p><p><span style="font-family: verdana;">Devin's paper on this is <a href="https://dspace.mit.edu/bitstream/handle/1721.1/119234/NPC170223.pdf?sequence=1&isAllowed=y" target="_blank">here</a>.</span></p><p><span style="font-family: verdana;">"Being based on permutation inference, NPC does not require modeling </span><span style="font-family: verdana;">assumptions or asymptotic justifications, <i>only that observations be exchangeable</i> (e.g., randomly assigned) under the global null hypothesis that treatment has no effect. </span><span style="font-family: verdana;">It is possible to combine p-values parametrically, typically under the assumption that the </span><span style="font-family: verdana;">component tests are independent, but nonparametric combination provides a much more </span><span style="font-family: verdana;">general approach that is valid under arbitrary dependence structures.</span><span style="font-family: verdana;">" --Devin Caughey, MIT</span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">As mentioned above, </span><i style="font-family: verdana;">the only assumptions we make for NPC are that the observations are exchangeable</i><span style="font-family: verdana;">, and it allows us to combine two or more hypothesis, while accounting for multiplicity, and to get an accurate total p value. </span></p><p><span style="font-family: verdana;">NPC is also used where a large number of contrasts are being investigated such as brain scan labs. (<a href="https://brainder.org/2016/02/08/npc/" target="_blank">link</a>)</span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">The results of after running NPC in R, and our main result:</span></p><p><span style="font-family: verdana;"><b>2010<2014 results in a p value = 0.0444</b><br /><br /></span></p><p><span style="font-family: verdana;">This is less than our cutoff of p value = 0.05 <i>so we reject the null </i>and can say that the Global Temp. Anomalies between 2010 and 2014 have had warming increased <b>significantly in the data, and that the distributions are different.</b></span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">The result of 2020 > 2014 has a p value = 0.1975</span></p><p><span style="font-family: verdana;">We do not reject the null here, so 2014 is not significantly different from 2020. </span></p><p><span style="font-family: verdana;">If we combine p values using hypothesis (2010<2014>2020 ie increases in warming in every version) with NPC we get a p value of 0.0686. This just falls short of our 5% level of significance, so we don't reject the null, although there is considerable evidence supporting this.</span></p><p><i style="font-family: verdana;">The takeaway here is that Global Temperature Anomalies have been significantly altered by warming up, between the years 2010 and 2014, after which they stayed essentially similar.</i></p><div class="separator" style="clear: both; text-align: center;"><br /></div><p><span style="font-family: verdana;"></span></p><h3><span style="font-family: verdana;">I See It But I Don't Believe It....</span></h3><p><span style="font-family: verdana;"><i>" If you are using averages, on average you will be wrong."</i> <b>(</b><a href="https://www.amazon.com/Flaw-Averages-Underestimate-Risk-Uncertainty/dp/1118073754" style="font-weight: bold;">link</a><b>)<br /></b><span> -- </span>Dr. Sam Savage on The Flaw Of Averages</span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">I earlier posts I showed the propensity of the BOM to copy/paste or alter temperature sequences, creating blocks of duplicate temperatures and sequences lasting a few days or weeks or even a full month. They surely wouldn't have done this with Global Temperature Anomalies, a really tiny data set, would they?</span></p><p><span style="font-family: verdana;"><br /></span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8JLyVcOyqWqGl0BLMriS8ORvR5cUiTTlLDdq6nfhgpooPfOyMBUe_pC0F7CXJ-wmpFq0fS_eHVkK_SFs2PgJkVYvoSskEQ7lURTq_iqNkAuf5i4gaxPxxANCnSWPuNXKjrqGiiGyVcNJG6uEtFRMHvMLvDrTLn68FilBXhQKs7Lp1246RrCgexf1C/s1295/dupsequence.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1003" data-original-width="1295" height="496" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8JLyVcOyqWqGl0BLMriS8ORvR5cUiTTlLDdq6nfhgpooPfOyMBUe_pC0F7CXJ-wmpFq0fS_eHVkK_SFs2PgJkVYvoSskEQ7lURTq_iqNkAuf5i4gaxPxxANCnSWPuNXKjrqGiiGyVcNJG6uEtFRMHvMLvDrTLn68FilBXhQKs7Lp1246RrCgexf1C/w640-h496/dupsequence.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span><p></p><p><span style="font-family: verdana;">As an incredible as it seems, we have a duplicate sequence even in this <i>small sample</i>. SAS JMP calculates the probability of seeing this at random given this sample size and number of unique values, is equal to seeing 10 heads in a row in a coin flip sequence. In other words, unlikely. More likely is the dodgy data hypothesis.</span></p><p><span style="font-family: verdana;"><br /></span></p><h3 style="text-align: left;"><span style="font-family: verdana;">The Case Of The Dog That Did Not Bark</span></h3><p><span style="font-family: verdana;">Just as the dog not barking on a specific night was highly relevant to Sherlock Holmes in solving a case, so it is <b>important with us knowing what is not there.</b></span></p><p><span style="font-family: verdana;">We need to know what variables disappear and also which ones suddenly reappear.</span></p><p><span style="font-family: verdana;"><br /></span></p><p><i><span style="font-family: verdana;">"A study that leaves out data is waving a big red flag. A<br /></span></i><i><span style="font-family: verdana;">decision to include or exclude data sometimes makes all the difference in<br /></span></i><i style="font-family: verdana;">the world." </i><span style="font-family: verdana;">-- </span><span style="font-family: verdana;">Standard Deviations, Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics, Gary Smith.</span></p><p><span style="font-family: verdana;"><br /></span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQb_TKurvMJAtPoY0EuR_Oi4lA_Tzun0i0MciEG8YPQbERDWamAp19UtDXLfMCV2jQJFpKoT9SLoHD6e63H5Eh6uv7noJhxSNfMIL9KkNXSvWcL1T5G8owfKdV5tr-5fNQxm3JA9GF7bt23p4FDnE5QRVxAFX9wxujYejgeeeBPJC4PZ-BVX8R2Myv/s838/palmervillemissing.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="321" data-original-width="838" height="246" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQb_TKurvMJAtPoY0EuR_Oi4lA_Tzun0i0MciEG8YPQbERDWamAp19UtDXLfMCV2jQJFpKoT9SLoHD6e63H5Eh6uv7noJhxSNfMIL9KkNXSvWcL1T5G8owfKdV5tr-5fNQxm3JA9GF7bt23p4FDnE5QRVxAFX9wxujYejgeeeBPJC4PZ-BVX8R2Myv/w640-h246/palmervillemissing.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span><p></p><p><span style="font-family: verdana;">This is a summary including missing data from Palmerville as an example. Looking at maximum temps first, the initial data the BOM works with is raw, so minraw has 4301 missing temps, then minv1 followed as the first set of adjustments and now we have 4479 temps missing. Around 178 temps went missing.</span></p><p><span style="font-family: verdana;">A few years later and more tweaks are on the way, thousands of them, and in version minv2 and we now have 3908 temps missing, so now 571 temps have been imputed or infilled. </span></p><p><span style="font-family: verdana;">A few more years later technology has sufficiently advanced for BOM to bring out a new fandangled version, minv2.1 and now we have 3546 temps missing -- a net gain of 362 temps that have been imputed. By version minv22 there are 3571 missing values and so a few more go missing.</span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">Max values tell similar stories as do other temperature time series. Sometimes temps get added in with data imputation, sometimes they are taken out. You would think that if you are going to use advanced techniques for data imputation you would do all the missing values, why only do some. Likewise, why delete specific values from version to version.</span></p><p><span style="font-family: verdana;">Its almost as if the missing/added in values <i>help</i> the hypothesis.</span></p><p><span style="font-family: verdana;"><br /></span></p><p><br /></p><p><span style="font-family: verdana;">Below -- Lets stay with Palmerville for August. All the Augusts from 1910 to 2020. For this we will use the most basic of all data analysis graphs, the good old scatterplot. This is a data display that shows the relationship between two numerical variables.</span></p><p><span style="font-family: verdana;"><br /></span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHybJCDuLsBR8sfhcZXGC3wHTQoCanqaUaYoDJZevhrUOf3dZ-lt9us8ssoEEEGhVQBnOgff79w-4SbbFUHpWy9FrHk4DsrV03OWomBMOvp6v19rWlBfuCVxXgREKZQ5viVY5-weWrO8BXgPhKNqnWOiMmef5GpJpHRhCH9t1hxAoOQxeoZlU5pqHd/s1201/palmerAUGminv22.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="976" data-original-width="1201" height="520" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHybJCDuLsBR8sfhcZXGC3wHTQoCanqaUaYoDJZevhrUOf3dZ-lt9us8ssoEEEGhVQBnOgff79w-4SbbFUHpWy9FrHk4DsrV03OWomBMOvp6v19rWlBfuCVxXgREKZQ5viVY5-weWrO8BXgPhKNqnWOiMmef5GpJpHRhCH9t1hxAoOQxeoZlU5pqHd/w640-h520/palmerAUGminv22.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span><p></p><p><span style="font-family: verdana;">Above -- This is a complete data view of the entire time series, minraw and minv22. Raw came first (bottom in red) so this is our reference. There is clustering at the ends of the raw graph as well as missing values around 1936 or so, and even at 2000 you see horizontal gaps where decimal values have disappeared, so you only get whole integer temps such 15C, 16C and so on. </span></p><p><span style="font-family: verdana;">But minv22 is incredibly bad -- look at the long horizontal "gutters" or corridors that exist from 1940's to 2000 or so. There are complete <b>temperature ranges</b> that are missing, so 14.1, 14,2,14.3 for example might be missing for 60 years or so. It turns out that these "gutters" or missing temperature ranges were added in! Raw has been adjusted 4 times with 4 versions of state of the art BOM software and this is the result - a worse outcome.</span></p><p><span style="font-family: verdana;"><br /></span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkJyR_CxVNclLFns6GX-kkOQMXiMcjfj5amhuYUZN0zDJleY2IJYzAViMf61JZAxqJC584TB2v49h1WARo12yQxXFCGbJIkJ221_OqCHBMFNDajjCBwP6X53ssZttDC60hqox35UasLnBsBypLs9DLkumRQ1m16aENReXJNkU1dDLTaTa7by6Wby4v/s1201/palmerJANminv22.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="976" data-original-width="1201" height="520" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkJyR_CxVNclLFns6GX-kkOQMXiMcjfj5amhuYUZN0zDJleY2IJYzAViMf61JZAxqJC584TB2v49h1WARo12yQxXFCGbJIkJ221_OqCHBMFNDajjCBwP6X53ssZttDC60hqox35UasLnBsBypLs9DLkumRQ1m16aENReXJNkU1dDLTaTa7by6Wby4v/w640-h520/palmerJANminv22.jpg" width="640" /></a></div><br /><span style="font-family: verdana;">January has no clean data, massive "corridors" of missing temperature ranges until 2005 or so. No predictive value here. Raw has a couple of clusters at the ends, but this is useless for the stated BOM goal of observing trends. Again, the data is worse after the adjustments.</span><p></p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRDGSl3ZuguiYqqUMNgpzSOf3F88WKIR7bX-i0CIpCtAb9l3IDoXXu-u4U28KdLuGMHeZMCbMuLUhIRatA372Z9UytEKzw_OMZVqJOrIxdVB9gJYvRygwko0tPfj_O37cA_6l13bJZmXX4CDdFptmfT9ALzmYBk0AZSMurhtfltREeKAmlAC5HUJBQ/s987/palmerMARmaxv22.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="875" data-original-width="987" height="568" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRDGSl3ZuguiYqqUMNgpzSOf3F88WKIR7bX-i0CIpCtAb9l3IDoXXu-u4U28KdLuGMHeZMCbMuLUhIRatA372Z9UytEKzw_OMZVqJOrIxdVB9gJYvRygwko0tPfj_O37cA_6l13bJZmXX4CDdFptmfT9ALzmYBk0AZSMurhtfltREeKAmlAC5HUJBQ/w640-h568/palmerMARmaxv22.jpg" width="640" /></a></div><br /><p><span style="font-family: verdana;">March data is worse after adjustments too. They had a real problem with temperature from around 1998-2005.</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><br /></div><p></p><p><span style="font-family: verdana;">Below -- Look at <i>before and after</i> adjustments. This is very bad data handling procedures and it's not random, so don't expect this kind of manipulation to cancel out.</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqhzUGPb3-w4pus7IAq9t97jS-5KdoLyjUvvhNDsHMzkrqPishm002X1IH6_Xu_K3oz2Hav1JWH7wKPcsJqp0tYuHztV7E5NC58Y6juXlDO05p5C1sJas5Lv6EKjqwmpFFFzYT0BDy2dPCbTpYnX49ZlXOaF-r0Ofgls9qJkTjTF8ao4qCwTkFSqyA/s1261/palmerSEPTmaxv22.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="896" data-original-width="1261" height="454" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqhzUGPb3-w4pus7IAq9t97jS-5KdoLyjUvvhNDsHMzkrqPishm002X1IH6_Xu_K3oz2Hav1JWH7wKPcsJqp0tYuHztV7E5NC58Y6juXlDO05p5C1sJas5Lv6EKjqwmpFFFzYT0BDy2dPCbTpYnX49ZlXOaF-r0Ofgls9qJkTjTF8ao4qCwTkFSqyA/w640-h454/palmerSEPTmaxv22.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /><br /></span><h3 style="text-align: left;"><span style="font-family: verdana;">More Decimal Drama:<br /></span><span style="font-weight: normal;"><span style="font-family: verdana; font-size: small;">You can clearly see decimals problems in this histogram. The highest dots represent the most frequently occurring temperatures and they all end in decimal zero. This is from 2000-2020.</span></span></h3><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirstnxXmORN0qVOc2uuyazaz3NQ6DK2xPWwgvB2j9E5yJkRg_gxS9sjJ0EvnYV94dyrdndLiKYsYT3iIKtH2PlSpdWTciLOR6wf7iztsIHiYGQvTW6X5o39SS4edbDhMjN48vmG752PWSm9T2RWayfQ49oC45tMiLDfL2J6SHtEDfATZHYZqjZfhPk/s1326/spikedecimal2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="770" data-original-width="1326" height="372" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirstnxXmORN0qVOc2uuyazaz3NQ6DK2xPWwgvB2j9E5yJkRg_gxS9sjJ0EvnYV94dyrdndLiKYsYT3iIKtH2PlSpdWTciLOR6wf7iztsIHiYGQvTW6X5o39SS4edbDhMjN48vmG752PWSm9T2RWayfQ49oC45tMiLDfL2J6SHtEDfATZHYZqjZfhPk/w640-h372/spikedecimal2.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div><br /></div><h3 style="text-align: left;"><span style="font-family: verdana;">Should BOM Adjustments Cause Missing Temperature Ranges?</span></h3></div><div><span style="font-family: verdana;">Data that disappears when it is being purportedly "corrected" or "aligned" to the network is called biased data. When complete temperature ranges disappear for 60 years or more, it is biased. Data that is missing NOT at random is biased. Data that has different mean values depending whether it is Tuesday or Friday or Sunday is biased. Data that is infilled or imputed and creates outliers is biased.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Here is the University Of Stockholm's adjustment statement:</span></div><div><div><span style="font-family: verdana;"><br /></span></div><div><i><span style="font-family: verdana;">"18700111-20121231<span style="white-space: pre;"> </span>Correction for urban heat island trend and other inhomogeneities. This gives an average adjustment by -0.3 C both May and August and -0.7 C for </span><span style="font-family: verdana;">June and July. This adjustment is in agreement with conclusions drawn by Moberg </span><span style="font-family: verdana;">et al. (2003), but have been determined on an ad hoc basis rather than from a </span><span style="font-family: verdana;">strict statistical analysis."</span></i></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">And here is what the Swedish Scatterplot with Raw and Homogenised data looks like:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiubmsHcIIgwUTZj-Dby3EmazOpxxuSgiRfeiNY_WNO_xeUS8NTFK8SIK2fDCtbKkvyNeVp1HGn3-0v-6kUGlGoPE5yZFyzdo5VW8livGA9zb3ts_bmEAKa9CjzT0wphcchuKPNqV_WxheBuohGPkoC9E4ecWKcI3TbzgqbzO3x-_DpuLYB1sa9K-ZR/s776/swedenhomog.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="776" height="604" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiubmsHcIIgwUTZj-Dby3EmazOpxxuSgiRfeiNY_WNO_xeUS8NTFK8SIK2fDCtbKkvyNeVp1HGn3-0v-6kUGlGoPE5yZFyzdo5VW8livGA9zb3ts_bmEAKa9CjzT0wphcchuKPNqV_WxheBuohGPkoC9E4ecWKcI3TbzgqbzO3x-_DpuLYB1sa9K-ZR/w640-h604/swedenhomog.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><b>The data is all still there after adjustment,</b> just the adjusted months were "slightly lowered" indicating a cooler temperature adjustment.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><b>Decimals have not gone missing in action and complete temperature ranges have not been altered or deleted.</b></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Below -- How about decimals after adjustment: BOM has such a problem with decimals, surely there has been an effect in Sweden? Five temperatures went up a bit, five went down. This is exactly what you would expect. </span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0ET2I_pACDKAnOi0lfeR-HuaIcyavN2Krm3QDvhBgnXxoIs8f2UdGjwNjIHz9HocHeIt20B_mP69lY6mAvQsC9Vc4aSyHsxkAgC2sN9MVtHpQD_PBGKrWOa3j7zCrewy1CvMwe94mUp8FMHHot5rQ_mMNvP4FW8bN21Wj2MNgAvDgSQ8KfJ5cIrR3/s818/rawhomogdecimals.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="818" height="572" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0ET2I_pACDKAnOi0lfeR-HuaIcyavN2Krm3QDvhBgnXxoIs8f2UdGjwNjIHz9HocHeIt20B_mP69lY6mAvQsC9Vc4aSyHsxkAgC2sN9MVtHpQD_PBGKrWOa3j7zCrewy1CvMwe94mUp8FMHHot5rQ_mMNvP4FW8bN21Wj2MNgAvDgSQ8KfJ5cIrR3/w640-h572/rawhomogdecimals.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><br /></div><span style="font-family: verdana;"><br /><br /></span><p></p><p></p><h3 style="text-align: left;"><span style="font-family: verdana;">Sunday at Nhill = Missing Data NOT At Random<br /></span><span style="font-weight: normal;"><span style="font-family: verdana; font-size: small;"><i>A bias is created with missing data not at random.</i></span></span>(<a href="https://sites.google.com/site/drhuiliew/missing-data/missing-not-at-random" style="font-family: verdana;">link</a><span style="font-family: verdana;">).</span></h3><p></p><p><span style="font-family: verdana;">Below - Nhill on a <b>Saturday</b> has a big chunk of data missing in both raw and adjusted.</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIhpjQ3SXlGfFySpM8_VbwZtJPspoOzBOLfRemRHp0OgWOtLOqYc6ONFzMZfqZHehJWxoy2hWA3NTkApExkLXqwxWUj_xl4BXZX82rlQgQVJv0BVeOKgu_eLUHZ63kB1MX1KoAFgABlEv7-L18znzJ-tp74zcVZBKnI_3wjE1tlkVbf00-HH5iMkuR/s1447/nhillmaxv22SAT.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="850" data-original-width="1447" height="376" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIhpjQ3SXlGfFySpM8_VbwZtJPspoOzBOLfRemRHp0OgWOtLOqYc6ONFzMZfqZHehJWxoy2hWA3NTkApExkLXqwxWUj_xl4BXZX82rlQgQVJv0BVeOKgu_eLUHZ63kB1MX1KoAFgABlEv7-L18znzJ-tp74zcVZBKnI_3wjE1tlkVbf00-HH5iMkuR/w640-h376/nhillmaxv22SAT.jpg" width="640" /></a></div><span style="font-family: verdana;"><p><br /></p><p>Below: Now watch this trick -- my hands dont leave my arms -- it becomes <b>Sunday</b>, and voila -- thousands of raw temperatures now exist, but adjusted data is still missing.</p><p><br /></p></span><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoNlWW97fGrmeWHTqEgk_iH79GRSW633TwA-rsAipXK3gKcwgT4c9HuR-kWumd582P7p7gWaiCgEgAx6T7ZpxwfQPkjlfT8NDFJgAkzk9UdOy2VX6IAPFymmMr_7MXvTo2jj7Fj-u90ogAwLFV3NCdnL6rfiC-w4fMrTMEmkoKRH0i8ZEk0WryplhR/s1447/nhillSUNDAY.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="850" data-original-width="1447" height="376" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoNlWW97fGrmeWHTqEgk_iH79GRSW633TwA-rsAipXK3gKcwgT4c9HuR-kWumd582P7p7gWaiCgEgAx6T7ZpxwfQPkjlfT8NDFJgAkzk9UdOy2VX6IAPFymmMr_7MXvTo2jj7Fj-u90ogAwLFV3NCdnL6rfiC-w4fMrTMEmkoKRH0i8ZEk0WryplhR/w640-h376/nhillSUNDAY.jpg" width="640" /></a></div><br /><br /><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Below -- You want more, I hear. Now its <b>Monday</b>, and voila -- now thousands of adjusted temperatures appear!</span></div><div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP6VrPGW8JlrZCYGSSAMoKALEPy8pRGECyidAQZP3xIeZweAbBqJYkwb3Esjhey3zo_btJrZTm-Vac4bwtGfXVqWgCjrZE4SMe0e_BkrEYZvu_LmW14k4zoX-YkUK-v-0wWohecsi2cG_MtO_bnq7eWsgyi9-q0CSB5per_6seYzaqvheQmqD5WQNG/s785/monday2friday.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="785" data-original-width="778" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP6VrPGW8JlrZCYGSSAMoKALEPy8pRGECyidAQZP3xIeZweAbBqJYkwb3Esjhey3zo_btJrZTm-Vac4bwtGfXVqWgCjrZE4SMe0e_BkrEYZvu_LmW14k4zoX-YkUK-v-0wWohecsi2cG_MtO_bnq7eWsgyi9-q0CSB5per_6seYzaqvheQmqD5WQNG/w634-h640/monday2friday.jpg" width="634" /></a></div><br /><span style="font-family: verdana;">The temperatures all reappear!<br /></span><p><br /></p><p><span style="font-family: verdana;">I know, you want to see more:</span></p><p><span style="font-family: verdana;">Below -- Mildura data Missing NOT At Random</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHOFnz2YrbYCLT9wbMbjCciNwl9M7FwM3zxcfaxm-4ybNQPPmBblvSMXBt6skLhhXpGARLWU0PBuyHwEmL1T8P0XMNhFtM9grs46NLuT0C5iN6sgPJKnaTE4hqV3eZJDdYXkxMD2W4HcX2DWrTfkEGbcaNwDGiMS-HbAGwWv6CU_ttw5ePRA2rhLzl/s820/milduraFRIDAY.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="820" data-original-width="781" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHOFnz2YrbYCLT9wbMbjCciNwl9M7FwM3zxcfaxm-4ybNQPPmBblvSMXBt6skLhhXpGARLWU0PBuyHwEmL1T8P0XMNhFtM9grs46NLuT0C5iN6sgPJKnaTE4hqV3eZJDdYXkxMD2W4HcX2DWrTfkEGbcaNwDGiMS-HbAGwWv6CU_ttw5ePRA2rhLzl/w610-h640/milduraFRIDAY.jpg" width="610" /></a></div><br /><span style="font-family: verdana;">Above -- Mildura on Friday with raw has a slice of missing data at around 1947, which is imputed in the adjusted data.</span><p></p><p><i style="font-family: verdana;"><br /></i></p><p><span style="font-family: verdana;">Below -- Mildura on a Sunday:</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRj01_IsDC5p8L9LgUXaZOLwtw7piUnN2IRE-_TCD8-F0J5cAlnDDSl5zJ_q7fOGOsJaa9-3fNeY2SB4dyF3W7UisCa36Y-XJzpw5TmcY775jOY7WY050tQyWbG7zcO8mU01rIK4xoFXT4lB4-rVJO_JFQT0jLO4QTDTcIihE3iJ5fvQ1JK9VUZxve/s819/milduraSUNDAY.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="819" data-original-width="776" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRj01_IsDC5p8L9LgUXaZOLwtw7piUnN2IRE-_TCD8-F0J5cAlnDDSl5zJ_q7fOGOsJaa9-3fNeY2SB4dyF3W7UisCa36Y-XJzpw5TmcY775jOY7WY050tQyWbG7zcO8mU01rIK4xoFXT4lB4-rVJO_JFQT0jLO4QTDTcIihE3iJ5fvQ1JK9VUZxve/w606-h640/milduraSUNDAY.jpg" width="606" /></a></div><br /><p><span style="font-family: verdana;">Above - The case of the disappearing temperatures, raw and adjusted, around twenty years of data.</span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">Below -- Mildura on a Monday:</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkkv0jyfFJHtTl0JBUdluMZ_-OVwW8Eb2yOaP9NzAXMdSb4B4zZjJ3EYvIcpAXlnW8A9XIIzRfWolZwMWEFq-ks5U91S2eRh3qB0xdL63iedMrN1wmaZyNchXP7nKC1IDO4YNql6h8wKLUXSLKWljL-_g6lJS4q9DGQmDI5vOy3Pz14yGGRkGFS4BL/s823/milduraMONDAY.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="823" data-original-width="743" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkkv0jyfFJHtTl0JBUdluMZ_-OVwW8Eb2yOaP9NzAXMdSb4B4zZjJ3EYvIcpAXlnW8A9XIIzRfWolZwMWEFq-ks5U91S2eRh3qB0xdL63iedMrN1wmaZyNchXP7nKC1IDO4YNql6h8wKLUXSLKWljL-_g6lJS4q9DGQmDI5vOy3Pz14yGGRkGFS4BL/w578-h640/milduraMONDAY.jpg" width="578" /></a></div><br /><p><span style="font-family: verdana;">Above - On Monday, a big chunk disappears in adjusted data, <b>but strangely the thin stripe at 1947 missing data in raw is filled in at the same location at minv22.</b></span></p><div class="separator" style="clear: both; text-align: center;"><br /></div><p><span style="font-family: verdana;">Even major centres like Sydney get affected with missing temperature ranges over virtually the entire time series up to around 2000:</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhS_qdYFSxDdCItSXtC_n9wjuIMTlPw6632xZp3f-S9AmWIKEGyu_gZuBxkejaTpsDN1yQQbeJksib6IOcnegtRitn0DweN7GSsESI3Bl8WU1NR_A6PlOrW4O7nYHgVlC0WQLPbd8LxqLfIh2QTodqCmua52i6HkBC_vXLEspzDeFAbUd9ECV3qtj5E/s1038/sydneyJANmissingranges.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="904" data-original-width="1038" height="558" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhS_qdYFSxDdCItSXtC_n9wjuIMTlPw6632xZp3f-S9AmWIKEGyu_gZuBxkejaTpsDN1yQQbeJksib6IOcnegtRitn0DweN7GSsESI3Bl8WU1NR_A6PlOrW4O7nYHgVlC0WQLPbd8LxqLfIh2QTodqCmua52i6HkBC_vXLEspzDeFAbUd9ECV3qtj5E/w640-h558/sydneyJANmissingranges.jpg" width="640" /></a></div><br /><i style="font-family: verdana;"><br /></i><p></p><p><span style="font-family: verdana;">Below -- The missing data forming these gashes is easily seen in a histogram too. Below is November in Sydney with a histogram and scatterplot showing that you can get 60-100 years with some temps virtually <b>never</b> appearing!</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk5lTMcr8zUydU965Bhtoz92L-Uy_zM0YUhRPxPPTG6sEZdKwVEFnlZ8JyUaO_JC-4BQZJoO_kAbH6YH8WCj3Uww7J8_t6gQzSoIGeC_YLZSsazBMz93-HBaZISsXL-tt1TflFxV2UG0oh0R0sRBUYjKre85grR2JOAhxmWPkmXidEdCwfhTjRIT1k/s794/sydneyNOVmissingRangesHISTOGRAM.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="752" data-original-width="794" height="606" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk5lTMcr8zUydU965Bhtoz92L-Uy_zM0YUhRPxPPTG6sEZdKwVEFnlZ8JyUaO_JC-4BQZJoO_kAbH6YH8WCj3Uww7J8_t6gQzSoIGeC_YLZSsazBMz93-HBaZISsXL-tt1TflFxV2UG0oh0R0sRBUYjKre85grR2JOAhxmWPkmXidEdCwfhTjRIT1k/w640-h606/sydneyNOVmissingRangesHISTOGRAM.jpg" width="640" /></a></div><br /><i style="font-family: verdana;"><br /></i><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCcyIqFoHH_eL4a8LuZNIJirvApAedmh8nSDMHoVsUXfcBe1AAMwuqasCABXYtmzQgfN1i-NrtPG5JCfDwsYB4s-a2NVpHXG26EwKwId7Q87qFNCvMDwMSJF1GSgNzdZyRkPP1ZktSXwcrm3ab6vifKKjtV0A-Wb4vdcZeb0ickc4sLgFJvJXpYmRN/s1434/sydneyNOVscatterplot.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="966" data-original-width="1434" height="432" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCcyIqFoHH_eL4a8LuZNIJirvApAedmh8nSDMHoVsUXfcBe1AAMwuqasCABXYtmzQgfN1i-NrtPG5JCfDwsYB4s-a2NVpHXG26EwKwId7Q87qFNCvMDwMSJF1GSgNzdZyRkPP1ZktSXwcrm3ab6vifKKjtV0A-Wb4vdcZeb0ickc4sLgFJvJXpYmRN/w640-h432/sydneyNOVscatterplot.jpg" width="640" /></a></div><br /><p><span style="font-family: verdana;">More problems with Sydney data. My last posts showed two and a half months of data that was copy/pasted into different years.</span></p><p><span style="font-family: verdana;">This kind of data handling is indicative of many other problems of bias. </span></p><p><span style="font-family: verdana;"><br /></span></p><p><span style="font-family: verdana;">Sydney Day-Of-Week effect</span></p><p><span style="font-family: verdana;">Taking all the September months in the Sydney time series from 1910-2020 shows Friday to be at a significantly different temperature than Sunday and Monday.</span></p><p><span style="font-family: verdana;"> The chance of seeing this at random is over 1000-1:</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAKsM5s9QL2SXGUTXjXJRmNgB9Qxsx7fQdx2P7SzLGB96wg1whkMMRkjQjR4ccUcJUjMLkMX1b8cE-87b1XZ-_PFK66dD5O2qy1qhwL6DqstnTi5ch-leEsBKWNQ7c_6-Yl9ttOBLCMBwfxJAANTRDZYiRjdaD0dqhT-31w0eq4wrCDjMoA0L-HiS7/s1008/sydneySEPTdowminv22.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1008" data-original-width="900" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAKsM5s9QL2SXGUTXjXJRmNgB9Qxsx7fQdx2P7SzLGB96wg1whkMMRkjQjR4ccUcJUjMLkMX1b8cE-87b1XZ-_PFK66dD5O2qy1qhwL6DqstnTi5ch-leEsBKWNQ7c_6-Yl9ttOBLCMBwfxJAANTRDZYiRjdaD0dqhT-31w0eq4wrCDjMoA0L-HiS7/w572-h640/sydneySEPTdowminv22.jpg" width="572" /></a></div><br /><span style="font-family: verdana;"><br /></span><p></p><p><span style="font-family: verdana;">Saturday is warmer than Thursday in December too, this is highly significant.</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidr6UnH3EGaTUn_5O5fOYu9quqmEH5yqe9wR7wBIx6G9o6SGclh0OBHqUI8UNaRmL7Yftod2g3o_Itp9sC8r88FZujKQbrnL-5oyHIvtTojW0lg92CTXaCTUxKWbTMTRSJEp8N_94kxfkp6EAinPF7XctFM_fle3FnkfEKSah8PC0cXeyxOQVM1SvW/s1008/sydneyDECdowmaxv22.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1008" data-original-width="900" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidr6UnH3EGaTUn_5O5fOYu9quqmEH5yqe9wR7wBIx6G9o6SGclh0OBHqUI8UNaRmL7Yftod2g3o_Itp9sC8r88FZujKQbrnL-5oyHIvtTojW0lg92CTXaCTUxKWbTMTRSJEp8N_94kxfkp6EAinPF7XctFM_fle3FnkfEKSah8PC0cXeyxOQVM1SvW/w572-h640/sydneyDECdowmaxv22.jpg" width="572" /></a></div><br /><i style="font-family: verdana;"><br /></i><p></p><div class="separator" style="clear: both; text-align: center;"><br /><br /></div><h3 style="text-align: left;"><span style="font-family: verdana;">Never On A Sunday.</span></h3><p><span style="font-family: verdana;">Moree is one of the best <i>worst</i> stations. <b>It doesn't disappoint with a third of the time series disappearing on a Sunday!</b> But first Monday to Saturday:</span></p><p><span style="font-family: verdana;">Below -- Moree on a Monday to Saturday looks like this. Forty odd years of data is deleted going from raw to minv1, then it reappears again in versions minv2, minv21 and minv22. </span></p><p><span style="font-family: verdana;"><br /></span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmW4XGRaJkkOjD_Q8TITrso9f33aBwmrtgWF9Gv74StUS6SqhEVmnaCF5nVc19ZZi7H4b8k00wQoaDycXyWnuK8KDtEbg8BdfIpfEmNXbnCDcV__UeDCkvEuIBsyUR5e3B3q4qTXVy80eLZZmmXkKzX2Px-McmjNiw8ez5HXrmyYKTvVyDVDzM7DZz/s675/moreeMonday.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="632" data-original-width="675" height="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmW4XGRaJkkOjD_Q8TITrso9f33aBwmrtgWF9Gv74StUS6SqhEVmnaCF5nVc19ZZi7H4b8k00wQoaDycXyWnuK8KDtEbg8BdfIpfEmNXbnCDcV__UeDCkvEuIBsyUR5e3B3q4qTXVy80eLZZmmXkKzX2Px-McmjNiw8ez5HXrmyYKTvVyDVDzM7DZz/w640-h600/moreeMonday.jpg" width="640" /></a></div><p></p><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><span style="font-family: verdana;">Below -- But then Sunday in Moree happens, and a third of the data disappears! (except for a few odd values). </span></div><div><span style="font-family: verdana;"><br /></span><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTWF0lOs5tu8s7ZWfAQoYaNWB0v6B8iHhAZEgLewhs6HCj4qeZnIVWQTkVwFUZKxot8rcjtBJAUcE04UThDGIFUKzWfB44V-5lU_N0M1weelB9kU12ZJuUK42Ad5X2TWfznFv5hZte6BJhlRQyFFqZQlRgHVIVJgCB4v-GHNxJ-mRfOUWccXfoj90O/s672/moreesunday2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="670" data-original-width="672" height="638" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTWF0lOs5tu8s7ZWfAQoYaNWB0v6B8iHhAZEgLewhs6HCj4qeZnIVWQTkVwFUZKxot8rcjtBJAUcE04UThDGIFUKzWfB44V-5lU_N0M1weelB9kU12ZJuUK42Ad5X2TWfznFv5hZte6BJhlRQyFFqZQlRgHVIVJgCB4v-GHNxJ-mRfOUWccXfoj90O/w640-h638/moreesunday2.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><b><br /></b></span></div><div><span style="font-family: verdana;"><b>A third of the time series goes missing on Sunday!</b> It seems the the Greek comedy film <i>Never On A Sunday</i> with Greek prostitute Ilya attempting to relax Homer (but never on a Sunday) has rubbed off onto Moree.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">Adjustments create duplicate sequences of data</span></h3><div><span style="font-family: verdana;">Below -- Sydney shows how duplicates are created with adjustments:</span></div><div><span style="font-family: verdana;"><br />The duplicated data is created by the BOM with their state-of-the-art adjustment software, they seem to forget that <i>this is supposed to be observational data</i>. Different raw values turn into a sequence of duplicated values in maxv22!</span></div><div><span style="font-family: verdana;"><br /></span></div><div><br /></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihQJEd8B8DlkB7XuSaI6yy5A95XcKq3aGdJW5jI8Qva6nY0nnXLh3V5z_x_-OT2IZ-HD0A_WSJcuyVZ4vGGtTOTlaKFERQuU84b1p9LgDuINkwdrGOJF2D7EEFQ2V64oi66kVuK5AAet-k88ZpwsQXJ6nJuKecledz7qx2kg2bFIoBleRzhVdzDinP/s894/ssydmaxadjust.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="601" data-original-width="894" height="430" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihQJEd8B8DlkB7XuSaI6yy5A95XcKq3aGdJW5jI8Qva6nY0nnXLh3V5z_x_-OT2IZ-HD0A_WSJcuyVZ4vGGtTOTlaKFERQuU84b1p9LgDuINkwdrGOJF2D7EEFQ2V64oi66kVuK5AAet-k88ZpwsQXJ6nJuKecledz7qx2kg2bFIoBleRzhVdzDinP/w640-h430/ssydmaxadjust.png" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><br /></div><div><br /></div><h3 style="text-align: left;"><span style="font-family: verdana;">Real Time Data Fiddling In Action:</span></h3><div><span style="font-family: verdana;">Duplicates sequences go up and down in value....then a single value disappears!</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQpRjG180x8ZDW_2aZw8OGp7_ghS0103XrYc_q3o4reqYJcac0bgswhAeMII9EQv4BdRfvgk7Clrq-75ONBGF5U9IRh0o3lV1ihTAy1mo-vBz7FlUwi-y5cL4UFEcpzeaeiW6PWJnB6IQPjquwbJM58XNWBbXIhXoBpGMgMq7kRKu8rsjlAdRo6rCn/s1490/mildurafakenumbers.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1020" data-original-width="1490" height="438" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQpRjG180x8ZDW_2aZw8OGp7_ghS0103XrYc_q3o4reqYJcac0bgswhAeMII9EQv4BdRfvgk7Clrq-75ONBGF5U9IRh0o3lV1ihTAy1mo-vBz7FlUwi-y5cL4UFEcpzeaeiW6PWJnB6IQPjquwbJM58XNWBbXIhXoBpGMgMq7kRKu8rsjlAdRo6rCn/w640-h438/mildurafakenumbers.png" width="640" /></a></div><br /><span style="font-family: verdana;">Maxraw (above) has a run of 6 temperatures at 14.4 (others too above it, but for now we look at this), and at version minv1 the sequence is faithfully copied, at version minv2 the duplicate sequence changes by 0.2 (still dupes though) <b>and a value is dropped off on Sunday 18</b>. By version minv21, the "lost value" is still lost and the duplicate sequence goes down in value by 0.1, then goes up by 0.3 in version minv22. So that single solitary value on Sunday 18 becomes a missing value. </span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Duplicate sequences abound, looking up the series (above) you see more duplicate runs when looking upwards, and indeed this carries on above the snapshot. Many of the sequences have what appear to be made-up or fabricated numbers with short runs and an odd value appearing or disappearing in between.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">A Sly Way Of Warming:</span></h3><div><span style="font-family: verdana;">Last two examples from Palmerville, one showing a devious way of warming by copying from March and pasting into May!</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjneqTCoSMGsBkl5VqK75qi_cldmmsQ0-j8a_cscK0oxgAp_97kfRh88PNXK1v95EwMKYQMFRKOYOiymwZMOq4nZXCUSw0pt6eLZx4rhZoFCUxxEEYvmYE04zE32YIR-OlLyU0KeDdcrg8MctJLlBiPW0Il7jDwqgR3LIPBvmx6X20UtEKAqLIkMOza/s832/palmervilledupseq.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="819" data-original-width="832" height="630" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjneqTCoSMGsBkl5VqK75qi_cldmmsQ0-j8a_cscK0oxgAp_97kfRh88PNXK1v95EwMKYQMFRKOYOiymwZMOq4nZXCUSw0pt6eLZx4rhZoFCUxxEEYvmYE04zE32YIR-OlLyU0KeDdcrg8MctJLlBiPW0Il7jDwqgR3LIPBvmx6X20UtEKAqLIkMOza/w640-h630/palmervilledupseq.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgIxjpYoBqxMPMEZVJAFL3ueyWfN-8N-awVJuXVj9JVIOBFqE68MzYvNeTaPLiHqX00a8ZW10J4I1yJULxCYxN_IjnOekKsDyl5ro1T11fX6ZpZD4KQ2ZRXkjTq4_Y1MfUDtVZEAHToEdWw6KNna9eP_Ld_R7ye0P2okDQZjKCIyucR_ovkEBw7tTq/s704/palmervilledupseq17.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="704" data-original-width="668" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgIxjpYoBqxMPMEZVJAFL3ueyWfN-8N-awVJuXVj9JVIOBFqE68MzYvNeTaPLiHqX00a8ZW10J4I1yJULxCYxN_IjnOekKsDyl5ro1T11fX6ZpZD4KQ2ZRXkjTq4_Y1MfUDtVZEAHToEdWw6KNna9eP_Ld_R7ye0P2okDQZjKCIyucR_ovkEBw7tTq/w608-h640/palmervilledupseq17.jpg" width="608" /></a></div><br /></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><i><div>"Watch out for unnatural groupings of data.</div><div>In a fervent quest for publishable theories—no</div><div>matter how implausible—it is tempting to tweak the data to provide</div><div>more support for the theory and it is natural to not look too closely if</div><div>a statistical test gives the hoped-for answer.</div><div><span><br /> -- </span>Standard Deviations,Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics, Gary Smith.</div></i></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><i><div>"In biased research of this kind, researchers do not objectively seek the</div><div>truth, whatever it may turn out to be, but rather seek to prove the truth of what they already know to be true or what needs to be true to support activism for a noble cause (Nickerson, 1998)."<span> -- </span>Circular Reasoning In Climate Change Reasearch, <i>Jamal Munishi</i></div></i></span></div><div><br /></div><div><br /></div><div><br /></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">The Quality Of BOM Raw Data</span></h3><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">We shouldn't be talking about raw data, because it's a misleading concept......</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><div><i>"Reference to Raw is in itself a misleading concept as it often implies</i></div><div><i>some pre-adjustment dataset which might be taken as a pure</i></div><div><i>recording at a single station location. For two thirds of the ACORN SAT</i></div><div><i>there is no raw temperature series but rather a composited series<br />taken from two or more stations."<span> -- the BOM</span></i></div><div><i><span><br /></span></i></div></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><i>"Homegenization does not increase the accuracy of the data - it can be no higher than the accuracy of the observations. "</i> </span><span style="font-family: verdana;">(M.Syrakova, V.Mateev, 2009)</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">So what do these composites look like? A simple thing most data analysts do at the data exploratory stages is look at the distribution with a histogram.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Here's Inverell:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCWX57BZDHwArejl_6hzujHbbMmFPDIQWVaMBZrx_jtLyYflwH50_Xy4FEiqGnk6Vdc1hJW-5yMYhz6kerACORGkZ9BFFkl4xM8eU59oMaUMfyk7qsMyTUxpKiI04GD9u4QBNlco2s8gymjBKGOlW51pljMK0LRDQ6bnNnT_7TdVOlwGGrf_9HHITh/s794/inverellhisto.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="794" height="590" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCWX57BZDHwArejl_6hzujHbbMmFPDIQWVaMBZrx_jtLyYflwH50_Xy4FEiqGnk6Vdc1hJW-5yMYhz6kerACORGkZ9BFFkl4xM8eU59oMaUMfyk7qsMyTUxpKiI04GD9u4QBNlco2s8gymjBKGOlW51pljMK0LRDQ6bnNnT_7TdVOlwGGrf_9HHITh/w640-h590/inverellhisto.jpg" width="640" /></a></div><br /><span style="font-family: verdana;">You don't have to be a data scientist to see that this is a problem, there appear to be two histograms superimposed. We are looking at maxraw on the X axis and frequency or occurrences on the Y axis.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">This tells us how often each temperature appeared. The gaps show problems in the decimal use of data, some temps appear a lot, then 5 don't appear often, then one appears a lot, then four don't appear often. We have a high, 5 low, high, 4 low sequence. </span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKHfH4Rg5ElLS8Yt6XV51Ln17wqiaqtdGp-K9l2PDoxp2p9bdrgXwGTfRWCb0fHg9gTTeMmMN3b7Snr9fF9_xGAFbsvOiBzhKdOcIqpHBSrYxG8MIkV7zNtQ1IU-GGzApO6EbrCU-HLBglkD8fQl8aCYQlr8hf60IuibgErOtEkN6KRHgzi-VT7e8v/s721/tibbohisto.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="630" data-original-width="721" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKHfH4Rg5ElLS8Yt6XV51Ln17wqiaqtdGp-K9l2PDoxp2p9bdrgXwGTfRWCb0fHg9gTTeMmMN3b7Snr9fF9_xGAFbsvOiBzhKdOcIqpHBSrYxG8MIkV7zNtQ1IU-GGzApO6EbrCU-HLBglkD8fQl8aCYQlr8hf60IuibgErOtEkN6KRHgzi-VT7e8v/w640-h560/tibbohisto.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjYKTy317iiobDxYvqCGCkW4qoE2KkdNj93yPpIDBW65aUkua0_NX51xa9BaF9Osjerg6kQnNTgzgbNDT1KIJvmJnYprmoqH9bPgFi4LAn9XEGszOOmAh6bV8bZXdErp5z2HG8TakRqscKkaBKPrDGSp-dcPLja-8YGiQWadNPswqrZWJQ1UrE8i32/s675/walgethisto.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="568" data-original-width="675" height="538" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjYKTy317iiobDxYvqCGCkW4qoE2KkdNj93yPpIDBW65aUkua0_NX51xa9BaF9Osjerg6kQnNTgzgbNDT1KIJvmJnYprmoqH9bPgFi4LAn9XEGszOOmAh6bV8bZXdErp5z2HG8TakRqscKkaBKPrDGSp-dcPLja-8YGiQWadNPswqrZWJQ1UrE8i32/w640-h538/walgethisto.jpg" width="640" /></a></div><br /><span style="font-family: verdana;">These histograms consists of the entire time series. Now we know decimalisation came in around the 70's, so this shouldn't happen with the more recent decades, correct?</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBL2NRjKqQMu95LU36tGYTJE7sDjKf17dvp9Tk4oVaE1oj_NRWf7vo3npJoaE2VvoFZ-y1mCepQFK7f8i5Uvi0Xrq5CiiZH9npYc-DMaBK93Lb5PTJqI6AGdC7yK0CUeAtRBNeAtYv5_LMilrNkwfsG4sr09yJizYAnp5bwkdBpei25VRBEFBKSPXC/s787/bathursthisto.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="787" height="596" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBL2NRjKqQMu95LU36tGYTJE7sDjKf17dvp9Tk4oVaE1oj_NRWf7vo3npJoaE2VvoFZ-y1mCepQFK7f8i5Uvi0Xrq5CiiZH9npYc-DMaBK93Lb5PTJqI6AGdC7yK0CUeAtRBNeAtYv5_LMilrNkwfsG4sr09yJizYAnp5bwkdBpei25VRBEFBKSPXC/w640-h596/bathursthisto.jpg" width="640" /></a></div><br /><span style="font-family: verdana;">Here we have what appears to be three histograms merged into one, and that is in the decade 2010-2020. Even at that late stage in the game, BOM is struggling to get clean data.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">In fact, the problem here is more than decimalisation -- Double Rounding Imprecision where Fahrenheit is rounded to nearest 1 degree precision, then converted to Celcius and rounded to 0.1 precision creating an excess of decimal 0.0's and a scarcity of 0.5's (with this example); <b>different rounding scenarios exist where different decimal scarcities and excesses were created in the same time series!</b></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The paper is below--</span></div><div><span style="font-family: verdana;"><i>"Decoding The Precision Of Historical Temperature Observations"</i> -- Andrew Rhimes, Karen A McKinnon, Peter Hubers.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">These different double rounding scenarios putting records in doubt in some cases (see above paper).</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">This is what it looks like, plotting decimal use per year by frequency:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidgVJXFgvJieBHhcbPUD5rxuY5GbjYperg552J-AwELSDSxEXGA7BAORZW-d6rqF5yNZ6FPaFlyfnEX6_wXXaWPRStpDS-Tfov-84UK7RlFydVZ_uxd4dJE312KwzHDxtUGKeNFPH1pBDou94qeXtuPBGzqiZHugsJljxL8oKG-8d7FRZWgTU8vERg/s782/yambadecimalsminraw.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="782" height="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidgVJXFgvJieBHhcbPUD5rxuY5GbjYperg552J-AwELSDSxEXGA7BAORZW-d6rqF5yNZ6FPaFlyfnEX6_wXXaWPRStpDS-Tfov-84UK7RlFydVZ_uxd4dJE312KwzHDxtUGKeNFPH1pBDou94qeXtuPBGzqiZHugsJljxL8oKG-8d7FRZWgTU8vERg/w640-h600/yambadecimalsminraw.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Certain decimals are used more (or less) in certain years and decades.</span></div><div><span style="font-family: verdana;">It's obvious most stations have neither raw nor clean data, and doesn't even look like observational data.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">Adjustments, Or Tweaking Temperatures To Increase Trends.</span></h3><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"> </span><i style="font-family: verdana;">"For example, imagine if a weather station in your suburb or town had to be moved because of a building development. There's a good chance the new location may be slightly warmer or colder than the previous. If we are to provide the community with the best estimate of the true long-term temperature trend at that location, it's important that we account for such changes. To do this, the Bureau and other major meteorological organisations such as NASA, the National Oceanic and Atmospheric Administration and the UK Met Office use a scientific process called homogenisation." -- </i><span style="font-family: verdana;">BOM</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">First of all, how are climate adjustments done in other countries? The University Of Stockholm has records going back nearly 300 years.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Here are their adjustments:</span></div><div><span style="font-family: verdana; font-style: italic;">"18700111-20121231</span><span style="font-family: verdana; font-style: italic;"> -- </span><span style="font-family: verdana; font-style: italic;">Correction for urban heat island trend </span><span style="font-family: verdana; font-style: italic;">and other inhomogeneities."</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><div style="font-style: italic;">"This gives an average adjustment by -0.3 C both May and August and -0.7 C for</div><div style="font-style: italic;">June and July. This adjustment is in agreement with conclusions drawn by Moberg</div><div style="font-style: italic;">et al. (2003), but have been determined on an ad hoc basis rather than from a</div><div style="font-style: italic;">strict statistical analysis."</div><div style="font-style: italic;"><br /></div><div>The scatterplot of adjustments over raw looks like below. The adjustments were all done to combat the Urban Heat Island effect.</div></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1zPi_A9Nhq_iJTB26zuu_1mpnaQZAvH0mapvQkUQQMz4bPfmJPa4A7c52X7Gc1XmJzuznqk3_4KXlhf2z_kxpAf4N7tUx8e5-xyLYartipC8_NVh4MrZ48jgjaa72T_KD7QjTo503OM1j8wJYzuPERdr9uQBM3SgmuwlIaq1typHlk0awSh7mS9b0/s804/stockholmadjusted.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="804" height="582" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1zPi_A9Nhq_iJTB26zuu_1mpnaQZAvH0mapvQkUQQMz4bPfmJPa4A7c52X7Gc1XmJzuznqk3_4KXlhf2z_kxpAf4N7tUx8e5-xyLYartipC8_NVh4MrZ48jgjaa72T_KD7QjTo503OM1j8wJYzuPERdr9uQBM3SgmuwlIaq1typHlk0awSh7mS9b0/w640-h582/stockholmadjusted.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">What this shows is a "step change" in the early years--say 1915, you can see a single continuous adjustment, then looking at 1980 we can see a few temperature ranges are adjusted differently, as they mention in their Read Me above where they talk about May, August, June and July.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">How does the BOM deal with adjustments of bias? The condescending quote under the header prepares you for what is coming:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgAJl4pjFfmWrjNPhYTBeQsMiKIu6xHD1QflAYajTfoTjDF6XbrWq-Z2wY66E5TniMsb0PSORiyJ73y8oGTqsPLqkhS5QUnG31Q4vNCm03EtrJw-9PjlJZwZM7O3YoOvsA7O9ojTBR6nJkwByID5_D8X775Dxvzwjot1RcobJ3FBLRjhRBRVH1A3VL/s734/moreeadj.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="633" data-original-width="734" height="552" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgAJl4pjFfmWrjNPhYTBeQsMiKIu6xHD1QflAYajTfoTjDF6XbrWq-Z2wY66E5TniMsb0PSORiyJ73y8oGTqsPLqkhS5QUnG31Q4vNCm03EtrJw-9PjlJZwZM7O3YoOvsA7O9ojTBR6nJkwByID5_D8X775Dxvzwjot1RcobJ3FBLRjhRBRVH1A3VL/w640-h552/moreeadj.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">A "step change" would look like the arrow pointing at 1940. But look at around 1960--there are a mass over adjustments covering a massive temperature range, its a large hodge-podge of specific adjustments for specific ranges. If we only look at 1960 in Moree:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhafqdKYF-AeSaUGWME8qS45u4meqH0mj21I4t31qEAdXEppi4cuptvzmui4xntrQGCYInfURnM8k9DTNsVD8d-29FjNWzDVXXv2ADa09H3R6FO8IjQAv_9NQRrUCZtf4uU-afCUAOKpeUfl-un1Bc9s7_KfsDwBxWNFubm6QTs1iBAyd8717T6iVC/s803/moreeadjust2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="724" data-original-width="803" height="578" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhafqdKYF-AeSaUGWME8qS45u4meqH0mj21I4t31qEAdXEppi4cuptvzmui4xntrQGCYInfURnM8k9DTNsVD8d-29FjNWzDVXXv2ADa09H3R6FO8IjQAv_9NQRrUCZtf4uU-afCUAOKpeUfl-un1Bc9s7_KfsDwBxWNFubm6QTs1iBAyd8717T6iVC/w640-h578/moreeadjust2.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><span style="font-family: verdana;">Look at the intricate overlapping adjustments- the different colours signify different sizes of adjustments in degrees Celsius (see table on right side of graph).</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><i>BOM would have us believe that these chaotic adjustments for just 1960 in this example, are exact and precise adjustments needed to correct biases.</i></span></div><div><span style="font-family: verdana;"><i><br /></i></span></div><div><span style="font-family: verdana;">A more likely explanation based on the Modus Operandi of the BOM is that specific months and years get specific warming and cooling to increase the desired trend. Months like August and April are "boundary months" and are consistently warmed or cooled more depending on the year.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">These are average monthly adjustments, getting down to a weekly or daily view really brings out the chaotic nature of the adjustments, as the scatterplot shows.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Nhill maxv22 adjustments over raw below:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivhI5N3_gLYgfXxxWuFOQpCnyYx5BZ0yo8NpS0iNffOxSANWmXSdNMytaiMr_Cv1RAx6wJMgT1UUJe_6UaibCeJ_FLg1RjIBoCGqBatxqe-XRz-QX8Ltc89ksF-dOoxNJr0RegZHKtWUb_3m23lwq-82xtVeLrtELF0uR8SomLdKNZTazGQEO4474o/s844/nhillmax.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="671" data-original-width="844" height="508" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivhI5N3_gLYgfXxxWuFOQpCnyYx5BZ0yo8NpS0iNffOxSANWmXSdNMytaiMr_Cv1RAx6wJMgT1UUJe_6UaibCeJ_FLg1RjIBoCGqBatxqe-XRz-QX8Ltc89ksF-dOoxNJr0RegZHKtWUb_3m23lwq-82xtVeLrtELF0uR8SomLdKNZTazGQEO4474o/w640-h508/nhillmax.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Nhill minv22 adjustments over raw is below:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvYXa1xnGxFu8UuLhv9WFm0TJY-k7QNDkY0pG9kzCdek6eQELv8d5ByN3sCOWo1mI2wbHHS5f3LObqWO5ZDkMUsn1HYYsD9rQiCgHVHnF1xyCCM1CWY8kWFwWsz00p9hA0Dah1KB9YSRqXQTrcjLc51palKwoeHsdE3soHlDl7BLdXizsFM02bBER-/s845/nhillminadj.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="676" data-original-width="845" height="512" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvYXa1xnGxFu8UuLhv9WFm0TJY-k7QNDkY0pG9kzCdek6eQELv8d5ByN3sCOWo1mI2wbHHS5f3LObqWO5ZDkMUsn1HYYsD9rQiCgHVHnF1xyCCM1CWY8kWFwWsz00p9hA0Dah1KB9YSRqXQTrcjLc51palKwoeHsdE3soHlDl7BLdXizsFM02bBER-/w640-h512/nhillminadj.jpg" width="640" /></a></div><div><br /></div><div><br /></div><span style="font-family: verdana;">Palmerville is below, the arrow and label points to a tiny dot with a specific adjustment for a specific week/month.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiFybxpfZJ92KtBRjoY14kO3CPt_9oV95NNGWM9EJXkbhXZLq0xgwKBF83LKOkbFeeeFC_NZQJqIUzbX039O5jGnZ7n4OPzAbQQdBkmuU1K20nVrnWbv0UHmL_0J8de5U2U_q19gvNVWFEQNDeXDnA68QZ_WKfvdGuEi_XYALlRc8g6sgxJdnIUavvg/s1179/palmerminadjut.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="826" data-original-width="1179" height="448" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiFybxpfZJ92KtBRjoY14kO3CPt_9oV95NNGWM9EJXkbhXZLq0xgwKBF83LKOkbFeeeFC_NZQJqIUzbX039O5jGnZ7n4OPzAbQQdBkmuU1K20nVrnWbv0UHmL_0J8de5U2U_q19gvNVWFEQNDeXDnA68QZ_WKfvdGuEi_XYALlRc8g6sgxJdnIUavvg/w640-h448/palmerminadjut.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The problem here is that:<br />1-- most of the warming trends are created by adjustments, and this is easy to see.<br />2--BOM tell us that they are crucial because they have so many cases of vegetation growing, moving to airports, observer bias, unknown causes, cases where they <b>think</b> it should be adjusted because it doesn't look right and so on. </span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Now looking at the scatterplots you can clearly see that adjustments are not about correcting "step-changes" and biases. Recall, as we saw above, in most cases the adjustments make the data worse by adding duplicate sequences and adding other biases. Bedford's Law will also be used later to show <i>less compliance with adjustments indicating data problems.</i></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Biases that are consistent are easily dealt with:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><i>"Systematic bias as long as it does not change will not affect the changes in temperature. Thus improper placement of the measuring stations result in a bias but as long as it does not change it is unimportant. But any changes in the number and location of measuring stations could create the appearance of a spurious trend."<span> --</span></i></span><span style="font-family: verdana;"><i>Prof Thayer Watkins, San Jose University.</i></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">The Trend Of The Trend</span></h3><div><br /></div><div><span style="font-family: verdana;"><i>"Analysis has shown the newly applied adjustments in ACORN-SAT version 2.2 have not altered the estimated long-term warming trend in Australia." -- BOM</i></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><div style="font-family: "Times New Roman";"><span style="font-family: verdana;">Long term trends are pooled data and:</span></div><div style="font-family: "Times New Roman";"><i style="font-family: verdana;">".....pooled data may cancel out different individual signatures of manipulation."</i></div><div style="font-family: "Times New Roman";"><span style="font-family: verdana;">-- (Diekmann, 2007)</span></div></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">But version 2.2 does change trends on some individual stations, though, see below. Version 2.2 changed the trend on version 2.1.<br /><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmGbOO7s8gVh-QVzMl5QBZh9c_pZCiZP1Sh-bG2ukOFv9rxBB7hi3xrTHkPTiV9uwPM02GpP6WlX70h1N4oFT-MkK6fHZqMeyZrUktQue0BrxcnwgeS2zLpnfYRXhvcIQCrExAnAldgBMJ0J_6yis1vHCvAWyKAdtoFRIOx0v0mLHz3XB3KgeAoWm6/s1191/bourkespringtrends.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="830" data-original-width="1191" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmGbOO7s8gVh-QVzMl5QBZh9c_pZCiZP1Sh-bG2ukOFv9rxBB7hi3xrTHkPTiV9uwPM02GpP6WlX70h1N4oFT-MkK6fHZqMeyZrUktQue0BrxcnwgeS2zLpnfYRXhvcIQCrExAnAldgBMJ0J_6yis1vHCvAWyKAdtoFRIOx0v0mLHz3XB3KgeAoWm6/w640-h446/bourkespringtrends.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><span style="font-family: verdana;">And version 2 can change the trend of version 1 as well:</span></div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjz9TowZboN0Q3EaT0NbDZpDQXpouxeoWKSGp58Dr7y2IKi0j4bSr1dFJh8jfKbjR9XRxZwQ19w3ox0O3p2jTVGgvLpge1-zlDoH1NlHWfEDD39tvTPl1ZwekKrmZgGL7g1JgWVBn6vDKkvPZAIBtUNny1lUulJ7zSN9sTiguXjsphqpaXKcUIPjX16/s1191/bourkeautumn.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="830" data-original-width="1191" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjz9TowZboN0Q3EaT0NbDZpDQXpouxeoWKSGp58Dr7y2IKi0j4bSr1dFJh8jfKbjR9XRxZwQ19w3ox0O3p2jTVGgvLpge1-zlDoH1NlHWfEDD39tvTPl1ZwekKrmZgGL7g1JgWVBn6vDKkvPZAIBtUNny1lUulJ7zSN9sTiguXjsphqpaXKcUIPjX16/w640-h446/bourkeautumn.jpg" width="640" /></a></div><div><br /></div><div><br /></div><div><span style="font-family: verdana;">And version 1,2,2.1,2.2 can also change the trend of raw:</span></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEJzrnRJINPb9-Kl1lI2K0IXivlrbYudFnhXwNEp2ER9r7loN9tWmkkd_nXORpg8rZ_a_Yn_p5I0wmih9Z5YyMmdL3QbNQjxcCdvuNJxFs8hAqi2L0KLYUNTr0KtCumm0l82X8gnI8hGjRQJRGwtp-_eNEml-9EMdUbngG6nlPepQ1xEnvH8ync4lm/s1191/bourkewintertrends.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="830" data-original-width="1191" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEJzrnRJINPb9-Kl1lI2K0IXivlrbYudFnhXwNEp2ER9r7loN9tWmkkd_nXORpg8rZ_a_Yn_p5I0wmih9Z5YyMmdL3QbNQjxcCdvuNJxFs8hAqi2L0KLYUNTr0KtCumm0l82X8gnI8hGjRQJRGwtp-_eNEml-9EMdUbngG6nlPepQ1xEnvH8ync4lm/w640-h446/bourkewintertrends.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div><br /></div><div><br /></div><h3 style="text-align: left;"><span style="font-family: verdana;">Adjustments: Month Specific And<br /> </span><span style="font-family: verdana;">Add Outliers + Trends, </span></h3><div><br /></div><div><span style="font-family: verdana;">August and April are "boundary" months on the edge of summer and winter and often get special attention, warming or cooling depending if it is before or after around 1967. <i>Columns are months and frequencies (occurrences).</i></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgG6n2SIAqraZyKZfjEWMug3sfRVX-vIMoaNsl-3z7CO-Q2Sllgngh5fQc4TPWUUxh30dAllQ4bHPA-VTawnudRT88uVwAeY_kLMt9rBtkK2OjE3D9IUDrD8GM1qkVBfG7_3SWTQ39axlqqVBFgWGKbAEsCqGtSmoA6Eu6uOjcjusNEgwfiyG0YBpP0/s527/bourkeminnegadj.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="407" data-original-width="527" height="247" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgG6n2SIAqraZyKZfjEWMug3sfRVX-vIMoaNsl-3z7CO-Q2Sllgngh5fQc4TPWUUxh30dAllQ4bHPA-VTawnudRT88uVwAeY_kLMt9rBtkK2OjE3D9IUDrD8GM1qkVBfG7_3SWTQ39axlqqVBFgWGKbAEsCqGtSmoA6Eu6uOjcjusNEgwfiyG0YBpP0/s320/bourkeminnegadj.jpg" width="320" /></a></div><br /><span style="font-family: verdana;"><b>The above shows how the largest cooling adjustments at Bourke get <i>hammered</i> into a couple months.</b> This shows months and frequencies, how often adjustments of this size were done. It makes the bias adjustments look like what they are - warming or cooling enhancements.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">More of the same:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjijh1F3y9p1hIU2JjUEb9LkKWdObivqydZVLPJsT69H63Y_igB0od-jOKxY8CEGHlLU7SB32l7eKdb_wThur7Su1WY7E2VGSci8K-rfuT2Giy6X927_blk2LLBsjIFSpPt8ipUH2Mg5FsVBDTKjdOumjcBFwptGYKHSEPABZCLsqnlKLrq0NbxsGUe/s411/bourkeadjapril.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="411" data-original-width="331" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjijh1F3y9p1hIU2JjUEb9LkKWdObivqydZVLPJsT69H63Y_igB0od-jOKxY8CEGHlLU7SB32l7eKdb_wThur7Su1WY7E2VGSci8K-rfuT2Giy6X927_blk2LLBsjIFSpPt8ipUH2Mg5FsVBDTKjdOumjcBFwptGYKHSEPABZCLsqnlKLrq0NbxsGUe/s320/bourkeadjapril.jpg" width="258" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Strange how so many stations all have problems in April and August.</span></div><div><span style="font-family: verdana;">Below is a Violin Plot, this shows graphically where most adjustments go. Ignore the typo saying "May, as you can see it is August.</span></div><div><span style="font-family: verdana;">Horizontal axis is months, vertical Y axis is the amount of adjustment in Celsius.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">If the distribution is thick/big as per top of April, it means a lot of the distribution resides there, meaning many occurrences of warming adjustments. The long tails (spikes) indicate outliers, so August has many larger adjustments (up to -4 C).</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfwEaNil-7hS2RbwyH2YxJ2OGH1xHgI68yI8KcOqKORx7jeVFXoQHQVwggYD0aGKsIJmax0YMS3rMGAwd5oZte8ts1jlp_iYkweOfJtvX_rVMYOP0vU_RPUB4EpxQWIrOmeHM6rOczPVY0m6qfV_BIrWYbOcuLg7lJaK4l-qCDFUmKITH78rfwLUns/s904/bourkeviolin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="904" data-original-width="842" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfwEaNil-7hS2RbwyH2YxJ2OGH1xHgI68yI8KcOqKORx7jeVFXoQHQVwggYD0aGKsIJmax0YMS3rMGAwd5oZte8ts1jlp_iYkweOfJtvX_rVMYOP0vU_RPUB4EpxQWIrOmeHM6rOczPVY0m6qfV_BIrWYbOcuLg7lJaK4l-qCDFUmKITH78rfwLUns/s320/bourkeviolin.jpg" width="298" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">We can prove the causality of adjustments with Bayesian and Decision Tree data mining (later).</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Below we look at values that are missing in Raw but appear in version 2.1 or 2.2.</span></div><div><span style="font-family: verdana;">This tells us they are created or imputed values. In this case the black dots are missing in raw, but now appear along with some outliers.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3Po92uogUpqJ4i1mn_6C9zAJrfjRjflPLveTd7kPgvutt4PzMsB7ESSMk_Gno2iPzUhJC2pjiN1BlHNNPcp6WZVIwcMxk5Q51jRN65M3OKSx-YUEdLR4D4p1VHp3cpHJBVaBX9u8Df5je560Cmo5y2VugxEYwDe_tyvWEB8-WIhl8WDlsqnL9nk8_/s777/bourkeimpute.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="777" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3Po92uogUpqJ4i1mn_6C9zAJrfjRjflPLveTd7kPgvutt4PzMsB7ESSMk_Gno2iPzUhJC2pjiN1BlHNNPcp6WZVIwcMxk5Q51jRN65M3OKSx-YUEdLR4D4p1VHp3cpHJBVaBX9u8Df5je560Cmo5y2VugxEYwDe_tyvWEB8-WIhl8WDlsqnL9nk8_/w640-h560/bourkeimpute.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><i>These outliers and values, by themselves, have an upward trend.</i> In other words, the imputed/created data has a warming trend.(below)</span></div><div><span style="font-family: verdana;"><br />Adding outliers is a no-no any any data analysis. The fact is that only some values are created, which seem to suit the purpose of warming, but there are still missing values in the time series. As we progress to different versions of tweaking software, it is possible <i>new missing values</i> will be imputed, or other values disappear.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiwQ5O0qOrKi4o2tdNavy0ZjJNCkhm1kzSHaziAHvcEvtN90vfDoXKbe4Ov1cSbj5OPgqgTEugkly5Q33cs6gRSKtDHzOKlEylYb6lYQcMwm-IQRl1UpxYYTIXXWTkKo5CLLh72DURXG13ZHPBSKvKt-pGWf80jRlqjH8BoTcEhQV0vHdhpddnXC_H/s777/bourkeimpute2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="777" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiwQ5O0qOrKi4o2tdNavy0ZjJNCkhm1kzSHaziAHvcEvtN90vfDoXKbe4Ov1cSbj5OPgqgTEugkly5Q33cs6gRSKtDHzOKlEylYb6lYQcMwm-IQRl1UpxYYTIXXWTkKo5CLLh72DURXG13ZHPBSKvKt-pGWf80jRlqjH8BoTcEhQV0vHdhpddnXC_H/w640-h560/bourkeimpute2.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">First Digit Of Temperature Anomalies Tracked For 120 years</span></h3><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">This paper from Scotland along with their customised R software, tracks the distances tropical cyclones travel over time:</span></div><div><h1 class="content-title" style="background-color: white; box-sizing: inherit; clear: initial; font-weight: 400; letter-spacing: -0.01em; line-height: 22.5pt; margin: 20pt 0px 10pt;"><span style="font-family: verdana; font-size: small;"><i>Technological improvements or climate change? Bayesian modeling of time-varying<br />conformance to Benford’s Law, Junho Lee and Miquel de Carvalho (<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6461246/" target="_blank">link</a>)</i></span></h1></div><div><span style="font-family: verdana;">It uses Benfords Law to check for compliance or deviation of the first digit in a hurricanes travel, and they combine it with a Bayesian model. I wrote to the authors and they slightly modified the R code to run on Acorn anomalies.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">This allowed me to check scarcity or excess use of the first digit in a timeseries over 120 years. Later on I will show how Bedford's Law is justified ( University Canberra has already shown temperature anomalies are Bedford's compliant, see Sambridge), but to keep the picture clearer we will minimise the data.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMgL9wN7cBIelBHB1a7oZVyDgp3HzZiFmvhz6YrqN7V68vuO-hu750eGkDklKn11vjcEDhchM3G3uCuSbOw77F6vO3rF9nNW5u7hSdOrEsi8Ne3vrYegAzjlnIekuqyeoaMOBQbfoOeX8YX5WkhosPK578TvFkjR8V3I7w0cCeKB8jTRPxkL4vIWX-/s1920/bourkebayes.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMgL9wN7cBIelBHB1a7oZVyDgp3HzZiFmvhz6YrqN7V68vuO-hu750eGkDklKn11vjcEDhchM3G3uCuSbOw77F6vO3rF9nNW5u7hSdOrEsi8Ne3vrYegAzjlnIekuqyeoaMOBQbfoOeX8YX5WkhosPK578TvFkjR8V3I7w0cCeKB8jTRPxkL4vIWX-/w640-h348/bourkebayes.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The X axis is years, and since it began in 1911, this becomes one. The arrows show where the years 1911-2019 are.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">This shows a real time use of digits in temperature first </span><span style="font-family: verdana;">anomaly</span><span style="font-family: verdana;"> position ie leading value. The 1 value is underused till 2010 where it becomes overused briefly, the drops once again. This shows that the adjustments have too many digits in leading position of minv21 with a 2-4 value, and not enough values of one. This is consistent with other studies using Bedford's Law showing that small values have been decreased in first digit position, thereby warming or cooling that part of the time series (depending whether the temperature anomaly has a + or - in front of it).</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><h3 style="text-align: left;"><span style="font-family: verdana;">The German Tank Problem</span></h3><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><span style="background-color: white;">In World War II, each manufactured German tank or piece of weaponry was printed with a serial number. Using serial numbers from damaged or captured German tanks, the Allies were able to calculate the total number of tanks and other machinery in the German arsenal.</span></span></div><div><span style="font-family: verdana;"><span style="background-color: white;"><br /></span></span></div><div><span style="font-family: verdana;"><span style="background-color: white;">The serial numbers revealed extra information, in this case an estimate of the entire population based on a limited sample.</span></span></div><div><span style="font-family: verdana;"><span style="background-color: white;"><br /></span></span></div><div><span style="font-family: verdana;"><span style="background-color: white;">This is an example of what David Hand calls <b>Dark Data</b>. This is data that many industries have, never use, but leaks interesting information that can be used. (<a href="In World War II, each manufactured German tank or piece of weaponry was printed with a serial number. Using serial numbers from damaged or captured German tanks, the Allies were able to calculate the total number of tanks and other machinery in the German arsenal." target="_blank">link</a>)</span></span></div><div><br /></div><div><span style="font-family: verdana;">Now, <b>Dark Data</b> in the context of Australian Climate data would allow us extra insight to what the BOM is doing behind the scenes with the data....that they are not aware of. So if dodgy work was being done, they would not be aware of any information "leakage."</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">A simple Dark Data scenario here is very simply done by taking the first difference of a time series:</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Get the difference between temperature 1 and temperature two, then the difference between temperature 2 and temperature 3 and so on. (below)</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdH4IpX7PQAqHZygzGk1UquJLD7Pd2j_oIY6NMCMSJ7L-dET9NAxAYVg5Xer-k_MPu36xpHa8CweN-HJYo6UK7poTiyCuRmnOlks9xF9wz4myafCgIAajSBj18HHIAOOkp_jjkeEFPOU3LGiTNdGsgXhmzyfrgmV-p0tBWj7uRuduIuL-dJmWVsmZ9/s481/zzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="481" data-original-width="203" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdH4IpX7PQAqHZygzGk1UquJLD7Pd2j_oIY6NMCMSJ7L-dET9NAxAYVg5Xer-k_MPu36xpHa8CweN-HJYo6UK7poTiyCuRmnOlks9xF9wz4myafCgIAajSBj18HHIAOOkp_jjkeEFPOU3LGiTNdGsgXhmzyfrgmV-p0tBWj7uRuduIuL-dJmWVsmZ9/s320/zzzz.jpg" width="135" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">An example is above.<i> If the difference between two days is zero, then the two paired days have the same temperature.</i> So this is a quick and easy way to spot paired days that have the same temperature.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Intuition would expect a random distribution with no obvious clumps.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIjCk-y278XA9u1Y-a71wUN8KvrbF-VqO_JI_-cl36L4GcBXItIblugS4-78kCHYw6iGbZEqJD8Xz5VXuhOKxKe-bFKSERviFaL4QH3qkynEhqxUEbtm_X3qPgIyjjwYgSpJNqH7Xf8eeoIzSDl-eTDnfbV94D_831t5SdD4Wih5qFh6COtGtHjpnM/s1191/dekooydiff0.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="830" data-original-width="1191" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIjCk-y278XA9u1Y-a71wUN8KvrbF-VqO_JI_-cl36L4GcBXItIblugS4-78kCHYw6iGbZEqJD8Xz5VXuhOKxKe-bFKSERviFaL4QH3qkynEhqxUEbtm_X3qPgIyjjwYgSpJNqH7Xf8eeoIzSDl-eTDnfbV94D_831t5SdD4Wih5qFh6COtGtHjpnM/w640-h446/dekooydiff0.jpg" width="640" /></a></div><div><br /></div><span style="font-family: verdana;">Above is deKooy in the Netherlands with a fairly even distribution. Sweden is very similar. Diff0 in the graph refers to the fact that there is zero difference between a pair of temps when using the First Difference technique above, <i>meaning that the 2 days have identical temperatures.</i></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Lets look at Melbourne, below:</span></div><div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjesHyKEKllNOQf45PqH25MhdZfYqXsdDI7KJQDQY0WZMZANy4dypBIAO4YzDIVJnu1tH-63Z6pxtu_YuBKIa2oSUCo6RYRJHsRIXttVm1vYjdAzeyVDlLMEsf0R9KaYSugJ1rS1pzMal0B3NJw0MeDUijVfZujJSnupnicHBTVcYobf_Ck4Xqi57zm/s781/melbournemaxscatter.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="781" height="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjesHyKEKllNOQf45PqH25MhdZfYqXsdDI7KJQDQY0WZMZANy4dypBIAO4YzDIVJnu1tH-63Z6pxtu_YuBKIa2oSUCo6RYRJHsRIXttVm1vYjdAzeyVDlLMEsf0R9KaYSugJ1rS1pzMal0B3NJw0MeDUijVfZujJSnupnicHBTVcYobf_Ck4Xqi57zm/w640-h600/melbournemaxscatter.png" width="640" /></a></div><br /><span style="font-family: verdana;">The paired days with same temperatures are clustered in the cooler part of the graph, and taper out after 2010 or so.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Below is Bourke, and again you can see clustered data.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitz2fPlX3vfMVRJbC94jytDCZ7tX9mBgw9zQHXxHgiKVrG63Ti7q76eEGOrI401CDnmipXbBE_FUxJhNbmusn8Rf411l0h4RAA2UmtLjDXfo1tOrl0ExVOT_dnynk0cuIOWUgKZuhg00uC7NwNBP9hZTO-bN8zV0OT5aeCOJchBkOKsutrCaBoCxvo/s683/bourkemindiff0.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="635" data-original-width="683" height="596" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitz2fPlX3vfMVRJbC94jytDCZ7tX9mBgw9zQHXxHgiKVrG63Ti7q76eEGOrI401CDnmipXbBE_FUxJhNbmusn8Rf411l0h4RAA2UmtLjDXfo1tOrl0ExVOT_dnynk0cuIOWUgKZuhg00uC7NwNBP9hZTO-bN8zV0OT5aeCOJchBkOKsutrCaBoCxvo/w640-h596/bourkemindiff0.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Below is Port Macquarie, and there is extremely tight clustering from around 1940-1970.</span></div><div><br /></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitmBeTko_FXy2emLSctq6DIRt1GWn3nMkiCKunSlzazY7F1lthDpGL0ucrtXbRgUyZiDXhdwTHh7r0jZSk2A6_GB1U0RHb1A_A6NyJQR1yyS7rxREsDj4AA0lCNr-UUOp3nEm_FY2xc5wZ9RijB1aMWVXt6w3My8bYqQHvQp3gB0hDB-IsPwKG-cOD/s924/portdiff0.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="752" data-original-width="924" height="520" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitmBeTko_FXy2emLSctq6DIRt1GWn3nMkiCKunSlzazY7F1lthDpGL0ucrtXbRgUyZiDXhdwTHh7r0jZSk2A6_GB1U0RHb1A_A6NyJQR1yyS7rxREsDj4AA0lCNr-UUOp3nEm_FY2xc5wZ9RijB1aMWVXt6w3My8bYqQHvQp3gB0hDB-IsPwKG-cOD/w640-h520/portdiff0.jpg" width="640" /></a></div><br /><span style="font-family: verdana;">This data is varying with adjustments, in many cases there are very large difference before and after adjustments.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The capital cities vary around 3-4% of the data being paired. Country station can go up to 20% for some niche groups.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The hypothesis is this: The most heavily clustered data points are the most heavily manipulated data areas.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The paired clusters are not heavily dependent on temperature but more on adjustments-- this was discovered as the causal link in data mining analysis, more later.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Let's grab a random spot in the heaviest clustered areas at Port Macquarie, 1940-1970.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj33V3AmB3cg87Z9GiFis1EKLdF3TBiShWg7rv9n17lb9pGknrf1fNwfa3l4Cobipp7dz_oS1xDkJovpm1bI8puM1OOTOAEPZ7-In2V--K_RKYY9ypIIzQYg2sjLtPbsrAR8pfEIi0BzRTaW-QlUk_lQxO3hzCQ_awQEveVUbIz5XhXa2rkdJd64e9l/s1665/portultraseq.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1665" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj33V3AmB3cg87Z9GiFis1EKLdF3TBiShWg7rv9n17lb9pGknrf1fNwfa3l4Cobipp7dz_oS1xDkJovpm1bI8puM1OOTOAEPZ7-In2V--K_RKYY9ypIIzQYg2sjLtPbsrAR8pfEIi0BzRTaW-QlUk_lQxO3hzCQ_awQEveVUbIz5XhXa2rkdJd64e9l/w640-h400/portultraseq.jpg" width="640" /></a></div><br /><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Above is a temperature segments and it is immediately apparent that many days have duplicated sequences.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">From this we know:</span></div><div><span style="font-family: verdana;">1--the data is not observational.</span></div><div><span style="font-family: verdana;">2--it is heavily manipulated.</span></div><div><span style="font-family: verdana;">3--it is probably not "real".</span></div><div><span style="font-family: verdana;">4--it certainly is not climate data readings taken from a thermometer.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">More from Port:<br /><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8SUBIbe6d6-udSzZC6tOw7H4y8dvmssvzBBh7yZzG8QAWcuLQnfm8azRl3t2u2GEBR1gPgIIkSSyq11dWezg8QZit_QQQ1uAI3CxtmlPjn_5aNwLYcii1s__aI9fPT8npOm7oeLt5axcEiCEOQb7c9IoYPFJBeo1IwD8zYPMribYRs3QEALjTsmJk/s1356/portseqaaa.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1356" height="492" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8SUBIbe6d6-udSzZC6tOw7H4y8dvmssvzBBh7yZzG8QAWcuLQnfm8azRl3t2u2GEBR1gPgIIkSSyq11dWezg8QZit_QQQ1uAI3CxtmlPjn_5aNwLYcii1s__aI9fPT8npOm7oeLt5axcEiCEOQb7c9IoYPFJBeo1IwD8zYPMribYRs3QEALjTsmJk/w640-h492/portseqaaa.jpg" width="640" /></a></div><br /><div><span style="font-family: verdana;">Here we have gaps of 1 and 3 between the sequences.</span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">Below we have gaps of 8 between sequences!</span></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpV_wsSc8XaGGBqHnH7BExIgWhwUnmLoDJui0kdetKe6HCGkA3DTUhN_IgKq8iMBahcw9am6nQOOWA0oJ35qao5xB-TgKQJuAuKJy_7puUL2xnOeMjfvrkZX72PgggVTbqiA1rvRKGPmnNChuFluWOTFJjSY_SOb3T0LMt9hHMcMHIfrXmOoZUfrWN/s1356/portseqbbb.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1356" height="492" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpV_wsSc8XaGGBqHnH7BExIgWhwUnmLoDJui0kdetKe6HCGkA3DTUhN_IgKq8iMBahcw9am6nQOOWA0oJ35qao5xB-TgKQJuAuKJy_7puUL2xnOeMjfvrkZX72PgggVTbqiA1rvRKGPmnNChuFluWOTFJjSY_SOb3T0LMt9hHMcMHIfrXmOoZUfrWN/w640-h492/portseqbbb.jpg" width="640" /></a></div><div><br /></div><div><br /></div><div><br /></div><span style="font-family: verdana;">Below -- now we have gaps of 2, then 3, then 4, then 5. You couldnt make this stuff up. Bear in mind that every time series has <b>hundreds</b> of these dodgy sequences!</span><br /><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhW3Rkg0bVOdby593TuABPJm40zbNEvVeYyWbNVtXSJlzMLrN53MhDQh64Op3WpxcGNGf2D-4OdJWQk_WnTJ5k4NWuUqc_c-10aCjx1OpawL16p8Ym0c31yw9Jt6Kg8L3LCsyVcqyJJV17-TFPsQFFjofonn43PBK1biIiiY2uAfuw0HDC3i3-O_UgE/s1356/portseqeeee.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1356" height="492" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhW3Rkg0bVOdby593TuABPJm40zbNEvVeYyWbNVtXSJlzMLrN53MhDQh64Op3WpxcGNGf2D-4OdJWQk_WnTJ5k4NWuUqc_c-10aCjx1OpawL16p8Ym0c31yw9Jt6Kg8L3LCsyVcqyJJV17-TFPsQFFjofonn43PBK1biIiiY2uAfuw0HDC3i3-O_UgE/w640-h492/portseqeeee.jpg" width="640" /></a></div><br /><div><br /></div><div><span style="font-family: verdana;">It's pretty clear you couldn't use this data to model or predict anything. </span></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The Climate data presented by the BOM has no integrity, it is highly modified with each iterative version of software, under the umbrella of "homogenisation." </span></div></div><div><span style="font-family: verdana;"><br /></span></div><div><span style="font-family: verdana;">The data cleaning procedures are non existent -- sometimes data is imputed along with outliers, sometimes data is deleted when strategically advantageous, along the way the data is continuously modified creating duplicated runs and sequences such that the data no longer has any characteristics of "naturally occurring numbers. The next post uses Bedford's law and other Digit Tests to show that <b>climate data is modelled output and not observational data.</b></span></div><div><span style="font-family: verdana;"><br /></span></div><div><p><i style="font-family: verdana;"><br /></i></p><p><br /></p><p><i style="font-family: verdana;"><br /></i></p><p><i style="font-family: verdana;"><br /></i></p><p><br /></p><p><br /></p><p><br /></p><p> </p><p><br /></p></div></div></div></div></div>Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-78595354431230126952020-12-08T15:42:07.538-08:002021-02-11T22:26:19.524-08:00BOM Raw Climate Data - EVIDENCE LARGE SCALE TAMPERING.<div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUPAGSnK033_Slsfo1lVvQGN-A2B-IEKnlGrjcczUbSYdn6bxzVxwz8jpuDySO_3JM6fioC8Z0NjG1wNcRvlIKKM9CWOFxUGay2DCo6er2tS-fqb9DmRnTNMTjJQNSvENv-T8l-2T9NXE/s1200/big_data_analytics_analysis_thinkstock_673266772-100749739-large.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="800" data-original-width="1200" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUPAGSnK033_Slsfo1lVvQGN-A2B-IEKnlGrjcczUbSYdn6bxzVxwz8jpuDySO_3JM6fioC8Z0NjG1wNcRvlIKKM9CWOFxUGay2DCo6er2tS-fqb9DmRnTNMTjJQNSvENv-T8l-2T9NXE/w640-h426/big_data_analytics_analysis_thinkstock_673266772-100749739-large.jpg" width="640" /></a></div><div style="text-align: right;"><span style="font-family: verdana; font-size: xx-small;"> </span></div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana; font-size: large;">An Investigation Into Australian Bureau Of Meteorology - Large Scale Data Tampering.</span></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b><i>“Findings from a multitude of scientific literatures converge on a single point: People are credulous creatures who find it very easy to believe and very difficult to doubt.”</i></b> </span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">(<a href="wjh-www.harvard.edu/~dtg/Gillbert (How Mental Systems Believe).PDF">How Mental Systems Believe</a>, Dan Gilbert, psychologist)</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">The concept of <i>garbage in, garbage out</i> means that no meaningful output can result from 'dirty' data being input, and all adjustments that follow are moot. So raw data files as records of observation are critical as an accurate temperature record. The question is, are they raw? Is this unadjusted observational data?</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><i><span style="font-family: verdana; font-size: medium;">"The Bureau does not alter the original temperature data measured at individual stations." <br /> -- BOM, 2014</span></i></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Summary Of Results.</span></b></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">My Analysis Shows Heavily Tampered Raw Data. </span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">There are <i>whole months that have been copy/pasted into other months,</i> impossible sequences and duplications, complete <i>nonconformance to Benford's law indicating heavy data tampering,</i> standard errors of 30 or more in number bunching tests indicating abnormally repeated temperature frequencies, <i>strategic rounding where no decimal numbers exist for years</i>, there are temperatures where Raw is missing but the Adjusted data has been infilled/imputed to create yearly temp records and or upward trends. This infilling is <i>strategically selective</i>, only specific cases are infilled, thousands are left empty while extreme outliers are <i>added </i>into the data.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i>In most cases data is even worse after adjustments according to Benford's Law and Control Charts. </i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i>And there is a smoking gun: nearly all raw data has man-made fingerprints showing engineered highs and lows in repeated data frequencies which can only be due to large scale tampering</i> -- all beautifully visual when exposed by zero bin histograms. <i>Raw is Not Raw, It Is Rotten And Overcooked.</i></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Update:</b> Below in <i>More</i> <i>Dodgy Adjustments</i> section, new tables in Minimum Temps show Bourke, the most manipulated fabricated station of them all, has had large cooling adjustments of -2.7C or more put onto <i>ONLY May and Aug for 84 years!</i> From 1911-1995, May and August Minimum received an average of 23 adjustments per year, whether the station moved up the hill or down the hill, whether the vegetation engulfed it or the thermometer drifted, or even if it was spot-on, it got a large cooling adj of -2.7C or more, right up to 1995, only for May and Aug!</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Preliminary.</span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Data from the 112 ACORN stations that I used:</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Bom supply <a href="http://www.waclimate.net/acorn2/index.html">data</a> for minimum and maximum temperatures:</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">maximum raw = maxraw</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">minimum raw = minraw</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">maximum adjusted = maxv1, maxv2, maxv2.1<span> different updates</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">minimum adjusted = minv1, minv2, minv2.1 <span> </span><span> </span>different updates</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">I have compared ACORN Raw with AWAP Raw and it is identical. </span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: large;"> </span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: large;"><span style="font-size: large;">“</span><i style="font-size: large;">There has been no statistically significant warming over the last 15 years.” -- 13 February 2010, <a href="https://www.cato.org/publications/commentary/climategate-beyond-inquiry-panels">Dr. Phil Jones</a></i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: large;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Billion Dollar Data That Has Never Had An Audit.</span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Incredibly, the BOM temperature series data has never had an independent audit and has never been tested with fraud analytic software, despite the vast amounts of money involved in the industry and the flourishing consultancies that have popped up.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">BOM make a lot of the independent review 10 years ago that compared their <i>methodology</i> <i>and results</i> to other climate data and found it <i>robust</i> because it was similar, but the <i>data has never been audited.</i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">As we will see later on, the other climate agencies have complete lack of conformance to Benford's Law too indicating data problems.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">You don't even need Benford's Law to see multiple red flags with online queries from from the GHCN U.S data network. (<a href="http://mc-computing.com/Science_Facts/Annual_Temperature_Plots/Histograms_Oceans.html">link</a>) <span style="color: #2e2e2e;">The U.S GHCN data <i>stops reporting cooling stations</i> in south East Australia after 1990!</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><span style="font-family: verdana;">The BOM say adjustments don't make any difference and their evidence is a graph of </span><i style="font-family: verdana;">averages of averages of averages -</i><span style="font-family: verdana;"> days averaged to months averaged to years, averaged with 112 stations, all without published boundaries of error.</span></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><i><span style="font-size: medium;">It is well known that pooled data can hide individual fraud and manipulation signatures</span><span style="font-size: medium;"><span>, </span><span>and even begin to conform with Benford's Law due to multiplication and or division in data </span></span><span style="font-size: x-small;">(Diekmann, 2007).</span></i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Pooled data can also exhibit Simpsons Paradox where a trend in different groups can reverse when combined. </span></div><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Below: Annual Mean temperature averages BOM use to show that individual adjustments don't matter....without boundaries of error or confidence levels.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0gpSuD58FRvsYbq8b7lUbay3pJJztDlbrQoxikd3AYbcPA1Qox92mN2COtLNQnUIFoLBWmuvVegW1hIimtga9YmrzQj0F7mPCpSC_7CmtqxiErrRk5mxwcmoJQPauZ6_qTODqnr-QrEQ/s1180/averagesofaverages.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="794" data-original-width="1180" height="430" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0gpSuD58FRvsYbq8b7lUbay3pJJztDlbrQoxikd3AYbcPA1Qox92mN2COtLNQnUIFoLBWmuvVegW1hIimtga9YmrzQj0F7mPCpSC_7CmtqxiErrRk5mxwcmoJQPauZ6_qTODqnr-QrEQ/w640-h430/averagesofaverages.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">What is also amazing is that Benford's Law which has a proven history in fraud detection in many fields and is admissible as evidence in a court of law in the U.S has not been run on any climate data of significance.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><span style="font-family: verdana;">Much natural and man-made data follows Benford's Law if there are</span><i style="font-family: verdana;"> several orders of magnitude</i><span style="font-family: verdana;">.</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Temperature has an upper and lower limit to what you would normally observe so it doesn't follow Benford's Law per se, but if you convert it to a temperature anomaly (which is simply an offset used by climate industry ), then it <i>does</i> follow Benford's Law. (<a href="https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2010GL044830">Sambridge et al, 2010</a>).</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">You would think that your tax returns first digit would have values of equal probability appearing, but this isn't so -- one appears about 30% of the time and nine appears less than 5% of the time which is why the tax man is interested in this law too, it has helped find and even convict on tax or cheque fraud.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">It turns out that human beings are not very good at fabricating numbers.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Peer Review Is No Guarantee</span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>"Scientific fraud, particularly data fabrication is increasing."</b></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span style="font-size: medium; font-weight: bold;"> </span><span style="font-size: x-small;">-- </span><span style="font-size: x-small; font-style: italic;">(Data Fraud In Clinical Trials, Stephen George and Marc Buyse, 2015)</span><span style="font-size: medium; font-style: italic;">. </span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span style="font-size: medium;"><br /></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span style="font-size: medium;">Retractionwatch.com has over 2000 retractions.</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Fujii has the world record with 183 retractions, after which he was fired from Tokyo University, <a href="https://www.nature.com/news/retraction-record-rocks-community-1.11434">here</a>. <a href="http://datacolada.org">Uri Simonsohn</a> has a website where he replicates studies and has been responsible for many retractions.</span></div><p><span style="font-family: verdana; font-size: medium;">Smeesters, Staples and Sanna were three very high profile professors in peer reviewed journals. All were found guilty of data fabrication. All resigned and restracted their papers. </span></p><p><span style="font-family: verdana;"><i><span style="font-size: medium;">Peer reviews are no protection against fabrication</span></i><span><span style="font-size: medium;">.</span><span style="font-size: large;"> </span><span style="font-size: medium;">Uri Simonsohn was responsible for exposing the 3 professors on data alone, he argues the way forward is to to supply all raw data and code with studies for replication</span><span style="font-size: large;">.</span></span><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2114571" style="font-size: large;">(link</a><span style="font-size: large;">)</span></span></p><p><span style="font-family: verdana; font-size: medium;"><b>Below:</b> One of the 53 studies from Stapels that was retracted due to fabrication. Note the duplicated entries. </span></p><p><span style="font-family: verdana; font-size: medium;">Number duplication or repeating the frequency of numbers is one of the most common causes of fabrication, and even the BOM uses low level copy/paste duplications of temperatures.</span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6myGhG8gbqimoNUiMI_oNA0FFfWAD0j4x70o38eZHKaQRXskqgumyHJljgBcO4hNIABhoPvXEL8nbvqAPD9FqGnOEGgcyWzDEKGoCii3Cd0RUV7PmhC2m2z5pesCmgK7ToTKbfL-XR-Q/s1200/oo_85087.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="620" data-original-width="1200" height="330" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6myGhG8gbqimoNUiMI_oNA0FFfWAD0j4x70o38eZHKaQRXskqgumyHJljgBcO4hNIABhoPvXEL8nbvqAPD9FqGnOEGgcyWzDEKGoCii3Cd0RUV7PmhC2m2z5pesCmgK7ToTKbfL-XR-Q/w640-h330/oo_85087.jpg" width="640" /></a></div><br /><p><span style="font-size: medium;"><span style="font-family: verdana;">John Carlisle is an anaesthetist</span><b style="font-family: verdana; font-style: italic;"> </b><span style="font-family: verdana;">who is also a part time data detective, he has uncovered scientific misconduct in hundreds of papers and helped expose some of the world's leading scientific frauds</span></span><span style="font-family: verdana; font-size: large;">. (</span><a href="https://www.enago.com/academy/investigation-of-clinical-trials-unveils-data-fabrication/" style="font-family: verdana; font-size: large;">link</a><span style="font-family: verdana; font-size: large;">).</span></p><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><b><span style="font-family: verdana; font-size: medium;">Reproducibility -</span></b></div></span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">"No Raw Data, No Science"</span></b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><span style="font-family: verdana;">Reproducibility is a major principle of the scientific method</span><i style="font-family: verdana;">.</i></span><i style="font-family: verdana; font-size: large;"> </i><span style="font-family: verdana; font-size: large;">(</span><a href="https://en.wikipedia.org/wiki/Reproducibility#:~:text=Reproducibility%20is%20a%20major%20principle,same%20methodology%20by%20different%20researchers." style="font-family: verdana; font-size: large;">wiki</a><span style="font-family: verdana; font-size: large;">).</span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">"Reproduction in climate science tends to have a</span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">broader meaning, and relates to the robustness</span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">of results." -BOM</span></i></div></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><span style="font-size: medium;"><span style="background-color: white; color: #444444; font-family: verdana; text-align: left;"><i><span>"Robustness checks involve reporting alternative specifications that test the same hypothesis. </span></i></span><i style="color: #444444; font-family: verdana; text-align: left;"><span> Because the problem is with the hypothesis, the problem is not addressed with robustness checks."</span></i></span></div><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana; font-size: medium;"><i style="color: #444444; text-align: left;"><span>-- </span></i></span><a href="https://statmodeling.stat.columbia.edu/2016/09/18/another-item-for-uris-comment-section/" style="font-family: verdana; font-size: large; outline-width: 0px; text-align: left; user-select: auto;">Uri Simonsohn</a></div><div><span style="font-family: verdana; font-size: medium;"><i style="color: #444444; text-align: left;"><span><b><br /></b></span></i></span></div></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><span style="font-family: verdana;">Tsyoshi Miyakawa is editor of </span><i style="font-family: verdana;">Molecular Brain,</i><span style="font-family: verdana;"> he estimates a quarter of the journals he has handled contain fabricated data, and is leading for his push </span><i style="font-family: verdana;">"No Raw Data, No Science"</i><span style="font-family: verdana;">, to only publish reproducible studies that supply raw data.</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><span style="font-family: verdana;">The BOM supply little documentation regarding meta data, adjustments, neighbouring stations used, correlations used for adjustments. </span><span style="font-family: verdana;">It is not transparent and so cannot be replicated.</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Excel plug-ins such as XLstat have already built in the Alexandersson algorithms used by BOM so it would be possible to replicate the adjustments if sufficient documentation were available. </span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span style="font-size: medium;"><i>"The removal of false detected inhomogeneities and the acceptance of inhomogeneous series affect <b>each subsequent analysis</b>."</i><span> </span></span><span style="font-size: x-small;">(<i>A. Toreti,F. G. Kuglitsch,E. Xoplaki, P. M. Della-Marta, E. Aguilar, M. Prohom fand J. Luterbacher g, 2010</i>)</span></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">The adjustment software used by the BOM is running at a 95% significance level so 1 in 20 sequences that are normal will be flagged as "breaks" or anomalous, as will the number of stations selected; this in turn affects <i>each subsequent analysis.</i></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5SiZ8rrePy0aCPmsDpiBc7FAMGP9OnM6w0qwgl8D4_eUB471GjCvMKnS4onfdy2X_uXf9HBRo25e74_UCnRu29Jh2RSNFQLSkq5mQrra_5wKP6MmRF7FoT0VNOuJNCom-KFNOENDjtMc/s1120/xxxx.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="432" data-original-width="1120" height="246" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5SiZ8rrePy0aCPmsDpiBc7FAMGP9OnM6w0qwgl8D4_eUB471GjCvMKnS4onfdy2X_uXf9HBRo25e74_UCnRu29Jh2RSNFQLSkq5mQrra_5wKP6MmRF7FoT0VNOuJNCom-KFNOENDjtMc/w640-h246/xxxx.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i>"Homegenization does not increase the accuracy of the data - it can be no higher than the accuracy of the observations. The aim of adjustments is to put different parts of a series in accordance with each other as if the measurements had not been taken under different conditions."</i> </span><span style="font-family: verdana; font-size: x-small;">(M.Syrakova, V.Mateev, 2009)</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Fraud Analytics</b><br />The principle in Fraud Analytics is that <i>data that is fabricated or tampered looks different to naturally occurring data.</i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Tools to help in the search:</span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">(1) SAS JMP - powerful statistical software designed for data exploration and anomolous pattern detections. This detects patterns such as copy/paste, unlikely duplications, sequences etc.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">(2) R code for Benfords Law - industrial strength code to run most of the tests advocated by Mark Nigrini in his fraud analytics books. Benford's law points to digits that are used too much or too little. Duplication of exact numbers is a major cause of fraud. (<a href="http://datacolada.org">Uri Simonsohn</a>)</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span><span style="font-size: medium;">(3) R code from </span><i><span style="font-size: medium;">"Measuring Strategic Data Manipulation: Evidence from a World Bank Project"</span><span style="font-size: large;"> </span><span style="font-size: x-small;"> --</span></i></span><span style="font-size: x-small;"><span> </span><span>By Jean Ensminger and Jetson Leder-Luis</span></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"> (4) R code to replicate BOM methodology to create temperature anomalies to use with Benford's law.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span style="font-size: medium;">(5) R code from University of Edinburgh -- </span><i><span style="font-size: medium;">"Technological improvements or climate change? Bayesian modeling of time-varying conformance to Benford’s Law" </span><span style="font-size: x-small;">--</span></i><span style="font-size: x-small;"> Junho Lee + Miguel de Carvalho.</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: large;">(6) </span><span style="font-size: medium;"><span style="font-family: verdana;">R code -</span><i style="font-family: verdana;"> </i><i><span style="font-family: verdana;">"<code style="background-color: white; color: #2e2e2e;">NPC</code></span></i><span style="background-color: white; color: #2e2e2e;"><i><span style="font-family: verdana;"><span>: An R package for performing nonparametric combination of multiple dependent hypothesis tests" </span></span></i></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: white; color: #2e2e2e;"><i><span style="font-family: verdana;"><span style="font-size: medium;">--</span></span></i><span style="font-family: verdana; font-size: medium;"> </span><a href="https://caughey.mit.edu/software" style="font-family: verdana; font-size: large;">Devin Caughey</a><span style="font-family: verdana; font-size: medium;"> from MIT</span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: white; color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><br /></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: white; color: #2e2e2e; font-family: verdana; font-size: large; text-align: justify;">(7) </span><span style="font-size: medium;"><span style="background-color: white; color: #2e2e2e; font-family: verdana; text-align: justify;">R code - </span><span style="color: #2e2e2e; font-family: verdana; text-align: justify;"><span>Number-Bunching: A New Tool for Forensic Data Analysis (<a href="http://datacolada.org/77">datacolada</a>). </span><span face="lato, lato, helvetica neue, helvetica, arial, sans-serif"><span style="background-color: white;"> </span></span></span><span style="color: #2e2e2e; font-family: verdana; text-align: justify;">Used to </span></span><span style="font-family: verdana;"><span style="color: #2e2e2e;"><span style="font-size: medium;">analyze the frequency with which values get repeated within a dataset, a major source of data fraud .</span></span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">(8) CART decision trees from Salford Systems + K-means clustering from JMP.</span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: justify;"><br /></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">_________________________________________________________</span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b><br /></b></span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b><br /></b></span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b><br /></b></span></span></div><div class="separator" style="clear: both; text-align: justify;"><span style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><div style="color: black; font-family: "Times New Roman";"><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">"The Bureau's ACORN-SAT dataset and methods have been thoroughly</span></i></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">peer-reviewed and found to be world-leading." - BOM</span></i></div><div style="font-size: 14.0877px;"><i><span style="font-family: verdana; font-size: medium;"><br /></span></i></div><div style="font-size: 14.0877px;"><i><span style="font-family: verdana; font-size: medium;"><br /></span></i></div><div><span style="font-family: verdana;"><b>Unlocking Data Manipulation With Temperature REPEATS -<br />The Humble Histogram Reveals Tampering Visually.</b></span></div><div><span style="font-family: verdana;"><br /></span></div></span></div></span></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><span style="font-size: medium;">Data Detective Uri Simonsohn's <a href="http://datacolada.org/77">Number Bunching</a> R code is used in forensic auditing to determine how extreme number bunching is in a distribution.</span></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><span style="font-size: medium;"><br /></span></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><span style="font-size: medium;">I used to use the code for fairly subtle distribution discrepancies before realising that the BOM Raw temperature data has been so heavily engineered that it isn't needed -- the visual display from any stats program shows this specific residue of extreme tampering. </span></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><span style="font-size: medium;"><br /></span></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><span style="font-size: medium;">This visual display is a fingerprint to manipulated data, i</span></span><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">t involves a specific structure of 1 temperature that is </span><i style="color: #2e2e2e;">highly repeated</i><span style="color: #2e2e2e;"> in the data, then 4 </span><i style="color: #2e2e2e;">low repeated temps</i><span style="color: #2e2e2e;">, then 1 high repeater, then 5 lower repeats. This 4-5 alternating sequence is methodical, consistant and man-made, and i</span><i style="color: #2e2e2e;">t leaves gaps between the highest repeated temperatures.</i></span></div><div style="clear: both; text-align: left;"><span style="font-size: medium;"><br /></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">And this is immediately visible in virtually all Raw Data and it proves that the data is not observational temperature readings.</span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div style="clear: both; text-align: left;"><span style="color: #2e2e2e; font-family: verdana;"><span style="font-size: medium;">The way to see this is with a particular histogram that can be created with any stats program. Let's talk about the histogram.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #202122;"><span style="background-color: white; font-family: verdana; font-size: medium;">A histogram is an approximate representation of the distribution of numerical data. Lets look at Tennant Creek Maximum Raw temps as an example. This gives you a rough idea what the distribution looks like with the temperatures at the bottom horizontal X axis and the frequency (repeats) on the vertical Y axis. This lets you <i>see what temps appeared the most often</i>, this show you the shape of the distribution.</span></span></div><div><br /></div><div><i><span style="font-size: medium;"><span style="color: #202122; font-family: verdana;">But the data is binned</span><span style="color: #202122; font-family: verdana;">, many observations are put in each bin, so you can't tell exactly </span><span style="color: #202122; font-family: verdana;">how many times a specific temperature appeared in the data. </span></span></i></div><div><br /></div><div><span style="color: #202122;"><span style="background-color: white; font-family: verdana; font-size: medium;">Looking at binned histograms though won't show you anything unusual on cursory inspection because the BOM use <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4647694/">Quantile Matching</a> algorithms to match distributions of Adjusted data with Raw.</span></span></div><div><span style="color: #202122;"><span style="background-color: white; font-family: verdana; font-size: medium;"> </span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmcYQAlAJRlZ86s2O9BnsNDBrU0WvXJ9BAVR1cXjLAR3YR6TYhR50NkI6dmEHGreEXzZLHE5ttAoTrSN1ILT7i1SVc3cbJeFHxmv8bUWN4g3coVRuD1rPjcV6oLLUxW9LZrLfFio-3USI/s883/tennanthisto.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="474" data-original-width="883" height="344" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmcYQAlAJRlZ86s2O9BnsNDBrU0WvXJ9BAVR1cXjLAR3YR6TYhR50NkI6dmEHGreEXzZLHE5ttAoTrSN1ILT7i1SVc3cbJeFHxmv8bUWN4g3coVRuD1rPjcV6oLLUxW9LZrLfFio-3USI/w640-h344/tennanthisto.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>We want to know <i>how many times each individual temperature </i>appears, so we need a <i>histogram that doesn't bin it's data.</i></b></span></div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5nflp9w6vUjdgZ_mdtDz_oZH5pn-fJXwCgOv9ATeAIPM3HhrjPwCIYiBY-GovK98_Gg0Z1WMmCbud4RKGtFNGzPL3_ZXU4v80ucB_xisJB1Oyo2Scm0Bghc5m8GdGuEaYJ_sVmGF4Cy0/s784/tennantmaxspikes.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="784" height="554" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5nflp9w6vUjdgZ_mdtDz_oZH5pn-fJXwCgOv9ATeAIPM3HhrjPwCIYiBY-GovK98_Gg0Z1WMmCbud4RKGtFNGzPL3_ZXU4v80ucB_xisJB1Oyo2Scm0Bghc5m8GdGuEaYJ_sVmGF4Cy0/w640-h554/tennantmaxspikes.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Above: We are looking at the exact same distribution, but now <i>each and every temp has a value that shows exactly how often it appears in the data.</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">The highest spike is 37.8C degrees and repeated 758 times in Maxraw data. The higher the spike, the more often it appeared.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><i>note -- this is NOT related to Australia going metric and changing to Celcius in 1972, these graphs show the same thing in the 1940's and the 1990's too.</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">And here is the problem for the BOM -- you can see straight away that this is dodgy data. The reason data is binned in normal histograms is that if you get down to a granular data level things become very noisy and it's difficult to see the shape of the distribution, a bit like this (below the US NOAA NW region climate data).</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihx7YSqR_Lu1aLvGbTxAvB9XT0_T1PCKPtB8mBuaah59uw-Ujy9bC4sTAsq-JtmrDj-UZ6eNCj2lRvm6xNQ4fME2Kn5FAPbHIH9_IaZEw9ZFmKzq7uUBDXajghfUNliW-axSh4aZa3SNc/s784/noaa.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="784" height="346" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihx7YSqR_Lu1aLvGbTxAvB9XT0_T1PCKPtB8mBuaah59uw-Ujy9bC4sTAsq-JtmrDj-UZ6eNCj2lRvm6xNQ4fME2Kn5FAPbHIH9_IaZEw9ZFmKzq7uUBDXajghfUNliW-axSh4aZa3SNc/w400-h346/noaa.jpg" width="400" /></a></div><br /><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Not so with the BOM data -- things become clearer because this is not observational data, it has been engineered to have specific high repeated temperatures (high spikes), followed by gaps where there are lower frequency (repeated) temps, then a high one again and so on.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><i>What this is saying is that the highest frequency temperatures are neatly ordered between lower repeated temperatures.</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Let's look at Deniliquin Minraw.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">These are the numbers that go into the created histogram.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">The Maxraw temp of 8.9C degrees (top line) repeated 833 times in the data creating the highest spike because it appeared the most often, it had the highest frequency.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiB1x0uIlVoMWmXzgLzSpOLg19TWNDCJsd6j9KUDt6FmPvhmZxDT2TIHn_5KcdIwalevYIXPM7RzgLo2VLTdj9r7mqn6Uvqobz24DHPXuI5OdYRKZijkdnkAGYsQC-i8xtJtUwrmSUfle8/s988/deniminrawspike.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="844" data-original-width="988" height="546" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiB1x0uIlVoMWmXzgLzSpOLg19TWNDCJsd6j9KUDt6FmPvhmZxDT2TIHn_5KcdIwalevYIXPM7RzgLo2VLTdj9r7mqn6Uvqobz24DHPXuI5OdYRKZijkdnkAGYsQC-i8xtJtUwrmSUfle8/w640-h546/deniminrawspike.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">But look what happens --<b> </b>there is a gap of 4 LOW repeated temps then another HIGH repeated temp (next one is 9.4C at 739 repeats).</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">Then there is a gap of 5 LOW then 1 HIGH, then 4 LOW and so on and so on.</span></span></div><div><br /></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana;">This is RAW data and it is engineered so that</span><span style="color: #2e2e2e; font-family: verdana;"> t</span><span style="color: #2e2e2e; font-family: verdana;">here are consistantly alternating gaps of <i>4 and 5 low numbers between the extreme high spikes</i>. </span><span style="color: #2e2e2e; font-family: verdana;">And this occurs with most Raw data (at least 80%). It is a major mistake by the BOM, it is a residue, a left over from tampering.</span></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Recap -- the very high spikes you see in the graph is from a simple histogram <i>without binning</i> available in most stats programs. It is showing us that so-called RAW data which is supposed to be observational data,<i> actually has an artificial structure that is man-made!</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">You don't see a dataset with a high frequency temperature (high repeat) then 4 very low frequency temps, then a high frequency, then 5 low temps, continuing, in a dataset from the natural world. <i>This is not random, it is engineered, and it is a mistake from one of the BOM algorithms!</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><i><br /></i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Lets look at Bourke Minraw:</span></div><div><br /></div><div><span style="color: #2e2e2e;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjn37xPiBr2yPoVwDjJVD2_aobwcx7dmdwQ611J5NXyC_2LTRtmhFveAASJE-YeQSZsaOIEoUMzJ7o4EWXqzfZnW3DilMm0_iRv2sZOO-TILEa3KjCne_jaDOftx7sfN_4IY9-1ViS114w/s953/bourkeminspikes.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="811" data-original-width="953" height="544" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjn37xPiBr2yPoVwDjJVD2_aobwcx7dmdwQ611J5NXyC_2LTRtmhFveAASJE-YeQSZsaOIEoUMzJ7o4EWXqzfZnW3DilMm0_iRv2sZOO-TILEa3KjCne_jaDOftx7sfN_4IY9-1ViS114w/w640-h544/bourkeminspikes.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; text-align: center;"><br /></div><span style="font-family: verdana; font-size: medium;"><b>Above</b>: Bourke Minraw temperatures, exact same signature.</span></span></div><div><br /></div><div><span style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><br /></span></span></div><div><span style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><br /></span></span></div><div><span style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;">Lets look at Charters Towers:</span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><span style="font-size: 14.0877px;"><br /></span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><span style="font-size: 14.0877px;"><br /></span></span></div><div><span style="color: #2e2e2e;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-wr6olx7jjmYuv6JHva9enhWBgGKVcaRjRHM25KCN4YKV85NrMBgi0WtltAFhISVpKSZYGN69Wie5U6hM01TkEhu2s6LUlgr4s1GRNSBVYtzpKY6vfccihhlcl3zS6ZjUrDMb5YEM3pk/s944/chartersmaxspikes2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="812" data-original-width="944" height="550" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-wr6olx7jjmYuv6JHva9enhWBgGKVcaRjRHM25KCN4YKV85NrMBgi0WtltAFhISVpKSZYGN69Wie5U6hM01TkEhu2s6LUlgr4s1GRNSBVYtzpKY6vfccihhlcl3zS6ZjUrDMb5YEM3pk/w640-h550/chartersmaxspikes2.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; text-align: center;"><br /></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><b style="font-family: verdana; font-size: large;">Above:</b><span style="font-family: verdana; font-size: large;"> </span><span style="font-family: verdana; font-size: medium;">The same fraud signature showing Raw data is not Raw but overcooked. These high alternating repeated temps between the low ones are unnatural and there is no explanation for this except large scale tampering of Raw. </span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">This tampering is so extreme it doesn't exist to this level in other climate data from other agencies -- the BOM is the most heavy handed and brazen.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">Let's do a quick tour around various stations with just the visual histogram:</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjd7T3R57d7JrfN0CfABUVTgXsa9hyphenhyphenxSFfK7UzS2T6SOtCX302cDhYAlI7_tmKXCoNZNxbgI18MrfSBRsxUIEojAk9EtVnbE5Tq86w6RATtnBF2L0xcTYyjOPMaaRuOFJEVnaoCl2PFs0U/s1301/bbb.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="586" data-original-width="1301" height="288" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjd7T3R57d7JrfN0CfABUVTgXsa9hyphenhyphenxSFfK7UzS2T6SOtCX302cDhYAlI7_tmKXCoNZNxbgI18MrfSBRsxUIEojAk9EtVnbE5Tq86w6RATtnBF2L0xcTYyjOPMaaRuOFJEVnaoCl2PFs0U/w640-h288/bbb.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDrEnJQ2jzxnrNmQL_txXyyX_YuE9sGhoWPSCqoKyfZ15jau3Bs696ssrb5Fpb6FdxhbUhtnKeAqi_8MPjiILFHLYRmpBC-f9CA9jNqkv2yD_qc1UYSO7ingJgZV_O0zDyxOAqUTqYRnU/s1329/aaa.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="593" data-original-width="1329" height="286" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDrEnJQ2jzxnrNmQL_txXyyX_YuE9sGhoWPSCqoKyfZ15jau3Bs696ssrb5Fpb6FdxhbUhtnKeAqi_8MPjiILFHLYRmpBC-f9CA9jNqkv2yD_qc1UYSO7ingJgZV_O0zDyxOAqUTqYRnU/w640-h286/aaa.jpg" width="640" /></a></div><br /><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUo_P8tHk9ox3ZFE44pNu3SHpixtKTBky3MBDP7PlX1zBYTVzN5uxBglkMJZkmQJlWKyWDAFet-rp6MNyVed-lQAU3ZBSESU32xUoBKyzPo7N0BnGrVMfxhdzZbkGgNQVkgqhCx6btfuE/s1336/cccc.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="646" data-original-width="1336" height="310" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUo_P8tHk9ox3ZFE44pNu3SHpixtKTBky3MBDP7PlX1zBYTVzN5uxBglkMJZkmQJlWKyWDAFet-rp6MNyVed-lQAU3ZBSESU32xUoBKyzPo7N0BnGrVMfxhdzZbkGgNQVkgqhCx6btfuE/w640-h310/cccc.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMZanRBtfT3jh_gDuM9fhMGDwbwzgYYZt4ACxADPQ6UKvlnqAy4feciYe0KUWl59ZA3vojpEXasXUtJFhXqeQ180OjIsMmWOrVJ9Qipva1OkM0KAV4geK-9DUR84GNS6x7mkLSRDJ2aus/s1322/dddd.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="598" data-original-width="1322" height="290" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMZanRBtfT3jh_gDuM9fhMGDwbwzgYYZt4ACxADPQ6UKvlnqAy4feciYe0KUWl59ZA3vojpEXasXUtJFhXqeQ180OjIsMmWOrVJ9Qipva1OkM0KAV4geK-9DUR84GNS6x7mkLSRDJ2aus/w640-h290/dddd.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;">All Raw, all have extreme spikes showing extreme repeats, all have the same 4-5 alternating gaps! All are unnatural.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">Now what happens when the Raw get Adjusted?</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">The spikes get turned down, their frequency and rate of repeats is reduced, but the vast body of <i>lower repeated temps </i>are increased.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">Look at Deniliquin Minraw-- 7.2C degrees is repeated 799 times in Raw but only 362 times in Minv2 adjusted.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">But the low repeats in the gaps are increased!</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmRJJ6LSVFyTUF7k6_GXh_nNLc2qDBK5QDVSw_EhRc6gntShACji1dLqH0eMSIXoKP_cqFXxPRDkThCS9tHvw_3qdqbgWhXNDNQyZBJBFEyRR9n5c33aI7_vmC80B2vcOYSmF8pLi_H_M/s894/denirepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="894" data-original-width="731" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmRJJ6LSVFyTUF7k6_GXh_nNLc2qDBK5QDVSw_EhRc6gntShACji1dLqH0eMSIXoKP_cqFXxPRDkThCS9tHvw_3qdqbgWhXNDNQyZBJBFEyRR9n5c33aI7_vmC80B2vcOYSmF8pLi_H_M/w524-h640/denirepeats.jpg" width="524" /></a></div><br /><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; text-align: center;"><br /></div><span style="font-family: verdana; font-size: medium;">Let's look at Bourke repeated temps and compare Raw to Adjusted.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjESyOG8dpsVLJnu1NsNx5i2SeTN7Pr0LsUdddl_cZNhp7ycG2X08EJOwVIzSWRDQJ0MDqrq1dpHp4guo7Gvdq1W5pk43-ZO3wjvSoHiSUkSArYiaB_I6QPXlclE8o54h6fY_KyR1hnesw/s841/bourke+repeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="841" data-original-width="621" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjESyOG8dpsVLJnu1NsNx5i2SeTN7Pr0LsUdddl_cZNhp7ycG2X08EJOwVIzSWRDQJ0MDqrq1dpHp4guo7Gvdq1W5pk43-ZO3wjvSoHiSUkSArYiaB_I6QPXlclE8o54h6fY_KyR1hnesw/w472-h640/bourke+repeats.jpg" width="472" /></a></div><br /><div><span style="font-family: verdana; font-size: medium;">Same thing, the highest repeated temps, the spikes in the graph are reduced by reducing the frequency with which they appear. The low frequency temps are increased.</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">Tennant Creek, same thing:</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvTsiEfouNtMUbcftF1aKrcBfeyhV-B0zgSAn-rCApc4U6v5-naoSxotPIGh1Crz-f6dSHbnMddFEkV6_t23y65t4ekMy7YQXq7fqUtD_8ICn1bF5IWGIHmD9yQj9J0OsBolcsuwHKa3s/s857/tennatrepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="857" data-original-width="661" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvTsiEfouNtMUbcftF1aKrcBfeyhV-B0zgSAn-rCApc4U6v5-naoSxotPIGh1Crz-f6dSHbnMddFEkV6_t23y65t4ekMy7YQXq7fqUtD_8ICn1bF5IWGIHmD9yQj9J0OsBolcsuwHKa3s/w494-h640/tennatrepeats.jpg" width="494" /></a></div><br /><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><br /><span style="font-family: verdana; font-size: medium;">What is the net result of doing adjustments on raw data? The high spikes are reduced, getting rid of the evidence.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">The low frequency temps (of which there are many more) are increased in frequency giving a net warming effect.</span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><i>Temperatures are controlled by reducing or increasing the frequency with which they are repeated in the data!</i></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;"><i><br /></i></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="font-family: verdana; font-size: medium;">Tennant Creek Maxv2 Adjusted -- spikes reduced, they now merge with the gaps, so everthing appears more kosher on a cursory inspection. </span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZB5f4x7PfuVcX_7D6y_-I51ACh7q8CmEci3iDWLlY_mLmSQsZVIhEVICyX8ukVOyl1dPOrxiqJakPWKkEUFTbqA_Kq6DML4TG7l97X30FqjUUZ4YD6f7shtMLXCYaeMvZRCPW2prOdug/s784/tennantlowspikes.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="784" height="554" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZB5f4x7PfuVcX_7D6y_-I51ACh7q8CmEci3iDWLlY_mLmSQsZVIhEVICyX8ukVOyl1dPOrxiqJakPWKkEUFTbqA_Kq6DML4TG7l97X30FqjUUZ4YD6f7shtMLXCYaeMvZRCPW2prOdug/w640-h554/tennantlowspikes.jpg" width="640" /></a></div><br /></span></div><div><span style="color: #2e2e2e;"><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; text-align: center;"><br /></div></span></div><div><div><br /></div><span style="font-family: verdana; font-size: medium;"><b>Below:</b> A different look at how temperature frequency is manipulated up or down.</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">You can see that in Raw, 15C repeated 827 times, while in Minv2.1 it appears 401 times. This reduces the large spikes in the Adjusted histograms.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2G4rd3v-_ryG4HN4Qpovpm7Vpz-2ilSUIO61RroKR_moLSxpZN3l-bRrYlKG3mm8j0pDzEYW-hcVTy9TdPC8WNACFFj4jyMJpFhjUo6TihjX9qONvhpkk29aq6QueqbcALPAjqtpWO0w/s1231/moruyafreqcomapre.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="504" data-original-width="1231" height="262" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2G4rd3v-_ryG4HN4Qpovpm7Vpz-2ilSUIO61RroKR_moLSxpZN3l-bRrYlKG3mm8j0pDzEYW-hcVTy9TdPC8WNACFFj4jyMJpFhjUo6TihjX9qONvhpkk29aq6QueqbcALPAjqtpWO0w/w640-h262/moruyafreqcomapre.jpg" width="640" /></a></div><div><br /></div><div><br /></div></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><div><b style="color: #2e2e2e; font-size: large;"><span style="font-family: verdana;">Summary Of Histograms Sh</span></b><b style="color: #2e2e2e; font-size: large;"><span style="font-family: verdana;">owing Patterns In Raw:</span></b></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><i>Histograms with zero binning</i> at a granular level expose systematic tampering with the RAW data being engineered with a specific layout - the highest repeated temperatures are followed with an alternating gaps of 4 and 5 low frequency repeated temperatures, followed by a single high frequency repeated temp and so on.</span></span></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><br /></span></span></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;">Normally, for most data that is a bit more subtle than the BOM Raw, Uri Simonsohn's Number Bunching R code is required to detect extreme number bunching or repeating. But the BOM data is extremely heavy handed to such an extent, they have left a visual obvious residue from their tampering algorithms. This is proven when comparing their temps to other agencies, none display the extreme spikes and gaps. The BOM really is world-leading with it's data tampering.</span></span></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-weight: 400;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e; font-weight: 400;">This is a visual residue of large scale tampering. All Adjustments from RAW is moot. </span><span style="color: #2e2e2e;"><b>Raw</b></span><b style="color: #2e2e2e;"> Is Very Cooked.</b></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;">_________________________________________________________________________</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><br /></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><i style="background-color: white; font-family: verdana; font-size: large;">"Carefully curating and correcting records is global best practice for analysing temperature data." </i></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-weight: 400;"><span style="background-color: white; color: black; font-family: verdana; font-size: large;"><span> </span> </span><span style="background-color: white; color: black; font-family: verdana; font-size: medium;"> -- BOM.</span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: medium; font-weight: 400;"><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; text-align: center;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; text-align: center;"><br /></div></span></div><div><span style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><b>PATTERN EXPLORATION A.K.A Copy/Paste</b></span></span></div><div><span style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><br /></span></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Looking for strange patterns and duplication of sequences.</b></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><i>SAS JMP</i><b> </b>is responsible for this <a href="https://community.jmp.com/t5/Tutorials/Identifying-Unusual-Patterns-that-Might-Indicate-Data-Integrity/ta-p/273248">pattern exploration</a> section, see video link how this works on Pharma data, and how JMP finds anomolous or suspicious values.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">JMP computes probabilities of finding a sequence by random, depending on number of unique values, sample size and so on.</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;"><i>I have only listed sequences here that have over 1 in 100 000 chance of occurring by random as calculated by JMP, a full month copy/pasted get's 100% certainty for fabrication. </i></span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Copy/pasting exact data into another month or year is the ultimate in lazy tampering. It's incredible the BOM didn't think anyone would ever notice!</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Having a run of days ALL with the exact temperature to 1/10 of a degree is dodgy too. This proves raw is not raw.</span></div><div><br /></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana;"><b>Below:</b></span><span style="color: #2e2e2e; font-family: verdana;"> Sydney Min Raw - A full 31 days copy/pasted into the following year.</span></span></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana;"><br /></span></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">If this is possible, then anything is possible. And a major capital city too. It's not as it they didn't have the data, they leave thousands of entries blank. The correct procedure is to use proper imputation methods or leave the data missing.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7FRZe3ksXbAWyiflUfNvF2Dxvln6xA3-21E2sQDndHgURGVSSEoJA4GvWGvFdE3d_9_-lhtlAkEtCoXsJC_2wjojDMUQG8IcmdarUMta3SfhrzqrEooboIccicgLiBYmsdYav3r5LMd4/s835/aaazzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="822" data-original-width="835" height="630" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7FRZe3ksXbAWyiflUfNvF2Dxvln6xA3-21E2sQDndHgURGVSSEoJA4GvWGvFdE3d_9_-lhtlAkEtCoXsJC_2wjojDMUQG8IcmdarUMta3SfhrzqrEooboIccicgLiBYmsdYav3r5LMd4/w640-h630/aaazzzz.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">And it's not one-off. Another full Sydney month copy/pasted into the following year.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTHzJ8DcCg-t_7vifFdWItG6ugIlpSNBE7rGoTj4HuuEuyoC7bU5qYrSMrUoP6AwqLfwDY8B3X0ixW3TUbJ8-VaW2tZfvVbjItOmLsE9CCPnE_4jstLT-mN31nYJvcej46UqZysQ3ftsw/s787/bbbbzzzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="787" data-original-width="684" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTHzJ8DcCg-t_7vifFdWItG6ugIlpSNBE7rGoTj4HuuEuyoC7bU5qYrSMrUoP6AwqLfwDY8B3X0ixW3TUbJ8-VaW2tZfvVbjItOmLsE9CCPnE_4jstLT-mN31nYJvcej46UqZysQ3ftsw/w556-h640/bbbbzzzzz.jpg" width="556" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">More Sydney, another month copy/pasted. Notice, this is Raw Data and Adjusted....no-one will ever notice!</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEictfLZIoQLs-01FZcHIkW8onxfwjdYtqAgWHpeKUkakSTsORc1epJOIWVkbnuKVo8__7zFegmGOjxgBNqvTrT9hapRVyoJELH6w-NLZIAJMImUTTkHHsmpVJgMYG2FrKnQSHrXX0Gr71Q/s783/ccczzzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="783" data-original-width="389" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEictfLZIoQLs-01FZcHIkW8onxfwjdYtqAgWHpeKUkakSTsORc1epJOIWVkbnuKVo8__7zFegmGOjxgBNqvTrT9hapRVyoJELH6w-NLZIAJMImUTTkHHsmpVJgMYG2FrKnQSHrXX0Gr71Q/w318-h640/ccczzzzz.jpg" width="318" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbhIuB6sc95zoFn5in0R8n9DD04ibwMaHuCJoRnig4siHawaB2skw7rdQCUcImW2b5dFnfdleVRQF63IP6ciG2h-d5AvLf9S-jwG2jMSQmBVz8AOO2WMQINnXmX-xm6XeYRekbBTe8ZtM/s733/dddzzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="733" data-original-width="448" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbhIuB6sc95zoFn5in0R8n9DD04ibwMaHuCJoRnig4siHawaB2skw7rdQCUcImW2b5dFnfdleVRQF63IP6ciG2h-d5AvLf9S-jwG2jMSQmBVz8AOO2WMQINnXmX-xm6XeYRekbBTe8ZtM/w392-h640/dddzzzz.jpg" width="392" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div><br /></div></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><b>Below:</b></span><span style="color: #2e2e2e; font-weight: 400;"> Richmond duplicate sequences. Raw as well.</span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKFmNTfkdIfnPJtFCsfCTdqLwbc12Ta1EkI_hF8KU-aaOI342FfX8UVWNSe35NHq-sk5hAkMgPnr6RKb-MVGgagLrk8SRpInzV9yNA1Vr1U8XV3rjXooZG_wp9TQ1n_dwY2EnI_0kY6ng/s882/2020-12-18_175927.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="882" data-original-width="697" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKFmNTfkdIfnPJtFCsfCTdqLwbc12Ta1EkI_hF8KU-aaOI342FfX8UVWNSe35NHq-sk5hAkMgPnr6RKb-MVGgagLrk8SRpInzV9yNA1Vr1U8XV3rjXooZG_wp9TQ1n_dwY2EnI_0kY6ng/w506-h640/2020-12-18_175927.jpg" width="506" /></a></div><br /><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><b>Below:</b></span><span style="color: #2e2e2e; font-weight: 400;"> Georgetown duplicated sequences = dodgy Raw data.</span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRuN8rYaHA58uravMtxVPvqGIQCI7gxoZ-4NEQ9OEOO9vs-8T4zCl-WifZRd8H7tU2DL1fZwgS3aJ4TcX4FFm4kEL3kjWXiDTbQdmsFjK5Ek-C2IaRF1JtNn5j9JfwVfA7wCsrIXfy5AM/s814/2020-12-18_172753.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="729" data-original-width="814" height="574" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRuN8rYaHA58uravMtxVPvqGIQCI7gxoZ-4NEQ9OEOO9vs-8T4zCl-WifZRd8H7tU2DL1fZwgS3aJ4TcX4FFm4kEL3kjWXiDTbQdmsFjK5Ek-C2IaRF1JtNn5j9JfwVfA7wCsrIXfy5AM/w640-h574/2020-12-18_172753.jpg" width="640" /></a></div><div><br /></div></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><div><span style="font-family: verdana; font-size: medium;">Palmerville - over 2 weeks with the exact same temps, to 1/10 of a degree.</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikS5gH9cPjf0yQNze3LA9wpH1D_YKKps8YSDgcDqU2b9xcLFQJo_PziyTEm5TPW9DvG8GW8HqxImGIfwd297aH70CERr-Ipkfj4NI6G-_EDdX3oIwJTQNVPZoqp67n3GViTX7oIskNcGo/s702/2020-12-18_175153.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="614" data-original-width="702" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikS5gH9cPjf0yQNze3LA9wpH1D_YKKps8YSDgcDqU2b9xcLFQJo_PziyTEm5TPW9DvG8GW8HqxImGIfwd297aH70CERr-Ipkfj4NI6G-_EDdX3oIwJTQNVPZoqp67n3GViTX7oIskNcGo/w640-h560/2020-12-18_175153.jpg" width="640" /></a></div></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><b>Below:</b></span><span style="color: #2e2e2e; font-weight: 400;"> Comooweal - </span></span><i style="color: #2e2e2e; font-family: verdana; font-size: large;">I love how they are unsure what to put into</i></div><div><i style="color: #2e2e2e; font-family: verdana; font-size: large;">2002-03-05 in Maxv1!</i></div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZYhEkcQ1utRNOI-DbH2peYYGjyEKzNNlGNwxC7fHBa-UM4wqe19KBxeAkQRGMshfd1X9v3ER6CgcmA67HWn8LCObrYcxot9kMVTlKiVhcb0T7MrVr9oGETj-UUANzYXDKrMF3geOoFbk/s468/comooweal33.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="468" data-original-width="453" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZYhEkcQ1utRNOI-DbH2peYYGjyEKzNNlGNwxC7fHBa-UM4wqe19KBxeAkQRGMshfd1X9v3ER6CgcmA67HWn8LCObrYcxot9kMVTlKiVhcb0T7MrVr9oGETj-UUANzYXDKrMF3geOoFbk/w620-h640/comooweal33.jpg" width="620" /></a></div><div><br /></div><div style="font-weight: bold;"><br /></div><span style="font-family: verdana; font-size: medium;"><b>Below: </b>Cairns -- Full month copy/pasted in Max Raw.<br /></span><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><b><br /></b></span></div><div><span style="color: #2e2e2e;"><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2WzOxM9hh8rr0Wyy9jJNBazl7_PJS_A1Fh9flA0oh6i2fZKY_QFQAEbSK8qn1sHIDHRGbqj6jHlDsKzjaEc4iXLp0RptSCtQWxzRSm_KF95gE2dtHHG9jlsteG-yIZJHxmAlzrwh59zw/s955/cairns1month.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="903" data-original-width="955" height="606" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2WzOxM9hh8rr0Wyy9jJNBazl7_PJS_A1Fh9flA0oh6i2fZKY_QFQAEbSK8qn1sHIDHRGbqj6jHlDsKzjaEc4iXLp0RptSCtQWxzRSm_KF95gE2dtHHG9jlsteG-yIZJHxmAlzrwh59zw/w640-h606/cairns1month.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; text-align: center;"><br /></div></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><div class="separator" style="clear: both; text-align: center;"><br /></div></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Below: </b>Tennant Creek -- paste January temps into March, that'll warm it!</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><b><br /></b></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><b><br /></b></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhf_Hwf0_wOss4DdTQE3eSFar29oRVN8aRJ9TaCqMRKKfI-AExbEre2MM5j6gPNq7s85WpWUr2Sqs3gm7gQJINFM2znT-7PGLeq3ic5G6WtOQ4zxu-RWo1DvsaNd4ORLFqRon1SiuZFiaw/s506/zzztenant.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="437" data-original-width="506" height="552" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhf_Hwf0_wOss4DdTQE3eSFar29oRVN8aRJ9TaCqMRKKfI-AExbEre2MM5j6gPNq7s85WpWUr2Sqs3gm7gQJINFM2znT-7PGLeq3ic5G6WtOQ4zxu-RWo1DvsaNd4ORLFqRon1SiuZFiaw/w640-h552/zzztenant.jpg" width="640" /></a></div><br /><b><br /></b></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Below: </b>Tennant Creek. </span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><b><br /></b></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgC2tvFh0t4ZQNzWvrQ8p2_wL8Ap_OJaC2CnQ5g3u9dnma35WAwf3T93Mt6aSvak7JMhuF6gyCYEBZQ23jVIaXjf7zHanJ-BSW_EXwljGBEDbGoEXfsx8L_SI7MtUMC7vAm5F1Fi0EN08Y/s627/tennant.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="627" data-original-width="509" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgC2tvFh0t4ZQNzWvrQ8p2_wL8Ap_OJaC2CnQ5g3u9dnma35WAwf3T93Mt6aSvak7JMhuF6gyCYEBZQ23jVIaXjf7zHanJ-BSW_EXwljGBEDbGoEXfsx8L_SI7MtUMC7vAm5F1Fi0EN08Y/w520-h640/tennant.jpg" width="520" /></a></div></span></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Below: </b>Port Macquarie</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Look at the top week - a change of week on the second day, pasted into another year but the change of week on the second day is mimicked!</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">At least they are fabricating consistantly.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGXyJgGciv3hAHcWp9YiP3RcPWF_Jah3l-Qg2hhZh2rgS0izo_7mZ7q45_JuazmPz6GyepMzdB6S1uNefoa4gotz1AzQGbEUCvHCmdrTKIZV-adOaKPKGCAjxP2b9HtirmtN6s3Ro5uog/s785/port333.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="785" data-original-width="464" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGXyJgGciv3hAHcWp9YiP3RcPWF_Jah3l-Qg2hhZh2rgS0izo_7mZ7q45_JuazmPz6GyepMzdB6S1uNefoa4gotz1AzQGbEUCvHCmdrTKIZV-adOaKPKGCAjxP2b9HtirmtN6s3Ro5uog/w378-h640/port333.jpg" width="378" /></a></div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><br /></div><div><span style="color: #2e2e2e;"><i><span style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: bold;"> </span><span style="font-family: verdana; font-size: medium;">"The data available via
Climate Data Online is generally considered ‘raw’,
by convention, since it has not been analysed,
transformed or adjusted apart from through basic
quality control. " -BOM</span></i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><i><br /></i></span></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;"><b>Below:</b> Bourke, copy/paste July Into June, that'll cool it down! </span></div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHaagJ7epdIWvOh4Xkyiqjpfan-gQSky5EfVGwy3083CA-l7_Mn6LZ0r6ig-L6eQvQD5AHEg2zgBE-QuTfdVUz7NevfcN01TVyR189fHoI_px8xymOA5cUvwBMN5CmzD2d4ddBszMz4VQ/s515/bourkemaxRAWseqRare35.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="478" data-original-width="515" height="594" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHaagJ7epdIWvOh4Xkyiqjpfan-gQSky5EfVGwy3083CA-l7_Mn6LZ0r6ig-L6eQvQD5AHEg2zgBE-QuTfdVUz7NevfcN01TVyR189fHoI_px8xymOA5cUvwBMN5CmzD2d4ddBszMz4VQ/w640-h594/bourkemaxRAWseqRare35.jpg" width="640" /></a></div><br /><div><br /></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><b>Below:</b></span><span style="color: #2e2e2e; font-weight: 400;"> </span><span style="color: #2e2e2e;">Charleville</span></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><i>Here's a great way to cool down a month -- copy/paste the entire August temperatures into September.</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">I kid you not.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdGE7zysWeNE4HlCFzDknHNEvsPZhDHc0B96ce5GVWxmrDcUmfB_DhVI1MyGk-h5RDuV2bQj7sJpNMKNiCae7D1hAMSZrdNdsmqdkHCYRHHKuvi4cB-TI14eGXXwWUZnkW-C5kIJY_tGE/s808/char1.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="808" data-original-width="569" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdGE7zysWeNE4HlCFzDknHNEvsPZhDHc0B96ce5GVWxmrDcUmfB_DhVI1MyGk-h5RDuV2bQj7sJpNMKNiCae7D1hAMSZrdNdsmqdkHCYRHHKuvi4cB-TI14eGXXwWUZnkW-C5kIJY_tGE/w450-h640/char1.jpg" width="450" /></a></div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWNwBsJigo_07X2WqWc3KX4qG4Nk65by3Tfyx_a4EFVOvbrFSRjIXr3JNbK2q0E_xxl-p8aL6yBAdpDI4Hgc6OLVcZ4H0tz-r7Pp49Lf2AhqKuA8EEJ3O69d7YIdZ20LTdfwdu0xFkQNA/s805/char2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="805" data-original-width="580" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWNwBsJigo_07X2WqWc3KX4qG4Nk65by3Tfyx_a4EFVOvbrFSRjIXr3JNbK2q0E_xxl-p8aL6yBAdpDI4Hgc6OLVcZ4H0tz-r7Pp49Lf2AhqKuA8EEJ3O69d7YIdZ20LTdfwdu0xFkQNA/w462-h640/char2.jpg" width="462" /></a></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;">I left the best for last. I can go on and on with these sequences, but this is the last one for now, it's hard to beat.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Charleville:</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Lets copy the full month of December into the following year of December.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">And let's do this for ALL the Raw and Adjusted temperature series.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">BUT let's not make it so obvious -- we'll hide this by changing ONE value and DELETING two values.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">You've got to love the subtlety here.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4vnT5NdnB7pyM6waw1burZK-cHn3FkG6mrkANPe0JWt2Ked80KpIVkjgzrUNMuUsnPvPJfhfvoT0ZXiy2Sn3spLpS6FZZjx-Sh19gIaV6Q9jCYkY4oYa1Aa381Smk93B6ZWKAn_RmuN8/s1093/charlevilefinale.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="894" data-original-width="1093" height="524" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4vnT5NdnB7pyM6waw1burZK-cHn3FkG6mrkANPe0JWt2Ked80KpIVkjgzrUNMuUsnPvPJfhfvoT0ZXiy2Sn3spLpS6FZZjx-Sh19gIaV6Q9jCYkY4oYa1Aa381Smk93B6ZWKAn_RmuN8/w640-h524/charlevilefinale.jpg" width="640" /></a></div><br /><span style="color: #2e2e2e;"><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><i>"Producing analyses such as ACORN-SAT involves much work, and typically takes <b>scientists</b> at the Bureau of Meteorology several years to complete. " -BOM</i></span></div></span></div><div><br /></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Summary of Pattern Exploration</b></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">I deleted quite a few sequences in a re-write of the blog because I can go on and on. There are hundreds or very suspicious to confirmed fabrication sequences. This is a sampling of what is out there.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Charlleville is my favourite. Changing 1 value out of 31 in the Minv2 adjustment data above was a masterstroke....they must have found a 'break' through neighbouring stations!</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Overall, the sequences show 100% definite data tampering and fabrication on a large scale. What this shows is a complete lack of integrity for data. A forensic audit is long overdue. </span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><br /></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;">_________________________________________________________________________</span></div><div><br /></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><i>"ACORN-SAT data has its own quality control
and analysis..." -BOM </i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px;"><b><br /></b></span></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>BENFORD'S LAW INDICATES EXCESSIVE DIGIT FREQUENCY</b></span></div><div><br /></div><div><span><div><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;">Benford Law's Fraud Analytics</span></b></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">Benford's Law has been widely used with great success for many years from money laundering and financial scams to tracking hurricane distances travelled and predicting times between earthquakes, and is accepted into evidence in a court of law in the USA. </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;"><div>Benford's Law can also be applied on ratio - or count scale measures that have sufficient digits and that are not truncated (Hill & Schürger, 2005)</div></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">It describes the distribution of digits in many naturally occurring circumstances,<i> including temperature anomalies</i> (<a href="https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2010GL044830">Sambridge et al, 2010</a>). </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">Some novel innovations to increase accuracy of Benford's Law has been developed <a href="https://jee.caltech.edu/documents/1255/Ensminger_and_Leder-Luis.pdf">in this paper</a> , and which has been correlated and validated with an actual forensic audit done at the World Bank. </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-size: medium;"><span style="font-family: verdana;">If a data distribution </span><i style="font-family: verdana;">should</i><span style="font-family: verdana;"> follow a Benfords law distribution and it doesn't, it means that something is going on with the data. It is a red flag for an audit, and is likely to have been tampered.</span></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">The first graph below shows <i>Hobart, Sydney Melbourne, Darwin and Mildura Maxv2 combined for 200 000 data points.</i> Running a Benfords Law analysis using the first two digits produces a weak conformance based on Nigrini's Mean Absolute Deviation parameter. </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: large;"> </span></div><div><span style="font-family: verdana; font-size: medium;"><i><div>This supports the hypothesis that Benford’s Law is the appropriate theoretical distribution for our dataset. Importantly, this does not indicate that the data is legitimate, as pooled data may cancel out</div><div>different individual signatures of manipulation and replicate Benford’s Law (Diekmann 2007, Ensminger+LederLuis 2020).</div></i></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: large;"><br /></span></div></div></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;">Below: 5 cities aggregated and Benfords Law curve of first 2 digits (red dotted line). </span><span style="font-family: verdana; font-size: medium;">The individual spikes/gaps indicate excessive overuse/underuse of specific numbers.</span></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiX9Rs1EpCF5LZV8uhfO7uTqYz9QjXT4q8FyuLGcN-ae7djO-ffKkFdZ95inY0yEajvGzabiUi_nqhe4cs3EC0re7rUY5OCHVIBepDs6h-Fh01Zpjpp2Sqju0wGqRUY7njjAoxbg6MQ_Zg/s980/2digits5cities.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="936" data-original-width="980" height="612" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiX9Rs1EpCF5LZV8uhfO7uTqYz9QjXT4q8FyuLGcN-ae7djO-ffKkFdZ95inY0yEajvGzabiUi_nqhe4cs3EC0re7rUY5OCHVIBepDs6h-Fh01Zpjpp2Sqju0wGqRUY7njjAoxbg6MQ_Zg/w640-h612/2digits5cities.jpg" width="640" /></a></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div></span></div><div><span style="color: #2e2e2e;"><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">The above curve with all the aggregated data still has a bias with low numbers 10-15 appearing too few times, 17-45 appearing too often, then specific high numbers appearing with too low a frequency and a few high numbers popping up slightly. </span></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black;"><br /></div><div style="color: black;">___________________________________________________________________________________</div><div style="color: black; font-family: "Times New Roman"; font-size: medium;"><br /></div></span></div><div><br /></div><div><br /></div><div><span style="color: #2e2e2e;"><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;">Benford's Law on Individual Stations.</span></b></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;"><b style="font-weight: 400;">Below: </b>Deniliquin, Raw + Adj</span></div><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">Looking at the entire temperature series from 1910-2018 and using the first two digit values in a Benford analysis shows extreme non-conformance and a tiny p-value in the Max Raw and Max Adjusted data. </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">You can see the high systematic spikes in the graph indicating excessive specific digit use in temperature anomalies.</span></div><div style="color: black; font-weight: 400;"><br /></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">The tiny p-value is less than 2.16 e10-16 indicates a rejection of the null hypothesis of this data set following Benford's Law. In other words, there is something wrong with the data. </span></div><div style="color: black; font-weight: 400;"><br /></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiv5xVKg8FOmlnD7-r00J2cN36WGp3z9cziF8zWjn6fW9IZCNPzeOXcojI_ifGtzYi6xwNKzDxNApECyIPr1gAUfAP55dgOuMLZNTS_nWOHjfXWUlgh9lBqUbsbgtxp2L2zIEXJW09BPaA/s976/dmaxraw2digits.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="823" data-original-width="976" height="540" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiv5xVKg8FOmlnD7-r00J2cN36WGp3z9cziF8zWjn6fW9IZCNPzeOXcojI_ifGtzYi6xwNKzDxNApECyIPr1gAUfAP55dgOuMLZNTS_nWOHjfXWUlgh9lBqUbsbgtxp2L2zIEXJW09BPaA/w640-h540/dmaxraw2digits.jpg" width="640" /></a></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><br /></div><div style="color: black;"><span style="font-family: verdana;"><b style="font-weight: 400;">B</b><span style="font-size: medium;"><b style="font-weight: 400;">elow: </b>Min Raw and Adjusted Minv2 Temps for Deniliquin.</span></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">These are extreme biases in a 39 000 variable data point set that suggests tampering. The high frequency "spikes" are temps that are repeated a lot and are also evident in the histograms.</span></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGw_hhoDhl7PKfH1Mh7DGTJs9ysdElm7pGB5f-m22qzSCrRaxBWFLL0QnZn-VsR6ymaSBJ4ugXlQ_m4WoMePzMXAUUE2uushVmt6iurU_xsbIwaAtI6eItaQBeEkOSrDFTTAGRJeUYq-g/s1000/dminraw2digit.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="821" data-original-width="1000" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGw_hhoDhl7PKfH1Mh7DGTJs9ysdElm7pGB5f-m22qzSCrRaxBWFLL0QnZn-VsR6ymaSBJ4ugXlQ_m4WoMePzMXAUUE2uushVmt6iurU_xsbIwaAtI6eItaQBeEkOSrDFTTAGRJeUYq-g/w640-h526/dminraw2digit.jpg" width="640" /></a></div><br style="color: black; font-family: "Times New Roman"; font-size: medium;" /><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidZFv9yDILb1IRuRjl8qHMXckJRPGtWyGefjA9_q0FwJKykyiGHjMdmjPmrKXPRCLmgdTmN5P9O0-ci0ES-q17M-Fo0oJRthA0LJkZeKY1th05Jx1ojRY2GNgzOllgvJK4m3L4DteXLJI/s1007/dminv2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="829" data-original-width="1007" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidZFv9yDILb1IRuRjl8qHMXckJRPGtWyGefjA9_q0FwJKykyiGHjMdmjPmrKXPRCLmgdTmN5P9O0-ci0ES-q17M-Fo0oJRthA0LJkZeKY1th05Jx1ojRY2GNgzOllgvJK4m3L4DteXLJI/w640-h526/dminv2.jpg" width="640" /></a></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><br /></div><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;">Specific Months</span></b></div><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><span style="color: black; font-family: verdana; font-size: medium; font-weight: 400;">Some months are much more tampered with than other months. Not all months are treated equally.</span><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;"><b style="font-weight: 400;">Below: </b>Deniliquin Max Raw for January, all the days of January 1910-2019 are combined for a total of about 3300 days. This graph is screaming out, <i>"audit me, audit me."</i></span></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;"><i><br /></i></span></div><div style="color: black;"><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfBdypvb7D6irYPjCGrZn8NWTRm59c4x5BriBz5VWjmUJF1KKtjvWi-sQGYKu-eEo251vuabcmSPWEkMcIA-cqlj59vZeopn_8qrBnmVY_a_lFEeiJ7SHfe5JQ86yHjKR8QfvsFspPQYQ/s956/maxrawJAN.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="851" data-original-width="956" height="570" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfBdypvb7D6irYPjCGrZn8NWTRm59c4x5BriBz5VWjmUJF1KKtjvWi-sQGYKu-eEo251vuabcmSPWEkMcIA-cqlj59vZeopn_8qrBnmVY_a_lFEeiJ7SHfe5JQ86yHjKR8QfvsFspPQYQ/w640-h570/maxrawJAN.jpg" width="640" /></a></div><div style="font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><span style="font-family: verdana; font-size: medium;"><b style="font-weight: 400;">Below </b>is <span style="font-weight: 400;">Deniliquin </span>Max Raw for July<b style="font-weight: 400;">,</b> all the days of July where combined from 1910-2018 to give about 3300 days. These are astounding graphs that show extreme tampering of RAW data.</span></div><div style="color: black;"><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh53WNREFmhE5needpCD5rJ3KdunVbQ4YfvUtOPjtZWnDwKaPR9chjz4MRlgJoaq_dqNby4HzS8edSKSXK5Ns64oW33CnBLIqYjqpQkuSFefXeYvxjxUq2FTToRafXCqfVdsU1fNEkC0g8/s1014/maxrawjul.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="858" data-original-width="1014" height="542" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh53WNREFmhE5needpCD5rJ3KdunVbQ4YfvUtOPjtZWnDwKaPR9chjz4MRlgJoaq_dqNby4HzS8edSKSXK5Ns64oW33CnBLIqYjqpQkuSFefXeYvxjxUq2FTToRafXCqfVdsU1fNEkC0g8/w640-h542/maxrawjul.jpg" width="640" /></a></div><div style="font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div></div><div style="color: black; font-size: medium;"><div class="separator" style="clear: both; font-family: "Times New Roman"; font-weight: 400; text-align: center;"><br /></div><span style="font-family: verdana;"><b style="font-weight: 400;">Below: </b>Deniliquin Min Raw for July.</span></div><div style="color: black; font-weight: 400;"><div class="separator" style="clear: both; font-family: "Times New Roman"; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicQqgPCsN_nLAGdtC6IZaHeGl5IMSQFKPyJ3ly4mzDhVXI1QOcmm1ZXWQtao7gxYEGPaiM1zIeOlVs8l7EaICrlGpb-mTqJnBKoJhDtJ97fbd5g8lo-0d9B69YAIZ0yPdXulzJOJfkquI/s1023/minrawJUL.jpg" style="margin-left: 1em; margin-right: 1em;"><span style="font-size: medium;"><img border="0" data-original-height="865" data-original-width="1023" height="542" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicQqgPCsN_nLAGdtC6IZaHeGl5IMSQFKPyJ3ly4mzDhVXI1QOcmm1ZXWQtao7gxYEGPaiM1zIeOlVs8l7EaICrlGpb-mTqJnBKoJhDtJ97fbd5g8lo-0d9B69YAIZ0yPdXulzJOJfkquI/w640-h542/minrawJUL.jpg" width="640" /></span></a></div><span style="font-family: verdana; font-size: medium;"><i>This is max and min RAW data we have been looking at.</i> </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">You are unlikely to find worse less conforming Benford's Law graphs anywhere on the internet. </span><span style="font-family: verdana; font-size: medium;">This is as bad as it gets.</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">This is a massive red flag for a forensic audit.</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><br /></div></span></div><div><span style="font-family: verdana; font-size: medium;">SOME RANDOM BENFORD'S LAW GRAPHS</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Below: </b>Mackay Min Raw For July</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhReHDF6jP-Eqq6b4zqpCIhYmhKz23bv8TmkxHQWdn7vjU3sDN8v0riZUAY-yuDEcvGcr5lVq_7RBov8gcAF5AcBBFCcd3bPgquGMXw4tTBt1pvfBvaI714rHA-CV27py18L2QJZMsYBJE/s932/macminrawJUL.jpg" style="font-family: "Times New Roman"; font-size: medium; margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="883" data-original-width="932" height="606" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhReHDF6jP-Eqq6b4zqpCIhYmhKz23bv8TmkxHQWdn7vjU3sDN8v0riZUAY-yuDEcvGcr5lVq_7RBov8gcAF5AcBBFCcd3bPgquGMXw4tTBt1pvfBvaI714rHA-CV27py18L2QJZMsYBJE/w640-h606/macminrawJUL.jpg" width="640" /></a></span></div><div><br /></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Below: </b>Amberley Max Raw, <i>All Data.</i> Systematic tampering.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2CsCoJ3OevvGEDU8jzbjo07sEolvj4qUawnhw4Hl_JxyoCheHYCaIJbGnNwVTPtsBOL4R4x1ET8bEY3jg1s5csUBGmG6555-CLV1bFPFiufMeL610TQf6SE2HtvueP1USjH_9-zXJK4I/s1020/amberleymaxraw2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="868" data-original-width="1020" height="544" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2CsCoJ3OevvGEDU8jzbjo07sEolvj4qUawnhw4Hl_JxyoCheHYCaIJbGnNwVTPtsBOL4R4x1ET8bEY3jg1s5csUBGmG6555-CLV1bFPFiufMeL610TQf6SE2HtvueP1USjH_9-zXJK4I/w640-h544/amberleymaxraw2.jpg" width="640" /></a></div><div><br /></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;">Below: Amberley January Min Raw. All the days of January. </span></div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyEpqvR_517_elU5u0zFCJTbZppcTvWNA_fek0EW299sc2seeS_sfGjTh8H256qcip-XRmUNWx1SoO5aecjH9vOoMFyORKlYTXdZqGsLibwFENvriU1H-nYhfe4MYlghEE7AZdq3n_I-0/s994/amberleyminrawJAN.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="851" data-original-width="994" height="548" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyEpqvR_517_elU5u0zFCJTbZppcTvWNA_fek0EW299sc2seeS_sfGjTh8H256qcip-XRmUNWx1SoO5aecjH9vOoMFyORKlYTXdZqGsLibwFENvriU1H-nYhfe4MYlghEE7AZdq3n_I-0/w640-h548/amberleyminrawJAN.jpg" width="640" /></a></div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; text-align: left;">_________________________________________________________________________</span></div></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><br /></div><div><span style="color: #2e2e2e; font-family: inherit; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Amberley Month By Month.- </b></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: large;"> </span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Stratified months for Amberley shows which months have the most tampering. The results are p-values.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Keeping the same significance level as the BOM, any results less than 0.05 indicates rejection of the null hypothesis of conforming to Benford's Law. In other words, it should follow Benford's, it doesn't....tampering is likely.</span></div><div><br /></div><div><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana;">Minraw 1 digit test, </span><span style="color: #2e2e2e; font-family: verdana;">p-values</span></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><div>All 2.267687e-147</div><div>jan 1.106444e-17</div><div>feb 1.884201e-17</div><div>mar 7.136804e-11</div><div>apr 1.171959e-06</div><div>may 5.280244e-21</div><div>jun 5.561890e-28</div><div>jul 3.042741e-24</div><div>aug 1.439602e-32</div><div>sep 3.522860e-19</div><div>oct 9.930470e-25</div><div>nov 2.039136e-14</div><div>dec 4.546736e-23</div></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e; font-weight: 400;">This shows all the months aggregated for minraw as well as individual months It shows August + June being the worst offenders followed by October. </span><span style="color: #2e2e2e;">April is the 'best' month. As with Bourke, August gets major cooling. </span></span></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><div>Amberley Minv2 Adj 1 digit</div><div>All 7.701986e-192</div><div>jan 5.367620e-47</div><div>feb 1.269502e-25</div><div>mar 3.116875e-30</div><div>apr 8.924123e-24</div><div>may 9.250971e-26</div><div>jun 2.388032e-20</div><div>jul 2.889563e-38</div><div>aug 2.039597e-22</div><div>sep 1.678454e-19</div><div>oct 4.009116e-26</div><div>nov 6.251654e-15</div><div>dec 1.563074e-28</div></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">This compares the minv2 adjusted data and shows that adjustments overall (All at the top of the list) <i>are worse than raw,</i> which are pretty bad by themselves. </span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">January and July are the most heavily manipulated months.</span></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e; font-weight: 400;"> </span><span style="color: #2e2e2e;">Amberley Maxraw 1 digit</span></span></div><div><div><span style="font-family: verdana; font-size: medium;">All 6.528697e-217</span></div><div><span style="font-family: verdana; font-size: medium;">jan 4.243928e-74</span></div><div><span style="font-family: verdana; font-size: medium;">feb 3.451515e-48</span></div><div><span style="font-family: verdana; font-size: medium;">mar 1.279319e-52</span></div><div><span style="font-family: verdana; font-size: medium;">apr 1.141334e-69</span></div><div><span style="font-family: verdana; font-size: medium;">may 4.425933e-58</span></div><div><span style="font-family: verdana; font-size: medium;">jun 1.069427e-58</span></div><div><span style="font-family: verdana; font-size: medium;">jul 3.903140e-49</span></div><div><span style="font-family: verdana; font-size: medium;">aug 9.602354e-70</span></div><div><span style="font-family: verdana; font-size: medium;">sep 2.312850e-53</span></div><div><span style="font-family: verdana; font-size: medium;">oct 3.374468e-63</span></div><div><span style="font-family: verdana; font-size: medium;">nov 5.669760e-48</span></div><div><span style="font-family: verdana; font-size: medium;">dec 5.804254e-100</span></div></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Overall, Maxraw data is worse than Minraw data. </span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Amberley maxv2 adj digit</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><div>All 2.701983e-234</div><div>jan 2.309923e-83</div><div>feb 2.012154e-103</div><div>mar 1.492867e-56</div><div>apr 8.215013e-52</div><div>may 2.721058e-35</div><div>jun 9.487054e-40</div><div>jul 2.774663e-59</div><div>aug 7.915751e-47</div><div>sep 2.796343e-69</div><div>oct 1.096688e-39</div><div>nov 6.902012e-48</div><div>dec 1.814576e-68</div></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Once again, <i>adjustments are worse than Raw</i>. February takes over from January with extreme values.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">These results from Benford's Law first digit test show that <i>adjusted data is worse/ less compliant to the Benford distribution than Raw.</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;">_________________________________________________________________________</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><br /></div><div><span style="color: #2e2e2e;"><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;">Tracking Benford's Law For First Digit Value Over Years </span></b></div><div style="color: black; font-weight: 400;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">Amberley Minv2 Adj Data 1942-2017</span></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;">The University Of Edinburgh have created a smooth bayesian model that tracks performance of first digit values for Benford's Law over time so that you can see exactly when first digit probabilities increase or decrease. In effect, this allows you to see how the values of the first digit in a temperature anomaly<b style="font-weight: 400;"> </b>changes over time. </span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;">Running this model with<b style="font-weight: 400;"> </b>temperature anomalies fom Minv2 with all the data took 15 minutes on a laptop and produced the graph below:</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">Amberley started in 1942, so that is the zero reference on the X axis. The dotted line is the baseline to what should occur for the digits to conform to Benfords law. 1980 would be just after 40 on the X axis.</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">This graph shows that the <i>first digit with value 1</i> has always been underused. Too few ones are used in Minv2 temp anomalies.</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">It became slightly more compliant in the 1950's then worsened again. There are far too few 1's in first digit position of temperature anomaly Minv2.</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;">There are too many 2's and 3's right from the beginning at 1942, but use of 2's lessens (thus improving) slightly from the 1980's onwards. <i>But use of 4's increases from the 1980's.</i></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">The values of 8's and 9's in first digit position have always been underused. The 9 value is undersused from the 1990's onwards.</span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><i><span style="font-family: verdana; font-size: medium;">These digit values indicate less conformance after the 1980 adjustments.</span></i></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1jNnbyhMryag70gCcH7l1pGSBvULKn9e3kUQfRi3RXPVkY4F6peM_VjOesRbSurswe5sgsnsmwiv0E2iVToFueUmrpbHI1oZdCpODc5wI91vsaUzRYdLBePtdB-HaJbkRh-TRH39f-kg/s1920/amberminv2jags.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1jNnbyhMryag70gCcH7l1pGSBvULKn9e3kUQfRi3RXPVkY4F6peM_VjOesRbSurswe5sgsnsmwiv0E2iVToFueUmrpbHI1oZdCpODc5wI91vsaUzRYdLBePtdB-HaJbkRh-TRH39f-kg/w640-h348/amberminv2jags.jpg" width="640" /></a></div><br style="color: black; font-family: "Times New Roman"; font-size: medium;" /><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><br /></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div style="color: black;"><span style="font-family: verdana; font-size: medium;">Below: Amberley 2 digit test for all data indicates large scale tampering.</span></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><div class="separator" style="clear: both; color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8zkiKLQFJdtDtuqEj6Nxy2hCbvaQI5Z0Lc2WjcGUj2FNgdaqh28WSZcpZy72QNsTrGjm9mqreAr_4mwzV6lmWnqVW9INmE6QL97v5J11b-8Z3Z4SZaiafppDbQtvdsyNR8xCLfoEtGgU/s1444/amber1980downboth.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="892" data-original-width="1444" height="396" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8zkiKLQFJdtDtuqEj6Nxy2hCbvaQI5Z0Lc2WjcGUj2FNgdaqh28WSZcpZy72QNsTrGjm9mqreAr_4mwzV6lmWnqVW9INmE6QL97v5J11b-8Z3Z4SZaiafppDbQtvdsyNR8xCLfoEtGgU/w640-h396/amber1980downboth.jpg" width="640" /></a></div><div style="color: black; font-family: "Times New Roman"; font-size: medium; font-weight: 400;"><br /></div><span style="font-size: medium; font-weight: 400;"><b style="color: black;"><div style="font-family: inherit;"><span style="color: #2e2e2e; font-weight: 400;"><span style="font-family: inherit; font-size: medium;"><b style="color: black;"><br /></b></span></span></div><span style="font-family: verdana;">Above:</span></b><span style="font-family: verdana;"><span style="color: black;"> Amberley testing using the first 2 digit test. On the left is the raw data. Already the Min Raw is noncompliant with Benfords law giving a tiny P value, so we reject the null of conformity.</span><i style="color: black;"> This is not natural data.</i><span style="color: black;"> It has been heavily manipulated already, there is a big shortfall of value 10-15, there are too many numbers around 24-38 then in the 40's and then methodical spikes in the 50-90 range with big gaps signifying shortfalls.</span></span></span><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: black; font-weight: 400;"><span style="font-family: verdana; font-size: medium;">But look what happens AFTER the adjustments are made on the right -- the Minv2 data is far less compliant and has had digit values from 22-47 become greatly increased in frequency with a tapering off around the 87-97's. </span><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><div style="font-family: "Times New Roman";"><span style="font-size: medium;"><span style="color: #2e2e2e; font-family: verdana;">Below:</span><b style="color: #2e2e2e; font-family: verdana;"> </b><span style="color: #2e2e2e; font-family: verdana;">Bourke Max Raw and Maxv2.1. </span></span></div></div></div></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Adjustments make the data 'worse', if that is possible.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Extreme adjustments in both Raw and Adj data, with consistant underuse of lower digits and overuse of higher digits.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSyf-Y_TaiSp0Xauq6GzBIM2X_j4Y4EyUVBY4G-XLctCXLpuEL5oF_M9eraaCNuw3aOPqqwM-vprfWn5uOPpMXbfGef74UTcNTM7crGFdQS0-3VUZXlGWYXV16zUxB6gEJGrZOFi0aY_Y/s1427/bourke111.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="884" data-original-width="1427" height="396" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSyf-Y_TaiSp0Xauq6GzBIM2X_j4Y4EyUVBY4G-XLctCXLpuEL5oF_M9eraaCNuw3aOPqqwM-vprfWn5uOPpMXbfGef74UTcNTM7crGFdQS0-3VUZXlGWYXV16zUxB6gEJGrZOFi0aY_Y/w640-h396/bourke111.jpg" width="640" /></a></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;"><b>Below: </b>Bourke Min Raw Temp Anomalies vs. Minv2.1.</span></div><div><span style="font-family: verdana; font-size: medium;">NonConformance.</span></div><div><br /></div><div><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvN0anWYslUwB2M0NUOwUZuzcIIBfSAYVujQDMrPKUkiwGG6pwIRM8UAFlJxkSir1fcZLOUqTbO48_Wrv326D-JVdC3kV54shnW0WInc_NvHRmHU2gBCC_GCMMYBaPDLgvf_9MoLl-vyg/s1528/bourkemin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="887" data-original-width="1528" height="372" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvN0anWYslUwB2M0NUOwUZuzcIIBfSAYVujQDMrPKUkiwGG6pwIRM8UAFlJxkSir1fcZLOUqTbO48_Wrv326D-JVdC3kV54shnW0WInc_NvHRmHU2gBCC_GCMMYBaPDLgvf_9MoLl-vyg/w640-h372/bourkemin.jpg" width="640" /></a></div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Below: </b>Mackay January, first digit Benford's test, </span><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Raw Data</span></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">This show how much tampering has gone into January and July. </span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFhirgj_1zrLTbLGdU6p5Ml-1vxsQtbYOdAEywRtIGzQZJoKv1jlloj8gKNH7Vb7zqcaDo0s9jN4I-WiRm99iV4PW9l5XR9FbzgXDFwmlUoPKd64dAFG_xm08MQgeBrvpyEkoe3_BZyZ0/s1566/mackayxxx.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="877" data-original-width="1566" height="358" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFhirgj_1zrLTbLGdU6p5Ml-1vxsQtbYOdAEywRtIGzQZJoKv1jlloj8gKNH7Vb7zqcaDo0s9jN4I-WiRm99iV4PW9l5XR9FbzgXDFwmlUoPKd64dAFG_xm08MQgeBrvpyEkoe3_BZyZ0/w640-h358/mackayxxx.jpg" width="640" /></a></div></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">FINALLY BELOW:</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">This is a wall hanger--the beauty of 'naturally' occurring observational data in an outstanding pattern that is shouting, "audit me, audit me!"</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Almost as if the BOM are getting into fractals.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Below:<b> </b>Sydney data indicates engineered specific number use, certain numbers are repeated consistantly.</span></div><div><br /></div><div><br /></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdzjl4ZepGyUhuAhG8eZXtXdtO11dGJPHBl9RFJsQDNAjLrkLp7dFoStggkR7ZLaqMsf6zCuvOB7ilXTR-u0OOJ12Hywz7sQid2IDXiEbY3HQLmu53l_TVvVhMQZlt6wWqCI9rjP7cTAM/s868/sydney.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="718" data-original-width="868" height="530" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdzjl4ZepGyUhuAhG8eZXtXdtO11dGJPHBl9RFJsQDNAjLrkLp7dFoStggkR7ZLaqMsf6zCuvOB7ilXTR-u0OOJ12Hywz7sQid2IDXiEbY3HQLmu53l_TVvVhMQZlt6wWqCI9rjP7cTAM/w640-h530/sydney.jpg" width="640" /></a></div><br /><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">The BOM has obviously never heard of Benford's Law.<i> This shows engineered specific numbers at specific distances that are over and under used.</i> These are man-made fingerprints showing patterns in RAW data that do not occur in natural observations.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><br /></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;">_________________________________________________________________________</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">BENFORD'S LAW ANALYSIS ON GLOBAL TEMPERATURE ANOMALIES</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">-- THE END GAME IN CLIMATE CHARTS</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">The Global Temperature graphs shown by various climate agencies generally have no levels of significance or boundaries or errors and are the result of averaging daily temp anomalies into months which are averaged into years and are then averaged with 112 stations in Australia or many more world wide. Here is an example of NASA GISS data.</span><span style="color: #2e2e2e; font-family: verdana; font-size: large;"> </span><a href="https://climate.nasa.gov/vital-signs/global-temperature/" style="font-family: verdana; font-size: large;">(link</a><span style="color: #2e2e2e; font-family: verdana; font-size: large;">).</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><i>These are the primary graphs that are used by BOM in media releases.</i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><i><br /></i></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">And these are the graphs they use to argue that 15C Adjustments ( and more at some stations) don't matter.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">How reliable are they?</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">This is what BOM Global anomalies look like when analysed with Benford's Law:</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><span style="font-size: 14.0877px;"><br /></span></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><span style="font-size: 14.0877px;"><br /></span></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI8RYYgUyJF5gZIMkZ7W0VVYPmii0HA25SFvop8wODvY1t0AnyqG8iRZ2zMfRJ6F1Ly-fZau5Dr4aepM1XUEdrZlgeGuCTwTCSaFM78MC20N-0NEdwLbfdksHKfd96U7GpG_4SXm8LJyQ/s1003/bomglobal.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="800" data-original-width="1003" height="510" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI8RYYgUyJF5gZIMkZ7W0VVYPmii0HA25SFvop8wODvY1t0AnyqG8iRZ2zMfRJ6F1Ly-fZau5Dr4aepM1XUEdrZlgeGuCTwTCSaFM78MC20N-0NEdwLbfdksHKfd96U7GpG_4SXm8LJyQ/w640-h510/bomglobal.jpg" width="640" /></a></div><br /><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">An under use of 1's, and a large over use of 2,3,4 + 5's.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Nonconformance with a p-value less than 2.16e10-16.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">This means the data is likely highly tampered with.</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: large;"><br /></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">BOM talk about their data being robust because it matches other agencies:</span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;"><br /></span></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><b>Below:</b></span><span style="color: #2e2e2e; font-weight: 400;"> NASA GISS Global anomalies.</span></span></div><div><span style="color: #2e2e2e; font-family: verdana; font-size: medium; font-weight: 400;">Overuse of 3,4,5,6,7,+8's, terrible graph, nonconforming to Benford's Law.</span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfd4L7vn4SsocoMwhZiwKzbOl7RzJtgMidR4DXHnPcTMk7TwCyVBIthLcHYTWY91Y7mZfmBkp81tu-KdfDjCviER-PVPzwaJAC_EkuxOzEmX5NGHwwBAXBRxX2y2NPCeGO8ssp6-BOULc/s1006/gissglobal.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="843" data-original-width="1006" height="536" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfd4L7vn4SsocoMwhZiwKzbOl7RzJtgMidR4DXHnPcTMk7TwCyVBIthLcHYTWY91Y7mZfmBkp81tu-KdfDjCviER-PVPzwaJAC_EkuxOzEmX5NGHwwBAXBRxX2y2NPCeGO8ssp6-BOULc/w640-h536/gissglobal.jpg" width="640" /></a></div><div><br /></div><div><br /></div><span style="font-family: verdana; font-size: medium;">US agency NOAA Global anomalies are weakly conforming.</span></div><div><span style="font-family: verdana; font-size: medium;">Still an overuse of 6,7,8's.</span></div><div><br /><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px; font-weight: 400;"><br /></span></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjg5mEEDCQEqGNqSE_Wup0PXxNFucRcQlCE3g0l8n1w4q3wN4KCgClMbMilfrV475cuvbquXzUo0hXW7ghE8w4lTVXrabw2121yK_hKlVBq2v-RhjWwhnBkulPY9ONqqcUT74J6yLMI43k/s979/noaa1stdigit.jpg" style="margin-left: 1em; margin-right: 1em; outline-width: 0px; user-select: auto;"><img border="0" data-original-height="835" data-original-width="979" height="546" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjg5mEEDCQEqGNqSE_Wup0PXxNFucRcQlCE3g0l8n1w4q3wN4KCgClMbMilfrV475cuvbquXzUo0hXW7ghE8w4lTVXrabw2121yK_hKlVBq2v-RhjWwhnBkulPY9ONqqcUT74J6yLMI43k/w640-h546/noaa1stdigit.jpg" width="640" /></a></div></div><div><br /></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;">Where we really begin to go into La-La land is looking at <i>land/ocean</i> global anomalies, it's apparent that it's just modeled data. This is not real data.</span></div><div><br /></div><div><span style="color: #2e2e2e;"><div class="separator" style="clear: both; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGTJbIOZchUJfR6e8-PC9Bl8cHLOkfJhLvL7VuoqpxZ11cjuifIp81ypxBorXjweVfktLtIt6AHHapMrKr1rH2OdtKxl0iwmj49KXFuFPKHWm9kT2yOMf0unMh8zlKWXuIdAdvu5p3RBc/s1022/landocean.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="836" data-original-width="1022" height="524" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGTJbIOZchUJfR6e8-PC9Bl8cHLOkfJhLvL7VuoqpxZ11cjuifIp81ypxBorXjweVfktLtIt6AHHapMrKr1rH2OdtKxl0iwmj49KXFuFPKHWm9kT2yOMf0unMh8zlKWXuIdAdvu5p3RBc/w640-h524/landocean.jpg" width="640" /></a></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><br /></span></div><div style="font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif;"><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><br /></span></div><span style="font-size: medium;"><span style="font-family: verdana;"><b>Below: </b>NOAA global land/ocean with 2 digit test for Benford's Law.</span><br /></span><span style="background-color: white; font-family: Georgia, Baskerville, Palatino, "Palatino Linotype", "Book Antiqua", "Times New Roman", serif; font-size: 14.0877px;"><br /></span></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh73sJIHX3bG_ry4wnDJhXJBI76vBqbJLKG7dxx487YWkQpJhZQ8VC9YEa55kiQL5ZfIFxqRzmXayLuXaEzgqpJFRtbQbuewF4JILhPhw1_HBZlMgSjghwg-1qI370JvwhdZfNQjqEdTWE/s1022/noaa.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="802" data-original-width="1022" height="502" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh73sJIHX3bG_ry4wnDJhXJBI76vBqbJLKG7dxx487YWkQpJhZQ8VC9YEa55kiQL5ZfIFxqRzmXayLuXaEzgqpJFRtbQbuewF4JILhPhw1_HBZlMgSjghwg-1qI370JvwhdZfNQjqEdTWE/w640-h502/noaa.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">RESULTS</span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">None of the global temperature anomalies can be taken seriously. This is obviously (badly) modeled data to be used for entertainment purposes only. The Global Temperature Anomalies fail conformance too.</span></div><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both;"><div class="separator" style="clear: both;"><br /></div></div></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">__________________________________________________________</span></div><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">EXTREME ROUNDING OF TEMPERATURES</span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b><b><span style="font-family: verdana; font-size: medium;">Strategically Rounding/Truncating Temperatures To Create Extreme Biases.</span></b></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Correct rounding can add 0.1-0.2C of a degree to the mean, <i>incorrect rounding such as truncating can add 0.5C of a degree to the mean.</i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i>This is NOT related to Australia going metric in 1972.</i> Some stations have blocks of years rounded with almost no decimal values in the later years. For example Deniliquin from <i>1998-2002 has only 30 days with any decimal values, everything has been rounded for 4 years.</i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Some stations have 25 years where rounding increases from 10% to 70%. Looking at the graphs you can see which years get most treatment.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Below: Deniliquin Maxv2, 1998-2002 all rounded! </span></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">A graphical view of rounding of Max Raw temperatures by years. Notice the high density black dots in the 1998-2002 area which creates a bias. Obviously special attention is given in those years.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7NpJzePF-bcVNCIDE6jQlrhenugaRGJa7GQL4WpEbsTl4rFla_m8UioF7wlvndhMBFlYBuCEQ5wsBnmHur-mG1f4OSpzgP3E_MjoVWib8xMwi4DNP6GO-kJOSCZc0wVmiBDRBPp-ngbw/s1049/deniround.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="589" data-original-width="1049" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7NpJzePF-bcVNCIDE6jQlrhenugaRGJa7GQL4WpEbsTl4rFla_m8UioF7wlvndhMBFlYBuCEQ5wsBnmHur-mG1f4OSpzgP3E_MjoVWib8xMwi4DNP6GO-kJOSCZc0wVmiBDRBPp-ngbw/w640-h360/deniround.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">This comes on the heels of the review panel advising BOM their thermometers/readings needed to meet world standards and increase tolerance from 0.5C to 0.1C. Rounding with no decimal digits in specific blocks of years ensures they won't be meeting world standards any time soon.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">"However, throughout the last 100 years, Bureau of Meteorology guidance has allowed for a </span></i><i><span style="font-family: verdana; font-size: medium;">tolerance of ±0.5 °C for field checks of either in-glass or resistance thermometers. </span></i><i><span style="font-family: verdana; font-size: medium;">This is the primary reason the Panel did not rate the observing practices </span></i><i><span style="font-family: verdana; font-size: medium;">amongst international best practices." - BOM</span></i></div></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">A more visual way to see the rounding is to view the graphs with actual black data dots which shows all the rounded temperatures. </span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Below: Deniliquin. - there are strategic patterns to rounding in RAW data. </span><b style="font-family: verdana; font-size: large;">The black dots are rounded temperatures!</b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4KcS9G8nWTJ10gKc3pqD8S5PpOfWPn3vuYrTvMplhbZRJvuYZQnZeQUggj7jsLXaZY-BDSFXWTrD74u7y_xfPVjTR2LhRpxc1mNdyXCLhMi5OqCwjHefz6k7g495rp0pXGLqKQY_VaDA/s778/denilquinminrawrounding.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="778" height="558" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4KcS9G8nWTJ10gKc3pqD8S5PpOfWPn3vuYrTvMplhbZRJvuYZQnZeQUggj7jsLXaZY-BDSFXWTrD74u7y_xfPVjTR2LhRpxc1mNdyXCLhMi5OqCwjHefz6k7g495rp0pXGLqKQY_VaDA/w640-h558/denilquinminrawrounding.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Below: Bourke Minraw--strategically rounded up or down.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKIsmXlJL_KXz3cFZwgw5gXdO8C_-kE7HnLlX-vIEnvQqLN2lrdiEPAaktL09lYtldILJES90pWpilM6H4Ac5_HgH8Ef2wWUUTEKkmEOTVoAhGLtlo_GWXxm9YPDOLCRuIR9xuyjX0teg/s778/boureminrounded.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="778" height="558" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKIsmXlJL_KXz3cFZwgw5gXdO8C_-kE7HnLlX-vIEnvQqLN2lrdiEPAaktL09lYtldILJES90pWpilM6H4Ac5_HgH8Ef2wWUUTEKkmEOTVoAhGLtlo_GWXxm9YPDOLCRuIR9xuyjX0teg/w640-h558/boureminrounded.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Below: Bourke Maxraw. The years of strategic rounding/truncating are clearly visible in a 20 year block.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia8Rn2u16CMVJ0E6gg6LiSi0ccqv5mm47re_8B0D7KTSn1c-NLfdrpQx6LNe9E1ZaIEc-cK6gZWSFca2GnOeYl3lXTbrJPZiJBXvW4DEYBsjsnWB1ApIQH2mFx6u9m0H4xwFGoFz-AEdQ/s889/bourkemaxrounding.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="686" data-original-width="889" height="494" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia8Rn2u16CMVJ0E6gg6LiSi0ccqv5mm47re_8B0D7KTSn1c-NLfdrpQx6LNe9E1ZaIEc-cK6gZWSFca2GnOeYl3lXTbrJPZiJBXvW4DEYBsjsnWB1ApIQH2mFx6u9m0H4xwFGoFz-AEdQ/w640-h494/bourkemaxrounding.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;">Below: Rutherglen Maxraw rounding/truncating.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Black dots are rounded temperatures! </span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDRUDM-e9eQ8r7MQoGbl-PHLn0BJsKK_ugdwAOI6_b7p4clAn_PkXQFyN0wUnYKkR6i5EjHzjGO4sXgesAOwloIyxjPPBUYxbob9jsFufEC2PpaWovqcgVtAj_RPk9UMzw9wauoIGylFM/s783/rutherroundingmaxraw.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="783" height="554" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDRUDM-e9eQ8r7MQoGbl-PHLn0BJsKK_ugdwAOI6_b7p4clAn_PkXQFyN0wUnYKkR6i5EjHzjGO4sXgesAOwloIyxjPPBUYxbob9jsFufEC2PpaWovqcgVtAj_RPk9UMzw9wauoIGylFM/w640-h554/rutherroundingmaxraw.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><span style="font-family: verdana; font-size: medium;">Below: Mackay with strategic rounding visible.</span><br /><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMvnAil5boqLKvPE8ij4PFKfQhGZJuID-oXtUJKV2x36l0IDlGcsVaIRJiR4oXfks6xh7_L_1mLKLhMguV5gUSwvDoLDV15W_YWwY4iYfEyDXihl9I6rkOjFBfuGWZtDU6bSQKXfnd9yM/s778/mackayminrawround.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="778" height="558" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMvnAil5boqLKvPE8ij4PFKfQhGZJuID-oXtUJKV2x36l0IDlGcsVaIRJiR4oXfks6xh7_L_1mLKLhMguV5gUSwvDoLDV15W_YWwY4iYfEyDXihl9I6rkOjFBfuGWZtDU6bSQKXfnd9yM/w640-h558/mackayminrawround.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div><span style="font-family: verdana; font-size: medium;">Below: Sydney Maxv2 Adj has rounding bias concentrated on the <i>lower part (lower temps) of the graph.</i></span></div><div><span style="font-family: verdana; font-size: medium;"><i><br /></i></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjorJzqENVM0R3UDAs-I0m9iP7ClyXHqujaoufby9BOifRGzg7znNO6suX1NOTeAO3i6Fy6u25tVWCGtfJWZEzs4TV-oHCXqyKDUWQL5Ovu84-sb1ZUxbOwHnTaibzIKASDUcRSKK11dKM/s777/sydneyroundingmax.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="777" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjorJzqENVM0R3UDAs-I0m9iP7ClyXHqujaoufby9BOifRGzg7znNO6suX1NOTeAO3i6Fy6u25tVWCGtfJWZEzs4TV-oHCXqyKDUWQL5Ovu84-sb1ZUxbOwHnTaibzIKASDUcRSKK11dKM/w640-h560/sydneyroundingmax.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><br /></div><span style="font-family: verdana; font-size: medium;">The BOM is using rounding/truncating particularly in the years 1975-2002. The amount of rounding becomes more dense in the 90's. This is on Raw data, it is strategic and varies from station to station and from year to year. Rounding or truncating can add around 0.5C of a degree to the mean. It is likely used to add to warming/ cooling on the temperature series.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">It shows the RAW data has been fiddled with and lacks integrity.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><b>__________________________________________________________________________________</b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">EXTREME OUTLIERS ADDED IN</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Selective Infilling/Imputing Of Missing Data To Create Extreme Outliers And Biases.</span></b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: #fefefe; font-family: verdana; font-size: medium;">Temperatures which are not collected because there is no instrument available, or because an instrument has failed, cannot be replicated. It is forever unavailable. Similarly, data which is inaccurate because of changes in the site metadata or instrument drift is forever inaccurate and it's accuracy cannot be improved with homogenisation adjustment algorithms.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: #fefefe;"><span style="font-family: verdana; font-size: medium;"><br /></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: #fefefe;"><span style="font-family: verdana; font-size: medium;">What we are concerned with here are <i>computer generated temperatures,</i></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="background-color: #fefefe;"><span style="font-family: verdana; font-size: medium;"><i>called imputation or infilling or interpolation into Adjusted data where there is no Raw.</i></span></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: large;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">This process cannot be as accurate as actual temperature readings but becomes worse when <b>only specific missing values are selected for infilling</b>, leaving thousands blank. Selecting <i>only some values</i> to infill creates a bias!</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">So computer generated temperatures can dominate selective parts of the temperature series with specific warming and cooling segments.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">In fact, this is exactly what is happening -- BOM is creating computer generated outliers in Adjusted data where Raw is missing. BUT only <i>some</i> of these missing variables are infilled, creating very biased data. </span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"> </span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiedpVU5xdTdc_w9uFwVqWpBo1FiUvhNE4RK0QUqhRB7F9CHYyWxPVO0R2iTiz7IO10ih49nMa7-BOp195jtNHQEJcs6wAAZd4K02iI8KRrbK2jk7t4xQuUZWFdgJP92wNwF5Du7CU9SMI/s511/xxxbourkemissing.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="224" data-original-width="511" height="280" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiedpVU5xdTdc_w9uFwVqWpBo1FiUvhNE4RK0QUqhRB7F9CHYyWxPVO0R2iTiz7IO10ih49nMa7-BOp195jtNHQEJcs6wAAZd4K02iI8KRrbK2jk7t4xQuUZWFdgJP92wNwF5Du7CU9SMI/w640-h280/xxxbourkemissing.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;">The missing variables pattern in JMP is flagged with a "1" when data is missing. The above pic is the missing variable report for minv2.1 and Minraw. <i>What we are interested in is all the missing temps in Raw that have infilled/imputed values in Minv2.1. </i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">So in the above there are 512 values with NO Raw but FULL minv2.1 data. There are also 622 values where both Raw and Minv2.1 is missing.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">BOM infilled outliers into the data, see below.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZi6YYYcEQQkO3H5cMgJZuFGoP-NSMRujCfX5L_RfC4M0GcS0tmxpDmDnn8CpLFc10F3632hKnAuZN8uat_midnr9gGP5_fwCvecmPo152S-RraZfIfg83aqtijry1b2Znt3gFcaJHJjI/s777/xxbourkeoutliers.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="777" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZi6YYYcEQQkO3H5cMgJZuFGoP-NSMRujCfX5L_RfC4M0GcS0tmxpDmDnn8CpLFc10F3632hKnAuZN8uat_midnr9gGP5_fwCvecmPo152S-RraZfIfg83aqtijry1b2Znt3gFcaJHJjI/w640-h560/xxbourkeoutliers.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEispL8wFOO8elx4FbtU241e65DjR0n1rLHWECMgVwRdoQ8l-EpwpuSfpU9hLE701H7xHL2ZuhFqeJJIV2WsHG7GZoJf6TimZSVWVcOzAQhFefzamkFeWBjsqIUrvV0_sBxAIZePlQT8nHI/s609/bourkeoutlier2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="609" data-original-width="509" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEispL8wFOO8elx4FbtU241e65DjR0n1rLHWECMgVwRdoQ8l-EpwpuSfpU9hLE701H7xHL2ZuhFqeJJIV2WsHG7GZoJf6TimZSVWVcOzAQhFefzamkFeWBjsqIUrvV0_sBxAIZePlQT8nHI/w334-h400/bourkeoutlier2.jpg" width="334" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Melbourne, below epitomises the selective infilling of missing data and bias creation.</b></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKUj4_3YXjr89RP3sEzqge-o904G6ciEqxBwHjVvgA0ixyAA4eoPYIh5oDOM_RRGYn6pVvJaiSiGdfQI4DRd3WKPnoebLt2yHNR8cC-qH0DRgNZxaJGJk2SFxdOvzKpE0RyOz2-V-zdnM/s499/melzxxxxx.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="266" data-original-width="499" height="342" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKUj4_3YXjr89RP3sEzqge-o904G6ciEqxBwHjVvgA0ixyAA4eoPYIh5oDOM_RRGYn6pVvJaiSiGdfQI4DRd3WKPnoebLt2yHNR8cC-qH0DRgNZxaJGJk2SFxdOvzKpE0RyOz2-V-zdnM/w640-h342/melzxxxxx.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><span style="font-family: verdana; font-size: medium;">48 values with no Raw are infilled into Minv2 BUT 47 values are still left missing! <b>Black dots are the actual infilled values.</b></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6kQNPJG6aBkscQHkaH4dZ0F71QKmHFfPg8uqU7eHz8E2RZCXt8lx8V3rK9QV1r0SLeMwki5DL-mgJely8vxuOf_tUe8WPOAozRh3NurRedG5mWrrsOMdD8i3J5l0LTF7b5Gh71xYrd8A/s777/melbournemissing.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="777" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6kQNPJG6aBkscQHkaH4dZ0F71QKmHFfPg8uqU7eHz8E2RZCXt8lx8V3rK9QV1r0SLeMwki5DL-mgJely8vxuOf_tUe8WPOAozRh3NurRedG5mWrrsOMdD8i3J5l0LTF7b5Gh71xYrd8A/w640-h560/melbournemissing.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;">The infilled values are all extreme cooling (lowest values in past) that help cool down the earlier part of the temperature series.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Just to be clear what we are seeing-- these are missing Raw temperatures that have been selectively infilled/imputed/interpolated with values that are computer generated to be on the lowest boundaries of cool in a position which 'helpfully' increases the BOM trendline in the morte recent years. <i>This is selective biased data.</i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i><br /></i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Palmerville (below) has computer generated a 'record warmest Minv2 ever' created. </span><b style="font-family: verdana; font-size: large;">Black dots are the actual infilled values.</b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdX_5hkZrJ9qJdNAGve7lxn-9mvnWtB5_5CpnYKkMo_7nXHqC1FOsZZAMfftykLi80D-vsZbswOfj0CR-GE5oX4IEt6ILAG-6VTrL6ZFfSXWDyi1vxZHneDso8iqQUO65wZmC63C6lIbE/s1226/palmervillemissingrecords.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="670" data-original-width="1226" height="350" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdX_5hkZrJ9qJdNAGve7lxn-9mvnWtB5_5CpnYKkMo_7nXHqC1FOsZZAMfftykLi80D-vsZbswOfj0CR-GE5oX4IEt6ILAG-6VTrL6ZFfSXWDyi1vxZHneDso8iqQUO65wZmC63C6lIbE/w640-h350/palmervillemissingrecords.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Richmond (below) gets the treatment.</span></div><div class="separator" style="clear: both; text-align: left;"><b style="font-family: verdana; font-size: large;">Black dots are the actual infilled values.</b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzlfOSNibxGpq-XA5w9Fwcmo2IaAtCWWzLv9Jqama76z_O3n7p6DuG48OCDjYLKaGkM1ioodnkQiLosWxQlGToFWarYHCoULlByQHVELoCr5qtqC_zsCIIi5KUKsCQb-w-pHzMm6pjQy4/s784/richmondminoutliers.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="784" height="554" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzlfOSNibxGpq-XA5w9Fwcmo2IaAtCWWzLv9Jqama76z_O3n7p6DuG48OCDjYLKaGkM1ioodnkQiLosWxQlGToFWarYHCoULlByQHVELoCr5qtqC_zsCIIi5KUKsCQb-w-pHzMm6pjQy4/w640-h554/richmondminoutliers.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><span style="font-family: verdana; font-size: medium;">Let's look at Sydney and the infilled values in detail, below with <i>relation to creating a trend.</i></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBcid_yyfoyTCcd7YxCa-HyD37CDNxycHo3ivyX0CfNhOH64BjhtF6cemvedHeqee7LnC-uCzbHq3eRB3UTc-_pAGM3tbBBROQcwmTo-K6r34CG0GxDhEZH3_rCqwEJpDeraJljeccxd4/s777/sydminfilledin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="777" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBcid_yyfoyTCcd7YxCa-HyD37CDNxycHo3ivyX0CfNhOH64BjhtF6cemvedHeqee7LnC-uCzbHq3eRB3UTc-_pAGM3tbBBROQcwmTo-K6r34CG0GxDhEZH3_rCqwEJpDeraJljeccxd4/w640-h560/sydminfilledin.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i>These infilled values are tested for a trend. Yep, you guessed it--<b>the infilled values by themselves trend UPWARDS</b> (below).</i></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheRiNXNJiE6cCEkYTxlJQNhxaXyYb57RTG2r2t0ReVQmab_45p6jHd1xXLzfHhTFHvQsmBrznWc3NJiGS0rM8WH3PmU2sDb73fDe6NJVtg1mP1-nK8vWJwgesOxMUFMtOTvYyXyDewjvc/s776/sydminv2missingrawuptrend.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="776" height="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheRiNXNJiE6cCEkYTxlJQNhxaXyYb57RTG2r2t0ReVQmab_45p6jHd1xXLzfHhTFHvQsmBrznWc3NJiGS0rM8WH3PmU2sDb73fDe6NJVtg1mP1-nK8vWJwgesOxMUFMtOTvYyXyDewjvc/w640-h560/sydminv2missingrawuptrend.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><br /><span style="font-family: verdana; font-size: medium;">Let's look at EXTREME warm and cool infilling at Port Macquarie (below). Keep in mind, these are computer generated values, all 12768 values in Minv2.1! </span><b style="font-family: verdana; font-size: large;">Black dots are the actual infilled values.</b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit6BcD5HjONsjvv-rXAnbmR8kNipThlEMzvatR0eJUJCMaVPcafPrvf6rhtwBk_UnC33nl7aumRfZABm0hA_vKG2ltz2SDHzuZVpviAglIPWADjiUOlp2J7-q_aw7lLOJyEWe20O7R3i0/s784/xxxportminaddin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="784" height="554" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit6BcD5HjONsjvv-rXAnbmR8kNipThlEMzvatR0eJUJCMaVPcafPrvf6rhtwBk_UnC33nl7aumRfZABm0hA_vKG2ltz2SDHzuZVpviAglIPWADjiUOlp2J7-q_aw7lLOJyEWe20O7R3i0/w640-h554/xxxportminaddin.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><span style="font-family: verdana; font-size: medium;">Similar story to the Maxv2.1. They infilled 13224 values complete with extreme values, yet still left some blank.</span></div><div class="separator" style="clear: both; text-align: left;"><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXijsyqxWj-QkZ7f2zEXfwfgDqhw2MpSQbFujPF_ld-jqbDXKirZoSW1krwBz12hCjKRipU6AIoUOIzB8ivTrXpQqCAuQyb_PIbJ2Fow0IMBqV7x33n5G3Ga5qDof_87C28P6FiQMrrDY/s787/portmaxfilledin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="787" height="552" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXijsyqxWj-QkZ7f2zEXfwfgDqhw2MpSQbFujPF_ld-jqbDXKirZoSW1krwBz12hCjKRipU6AIoUOIzB8ivTrXpQqCAuQyb_PIbJ2Fow0IMBqV7x33n5G3Ga5qDof_87C28P6FiQMrrDY/w640-h552/portmaxfilledin.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Lastly, let's look at Mt. Gambier. The missing patterns show 10941 values where Raw is missing, but values have been infilled into Minv2.1</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">There are still 77 values missing for both Raw and Minv2.1 and <i>17 values <b>ignored</b> in Minv2.1 that exist in Raw (another interesting concept!)</i></span></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrdWCbSx-PwB53YfFd8KJzBLfyKaVhgVBh-696ZKWNF4ZEenbL_f09Ypu_p3W1yMRGHGndMmOr0oXzOxDqw41mKQIyjpcwXy8eNFk43vxAcMX3P891_UKWsJ7OtFIaO_uvcP74x32Umos/s537/xxmtgambiermissingpatterns.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="224" data-original-width="537" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrdWCbSx-PwB53YfFd8KJzBLfyKaVhgVBh-696ZKWNF4ZEenbL_f09Ypu_p3W1yMRGHGndMmOr0oXzOxDqw41mKQIyjpcwXy8eNFk43vxAcMX3P891_UKWsJ7OtFIaO_uvcP74x32Umos/w640-h266/xxmtgambiermissingpatterns.jpg" width="640" /></a></div><div><br /></div><span style="font-family: verdana; font-size: medium;">The box plot below tells us that there are a lot of outliers in Minv2.1 and Raw.</span></div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdGjoNXbNGuX8vXC2FwGbLC1-oUQxtJfUjUkKSfUX4y9jdzz1D3eY_R21yVXpKwd-Zmintd_hqfFNVD8kr-MmXjO7nI2GKCqX1mRemjpaDQN4XX_Jg4RRdUP226UJTE40DjGvOyzScHPg/s784/xxmtgambieroutliers.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="633" data-original-width="784" height="516" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdGjoNXbNGuX8vXC2FwGbLC1-oUQxtJfUjUkKSfUX4y9jdzz1D3eY_R21yVXpKwd-Zmintd_hqfFNVD8kr-MmXjO7nI2GKCqX1mRemjpaDQN4XX_Jg4RRdUP226UJTE40DjGvOyzScHPg/w640-h516/xxmtgambieroutliers.jpg" width="640" /></a></div><div><br /></div><span style="font-family: verdana; font-size: medium;">Likewise, the scatterplot shows another view of the outliers with the 'bulge' in the plot and all the little dots scattered around by themselves. This tells us to prepare for outliers.</span></div><div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxXO0yGXz5AbC6hW8RHz0KdD-XdQ1WSBKpJe2cbkCDDKWEk4SsaO7E6bG27vH1ULVUU_HIS_OQ9bEdHOTc2to9D16NlmbfhwZTU1deNtJY_JdRpS701Hth3l9M2-MZxk-PxiO69kHXdhA/s863/zzmtgambierscatter.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="863" data-original-width="711" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxXO0yGXz5AbC6hW8RHz0KdD-XdQ1WSBKpJe2cbkCDDKWEk4SsaO7E6bG27vH1ULVUU_HIS_OQ9bEdHOTc2to9D16NlmbfhwZTU1deNtJY_JdRpS701Hth3l9M2-MZxk-PxiO69kHXdhA/w528-h640/zzmtgambierscatter.jpg" width="528" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Indeed, over 10000 values have been infilled, many with extreme values creating outliers. </span><span style="font-family: verdana; font-size: medium;">These are not true observational readings because Raw is missing. Records have been set by computer, look at the outlier dot in 1934, way above the others.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDYkBOggbxlEmfc87NxHnpIKpgl80Y14QUKLjkmAk8wZLV-3VMY5R7WwD9o0MYwWEAeN0GxzJNosCYpykswMhQqev-QSEDMeILFEQS4_0sEKZXzvDTzLmVJ5Zn6_ylsyw76Ziil2dcBEM/s1463/xmtgambierimputedultra.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="680" data-original-width="1463" height="298" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDYkBOggbxlEmfc87NxHnpIKpgl80Y14QUKLjkmAk8wZLV-3VMY5R7WwD9o0MYwWEAeN0GxzJNosCYpykswMhQqev-QSEDMeILFEQS4_0sEKZXzvDTzLmVJ5Zn6_ylsyw76Ziil2dcBEM/w640-h298/xmtgambierimputedultra.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">As well as Minv2.1, we have Maxv2.1 as well (below). <i>The second lowest maximum temperature ever is a computer generated infilled value!</i></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><i><br /></i></span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEz-arzoRyojZkw8IrSfMsFa-FHVZDjpR1YA15KRq8WlYljztNoy2sbFM2tbf0PUKsOXmPEvqvoiGq1xLrTJboxGCTSwKSHE3Je9jk38xLhaUgGEHmvYyICEYXGSAcgF48eUwTMt9nC_8/s1154/mtgambiermaxv21imputation.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="655" data-original-width="1154" height="364" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEz-arzoRyojZkw8IrSfMsFa-FHVZDjpR1YA15KRq8WlYljztNoy2sbFM2tbf0PUKsOXmPEvqvoiGq1xLrTJboxGCTSwKSHE3Je9jk38xLhaUgGEHmvYyICEYXGSAcgF48eUwTMt9nC_8/w640-h364/mtgambiermaxv21imputation.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>SUMMARY</b></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Infilling/interpolation/imputing values need to be carefully done to be statistically valid. And the values will never be as accurate as a valid reading.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">This is not being done by the BOM<i> because they are selecting specific values they want, leaving others blank, AND they are creating extreme value outliers in positions they want! </i>Outliers are normally removed, here they are added in<i>. </i>This data is completely without integrity.</span></div></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>__________________________________________________________________________________</b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">DUBIOUS ADJUSTMENTS</span></b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><i style="font-family: inherit;"><span style="background-color: white; color: #202122;"><b>___________________________________________________________________________________</b></span></i></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both;"><b><br /></b></div><div class="separator" style="clear: both;"><b><br /></b></div><div class="separator" style="clear: both;"> <span style="font-family: verdana; font-size: medium;"><i>"The primary
purpose of an adjusted station dataset it to provide
quality station level data for users, with areal [sic] averages being a secondary product." -BOM</i></span></div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">The complete Amberley temperature series both raw and adjusted is below. The orange graph is the raw temperature, the blue is the cooled down adjusted version. By cooling the past, a warming trend is created. Notice cooling in Adj stopped around 1998.</span></div><div class="separator" style="clear: both;"><div class="separator" style="clear: both;"><br /></div></div><div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFG1GqnVhXkhDgmCxle5UKdvzzOT1N8KPhOKlJ0KaYCZqxOCaUPFVTxktfiunOXyO-RARqfLWTspN3LEwlDgM1hKQxEmet0LNRS6PuTNbpyHzrxs95H3qSDu8yNIEGiUpVMNCiER3lGwk/s828/amberzzzzz.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="679" data-original-width="828" height="524" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFG1GqnVhXkhDgmCxle5UKdvzzOT1N8KPhOKlJ0KaYCZqxOCaUPFVTxktfiunOXyO-RARqfLWTspN3LEwlDgM1hKQxEmet0LNRS6PuTNbpyHzrxs95H3qSDu8yNIEGiUpVMNCiER3lGwk/w640-h524/amberzzzzz.jpg" width="640" /></a></div><div class="separator" style="clear: both;"><b><br /></b></div><div class="separator" style="clear: both;"><b><br /></b></div><div class="separator" style="clear: both;"><b><br /></b></div><div class="separator" style="clear: both;"><b>___________________________________________________________________________________</b></div><div class="separator" style="clear: both;"><b><br /></b></div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">A Summary Of The Amberley Problem:</span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">(1) A dip in the temperature series in 1980 ( an 'inhomogeneity detected') made them realise the station was running warm because now it didn't match it's neighbours -- therefore it was cooled down significantly from 1942 to 1998. ???</span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">(2)The unspecified 'neighbour stations' were totalled as 310 by NASA and several dozen by BOM.</span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">The stations involved were vague and non transparent, and so unable to be tested.</span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">(3) Conveniently a warming trend had been created where there was none before.</span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">(4) In 1998 the station mysteriously returned to normal and no more significant adjustments were required. </span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">(5) The following iteration of the temperature series from Minv1 to Minv2 resulted in them <i>now warming the cooled station after 1998</i>. </span></div><div class="separator" style="clear: both;"><b><span style="font-size: medium;"><br /></span></b></div><div class="separator" style="clear: both;"><span style="font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><b><span style="font-size: medium;">__________________________________________________________________________</span></b></div><div class="separator" style="clear: both;"><span style="font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">No evidence supplied, no documentation on the 'neighbours' involved, no meta data. </span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">The review panel from 10 years ago had problems with the methodology too -- </span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><div class="separator" style="clear: both;"><span style="font-family: verdana;"><i><span style="font-size: medium;">"C7 Before public release of the ACORN-SAT dataset the Bureau <b>should determine and document </b></span></i><i><span style="font-size: medium;"><b>the reasons why the new data-set shows a lower average temperature in the period prior to 1940 </b></span></i><i><span style="font-size: medium;">than is shown by data derived from the whole network, and by previous international analyses </span></i><i><span style="font-size: medium;">of Australian temperature data." </span></i></span></div></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">Also:</span></div><div class="separator" style="clear: both;"><i><span style="font-family: verdana; font-size: medium;">"C5 The Bureau is encouraged to calculate the adjustments using only the best correlated </span></i><i style="font-family: verdana;"><span style="font-size: medium;">neighbour station record and compare the results with the adjustments calculated using several </span></i><i style="font-family: verdana;"><span style="font-size: medium;">neighbouring stations. This would better justify one estimate or the other and quantify impacts </span></i><i style="font-family: verdana;"><span style="font-size: medium;">arising from such choices."</span></i></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-size: medium;"><span style="font-family: verdana;">Using only the 'best correlated neighbour stations' has obviously confused Gavin Schmidt from NASA, he used 310 neighbours (see </span><a href="https://jennifermarohasy.com/wp-content/uploads/2011/08/Changing_Temperature_Data.pdf" style="font-family: verdana;">Jennifer Marohasy'</a><span style="font-family: verdana;">s blog). Dr. Jennifer Marohasy documents the whole dubious adjustment saga in detail.</span></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">The BOM were eventually forced to defend their procedures in a statement:</span></div><div class="separator" style="clear: both;"><div class="separator" style="clear: both;"><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">"Amberley: the major </span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">adjustment is to minimum temperatures in 1980. There is very little </span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">available documentation for Amberley before the 1990s (possibly, as an </span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">RAAF base, earlier documentation may be contained in classified </span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">material) and this adjustment was identified through neighbour </span></i></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;">comparisons. <b>The level of confidence in this adjustment is very high </b></span></i><i><span style="font-family: verdana; font-size: medium;"><b>because of the size of the inhomogeneity and the large number of other </b></span></i><i><span style="font-family: verdana; font-size: medium;"><b>stations</b> in the region (high network density), which can be used as a </span></i><i><span style="font-family: verdana; font-size: medium;">reference. The most likely cause is a site move within the RAAF base."</span></i></div></div></div><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana; font-size: large; text-align: left;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><span style="font-size: medium;"><span style="font-family: verdana; text-align: left;">Obviously their level of confidence wasn't that large because </span><i style="font-family: verdana; text-align: left;">they warmed up their cooled temperatures</i><span style="font-family: verdana; text-align: left;"> somewhat in the next iteration of the temperature series data set (from minv1 to minv2).</span></span></div><div class="separator" style="clear: both; text-align: center;"><div style="text-align: left;"><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div style="text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Update minv2.1</span></b></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Warming continues from iteration minv2 to minv2.1 by<i> increasing the frequency of temperature repeats</i> slightly. Every iteration gets warmer.</span></div><div style="text-align: left;"><span style="font-size: medium;"><br /></span></div><div style="text-align: left;"><b><br /></b></div><div style="text-align: left;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjo8okMLoWiwf_yneUEYTsfx9MDIcxYKa9-S9GdnPz5RflMYwmPxRDGUeJVajJG6DL7xQ2NQ6M7r24Hri4tMEmAXlb4PfIW3vvBAMr8phl6bvtdCZ4saO9gR2ZN_9kcpizI-sSgVlTVvP4/s1249/amberleyrepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="487" data-original-width="1249" height="250" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjo8okMLoWiwf_yneUEYTsfx9MDIcxYKa9-S9GdnPz5RflMYwmPxRDGUeJVajJG6DL7xQ2NQ6M7r24Hri4tMEmAXlb4PfIW3vvBAMr8phl6bvtdCZ4saO9gR2ZN_9kcpizI-sSgVlTVvP4/w640-h250/amberleyrepeats.jpg" width="640" /></a></div><br /><b><br /></b></div><div style="text-align: left;">_______________________________________________________________________________</div></div></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><span style="font-size: medium;">This whole situation is ludicrious and you get the feeling that the BOM has been caught in a lie. </span><span style="font-size: medium;">There are several ways to check the impact of the adjustments, though.</span></span></div></div><div class="separator" style="clear: both; text-align: center;"><i><span style="font-family: verdana; font-size: medium;"><br /></span></i></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">(1) Benford's Law before and after adjustments</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">(2) Control Charts, before and after adjustments</span></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both;"><div class="separator" style="clear: both;"><span style="font-family: verdana; font-size: medium;">(3) Tracking first digit values from 1942-2018 to see if we can spot digit values changing using a smooth bayesian model from University of Edinburgh.</span></div><div class="separator" style="clear: both;"><span style="font-family: verdana;"><br /></span></div></div><div><br /></div><div><span><i><span style="background-color: white; color: #202122; font-family: verdana;"><b>__________________________________________________________</b></span></i></span></div></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">AMBERLEY TEST 1</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><b><span style="font-family: verdana; font-size: medium;">Benford's Law</span></b></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Below:</b> Raw and adjusted data is compared from 1942-1980 using Benford's law of first digit analysis. </span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9h2kyZoHw6vgdev2pLSmgJ7GN7Iea4oIoc3iOs8myTwRWo8eA_Hey3cl0kWEH7tA5FF6ye0y1jMG4mke17XFRcgXoUNjFsFUHK8GD1N0_fkYnDjpeSYPDm2uVf1fOSw-1tkIav1sUYlM/s1640/1digit1980down.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="835" data-original-width="1640" height="326" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9h2kyZoHw6vgdev2pLSmgJ7GN7Iea4oIoc3iOs8myTwRWo8eA_Hey3cl0kWEH7tA5FF6ye0y1jMG4mke17XFRcgXoUNjFsFUHK8GD1N0_fkYnDjpeSYPDm2uVf1fOSw-1tkIav1sUYlM/w640-h326/1digit1980down.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">Using Benford's law for first digit analysis we can see the adjustments make the data worse with lower conformance and a smaller p-value.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Below:</b> Beford's law first 2 digits for January and July.</span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-family: verdana; font-size: medium;">The graphs are as bad as anything you are likely to see and would trigger an automatic audit in any financial situation.</span></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9pyjANscIzvp0PNWE5a23S-Eoksi4JQX4vb-nKEFaSYF71svaKPifatclqon7xRVs5ZguR0tCLxGJIvzfSBtWtaByJr8SdCN79WqRB85nANgZRVIi0684YZlhQw5rqcBxONZAixzVDfc/s1617/xxxx.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="877" data-original-width="1617" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9pyjANscIzvp0PNWE5a23S-Eoksi4JQX4vb-nKEFaSYF71svaKPifatclqon7xRVs5ZguR0tCLxGJIvzfSBtWtaByJr8SdCN79WqRB85nANgZRVIi0684YZlhQw5rqcBxONZAixzVDfc/w640-h348/xxxx.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div></div><div class="separator" style="clear: both; text-align: center;"><div style="text-align: left;">___________________________________________________________________________________</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">AMBERLEY TEST 2</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Basic Quality Control - The Control Chart </b></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Besides Benfords Law, let's use Control Charts to get a handle on the Amberley data and get a second opinion.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">I put the Min Raw and Minv2.1 temperature data into a Control Chart, <i>one of the seven basic tools of quality control.</i> </span></div><div style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">The temperature series was already <i>'out of control' </i>in the raw sequence, but not in 1980. There are 11 warning nodes where the chart is over or under the 3 sigma limit, <i>but after adjustments, this nearly doubles</i>. There are many more warning nodes and the temperature sequence is more unstable.</span></div><div style="text-align: left;"><span style="color: #2e2e2e;"><br /></span></div><div><span style="color: #2e2e2e;"><br /></span></div></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9cyagCAV8udh7cmbxYc7wne-bL_wYo1SFRiS1LtPuYZ3ApP1y1a9fwaqN-aaCa_SLeATp-pvb2NJRY3we8FdlZo0Q9R-ty3PkUKvMUrAFdkJlcVlGA9vHceYciC4Mg77FPINpSO1_nCk/s1445/amberleycontrol.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="625" data-original-width="1445" height="277" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9cyagCAV8udh7cmbxYc7wne-bL_wYo1SFRiS1LtPuYZ3ApP1y1a9fwaqN-aaCa_SLeATp-pvb2NJRY3we8FdlZo0Q9R-ty3PkUKvMUrAFdkJlcVlGA9vHceYciC4Mg77FPINpSO1_nCk/w640-h277/amberleycontrol.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><br /></span><div><b><span style="font-size: medium;"><br /></span></b></div><div><b><span style="font-size: medium;">__________________________________________________________________________</span></b></div><div><b><span style="font-size: medium;"><br /></span></b></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">AMBERLEY TEST 3</span></div><div><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div><b><span style="font-family: verdana; font-size: medium;">Tracking Benford's Law For First Digit Value Over Time </span></b></div><div><b><span style="font-family: verdana; font-size: medium;">Amberley Minv2 Adj Data 1942-2017</span></b></div><div><b><span style="font-family: verdana; font-size: medium;"><br /></span></b></div><div><span style="font-family: verdana; font-size: medium;">The University Of Edinburgh have created a smooth bayesian model that tracks performance of first digit values for Benford's Law over time so that you can see exactly when first digit probabilities increase or decrease. </span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">In effect, this allows you to see how the values of the <i>first digit in a temperature anomaly </i>changes over time. </span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">Running this model with<b> </b>temperature anomalies fom Minv2 took 15 minutes on a laptop and produced the graph below:</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">Amberley started in 1942, so that is the zero reference on the X axis. The dotted line is the baseline to what should occur for the digits to conform to Benfords law. 1980 would be just after 40 on the X axis.</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">This graph shows that the <i>first digit with value 1</i> has always been underused. It became slightly more compliant in the 1950's then worsened again. There are far too few 1's in first digit position of temperature anomaly Minv2.</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">There are too many 2's and 3's right from the beginning at 1942, but use of 2's lessens (thus improving) slightly from the 1980's onwards. B<i>ut use of 4's increases from the 1980's.</i></span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div><div><span style="font-family: verdana; font-size: medium;">The values of 8's and 9's in first digit position have always been underused. The 9 value is undersused from the 1990's onwards.</span></div><div><i><span style="font-family: verdana; font-size: medium;">These digit values indicate less conformance after the 1980 adjustments.</span></i></div><div><i><br /></i></div><div><br /></div></div><div class="separator" style="clear: both; text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1jNnbyhMryag70gCcH7l1pGSBvULKn9e3kUQfRi3RXPVkY4F6peM_VjOesRbSurswe5sgsnsmwiv0E2iVToFueUmrpbHI1oZdCpODc5wI91vsaUzRYdLBePtdB-HaJbkRh-TRH39f-kg/s1920/amberminv2jags.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1jNnbyhMryag70gCcH7l1pGSBvULKn9e3kUQfRi3RXPVkY4F6peM_VjOesRbSurswe5sgsnsmwiv0E2iVToFueUmrpbHI1oZdCpODc5wI91vsaUzRYdLBePtdB-HaJbkRh-TRH39f-kg/w640-h348/amberminv2jags.jpg" width="640" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><div><span style="font-family: verdana; font-size: large; text-align: left;">___________________________________________</span></div><div><div style="text-align: left;"><br /></div><div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b><br /></b></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b><br /></b></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">MORE DODGY ADJUSTMENTS</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><br /></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b><br /></b></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><b>Bourke Adjustments Of 0.5C = Less Than Sampling Variation</b></span></div><div style="font-family: "Times New Roman";"><span style="color: #2e2e2e; font-family: verdana;">BOM released a statement about the adjustments made at Bourke:</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><blockquote style="border: none; font-family: "Times New Roman"; font-size: medium; margin: 0px 0px 0px 40px; padding: 0px;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><div><i>"Bourke: the major adjustments (none of them more than 0.5 </i></div></span><div><span style="font-family: verdana; font-size: medium;"><i>degrees Celsius) relate to site moves in 1994 (the instrument was moved </i></span><i style="font-family: verdana; font-size: large;">from the town to the airport), 1999 (moved within the airport grounds) </i><i style="font-family: verdana; font-size: large;">and 1938 (moved within the town), as well as 1950s inhomogeneities that </i><i style="font-family: verdana; font-size: large;">were detected by neighbour comparisons which, based on station photos </i></div><div><span style="font-family: verdana; font-size: medium;"><i>before and after, may be related to changes in vegetation (and therefore </i></span><i style="color: #2e2e2e; font-family: verdana; font-size: large;">exposure of the instrument) around the site."</i></div></blockquote><p style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana;"><br /></span></p><p style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;">Looking at Bourke, below:</span></p><p style="font-family: "Times New Roman"; font-size: medium;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1Tvch8FkTmFct6mOTTSDe2vFCE0ZGVEQd7UMwHd8b3c6pizbfN657kk3dlqhKIwO7le-vGWX1ekssuv4YU_Dzai3ZdSAj53wiQVhL4gWz4EUVQGWSHxZNhM0Rr5PGhpQOP25gJXPF3RU/s828/bbbbb.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="679" data-original-width="828" height="524" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1Tvch8FkTmFct6mOTTSDe2vFCE0ZGVEQd7UMwHd8b3c6pizbfN657kk3dlqhKIwO7le-vGWX1ekssuv4YU_Dzai3ZdSAj53wiQVhL4gWz4EUVQGWSHxZNhM0Rr5PGhpQOP25gJXPF3RU/w640-h524/bbbbb.jpg" width="640" /></a></p><div><span style="color: #2e2e2e;"><br /><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;">This is strange because there are lots of adjustments in 1994, 1999 and 1938 that are far more than 0.5 degree, some are over 3C degrees.<br /></span></div><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;">But maybe the vague language about the 1950's is where the low adjustments are -- well it depends on the year which is not specified.</span></div><div style="font-family: "Times New Roman"; font-size: 14.0877px;"><span style="font-family: verdana;"><br /></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><div class="separator" style="clear: both; text-align: center;"><span style="font-family: verdana;"><br /></span></div></div><div style="font-family: "Times New Roman";"><span style="font-family: verdana;">Here are the biggests Adjustments in the time series for Bourke:</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-size: medium;"><br /></span></div><div><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ8218NClEV3-LXQsR07ze3xoOdZ-xzrYer8Ca0M_W8Tj_ibsMw5tSKmnLJ3KxCnElvb2l4DAJn3woX6hV3kspoPv2I_8pAjlJ8hTluB6ORRhtAcyLvdoS-MvNlj33yUwigH5rm19Oh0Y/s812/bourkecooling.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="812" data-original-width="607" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ8218NClEV3-LXQsR07ze3xoOdZ-xzrYer8Ca0M_W8Tj_ibsMw5tSKmnLJ3KxCnElvb2l4DAJn3woX6hV3kspoPv2I_8pAjlJ8hTluB6ORRhtAcyLvdoS-MvNlj33yUwigH5rm19Oh0Y/w478-h640/bourkecooling.jpg" width="478" /></a></div><br /><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><br /></div><div class="separator" style="clear: both;"><span style="font-family: verdana;"><span style="font-size: medium;">So the <i>'</i><i>none of them more than 0.5 </i></span><i style="color: black;">degrees Celsius' is </i><span style="color: black;">meant for some unknown years in the 1950's, it seems to be misdirection to distract from the bigger adjustments all along the time series.</span></span></div><div class="separator" style="clear: both;"><span style="color: black;"><span style="font-family: verdana;"><br /></span></span></div><div class="separator" style="clear: both;"><span style="color: black;"><span style="font-family: verdana;">But look at the months column -- so many August entries I had to look at it more closely.</span></span></div><div class="separator" style="clear: both;"><span style="color: black;"><span style="font-family: verdana;"><br /></span></span></div><div class="separator" style="clear: both;"><span style="color: black;"><span style="font-family: verdana;">So there were more than 4400 adjustments over 2C degrees in the time series. That's the subset we'll look at--</span></span></div><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><span style="color: black; font-size: large;"><span style="font-family: verdana;"><br /></span></span></div><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium;"><span style="color: black; font-size: large;"><br /></span></div><div class="separator" style="clear: both; font-family: "Times New Roman";"><div class="separator" style="clear: both; font-size: medium; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQ5HdkHQniLyhYK9gmNS4WQHFXlwcMxr79TAGXqINJakkwBW2Ht2YkNvcfWufra6awUHNa95Ol9vwP8Dbigrl8B_yg58gCnk4-BgtfSLpsKwLiOty8CnrmNcSDSVAm-P8ADsvR-6S-3yg/s360/augcool.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="360" data-original-width="187" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQ5HdkHQniLyhYK9gmNS4WQHFXlwcMxr79TAGXqINJakkwBW2Ht2YkNvcfWufra6awUHNa95Ol9vwP8Dbigrl8B_yg58gCnk4-BgtfSLpsKwLiOty8CnrmNcSDSVAm-P8ADsvR-6S-3yg/w208-h400/augcool.jpg" width="208" /></a></div><br /><span style="color: black;"><span style="font-family: verdana;">Look at August -- <i><b>half of all the adjustments over 2 degrees for 1911-2019 in the time series were in August! </b></i></span></span></div><div class="separator" style="clear: both; font-family: "Times New Roman";"><span style="color: black;"><span style="font-family: verdana;"><br /></span></span></div><div class="separator" style="clear: both; font-family: "Times New Roman";"><span style="color: black;"><span style="font-family: verdana;">August is getting special attention by the BOM in Bourke with a major cooling of Minimum temperatures. </span></span></div><div style="font-family: "Times New Roman";"><span style="font-family: verdana;"><br /></span></div><span style="font-family: verdana; font-size: medium;">Getting back to the <i>'0.5 degree adjustments in the 1950's' </i>-- this is nonsense because:</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;">These are the statistics for 1950-1959:</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><br /></div><div class="separator" style="clear: both; font-family: "Times New Roman"; font-size: medium; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaPzUUUvF6vyFT_LAgGxVlvpCHdqt0CiJ74PtbZiKQ6HK5OaL2WoDP3yLHiKh3tqn8AlnRj4s9OAXG9Sj9d72ZHZz6qJwfuTRVZ_gJA8TxKTA6oixcE0smxYJfwYA1Wtq7gEmvEkewV1M/s519/bourke50%2527s.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="519" data-original-width="372" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaPzUUUvF6vyFT_LAgGxVlvpCHdqt0CiJ74PtbZiKQ6HK5OaL2WoDP3yLHiKh3tqn8AlnRj4s9OAXG9Sj9d72ZHZz6qJwfuTRVZ_gJA8TxKTA6oixcE0smxYJfwYA1Wtq7gEmvEkewV1M/w286-h400/bourke50%2527s.jpg" width="286" /></a></div><br /><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;">The mean Min Raw temp is 13.56 degrees. </span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;"><i>A single mean digit contains sampling variation and does not give a true picture.</i></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;"><i><br /></i></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;"> </span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="font-family: verdana; font-size: medium;">Putting Bourke Min into a Control Chart below shows what the real problems are. The upper red line is the upper 3 sigma limit, the lower one is the lower 3 sigma limit, temps will vary between the red lines 99.7% of the time unless it is <i>'out of control.'</i></span></div></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"> </span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">You can tell something is wrong with Bourke with the number of nodes that have breached the upper and lower limits at the beginning and end of the series, the 1950's doesnt even register. How can it, we are talking 0.5C.</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">Look at 2010, it is off the chart, literally....an extremely remote chance of seeing this event at random.</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;">In Control Chart language, this temp series is <i>'Out Of Control', </i>there is something very wrong with it.</span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: verdana; font-size: medium;"><br /></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><span style="color: #2e2e2e; font-family: Georgia, Baskerville, Palatino, Palatino Linotype, Book Antiqua, Times New Roman, serif;"><span style="font-size: 14.0877px;"><br /></span></span></div><div style="font-family: "Times New Roman"; font-size: medium;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFPc5kF7LVFt7YDZ-qcPFvQFXQk_6g6gOwie_PZLlZqGWBNQxnfEe5otU1HgTDPfj-Ailojkmo46Tf-une7VMCVe7_hMv2nwKatOHrj6WA6xqAsxqs-1yXvQ3hWzho8VTCtgUx6V22mXY/s671/bbbb.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="600" data-original-width="671" height="572" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFPc5kF7LVFt7YDZ-qcPFvQFXQk_6g6gOwie_PZLlZqGWBNQxnfEe5otU1HgTDPfj-Ailojkmo46Tf-une7VMCVe7_hMv2nwKatOHrj6WA6xqAsxqs-1yXvQ3hWzho8VTCtgUx6V22mXY/w640-h572/bbbb.jpg" width="640" /></a></div><br /><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><b>Above:</b></span><span style="color: #2e2e2e;"> Control Chart for Bourke showing the system is 'out of control' from the beginning, but the 1950's are not the problem.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">UPDATE: **********************************************</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">MORE on the specific months Bourke is manipulated/adjusted.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">Looking at Minv2.1, all manipulation, um adjustments at -2.7C or more ie the biggest cooling adjustments--shown below. May gets 414 adjustments, August gets 1330.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdp7yHvJ25lI4TDvZVsCFR3IP4y-hgfoljvJQhoC880eotaOa0_-EXHYrseMWSk0nwiNRQu86NuNjTQrl-K8P5sWuEUTz-g4xM8Tpga0iGgTZXcwrljVfkVFUn6bMtmf8jGyI8ReohCbY/s527/bourkeminnegadj.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="407" data-original-width="527" height="494" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdp7yHvJ25lI4TDvZVsCFR3IP4y-hgfoljvJQhoC880eotaOa0_-EXHYrseMWSk0nwiNRQu86NuNjTQrl-K8P5sWuEUTz-g4xM8Tpga0iGgTZXcwrljVfkVFUn6bMtmf8jGyI8ReohCbY/w640-h494/bourkeminnegadj.jpg" width="640" /></a></div><br /><span style="color: #2e2e2e;"><b>What this shows is that in Minimum Temps, 97% of ALL adjustments over 84 years at -2.7C or less (cooling down) were May and August!</b> Whether or not May and August needed it, every year for 84 years, adjustments were made to May and August.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">Below, all the years the adjustments were done as well as how many.</span></span></div><div><br /></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6IQfgg-N65VC0YeQ0_fxqUn2nE3nZ5eJpZutWUgzt1yOrHrm08kJ0NGynJXWSISOKtN3SEjoLEh9oa_ZB2-eEgD4ho_iH5o21vxrucQXxRCN1Sr4aJO5W4QrsOoLFPWhjsr25X3sf7hc/s841/2021-01-23_164649.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="841" data-original-width="730" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6IQfgg-N65VC0YeQ0_fxqUn2nE3nZ5eJpZutWUgzt1yOrHrm08kJ0NGynJXWSISOKtN3SEjoLEh9oa_ZB2-eEgD4ho_iH5o21vxrucQXxRCN1Sr4aJO5W4QrsOoLFPWhjsr25X3sf7hc/w348-h400/2021-01-23_164649.jpg" width="348" /></a></div><br /><span style="color: #2e2e2e;"><i>What this means is that the largest cooling adjustments were all done on May+ August every single year</i>---whether the station moved, vegetation grew, thermometers drifted, it matters not--every year May and August were cooled by -2.7C or more with an average of 23 adjustments per year.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;">This makes a mockery of reasons concocted by BOM in hindsight such as vegetation growing, station moving up the hill then down the hill etc.</span></span></div><div><span style="font-family: verdana; font-size: medium;"><span style="color: #2e2e2e;"><br /></span></span></div></span></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;">___________________________________________________________________________________</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">“<i>There has been no statistically significant warming over the last 15 years.” -- 13 February 2010, <a href="https://www.cato.org/publications/commentary/climategate-beyond-inquiry-panels">Dr. Phil Jones</a></i></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Statistical Sigificance With NPC Test from MIT</b></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><b>Every Decade Warmer Since 1980's Warmer? </b></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><b><br /></b></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Very often the BOM display graphs and charts without boundaries of error or confidence intervals. The statistical significance is implied.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Given that there are problems with past historic temperature series,what if we could test just the best, most recent results with modern fail safe equipment for statistical significance?</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">A hypothesis like this is easy to test:</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><div><i>Dr Colin Morice from the Met Office Hadley Centre.</i></div><div><i>"Each decade from the 1980s has been successively warmer than all</i></div><div><i>the decades that came before." </i></div><div><br /></div><div>We can use Non Parametric Combination Test with R code from <a href="https://www.scholars.northwestern.edu/en/publications/nonparametric-combination-npc-a-framework-for-testing-elaborate-t">Devin Caughey</a> at MIT.</div><div><br /></div></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">This technique is common in <a href="https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.23115">brain mapping</a> labs because no assumptions are made about the distribution, inter-dependencies are handled and multiple test are exactly combined into a p value. A great signal to noise ratio and the ability to handle very small sample sizes makes this the ideal candidate to test the hypothesis.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">The null hypothesis = <i>all</i> decades after 1980 are NOT getting warmer</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">The alternate hypothesis = <i>all</i> decades since 1980 have become warmer.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">We'll use the temps from Berkley Earth.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">The data will be:</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">1980-1989</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">1990-1999</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">2000-2009</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">The output of NPC is a p value after exactly combining the sub-hypotheses. In keeping with the BOM, we use the 95% significance level, so anything that is LESS than p value = 0.05 has the null hypothesis rejected.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"> </span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">The results using Berkely Earth temps (except NOAA which is from NOAA) are:</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><div>berkley earth temp</div><div>h0=!1>2>3----null hypothesis</div><div>h1=1>2>3=each decade warmer----alternate hypothesis</div><div><br /></div><div><i>Don't Reject The Null - each decade NOT getting warmer.</i></div><div>Alice Springs p-value = 0.4188</div><div>Amberley p-value = 0.3326</div><div>Tennant Creek p-value = 0.7159</div><div>Benalla p-value = 0.4085</div><div>Bering p-value = 0.1651</div><div>Capetown p-value = 0.2872</div><div>Corowa p-value = 0.1776</div><div>Darwin p-value = 0.5984</div><div>DeBilt, Netherlands p-value = 0.146</div><div>Deniliquin p-value = 0.4067</div><div>Echuca p-value = 0.3645</div><div>Launceston p-value = 0.3331</div><div>Mawson p-value = 0.3043</div><div>Mildura p-value = 0.2888</div><div>Mt. Isa p-value = 0.5782</div><div>NOAA Southern Region p-value = 0.2539 </div><div>Nowra p-value = 0.2141</div><div>Rutherglen p-value = 0.2283</div><div>Sale p-value = 0.3685</div><div>Tamworth p-value = 0.2407</div><div>Wangaratta p-value = 0.277</div><div><br /></div><div><i>Reject The Null - each decade is getting warmer</i></div><div>Beechworth p-value = 2e-04</div><div>Hobart p-value = 3e-04</div></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Beechworth is less than 40kms away from Wangaratta yet decisively rejects the null while Wangaratta does not! Similar to Hobart Launceston that are 2 hours apart.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">This shows that the premise from Met Office Hadley is wrong for our sample. Using Berkely Earth temps, a random sampling of stations using NPC test to calculate significance <i>without assuming a normal distribution</i>, has rejected the alternate hypothesis in most cases.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Going over the results again, I found most country stations reject the alternate while the capital cities being Urban Heat Islands, </span><span style="font-size: medium;"><span style="font-family: verdana;">decisively</span><span style="font-family: verdana;"> reject the null and agree with Met Office Hadley.</span></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: large;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">This shows 2 things:</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Statistical significance/confidence intervals/boundaries of error are mostly ignored in climate presentations.</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Don't trust everything you hear - test, test, test!</span></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">As An Aside</span><span style="font-family: verdana; font-size: medium;">:</span></div><div style="text-align: left;"><div style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;">Here are 40 000 coin tosses documented at Berkley University, heads are +1 and tails -1:</span></div><div><span style="font-family: verdana; font-size: medium;"><br /></span></div></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjF0UI96jXCnKPojHfJcY0e5fRpRQKAZjO7spoM8rRdtsHY4vA-FwHaopRzC6cuhzZ8GFi9E4wlPQvEk361pEudLb-ZSQQODUi7eZwJwyxkKRF7w4g3_ceUw5gVZndBrnitg0xMJVmju7M/s968/berkleyUnicointosses.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="829" data-original-width="968" height="548" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjF0UI96jXCnKPojHfJcY0e5fRpRQKAZjO7spoM8rRdtsHY4vA-FwHaopRzC6cuhzZ8GFi9E4wlPQvEk361pEudLb-ZSQQODUi7eZwJwyxkKRF7w4g3_ceUw5gVZndBrnitg0xMJVmju7M/w640-h548/berkleyUnicointosses.jpg" width="640" /></a></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><div style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;">I took the first 1000 tosses from their supplied spreadsheet, graphed it and plotted a trend.</span></div><div style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;">There's even a 95% percent boundary of error which is more than the BOM supply on most of their trends. </span></div><div style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="color: #2e2e2e;"><span style="font-family: verdana; font-size: medium;">Moral of the story: Even a sequence of coin tosses can show a trend. </span></div></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPhOpTKOLnVGa9rC0FQbd0hWpiE3T3gcyT64YZgTJyQ1Z-r_2CbMVtpiltA_dFweUOaz1Tk8K6GiZ15Rys-vzOvH70-F3zYxMjfGK9NO_yosgMbQcXl4rgxrM7FxLUNyW76NhBk3UYorI/s1920/1000cointossestrend.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPhOpTKOLnVGa9rC0FQbd0hWpiE3T3gcyT64YZgTJyQ1Z-r_2CbMVtpiltA_dFweUOaz1Tk8K6GiZ15Rys-vzOvH70-F3zYxMjfGK9NO_yosgMbQcXl4rgxrM7FxLUNyW76NhBk3UYorI/w640-h348/1000cointossestrend.jpg" width="640" /></a></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">__________________________________________________________</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">***UPDATES COMING ***</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">19 Jan 2021</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Over The Next Few Days/Weeks</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">coming-</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">NEW evidence on missing raw data being imputed/estimated/fabricated into Adjusted data that breaks records. In one case 10 000 data of missing raw is imputed/estimated/fabricated into Adjusted data with yearly records. DONE</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Count or ratio data conforms to Benford's Law. A NEW way to use temperatures into the histogram graphs that test for Benford's Law <i>without the need to convert them to temperature anomalies.</i> This tests the histogram of repeated temperatures for Benfords Law conformance!</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">Statistical significance tests using climate data from Berkley Earth. This shows that many/most stations are not even statistically valid data from a climate agency when testing this hypothesis--</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><i><div>Dr Colin Morice from the Met Office Hadley Centre:</div><div>"Each decade from the 1980s has been successively warmer than all</div><div>the decades that came before." </div></i></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">We do this using Non Parmatric Combination Test from MIT which makes no assumptions about the distribution and automatically accounts for inter-dependencies. DONE</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">__________________________________________________________</span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">This blog has taken quite a few months to research and write. The more I dug into the data, the more rotten it was. And I am still digging. It is a shocking case of extreme data tampering and fabrication. </span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">It is on a larger scale than Enron if it were financial (check Enron Benford curves from my first post), the fabrication/duplication is larger the Prof Staples who retracted his studies and was fired from the University of Rotterdam (and who said his techniques were 'commonly in use' in the research labs). </span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">This has to be a wake-up call for the Government to launch a forensic audit. The BOM cannot be trusted with the temperature record, it should be handed over to a reputable origanisation like the Bureau Of Statistics. </span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;"><br /></span></div><div style="text-align: left;"><span style="font-family: verdana; font-size: medium;">It's obvious the BOM either don't know or don't want to know about data integrity. This isn't science. The Brit's have a term for this - <i>Noddy Science.</i></span></div><div style="text-align: left;"><span style="font-size: medium;"><i><br /></i></span></div><div style="text-align: left;"><span style="font-size: medium;">__________________________________________________________________________</span></div></div></div></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span style="font-size: medium;"><br /></span></div><div class="separator" style="clear: both; text-align: left;"><span><span style="font-family: verdana; font-size: medium;">More to follow in other posts, there is much to write about in relation to climate data. It's making the tulip frenzy of the 1600's look like a hiccup.</span></span></div><div><div class="separator" style="clear: both; text-align: left;"><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><p><br /></p><p><br /></p></div></div>Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-64377072465858792912020-11-08T22:06:00.262-08:002020-11-22T19:58:11.311-08:00BOM Sydney Climate Data Audit Using Benfords Law And Statistical Analysis.<p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4NJ9-oylLApOEbtTiPaRUl91bwUkNtqX4CRUSW1ZKnKPuitwXxl4mnS2zY3xrEK4g2DLSklSxGLvRxPUMaLUlZffdlXC5sgTh-NVCnTYF3U_5JuJd9n-5mDEFbw_R52t0wjMnhuwP_Ac/s940/heatmap.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="627" data-original-width="940" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4NJ9-oylLApOEbtTiPaRUl91bwUkNtqX4CRUSW1ZKnKPuitwXxl4mnS2zY3xrEK4g2DLSklSxGLvRxPUMaLUlZffdlXC5sgTh-NVCnTYF3U_5JuJd9n-5mDEFbw_R52t0wjMnhuwP_Ac/w400-h266/heatmap.jpg" width="400" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><b>Summary:</b><div> We test daily climate data for Sydney from 1910-2018, 40 000 days for both Max and Min temperature time series for conformance to Benfords Law of first digit and first two digits, as it is commonly used for fraud detection and data intregrity checks. We find the data fails conform to Benfords Law test criteria by Chi-square and Kolmogorov-Smirnov indicating tampering, even with raw data. We also use the bayesian time-varying model from University of Edinburgh to test first digit homogeneity differences over time to pinpoint the years involved.<div><br /></div><div>Then using pattern exploration software, we find large clumps of data, over two and a half months worth, that has been "copy/pasted" into other years as well as multiple smaller "above chance" sequences that match across different years. These patterns exists in raw data as well.</div><div><br /></div><div>Trailing digit analysis confirms data tampering and extends the results from University of Portland analysis of Tasmania climate data showing likely tampering, to Sydney data showing the same thing.</div><div><br /></div><div>Focusing on the most repeated temperatures in a time series, a new technique of repeated numbers or "number bunching" from fraud analytics is used to identify cases where repeated temperatures occur exceed expectation too often.</div><div><br /></div><div><b><br /></b></div><div><b>Prelude:</b></div><div>In the computer industry, we used to say <i>Garbage In, Garbage Out.</i> It expressed the idea that flawed or incorrect input data will always produce faulty output. It's been claimed that 90% of the world's data has been created in the last two years (Horton, 2015), making it even more critical to check data integrity.<p>The Australian Government announced in 2016 that it has committed $2.55 billion dollars for carbon reduction and other $1 billion to support developing countries reduce their carbon dioxide emissions. (<a href="https://www.abc.net.au/news/2016-05-05/government-spends-$500m-reducing-carbon-emissions/7388310" target="_blank">Link</a>)</p><p>The premise behind the spending is that world wide temperatures have risen to dangerous levels, and are caused by man made emissions. The most cited dataset used to prove this is the HadCRUT4 from Met Office Hadley Centre UK, and before 2017 this data had never had an independent audit.</p><p>John McLean published his dissertation <span class="person_name" face="Arial, sans-serif" style="background-color: white; font-size: 12px; text-align: center;">McLean, John D.</span><span face="Arial, sans-serif" style="background-color: white; font-size: 12px; text-align: center;"> (2017) </span><em style="background-color: white; font-family: Arial, sans-serif; font-size: 12px; text-align: center;">An audit of uncertainties in the HadCRUT4 temperature anomaly dataset plus the investigation of three other contemporary climate issues.</em><span face="Arial, sans-serif" style="background-color: white; font-size: 12px; text-align: center;"> PhD thesis, James Cook University. </span>showing comprehensively how error- ridden and unreliable this dataset actually was. (<a href="https://researchonline.jcu.edu.au/52041/" target="_blank">Link</a>)</p><div class="separator" style="clear: both; text-align: center;"><a href="https://researchonline.jcu.edu.au/52041/1/52041-mclean-2017-thesis.pdf" style="margin-left: 1em; margin-right: 1em;" target="_blank"><img border="0" data-original-height="323" data-original-width="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiPhEGzMTGM1HvRmlmhHNboPxFYPZVAPZc77Lb0sBnV4MqLe0gJBM37Ghr3n5ijSzZ4oQehNKE6cJvXDx74jCHuAYceUBIxvptW6cbBX47FrtoKe_eHAmUPhUeXnzQ1wvgRVyGPpjUqpE/s320/mac.jpg" width="320" /></a></div><br /><p><br /></p><p>In Australia, the Bureau Of Meterology (BOM) created and maintains The Australian Climate Observations Reference Network-Surface Air Temperature (ACORN SAT) which <i>"provides the best possible dataset for analyses variability and change of temperature in Australia."</i></p><p>This dataset has also never had an independent audit despite claims that <i>"The Bureau's ACORN-SAT <b>dataset</b> and methods have been thoroughly peer-reviewed and found to be world-leading."</i> (<a href="http://www.bom.gov.au/climate/data/acorn-sat/" target="_blank">Link</a>)</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk35eRz-TTppeVrxo7DGW8D-Lzaf5Hkf4Bmk8xNRJmNDgpYTJwOwh22ftftUhbk-wO39xUNSsJVTEETPeR3eFw5y5DphmA9J3P51hOhYbqN3H5PaKGd-6xbZU7KpEl3VYuWLfkHbLdb4s/s616/aaaaaaa.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="279" data-original-width="616" height="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk35eRz-TTppeVrxo7DGW8D-Lzaf5Hkf4Bmk8xNRJmNDgpYTJwOwh22ftftUhbk-wO39xUNSsJVTEETPeR3eFw5y5DphmA9J3P51hOhYbqN3H5PaKGd-6xbZU7KpEl3VYuWLfkHbLdb4s/w391-h177/aaaaaaa.jpg" width="391" /></a></div><br /><p>The review panel in 2011 assessed the <b>data analysis methodology,</b> and compared the temperature trends to <i>"several global datasets"</i>, finding they <i>"exhibited essentially the same long term climate variability"</i>. This <i>"strengthened the panels view</i>" that the dataset was <i>"robust"</i>. (<a href="http://www.bom.gov.au/climate/data/acorn-sat/documents/ACORN-SAT_IPR_Panel_Report_WEB.pdf" target="_blank">Link</a>)</p><p><b><br /></b></p><p><b>Benford's Law</b></p><p><b>It has been shown that temperature anomalies conform to Benford's law</b>, as do a large number of natural phenomena and man-made data sets. (Benford's Law In The Natural Sciences, M.Sambridge et al 2010)</p><p>Benford's law has been widely applied to many varied data sets for statistical fraud and data integrity analysis, yet surprisingly has never been used to analyse climate data.</p><p>Some examples of Benfords Law: <i>(Hill, 1995a; Nigrini, 1996; Leemis, Schmeiser, and Evans, 2000; Bolton and Hand, 2002; Applying Benford’s law to detect fraudulent practices in the banking industry Theoharry Grammatikos a∗ and Nikolaos I. Papanikolaou 2015, Benford’s Law in Time Series Analysis of Seismic Clusters Gianluca Sottili ·2015; Schräpler, Jörg-Peter (2010) : Benford's Law As an Instrument for Fraud Detection in Surveys Using the Data of the Socio-Economic Pane 2019; Using Benford’s law to investigate Natural Hazard dataset homogeneity Renaud Joannes-Boyau et al 2015; Indentifying Falsified Clinical Data Joanne Lee, George Judge 2008; self-reported toxic emissions data (de Marchi and Hamilton, 2006), numerical analysis (Berger and Hill, 2007), scientific fraud detection (Diekmann,2007), quality of survey data (Judge and Schechter, 2009), election fraud analysis (Mebane, 2011)</i></p><p><b>Benford's Law states that the leading digit will occur with a probability of 30.1%</b> for many naturally occuring datasets such as the length of rivers or distance travelled by hurricanes, street addresses, and also man-made data such as tax returns and and invoices, making this a very useful tool for accounting forensics.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjk1bcNmah8QpX2G_HCO9byei6ZtOnVL3fBjfShWkIa4h0MCSClrDKCr-gQKR7_0Tlrdk9SmflDmSf17gIaelzAsTT4uGCOUmk_-JGf8_nVd8qqzaJTR0m8GXbSRUGz7CtSTxDhbnHisr4/s1285/nab.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="958" data-original-width="1285" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjk1bcNmah8QpX2G_HCO9byei6ZtOnVL3fBjfShWkIa4h0MCSClrDKCr-gQKR7_0Tlrdk9SmflDmSf17gIaelzAsTT4uGCOUmk_-JGf8_nVd8qqzaJTR0m8GXbSRUGz7CtSTxDhbnHisr4/s320/nab.jpg" width="320" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><span style="text-align: left;">To conform to Benfords law, the </span><b style="text-align: left;">leading digit </b><span style="text-align: left;">takes the value of 1 about 30.1% of the time, the value of 2 about 17.6% of the time, and so on, see table below. So the probability that nearly half the population live at a street address with the first number being a 1 or a 2 is 47.7%. Essentially this means that in the universe there are more one's than two's, more two's than three's and so on.</span></div><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZSUH7h2tu0oZ8sNqrCwDc_BoSa59DVue-dnvV4_I7VZIbcdD8kGkyc-7ZgR145KvVA9NL-ZO606XSIWcVyDrxdKLhLBg_ANxnYdh7SsOJnNlSUv4xZJ8aC-gwslcKFO_Z0wDWmSfc-U0/s1667/Benfords-Law.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1667" data-original-width="1000" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZSUH7h2tu0oZ8sNqrCwDc_BoSa59DVue-dnvV4_I7VZIbcdD8kGkyc-7ZgR145KvVA9NL-ZO606XSIWcVyDrxdKLhLBg_ANxnYdh7SsOJnNlSUv4xZJ8aC-gwslcKFO_Z0wDWmSfc-U0/s320/Benfords-Law.jpg" /></a></div><br />Data conformance to Benfords Law can be visually checked by looking at the graph of the actual versus the expected frequencies, and statistically confirmed with a Chi-square test to compare expected frequencies with actual, the Kolomogorov-Smirnov test was used as a back up confirmation. These tests were validated in this application for accuracy using monte-carlo simulations. (<i>Two Digit Testing for Benford’s Law, Dieter W. Joensseny, 2013</i>)<div><br /></div><div>Scammer Bernie Madoff's financial returns are a great example and can be found here (<a href="https://nakedshorts.typepad.com/files/madoff_fairfieldsentry3x.pdf" target="_blank">Link</a>)<div><a href="https://excelmaster.co/benford-catches-madoff/" target="_blank">This website</a> has calculated the Benford curve for one digit and first two digit probabilities from Madoff's financial returns:</div><div><br /></div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxoEAmAfvm6_TUSbz6dhld9gU1inqNOtE_KMiPoVSLFTZVZeL0sUAzhdAQHgoekR58zx7vKUylOSKPbMcsRTE_u5WBXWDSRj01chi4-UW4ZyGO-T6o_ao4CmiW15l7e1jfobueZTnPRaM/s502/benford_madoff_fairfield_5.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="408" data-original-width="502" height="325" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxoEAmAfvm6_TUSbz6dhld9gU1inqNOtE_KMiPoVSLFTZVZeL0sUAzhdAQHgoekR58zx7vKUylOSKPbMcsRTE_u5WBXWDSRj01chi4-UW4ZyGO-T6o_ao4CmiW15l7e1jfobueZTnPRaM/w400-h325/benford_madoff_fairfield_5.jpg" width="400" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOFJZAEPWOeyS30hWAIMt8PX2-noUjm3D9kQq5mHasYvk9U6onJwQikvjS4ZqGdOa1gmlLX4Mx4Qt4tPcdZM5uAOenrxTsOOZwgObGh6pTQKWg6tmCrSctGR4wAtyCmvPDVk7yk0MUuDg/s502/benford_madoff_fairfield_7.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="408" data-original-width="502" height="325" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOFJZAEPWOeyS30hWAIMt8PX2-noUjm3D9kQq5mHasYvk9U6onJwQikvjS4ZqGdOa1gmlLX4Mx4Qt4tPcdZM5uAOenrxTsOOZwgObGh6pTQKWg6tmCrSctGR4wAtyCmvPDVk7yk0MUuDg/w400-h325/benford_madoff_fairfield_7.jpg" width="400" /></a></div><br /><div><div><div class="separator" style="clear: both; text-align: center;"><br /></div>The first graph shows the leading digit did not have enough one's, and that there were too many two's three's, four's and fives.</div><div><br /></div><div>Using the <i>first digit and the second digit</i> adds more power to Benfords Law. (<i>Two Digit Testing for Benford’s Law Dieter W. Joenssen, University of Technology Ilmenau, Ilmenau, Germany 2013</i>)</div><div><br /></div><div>The second graph shows even more clearly the increase in power and how non-conforming Madoff's financials were by using first <i>and</i> second leading digits in the analysis.</div><div><br /></div><div>But for Benfords Law to apply, it must cover multiple orders of magnitude, and the numbers must not be constrained by an upper or lower limit. (<i>S.Miller, Benford's Law: Theory and Applications, 2015</i><span face="Verdana, Arial, Univers, Helvetica, sans-serif" style="background-color: white; font-size: 14.4px;">)</span></div><div><span face="Verdana, Arial, Univers, Helvetica, sans-serif" style="background-color: white; font-size: 14.4px;"><br /></span></div><div> Surface temperatures won't work with Benford because they are constrained - they may range from from -30 to +50 C, for example. You won't find 99 C degree surface temps (unless you count the errors in the HadCRUT4 data set), so the <i>digits are constrained and therefore don't conform to Benfords Law.</i></div><div><br /></div><div><b>However, temperature anomalies DO conform to Benfords law</b>. Malcolm Sambridge from the University Of Canberra showed this - (<i>Benford’s law in the natural sciences, M. Sambridge, 1 H. Tkalčić, and A. Jackson 2010</i>)</div><div><br /></div><div><b>What Are Temperature Anomalies?</b></div><div><span style="background-color: white; color: #202124;">National Oceanic And Atmospheric Adminstration describe a temperature anomaly as:</span></div></div></div><div><span style="background-color: white; color: #202124;"><br /></span></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><div><div><div><span face="arial, sans-serif" style="background-color: white; color: #202124;"><i><div>"A temperature anomaly is the difference from an average, or baseline, temperature.</div></i></span></div></div></div><div><div><span face="arial, sans-serif" style="background-color: white; color: #202124;"><i><div>A positive anomaly indicates the observed temperature was warmer than the baseline,</div></i></span></div></div><div><div><span face="arial, sans-serif" style="background-color: white; color: #202124;"><i><div>while a negative anomaly indicates the observed temperature was cooler than the baseline."</div></i></span></div></div></blockquote><div><div><br /></div><div>This means that temperatures above a determined "average block of years" are classified as warmer, and temperatures below this average are cooler. The "average" acts as a pivot point with above and below average anomalies clearly displayed in the Met Office plot below.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-DO9CM2j5EjrMgpeA9CeoZpDpn8eIushHtTCmqZEIiXwHtHqkHLGwI-c6spYr8_kiIhRBMU34PE-id9zD9OP8Y3mzFM37Xz8dL-2eSjHsnjPm-lq3vz9frfcBimqPODxWZgXx49z_jJM/s815/zzzzzqqqqq.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="524" data-original-width="815" height="230" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-DO9CM2j5EjrMgpeA9CeoZpDpn8eIushHtTCmqZEIiXwHtHqkHLGwI-c6spYr8_kiIhRBMU34PE-id9zD9OP8Y3mzFM37Xz8dL-2eSjHsnjPm-lq3vz9frfcBimqPODxWZgXx49z_jJM/w357-h230/zzzzzqqqqq.jpg" width="357" /></a></div><br /><div>The reason temperature anomalies are used is because it makes it easy to compare and blend neighbouring stations into a spatial grid. Climatologists claim anomalies are more accurate than temperatures. From NOAA website:</div><div><div><br /></div></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><div><div><i>“Anomalies more accurately describe </i><i>climate variability over larger areas than absolute temperatures do, and they give a </i><i>frame of reference that allows more meaningful comparisons between locations and </i><i>more accurate calculations of temperature trends.”</i></div></div></blockquote><div><br /></div><div>In fact, anomalies <i><b>are in most cases less accurate</b></i> than temperatures in spatial grids. (<i>New Systematic Errors In Anomalies Of Global Mean Temperature Time Series, Michael Limburg, Germany, 2019</i>)</div><div><br /></div><div><b>Anomalies are widely used in climate analysis and do conform to Benfords Law which gives us a very useful powerful tool for auditing climate data.</b></div><div><br /></div><div><br /></div><div>----------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div><br /></div><div><b>Benford's Law Analysis:</b></div><div><b>Data Integrity Audit Of BOM Climate Data Using Benfords Law</b></div><div><b>And Statistical Pattern Exploration using R and JMP</b></div><div><br /></div><div>The Bureau Of Meteorology provides Raw and Adjusted data <a href="http://www.bom.gov.au/climate/data/">here</a>. <b>Raw Data</b> is <i>"is
quality controlled for basic data errors"</i>. <b>Adjusted Data</b> "<i>has been developed specifically to
account for various changes in the network over
time, including changes in coverage of stations and
observational practices.</i>"</div><div><br /></div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFqJfw1RuTOb0SMC1XLSfF8CLwllzyc8IHw6Qn9Xzx9L9DEDyZQ2K1okxLa1I0kXpGM4TwPbhR7zPJSXP7HpePeEAegfl20-8DKZBug6B0bL1M0l6AVaJlVAxrU7rLPFd7kT-GpArpBs8/s1286/pharma.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="961" data-original-width="1286" height="299" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFqJfw1RuTOb0SMC1XLSfF8CLwllzyc8IHw6Qn9Xzx9L9DEDyZQ2K1okxLa1I0kXpGM4TwPbhR7zPJSXP7HpePeEAegfl20-8DKZBug6B0bL1M0l6AVaJlVAxrU7rLPFd7kT-GpArpBs8/w400-h299/pharma.jpg" width="400" /></a></div><br />.</div><div><br /></div><div>The "adjustments" by BOM are called homogeneity adjustments to account for various "errors", although it has been shown that half the global warming is due to this homogenisation procedure. (<i>Investigation of methods for hydroclimatic data homogenization, E. Steirou and D. Koutsoyiannish, 2012</i>)</div><div><br /></div></div><div><br /></div><div><br /></div><div><b>Sydney Daily Max And Min Temperature Time Series</b></div><div>The daily temperature time series for Sydney max and min temperatures extends from 1910-2018, nearly 40 000 days.</div><div><br /></div><div> Temperature Anomalies are created for each temperature time series as per BOM methodology using R code. The Benford's Law analysis and conformance tests are also done using R code.</div><div>---------------------------------------------------------------------------------------------------------------------------</div><div><b>NOTE: </b>The Minimum and Maximum adjusted data is called Minv2 and Maxv2 respectively, and the Min Raw and Max Raw is the Minimum and Maximum Raw daily data from 1910-2018 as supplied by BOM. </div><div>--------------------------------------------------------------------------------------------------------------------------</div><div><b>Benford's Law NOTE: </b>Temp anomalies are used for Benfords Law first digit and first two digits test. In the first digit test, <i>only the leading digit is used after the - + or 0 are stripped away.</i> In other words, according to Benfords Law, leading digit 0 is thrown away, as is - or + signs. Only digits 1-9 are used.</div><div><br /></div><div><i>In the Benfords Law two digit test, only the leading two digit values (10-99) are used after stripping out - or + or leading 0. </i></div><div>----------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div><b>Below: All Sydney days for Maxv2 data with first digit Benford's law test, expected (dotted red line) versus actual frequency.</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8_1jLqxZVjNLi5tYwdl2zyaVzpZpc0nDwo7LiOErAslFNPXuXkOfA4F8ltA2afk-XsDlW8M7lBNOX0unlG11AuZZ4RQI5Icddq-nUBKF9fzucf52IH0H7xcEjsXea7rMYlT7DjKoxzaQ/s957/maxv21digit.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="957" data-original-width="956" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8_1jLqxZVjNLi5tYwdl2zyaVzpZpc0nDwo7LiOErAslFNPXuXkOfA4F8ltA2afk-XsDlW8M7lBNOX0unlG11AuZZ4RQI5Icddq-nUBKF9fzucf52IH0H7xcEjsXea7rMYlT7DjKoxzaQ/w640-h640/maxv21digit.jpg" width="640" /></a></div><div><br /></div><b>Above</b>: The first (leading) digit for the <i>complete</i> <b>Daily</b> <b>Sydney Maximum Adjusted Temps (maxv2) </b>from 1910-2018, nearly 40 000 days. <i>The red dotted line is the expected, the bars are the actual.</i></div><div><br /></div><div>It shows a weakly conforming curve to Benford's Law <b>over the full data set</b>, but with too few one's and too many three's and four's overall. This curve fails the chi-square test with a very small p value but is "weak" according to the Nigrini MAD index. To gain more power, the first two digits are used in the next Benford Test below.<div><br /></div><div><b>Below</b>: <b>Maxv2 with first 2 digit Benfords law test, expected and actual frequencies.</b><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAc5UJ7BKx9qUaNl4-d6fJ86H9VA-Gn0koR_ss2BrJdpkvQCuf1y0716YDVw4gG_LgxrIZOvT7Mx7qZUzFaZy0b9nUqLkygalA3IL2IqfIqG_2lwIiuXlkVE6sKYgZqRiFqn69YcBtuGU/s957/sydmaxv2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="957" data-original-width="956" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAc5UJ7BKx9qUaNl4-d6fJ86H9VA-Gn0koR_ss2BrJdpkvQCuf1y0716YDVw4gG_LgxrIZOvT7Mx7qZUzFaZy0b9nUqLkygalA3IL2IqfIqG_2lwIiuXlkVE6sKYgZqRiFqn69YcBtuGU/w640-h640/sydmaxv2.jpg" width="640" /></a></div><br /><div><br /></div></div><div><b>Above:</b> This shows a better picture why the data fails conformance. The first two digits test gives a more complete picture and is more powerful. The data set also fails the conformance tests with two digits. You can clearly see some digits are in use too much and some too little.</div><div>These are the values of the first two digits flagged by the software for the biggest deviations from expected:</div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both;">digits<span style="white-space: pre;"> </span>absolute.diff</div><div class="separator" style="clear: both;">17<span style="white-space: pre;"> </span>317.3937215</div><div class="separator" style="clear: both;">38<span style="white-space: pre;"> </span>255.1850009</div><div class="separator" style="clear: both;">37<span style="white-space: pre;"> </span>204.2152006</div><div class="separator" style="clear: both;">10<span style="white-space: pre;"> </span>203.807979</div><div class="separator" style="clear: both;">27<span style="white-space: pre;"> </span>172.6250801</div><div class="separator" style="clear: both;">82<span style="white-space: pre;"> </span>172.5622119</div><div class="separator" style="clear: both;">42<span style="white-space: pre;"> </span>170.4305132</div><div class="separator" style="clear: both;">22<span style="white-space: pre;"> </span>165.9444006</div><div class="separator" style="clear: both;">85<span style="white-space: pre;"> </span>159.0889232</div><div class="separator" style="clear: both;">19<span style="white-space: pre;"> </span>151.2663636</div></div><br />There are far too many 17's, 38's and 37's, 42's and 85's. Looking at the curve, you can see systematic increase with "blocks" of numbers. There are too few 10's as well. The numbers seem to be in blocks of two's and three's, either too many or too few. Overall, as seen in both graphs, the mid range and larger numbers are over used. The Maxv2 data is non conforming to Benford's distribution.</div><div><br /></div><div><br /></div><div><br /></div><div><br /></div><div><b>Minv2</b></div><div><b>Below: The</b> <b>Daily</b> <b>Minimum Temperatures Adjusted (Minv2)</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKg1lf_a9Xeu1GK2gumVJnZgalrpCG67dvXh2DdZ55qXqbPWF2_OCPG2EmferaHdlUqEVRuPwa38kzha-ArmYhiZvixW_ixmTN3FjVFKKOVDQb-M41TRbWPKYExzhVycgV7YBlXbdfyr8/s1210/sydmin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="984" data-original-width="1210" height="520" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKg1lf_a9Xeu1GK2gumVJnZgalrpCG67dvXh2DdZ55qXqbPWF2_OCPG2EmferaHdlUqEVRuPwa38kzha-ArmYhiZvixW_ixmTN3FjVFKKOVDQb-M41TRbWPKYExzhVycgV7YBlXbdfyr8/w640-h520/sydmin.jpg" width="640" /></a></div><div><b>Above</b>: The minimum adjusted temps (minv2) for the first digit fails chi-square conformance test with a a small p value below our 0.005 cutoff. It is worse than the maximum temperatures graph for single and double digit test using the complete data.</div><div><br /></div><div>There are too many 1's, 2's and 3's, with 4-9's being scarce. This shows that the numbers from 4-9 are underused and 1-3 are overused in this dataset.</div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><b>Lower numbers get higher frequency than expected in Minv2 thus upward warming trend.</b></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmnXWqSEUtLL5e2gxYqJ2d3fgQfKQb2nSryo3PeM3tM9R5eKaxT0VaNoCNP7EBCUfC3Vb-uYdXe-20sOz3JMssV-ib5AiCHY5Nz6QnKwr3NsMXsk2cR6BlXw1YXtr5nlROZUQjlH3vVqk/s773/minv2trends.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="773" height="351" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmnXWqSEUtLL5e2gxYqJ2d3fgQfKQb2nSryo3PeM3tM9R5eKaxT0VaNoCNP7EBCUfC3Vb-uYdXe-20sOz3JMssV-ib5AiCHY5Nz6QnKwr3NsMXsk2cR6BlXw1YXtr5nlROZUQjlH3vVqk/w400-h351/minv2trends.jpg" width="400" /></a></div><br /><div><br /></div><div><br /></div><div><b>Below: The</b> <b>Daily</b> <b>Minimum Temperatures Adjusted (Minv2), first 2 digits Benford's test.</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDhHYnEa_pHUx8twwJ3zLZUGNVbil3vEKNZSZ1mHKz2MBi3uIMsIWMH0DWcnMO6rdK-JseboIS3sSqh_aS68kwMQRH9aRnOmc-xnAqBWwl_CKFH7I_Cxrx-ql_xUlJ7XHPdyPhMtYw-Zs/s1235/sydmin2digitall.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="977" data-original-width="1235" height="506" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDhHYnEa_pHUx8twwJ3zLZUGNVbil3vEKNZSZ1mHKz2MBi3uIMsIWMH0DWcnMO6rdK-JseboIS3sSqh_aS68kwMQRH9aRnOmc-xnAqBWwl_CKFH7I_Cxrx-ql_xUlJ7XHPdyPhMtYw-Zs/w640-h506/sydmin2digitall.jpg" width="640" /></a></div><div><b>Above</b>: This Minv2 graph for 2 digit test is more dramatic -- it clearly show how <i>Peter has been robbed to pay Paul</i> -- the higher numbers from 40-90 or so have been reduced in frequency, the lower numbers around 15-38 have been increased in frequency.</div><div><br /></div><div>The difference to Benfords Law here is striking, the data has a very large bias and this is with a large data sample of nearly 40 000 days. <i>This has the potential to be more extreme when looking at specific months.</i></div><div><br /></div><div>----------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div>Lets separate the Maxv2 data in positive and negative anomalies (above/below average) before the + and - signs are stripped away to test for Benford's. <i>This will show us if the anomalies in the above average or below average Maxv2 groups changes.</i></div><div><br /></div><div><b>Below: </b></div><div><b>Sydney Maxv2 Data, ONLY Positive Temp Anomalies Tested.</b></div><div>Looking at only the positive Maxv2 anomalies ie when temperature anomalies are above-average<i>,</i> there is a <b>greater lack of conformance to Benford's Law</b>. </div><div><i><br /></i></div><div><i>Particular numbers have increased and decreased with regularity</i>, there is nothing "natural" in this number distribution. This appears to be data tampering in the<b> resultant</b> above-average temp anomalies. </div><div><br /></div><div>Higher numbers have more dramatically<i> increased in frequency</i> in the Maxv2 data.</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS-rbgf72Eul7xJkmxev7-7mWABNWLdvxfzPAWqHQ7VG2OwwNmJ2xKJhTQLTG_amFv2YfIZ1nQFMKGN8tweV0GTgkFD9r9f48FPrw3RioKaKdh5ze2F8B8DPfj3Gqp_ydCpCJybJmnlp0/s1199/maxv2anomBenfordPOS.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="980" data-original-width="1199" height="524" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS-rbgf72Eul7xJkmxev7-7mWABNWLdvxfzPAWqHQ7VG2OwwNmJ2xKJhTQLTG_amFv2YfIZ1nQFMKGN8tweV0GTgkFD9r9f48FPrw3RioKaKdh5ze2F8B8DPfj3Gqp_ydCpCJybJmnlp0/w640-h524/maxv2anomBenfordPOS.jpg" width="640" /></a></div><div><b>Above</b>:<b> ONLY POSITIVE temp anomalies for Maxv2.</b></div><div>You can clearly see the spike in numbers that appear too often and the gaps where they are too sparse.</div><div><br /></div><div><b> </b></div><div><br /></div><div><b>What About Above-Average Minimum Temps?</b></div><div>The biases are more evident in the Sydney Minv2 temps. The higher numbers are reduced and the frequency of the lower numbers increased. The biases in the data are more extreme in the Minv2 dataset. </div><div>The <b>resultant above-average</b> temperature in the Minv2 data appears to have tbeen tampered with quite dramatically. </div><div class="separator" style="clear: both; text-align: center;"><br /></div><b><div><b><br /></b></div>Below</b>: <b>ONLY POSITIVE temp anomalies for Minv2.</b><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfUFjb4sI5HivUZFaL58b-qWudy-dvdtZ9bTPe6FeWcermuD5XSb7-6jzXjOEMkBi9kZCGtJv3FPEocw9sU2lcXoTehqGE5h_hSdn2NBIfUWYZP_u6DTHZ8RKNt6mXjGtS6LrvN1M4Ku0/s1168/minv2anomBenfordPOS.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="986" data-original-width="1168" height="540" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfUFjb4sI5HivUZFaL58b-qWudy-dvdtZ9bTPe6FeWcermuD5XSb7-6jzXjOEMkBi9kZCGtJv3FPEocw9sU2lcXoTehqGE5h_hSdn2NBIfUWYZP_u6DTHZ8RKNt6mXjGtS6LrvN1M4Ku0/w640-h540/minv2anomBenfordPOS.jpg" width="640" /></a></div><br /><div><b>Results Of Min Max Temp Anoms + Benfords Law</b></div><div><b>Neither Maxv2 or Minv2 temperature anomalies data conform to Benfords law.</b> There are very large deviations from the expected Benford's curve, particularly when looking at only the positive anomalies for Minv2 and Maxv2. </div><div><br /></div><div>The claim made by BOM that the homogeneity adjustments that are made to Maxv2 and Minv2 Data sets <i>are to "remove" biases of non climatic effects </i>is doing the opposite - in fact very large biases are added because <b>normal observational data with occasional corrections/adjustments would not look like this on data known to conform to Benford's Law. </b>This is nearly 40 000 observations in sample size, the "adjustments" have to be <i>very large</i> to look like this. </div><div class="separator" style="clear: both; text-align: center;"><br /></div><b><div><span style="font-weight: 400;">In any financial situation, this data would be flagged for a forensic audit, it suggests tampering.</span></div><div><b><br /></b></div><div><b><br /></b></div><div><b><br /></b></div>But What About RAW Data?</b><div><br /></div><div>But what about the raw temperature data? The BOM say they are "<i>unadjusted</i>" and are only subject to <i>"pre-processing"</i> and "<i>quality control</i>." (<a href="http://www.bom.gov.au/climate/data-services/content/quality-control.html" target="_blank">Link</a>) This consists of:</div><div><p>"T<i>o identify possible errors, weather observations received by the Bureau of Meteorology are run through a series of automated tests which include:</i></p><ul><li><i>‘common sense’ checks (e.g. wind direction must be between 0 and 360 degrees)</i></li><li><i>climatology checks (e.g. is this observation plausible at this time of year for this site?)</i></li><li><i>consistency with nearby sites (e.g. is this observation vastly different from nearby sites?)</i></li><li><i>consistency over time (e.g. is a sudden or brief temperature spike realistic?)"</i></li></ul><div>To test this, we will use the raw maximum and minimum temperature anomalies.</div></div><div>Lets start with <b>Maximum Raw Data</b>:</div><br /><div>We can see below that the Benford 2 digit test on <b>Maximum Raw Temp </b>Anomalies reveals extremely biased data, about as "unnatural" a distribution as you can get, with periodic spikes and dips. There is a man-made fingerprint here in the rugularity. This <b>RAW</b> data fails a chi-square test for Benford conformance. </div><div><br /></div><div><b>Below:</b> <b>This is the Maximum Raw Temperature Anomalies with a two digit Benford test.</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiR4WgG6-xpBPkOTIcQMezDJcUq1zvfK5pD_ttvU-vA7h0QqzuuOpgKzSoZqAxSYGroM29gwgmF-iuQ-XrZfkPoWc806r8lb2e0bdZw5yLEtyXwVbI36rVcXL1yF4WSmyD7Om6zVOAEwnM/s910/maxrawzzzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="910" data-original-width="882" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiR4WgG6-xpBPkOTIcQMezDJcUq1zvfK5pD_ttvU-vA7h0QqzuuOpgKzSoZqAxSYGroM29gwgmF-iuQ-XrZfkPoWc806r8lb2e0bdZw5yLEtyXwVbI36rVcXL1yF4WSmyD7Om6zVOAEwnM/w620-h640/maxrawzzzzz.jpg" width="620" /></a></div><div><br /></div><div><br /></div><div><br /></div><div><br /></div><div><br /></div><b>Below:</b> This is the <b>Minimum Raw Temperature</b> Anomalies with a two digit Benford test. Again, biased data and definitely not raw observational data with minor preprocessing. Very cooked. Too many 15-47's, too few 10-13 and higher numbers, such as 59, 69, 79, 89. <div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDPpzQF7t-xAJ437bUHh67XIFTCaJIIc91za8sshlG9Ke_2APgn-m1ARb0yOCos5rGQW8VomYSntF1XDmjPNjjFqMQsJbB1sQdqtbGV6ZuOKxCecyTuWpKKlvWoOL0QJ-17hYfIaeS9tc/s937/minrawxxxx.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="937" data-original-width="908" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDPpzQF7t-xAJ437bUHh67XIFTCaJIIc91za8sshlG9Ke_2APgn-m1ARb0yOCos5rGQW8VomYSntF1XDmjPNjjFqMQsJbB1sQdqtbGV6ZuOKxCecyTuWpKKlvWoOL0QJ-17hYfIaeS9tc/w620-h640/minrawxxxx.jpg" width="620" /></a></div><br /><div><b><br /></b></div><div><b><br /></b></div><div><b>Results Of Raw Temperature Anomalies Analyses And Benfords Law 2 Digit Test</b></div></div><div>The systematic tampering of particular digits forms periodic patterns.</div><div><b>The RAW data Min and Max is not raw, it is cooked. It is very cooked.</b></div><div><b><br /></b></div><div>The raw data fails the chi-square test with tiny p values, the Nigrini MAD index and the Kolmogorov-Smirnov test for Benford's Law conformance. </div><div><br /></div><div><br /></div><div><br /></div><div><b>Comparison With Other Climate Data Sets</b></div><div><i>"Berkeley Earth is a source of reliable, independent, non-governmental,</i></div><div><div style="text-align: left;"><i>and unbiased scientific data and analysis of the highest quality." (<a href="http://berkeleyearth.org" target="_blank">Link</a>)</i></div></div><div style="text-align: left;"><i><br /></i></div><div style="text-align: left;">Berkely Earth has released their world-wide daily global temperature anomalies data set with over 50 000 temperatures. It's not the Sydney daily data, it's global, but we can still have a quick comparison to see if the direction of the deviations is the same.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Their global analysis (below) and the same biases appear, the low numbers have been reduced in frequency, the same high numbers have been increased. What makes this stand out is how carefully the data has been manipulated above and below the expected frequency curve. The BOM data appears much more heavy handed. </div><div style="text-align: left;"> </div><div style="text-align: left;"><b>Below: Berkely Earth Global Temp Anomalies, First 2 Digits.</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtGIebXf_NNo3JVZzhdlruOdlbARbV6NN0n-6chsM66D7dPOzZ3UG2cCVxq98Xm-MZ1stNEwVsl6M5D31Z-8CrtlXOxnRNtpZlM0jFq2eSNtJTBs6MPNHoJItHQwge_aItZM6-KeNXHdE/s1042/berkely2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="969" data-original-width="1042" height="596" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtGIebXf_NNo3JVZzhdlruOdlbARbV6NN0n-6chsM66D7dPOzZ3UG2cCVxq98Xm-MZ1stNEwVsl6M5D31Z-8CrtlXOxnRNtpZlM0jFq2eSNtJTBs6MPNHoJItHQwge_aItZM6-KeNXHdE/w640-h596/berkely2.jpg" width="640" /></a></div><div style="font-weight: bold;"><b>Above: Benford's 2 digit test shows increased frequencies of digits 40-90 and reduced frequencies of digits 10-35 in Berkley Earth Gobal Anoms.</b></div><div style="font-weight: bold;"><b><br /></b></div><div style="font-weight: bold;"><b><br /></b></div><div style="font-weight: bold;"><b><br /></b></div><div><b>Below:</b></div><div>Plotting increased frequencies by years against anomaly size.</div><div>Increasing the frequency of numbers increases their effect on the average. In this case, increasing a trend upwards.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmxtF0Yc9t6J_rDpNFFKaSNZY8fVGPqP1eW2Vk9dlx6r3_PG-p9h9t3CnEuZKxW4r9FcUEb7390rWpnndK-TudhCFlEQqoeN6gC0f7OkGeZMRJ-kzupeNlUv2mjU6jHgdK5gBOERtGmRY/s772/tavgzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="679" data-original-width="772" height="351" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmxtF0Yc9t6J_rDpNFFKaSNZY8fVGPqP1eW2Vk9dlx6r3_PG-p9h9t3CnEuZKxW4r9FcUEb7390rWpnndK-TudhCFlEQqoeN6gC0f7OkGeZMRJ-kzupeNlUv2mjU6jHgdK5gBOERtGmRY/w400-h351/tavgzzz.jpg" width="400" /></a></div></div><div><div><br /></div><div style="font-weight: bold;"><b><br /></b></div><div style="font-weight: bold;"><b><br /></b></div><div style="font-weight: bold;"><b><br /></b></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div style="text-align: left;"><b>Below</b>: <b>Nasa GISS Global Temp Anomalies.</b></div><div style="text-align: left;">Looking at Nasa GISS yearly world temperature anomalies below. Only first digit analysis can be done because the dataset is small, using averaged yearly anomalies, averages that are averaged. The worst of the lot. For entertainment purposes only.</div><div style="text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvIGw8ezI8wXd-SWbRS1-A3SVpvJUNcDjc4HqGyB7bsjEkhohuGROCuUa13JtDJITu2NfqNHAoKYFU8dwk1kP1Un9ElBd1OkibOepRT1F13L6yq4bMb33bf14V1EXjMz85y3arNaOkkh4/s1175/nasa.jpg" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="959" data-original-width="1175" height="522" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvIGw8ezI8wXd-SWbRS1-A3SVpvJUNcDjc4HqGyB7bsjEkhohuGROCuUa13JtDJITu2NfqNHAoKYFU8dwk1kP1Un9ElBd1OkibOepRT1F13L6yq4bMb33bf14V1EXjMz85y3arNaOkkh4/w640-h522/nasa.jpg" width="640" /></a></div><br /><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><b>Results Of Comparison</b></div><div style="text-align: left;">Although we weren't comparing the same thing (Sydney specific compared to global), the data from the other climate temperature providers shows the same biases of data in the same direction. Comparing the data with each other confirms the same biases.</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><b>Difference Between RAW data and Adjusted data?</b></div><div style="text-align: left;">Extra temperature adjustments are added on top of raw in the BOM adjusted data sets.</div><div style="text-align: left;">These are the Adjusted data sets called Maxv2 and Minv2.</div><div style="text-align: left;">The adjustements are done by using "homogeneity" software, creating "adjusted" data sets. This is supposed to remove biases but instead adds biases as was shown above with the Benford tests</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What has the homogeneity software done?</div><div style="text-align: left;">BOM claim the adjustments are small. They say that the adjustments are not needed to see the warming trends, which we know is true because looking at Raw above we know it's actually cooked -- biases increase frequency of large numbers and reduces the small ones in Max data, and the opposite is true in Min data. Natural numbers follow Benfords Law, the BOM ACORN data set does not.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Lets look at exact temperature <i>differences</i> between raw and adjusted.</div><div style="text-align: left;">This shows the result of the adjustments done to raw.</div><div style="text-align: left;">This is simply done by:</div><div style="text-align: left;">1: maxv2 - max raw</div><div style="text-align: left;">2: minv2 - min raw</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The outcome of this is that anytime the we get a <b>positive number, the adj temp is warmer than to raw</b>, and when it's a <b>negative number, it is being cooled compared to raw.</b> </div><div style="text-align: left;"> ie</div><div style="text-align: left;">This lets us see what warming/cooling the BOM is adding on top of the "raw".</div><div style="text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUvlzPBFy2CnVtzSjMsWVeoK7rFYAAscFrU7ygAcof819nQQyQmSYhzcH0X_ZWuWkkdYhyT1Nfeg0grZWDQeKk3Lc8ULEzW34WXvBQd4qoQrbeLz9jA4NnXqvgy11wOG_I_hs1LSpxAmY/s1212/minmaxwarmingcomparedraw.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="868" data-original-width="1212" height="458" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUvlzPBFy2CnVtzSjMsWVeoK7rFYAAscFrU7ygAcof819nQQyQmSYhzcH0X_ZWuWkkdYhyT1Nfeg0grZWDQeKk3Lc8ULEzW34WXvBQd4qoQrbeLz9jA4NnXqvgy11wOG_I_hs1LSpxAmY/w640-h458/minmaxwarmingcomparedraw.jpg" width="640" /></a></div><br /><div style="text-align: left;"><b>Above:</b></div><div style="text-align: left;">The graph in blue represents the Maxv2 adjustments that warm raw.</div><div style="text-align: left;">The orange graph shows the the Minv2 adjustments that warm raw.</div><div style="text-align: left;"><b>This is the extra warming done by software on top of Raw.</b><i> </i>The adjustments are regularly updated and tweaked by BOM as the "<i>science changes</i>." and "<i>network changes" </i>are detected<i>.</i></div><div style="text-align: left;"><br /></div><div style="text-align: left;">To plot the curves, <b>average</b> temperature values were used on the left vertical axis. <b>The actual values of how much the temps were modified is below</b>.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">In blue curve we see 1910-1920 had the most warming added to adjusted data. Actual data belows tells us that the temps have been increased by around 3.5 C degrees on top of Raw. Around 1920-1940 it shot up again and then dropped again around 1980 and so forth. The orange curve tells a similar story with Minimum temps data.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The adjustments went to zero at the end of the time series, but we know that raw data has warming factored in already from the Benford analysis.</div><div style="text-align: left;"> </div><div style="text-align: left;"><b>The below</b> <b>graph shows data points with actual temp degrees of added warming</b><b>.</b></div><div style="text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCmoNEp5S9ZSlAFKz94VxmPSZ92PAhuB8QoX1cXr421DR36fHVmj7Pq8SlXCC9h5Ggu4xr_f7zRjUScH99ezJ_tNwpKTHBhDjsNOCzyGBQnguWAC89_E2BcLrZZ6Lo-2Rv_9vgbnE9gK0/s875/qqqqq.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="605" data-original-width="875" height="442" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCmoNEp5S9ZSlAFKz94VxmPSZ92PAhuB8QoX1cXr421DR36fHVmj7Pq8SlXCC9h5Ggu4xr_f7zRjUScH99ezJ_tNwpKTHBhDjsNOCzyGBQnguWAC89_E2BcLrZZ6Lo-2Rv_9vgbnE9gK0/w640-h442/qqqqq.jpg" width="640" /></a></div><div><br /></div><div><b>Above:</b></div>This graph above shows the difference between maxv2 and raw and minv2 and raw but with actual data points, no averaging. There are about 30 cases where maxv2 temperatures are increased 3 to 3.5 C degrees on top of raw.<div> <div>There is an outlier -- look at the data point in blue down near year 2000 on the horizontal axis.</div><div>It's nearly -8 C, in fact <b>it was cooled by -7.6 C degrees</b> at that point.</div><div><br /></div><div><br /></div></div><div><b>What month's are getting most of the warming from Raw to Adjusted data set?</b></div><div>In Maxv2, the biggest warming over raw with 3 C degrees or more in the above graphs is January, February, October, June. Looking at sheer number of times warming has been applied to each month, January, February, July and November stand out.</div></div><div><br /></div><div>In Minv2 data set, the months that get most of the warming temperature wise are January, February and December. The months that are warmed most by number of adjustments are September, October and November.</div><div><br /></div><div>To investigate the different treatment over different months by BOM, I have separated all the days of January, then February and so on.</div><div><br /></div><div>There are 3380 days in January from 1910-2018 so that will be our sample for Jan. All the months have over 3000 days.</div><div><br /></div><div>Monte Carlo simulations confirm the validy of using chi-square test and Kolmogorov-Smirnov test to validate Benfords Law at sample sizes over 2500 using the first two digits for analyses. This means our sample size of over 3000 days is large enough for a two digit test. (<i>Two Digit Testing for Benford’s Law,Dieter W. Joenssen, 2013</i>)</div><div><br /></div><div><br /></div><div><b>Specific Months Using Benfords Law.</b></div><div><b>Below: JANUARY Maxv2 </b><b>Temp Anomalies</b><b> - First 2 digits Benfords Law</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhg16ydiYa2y-WwFxROGtwXIgbhO0Rp6RRot3pJurn7YHj_SQ5gMN5ZcLK5sUC3dsES2BL5ZcY2gDMtL6UC5JvUbfAg7Ocs9vzUatU9V50LUBYQyqP3l1rS5Wri-jM8Y8FCWyliE0o-mco/s1129/jan.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="975" data-original-width="1129" height="552" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhg16ydiYa2y-WwFxROGtwXIgbhO0Rp6RRot3pJurn7YHj_SQ5gMN5ZcLK5sUC3dsES2BL5ZcY2gDMtL6UC5JvUbfAg7Ocs9vzUatU9V50LUBYQyqP3l1rS5Wri-jM8Y8FCWyliE0o-mco/w640-h552/jan.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div><b>Below: FEBRUARY Maxv2 </b><b>Temp Anomalies</b><b> </b><b>- First 2 digits Benfords Law</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXYCNYv6eghkL_-TSgdBoCG7_B93gW1XT5KZRuDWnbUMEIvufRKd4Df2bd3ThqVw4-EWq4sgIU5HPbP1IBGcA6F_strL-9ww9Tbxx4b2G7A_X12FxFJz0aIEiSc3KG5g3EDIXwXg0dK0c/s1103/feb.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="942" data-original-width="1103" height="546" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXYCNYv6eghkL_-TSgdBoCG7_B93gW1XT5KZRuDWnbUMEIvufRKd4Df2bd3ThqVw4-EWq4sgIU5HPbP1IBGcA6F_strL-9ww9Tbxx4b2G7A_X12FxFJz0aIEiSc3KG5g3EDIXwXg0dK0c/w640-h546/feb.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><b>Below: </b><b>JANUARY Minv2 Temp Anomalies - First 2 digits Benfords Law</b><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4m1mmHJPgxlg3H1oLXhselm59yPNkM31goETFVTO6brs7GalUyaJkWq1zeviMkNvhan8KFmPdez_iNOCfQPXy-VgwnOl17tS0pLw1_lEwl8AD0DKV7xofBHFN64_G-sZYMjo2xCEADWI/s1154/janmin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="979" data-original-width="1154" height="542" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4m1mmHJPgxlg3H1oLXhselm59yPNkM31goETFVTO6brs7GalUyaJkWq1zeviMkNvhan8KFmPdez_iNOCfQPXy-VgwnOl17tS0pLw1_lEwl8AD0DKV7xofBHFN64_G-sZYMjo2xCEADWI/w640-h542/janmin.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><b>Below: </b><b>FEBRUARY Minv2 </b><b>Temp Anomalies</b><b> </b><b>- First 2 digits Benfords Law</b><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDTEvXgKt-ftxpSfJ0mZEAYoh6Uy0lZYWYU02VQJeYePuOYCadfREqZUNsShj16YTizvrEe6fZOJqjkSQA8YgCKBDYP1IQ6mYQtbFUhRzpdTS4b36T3Lnqmh7SLLaOgoEUU4QerPyOfIo/s1148/febmin.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="973" data-original-width="1148" height="542" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDTEvXgKt-ftxpSfJ0mZEAYoh6Uy0lZYWYU02VQJeYePuOYCadfREqZUNsShj16YTizvrEe6fZOJqjkSQA8YgCKBDYP1IQ6mYQtbFUhRzpdTS4b36T3Lnqmh7SLLaOgoEUU4QerPyOfIo/w640-h542/febmin.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div><b>Benfords Law Results For Individual Months:</b></div><div>The Chi-square and Kolmogorov-Smirnov tests comprehensively fail all the individual months for not conforming to Benfords Law. The p value is 2.2e-16 in most cases, a tiny number. All the months exhibit very large biases. The lack of conformance to Benfords Law is extreme. These results would red flag any financial data set for a forensic audit. This signals very large data tampering.</div><div><br /></div><div><br /></div><div>*********************************************************************************</div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>University Of Edinburgh Bayesian R Code</b></div><div class="separator" style="clear: both; text-align: left;"><b>Tracking Data Conformance to Benfords Law Over Time.</b></div><div class="separator" style="clear: both; text-align: left;">This means that running the below model on our daily temperature anomalies data sets from 1910-2018 will track Benfords Law conformance using the first digit <i>over time</i>. This would tell us exactly at what point the data was modified (what year) and by how much and how little. </div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">It has been shown by Miguel De Carvalho to be <b>more</b> accurate than empirical methods of evaluation because of the discretisation effect.(<a href="https://journals.plos.org/plosone/article/authors?id=10.1371/journal.pone.0213300" target="_blank">Link</a>)</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Miguel and Junho kindly tweaked the software and sent me the R code to run on BOM data sets to create their superb time varing graphs. This shows you exactly when a change was made to the data.</div><div class="separator" style="clear: both; text-align: left;">They used it to track homogeneity of a data set which tracked the distanced travelled by hurricanes over the years. They used it to show that <i>data in recent years was less homogenous!</i></div><div class="separator" style="clear: both; text-align: left;">Their paper and link below.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><a href="https://journals.plos.org/plosone/article/authors?id=10.1371/journal.pone.0213300" target="_blank">Miguel De Carvalho and Junho Lee</a> from University of Edinburgh have created a state-of-the-art Bayesian time-varying model "<i>that tracks periods at which conformance to</i></div><div class="separator" style="clear: both; text-align: left;"><i>Benford’s Law is lower. Our methods are motivated by recent attempts to assess how the</i></div><div class="separator" style="clear: both; text-align: left;"><i>quality and homogeneity of large datasets may change over time by using the First-Digit</i></div><div class="separator" style="clear: both; text-align: left;"><i>Rule."</i></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjl9GrQy3NgSsk960DmzGE4a8rHYIhILvH-9-B98Hytnr6s-zqfG5xhf-XO6oLa-4xn-48xIFngb1WxbWREZARWA-G86wAFWoyNmXRqovWGGeI99snStj7mpUGZe7Am7paTRZ3NXnq6EHI/s969/xmiguel.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="602" data-original-width="969" height="398" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjl9GrQy3NgSsk960DmzGE4a8rHYIhILvH-9-B98Hytnr6s-zqfG5xhf-XO6oLa-4xn-48xIFngb1WxbWREZARWA-G86wAFWoyNmXRqovWGGeI99snStj7mpUGZe7Am7paTRZ3NXnq6EHI/w640-h398/xmiguel.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">I ran the model as a first run over the Berkely Earth Global Temperature Anomalies. This is a 50 000 sample data set from 1880-2018 I referenced above.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">The software used the first digit of the temperature anomalies and tracked conformance to Benfords Law by years.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">The time varying output graphed was as follows:</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUVBiJNoU1Xeda0GbWxCX36O9SopBLmY_ifvxzJWEWkZpDJfJdmMaRS6gs9CDLAcIGBAk6RzcHvobWw6hqW4TiE7zf8oai_Lr9vAdkd9iQcIQcPZb-7vSNOQZa6D-78hYyr6uAbuAkrts/s1585/beglobal2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="960" data-original-width="1585" height="388" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUVBiJNoU1Xeda0GbWxCX36O9SopBLmY_ifvxzJWEWkZpDJfJdmMaRS6gs9CDLAcIGBAk6RzcHvobWw6hqW4TiE7zf8oai_Lr9vAdkd9iQcIQcPZb-7vSNOQZa6D-78hYyr6uAbuAkrts/w640-h388/beglobal2.jpg" width="640" /></a></div>The outputs show the posterior mean for the leading digit of the temperature anaomalies taking the value 1 to 9. The leading digit with value 1 has the biggest effect with the probability going up at 110 years (the years 1880-2018 makes this about 1990), going up way past the dotted line which is the expected value for leading digit=1. The digit 1 was under expected dotted line for most of the time, going up and down, but 1990 was the critical point of a large increase.</div><div><br /></div><div>With digit=2 there is a small decrease at about 1890 then a levelling off where the probability is roughly what is expected, then it also dives at about year 110 which equals 1990 as well. This means value 2 is under used. This is similar for values 3,4 and 5.</div><div><br /></div><div>It is difficult to see on this plot, but digits 7,8,9 where over used from the 100 year mark (1980) with a gradual decline.</div><div><br /></div><div>The plots are difficult to see when shrunk, so for the Sydney model I have used the raw numbers output by the model and plotted those in JMP.</div><div><br /></div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTimqWuqQWy-FQVXaqziP6tEE5qBx69NwlU2ME-_LH0CKYmayYfPd988oepyvM3pn_PeseodhlHrzkLslH6DZxjUt6AnwA-V7-isqasdz59BrY7HtLK2kww6Azz3uRe36f2uuPQ0AmxHQ/s1920/zzzzz.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="969" data-original-width="1920" height="324" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTimqWuqQWy-FQVXaqziP6tEE5qBx69NwlU2ME-_LH0CKYmayYfPd988oepyvM3pn_PeseodhlHrzkLslH6DZxjUt6AnwA-V7-isqasdz59BrY7HtLK2kww6Azz3uRe36f2uuPQ0AmxHQ/w640-h324/zzzzz.jpg" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: left;">T<b>he above output shows the net effect of non conformance to Befords Law with the leading digit.</b> The smooth SSD (smooth sum of squared deviations) statistics assesses overall conformance over nine digits with the First-Digit Rule in each year, <i>"which avoids overestimation of the misfit due to a discretization effect, whereas a </i><i>naive empirical SSD as in can be shown to be biased."</i> (<a href="https://journals.plos.org/plosone/article/authors?id=10.1371/journal.pone.0213300" target="_blank">Link</a>)</div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">This clearly shows that at around the 115 year mark (1880+115=1995) there has been a large upward trend increasing lack of homogeneity by <i>lack of conformance to Benfords Law</i>. In other words, certain digits have been used excessively and some too sparsely in the leading digit values of temperature anomalies of Berkley Earth Daily Global Amomalies. The trend increases dramatically at 2008, suggesting much more data tampering in the latter years.</div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>Sydney JUNE Maxv2 Bayesian Tracking 1910-2018</b></div>The time tracking Benfords Law conformance model was run over all the Sydney Maxv2 Daily temperature Anomalies from 1910 to 2018 for June. June was one of the months that seemed to get extra attention from the BOM with warming, shown in the difference between raw and adjusted, above. So it was worth checking overtall conformance.</div><div><br /></div><div>The actual output from the model is a bit hard to see exactly when posted on this blog, so I used the raw numbers that are output by the model to graph it in JMP in large format.</div><div><br /></div><div>To recap -- the first digit for each temp anomaly was checked for the values of 1-9, and was tracked over the years for conformance to the first-digit rule from Benfords Law. This shows conformance behaviour over years (time) for each leading digit value. <br /><div> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUBfYXhq8xp2oW6ch-d3ahD3O_wVeTuFLb_U8_M85MMq3WI1GgUxn9h5HE18rW3Plz-oEQ_RmfTHl8NvCN8olA3FiclK0TCURPwPHsWipzNufKbEUqiKpP9bYbYmsPmVL1PFQ015EYbnc/s1920/d1.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1042" data-original-width="1920" height="347" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUBfYXhq8xp2oW6ch-d3ahD3O_wVeTuFLb_U8_M85MMq3WI1GgUxn9h5HE18rW3Plz-oEQ_RmfTHl8NvCN8olA3FiclK0TCURPwPHsWipzNufKbEUqiKpP9bYbYmsPmVL1PFQ015EYbnc/w640-h347/d1.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: left;"><b>Above:</b> This if the leading digit with value 1. This has the largest effect, and the orange line is the number of times we expect to see value =1. The blue is the actual variation from that. We see that 1's were over used till about 1940, were under used in the 1950's, increased in the 1980's, and then shot up in the late 1990's with high useage. The trend upwards is similar to Berkely Earth Globals above.</td></tr></tbody></table><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgul0qv6H9vfci-Kn_xiPtxnHR50Pqq9CvkfE9Z7NqGAIC1uNF16k29jDVWhWnDOEhldLa9IMtwtfseV8XpQClndfV33fubC37ZfLXsTCw7a7_PLiTwHJJ0xhqgQ4cjJguS6NtpWP6aMLk/s1920/d2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgul0qv6H9vfci-Kn_xiPtxnHR50Pqq9CvkfE9Z7NqGAIC1uNF16k29jDVWhWnDOEhldLa9IMtwtfseV8XpQClndfV33fubC37ZfLXsTCw7a7_PLiTwHJJ0xhqgQ4cjJguS6NtpWP6aMLk/w640-h348/d2.jpg" width="640" /></a></div><div><b>Above:</b> The first digit is now equal to 2 and the use was excessive around 1910, declined in the 1920's and was overused in the 1980's, and reducing in use in the last 5 years or so.</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkzujnubxpJjA33i4aogTb0L3742CUcN8HRpdPux2USDOcDOx6OWSQ5ZLkIj1EDzr0U0S7ufkd2WouAI-J8Y8AJjYdElLNnfdIK6f6UIwdI1DWV7G3KhgEtmOXcEwHQwGiRCEKZigJom4/s1920/d3.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkzujnubxpJjA33i4aogTb0L3742CUcN8HRpdPux2USDOcDOx6OWSQ5ZLkIj1EDzr0U0S7ufkd2WouAI-J8Y8AJjYdElLNnfdIK6f6UIwdI1DWV7G3KhgEtmOXcEwHQwGiRCEKZigJom4/w640-h348/d3.jpg" width="640" /></a></div><b>Above:</b> Leading digit =3, use declined greatly from 1960's, although there was a leveling out in the 1990's before dropping down to around normal expected level.</div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh46Jo_M2N80ncmg-YMbBdgegAE25TrhNiYCKFy9a-WmbqZfVAMDpxEURMQmz-9_L0WYBov5YymgKbWdd7wZ0EGmnqwqTSJhnuB2bhfdaYaJIc8Cv-puRuzOyGURb-fjmDrKo2yv64FyF4/s1920/d4.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh46Jo_M2N80ncmg-YMbBdgegAE25TrhNiYCKFy9a-WmbqZfVAMDpxEURMQmz-9_L0WYBov5YymgKbWdd7wZ0EGmnqwqTSJhnuB2bhfdaYaJIc8Cv-puRuzOyGURb-fjmDrKo2yv64FyF4/w640-h348/d4.jpg" width="640" /></a></div><b>Above</b>: Leading digit = 4. Almostly cyclical in use, and in the decline in recent years.</div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5MN_TS1rL53gvhJUwP42i3O3McSj0khddoPSeAvPFHnWsOute36JTtOGHUXHNoDBqfxHjjZn62aLoB7A7sRRGsP8Xu2lPbtf9LHLK5XAuMaPXYymCbYsPFfe5nRoWUSA7czf5fUwrBok/s1920/d5.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5MN_TS1rL53gvhJUwP42i3O3McSj0khddoPSeAvPFHnWsOute36JTtOGHUXHNoDBqfxHjjZn62aLoB7A7sRRGsP8Xu2lPbtf9LHLK5XAuMaPXYymCbYsPFfe5nRoWUSA7czf5fUwrBok/w640-h348/d5.jpg" width="640" /></a></div><div><b>Above:</b> Leading digit = 5 shows complete under use throughout the years, with an increase in the 1990's but still below expected. </div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQRoNN77R7nCc61AO96l3oiavraPVcmIbZKCAUJbJrbP0azV0Tt6qxMIX5fU5f3XrlExXVuYIaThegma4OswXK6VR__rv7-wWez8HLNBjna_IceZ5v1gCV4eupq7XR9PtS0Uq8mNqrCkI/s1920/d6.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQRoNN77R7nCc61AO96l3oiavraPVcmIbZKCAUJbJrbP0azV0Tt6qxMIX5fU5f3XrlExXVuYIaThegma4OswXK6VR__rv7-wWez8HLNBjna_IceZ5v1gCV4eupq7XR9PtS0Uq8mNqrCkI/w640-h348/d6.jpg" width="640" /></a></div><div><b>Above:</b> Leading digit = 6, shows under use and then a sharp increase in the 1950-1980's. It has been under used from the early 90's.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3J1JJ3Oim0CWoY6D4MKNsnvh9EhfzxUnaA7Nf1ZyBj5wO9zzxBoq8Tsi6UoYimUPKhGiP_M4pgyR6XZkn1yf1jyuCha4CDjDRR2SHjFdijZOzcdoD57F7eDZjHE2OhAeamehTsrxaxFo/s1920/d7.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3J1JJ3Oim0CWoY6D4MKNsnvh9EhfzxUnaA7Nf1ZyBj5wO9zzxBoq8Tsi6UoYimUPKhGiP_M4pgyR6XZkn1yf1jyuCha4CDjDRR2SHjFdijZOzcdoD57F7eDZjHE2OhAeamehTsrxaxFo/w640-h348/d7.jpg" width="640" /></a></div><div><b>Above</b>: Leading digit = 7, this shows excessive use in 1920's -- then a gradually declining use.</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixdeaI8nLN5-ataYGI-AwCGD-pKS-f5L_3qJxPOYRlTfDiVgnT24d5cIZYMOowggEuBm3jpoKx0RzgIGaKFTM-M4iwXB_77FM0798xo3K_j_oafvjLjIX56-vou3WsyBahosMOz-xQxSU/s1920/d8.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixdeaI8nLN5-ataYGI-AwCGD-pKS-f5L_3qJxPOYRlTfDiVgnT24d5cIZYMOowggEuBm3jpoKx0RzgIGaKFTM-M4iwXB_77FM0798xo3K_j_oafvjLjIX56-vou3WsyBahosMOz-xQxSU/w640-h348/d8.jpg" width="640" /></a></div><div><b>Above:</b> Leading digit = 8, the magical date of 1980 where so much happens in the climate world comes into again with excessive use in the 1980's and the 90's.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiR3fbevnaPs7LhT1OBy7vB6_JTYlFboVlgj4jViSo_j_LGOIUksB1y__Bo_swdWaXSMiv0NHpie1gOBUiJKPqIDzswSyLWS1iCLklUvJ6471QsA_2kj_nm2NkJqPZtJ8xGZtKkSWwKLU/s1920/d9.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiR3fbevnaPs7LhT1OBy7vB6_JTYlFboVlgj4jViSo_j_LGOIUksB1y__Bo_swdWaXSMiv0NHpie1gOBUiJKPqIDzswSyLWS1iCLklUvJ6471QsA_2kj_nm2NkJqPZtJ8xGZtKkSWwKLU/w640-h348/d9.jpg" width="640" /></a></div>Above: Leading digit = 9, this generally shows under use over the years.</div><div><br /></div><div>The net result of posterior probabilities of all the digits is in the SSD curve. The lack of conformance is <b>higher</b> than Berkely Earth, there is a higher overall lack of conformance to Benfords Law. This can be see on the left hand side vertical axis. The lack of conformance to Benford's law is relatively flat with slight cyclic variations around 1910, 1930's, 1950's and gradually increasing from the 1970's, with accelerated increase in the last 5 years. That signals the worst lack of conformance, suggesting Benford's Law conformance has been getting worse in the last 5 years or so.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTWY9c5py3fBG-BqCrqUI0P5LDGVm2jxx525oXnO6EfNpLFogxzGuAK7Tk6XUqx1AzimaPL2CIQl50ZNY5S7x-gwARcgMSMzDgBdhzfc1KuzWzgTRJ0M9vZgm36D3gHy-VOvTLU4w-hNE/s1912/junemaxSSD.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1052" data-original-width="1912" height="352" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTWY9c5py3fBG-BqCrqUI0P5LDGVm2jxx525oXnO6EfNpLFogxzGuAK7Tk6XUqx1AzimaPL2CIQl50ZNY5S7x-gwARcgMSMzDgBdhzfc1KuzWzgTRJ0M9vZgm36D3gHy-VOvTLU4w-hNE/w640-h352/junemaxSSD.jpg" width="640" /></a></div><br /><div><br /><div class="separator" style="clear: both; text-align: left;"><b>Summary:</b> The use of leading digit value = 1 increases dramatically from the 2000's, causing negative anomalies in the June data set to be warmed. Overall lack of conformance to Benfords first-digit rule is worse than the Berkely Earth global data set as shown on left axis values of SSD graph.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div></div><div><div class="separator" style="clear: both; text-align: left;">----------------------------------------------------------------------------------------------------------------------------</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>Statistical Analysis Of BOM Data Sets Without Benford's Law:</b></div><div class="separator" style="clear: both; text-align: left;"><b>Pattern Exploration, Trailing Digits And Repeated Numbers.</b></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Leaving Benford's law behind, there are other tools to help with analysis of data quality fraud.</div><div class="separator" style="clear: both; text-align: left;">Replication problems have been increasing in scientific studies, with data fabrication increasing.</div><div class="separator" style="clear: both; text-align: left;"><a href="http://Retractionwatch.com" target="_blank">Retractionwatch.com</a> list hundreds of studies that have been retracted, many for data fabrication.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Uri Simonsohn at <a href="http://datacolada.com">datacolada.com</a> is a "data detective" that has been responsible for getting several <a href="https://www.the-scientist.com/the-nutshell/another-victim-of-suspicious-data-40740" target="_blank">big name professors</a> to retract their studies and resign from their posts for data fabrication. His website statistically tests and attempts to replicate studies causing many retractions. </div><br />The pharmaceutical industry is also actively involved in replication of studies-</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimN1xd_kqOgOffBBI4WMjMrrGoHagqy-QLLUNsLeA7eqpYoYnkyLb3sAu57F11nZoJBdV7trBgLfgAnT9jBOf72nL86LRsqHJaky1bAEsLDiAopOH26txi8uxvFCa4LzG__woMGn4iFf0/s1236/pharma.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="838" data-original-width="1236" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimN1xd_kqOgOffBBI4WMjMrrGoHagqy-QLLUNsLeA7eqpYoYnkyLb3sAu57F11nZoJBdV7trBgLfgAnT9jBOf72nL86LRsqHJaky1bAEsLDiAopOH26txi8uxvFCa4LzG__woMGn4iFf0/s320/pharma.jpg" width="320" /></a></div><br /><div><br /></div><div>The University Of Portland did an analysis of the trailing digits in <a href="https://web.williams.edu/Mathematics/sjmiller/public_html/math/talks/TheoryApplicationsBenford2.pdf" target="_blank">Tasmanian Climate</a> data taken from “<i>Proxy Temperature Reconstruction" data from “Global Surface Temperatures Over the Past Two Millenia" (Phil D. Jones, Michael E. Mann), </i>the infamous "climategate" dataset.</div><div><b><br /></b></div><div><div class="separator" style="clear: both; text-align: left;">They found:</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcrewLC_SDu8P_3dfo7Y0z56lyfpNq_VB4UXnTcjPTrlxd_rAbpxjFoE-uNU97DLPAYTD5oBk6z21A_y0P6sttc8M8FKL5cfbu656veIm108V8jyD83Py0UXySLJy6ORsngi5gMTu9PNU/s1208/port4.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="692" data-original-width="1208" height="229" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcrewLC_SDu8P_3dfo7Y0z56lyfpNq_VB4UXnTcjPTrlxd_rAbpxjFoE-uNU97DLPAYTD5oBk6z21A_y0P6sttc8M8FKL5cfbu656veIm108V8jyD83Py0UXySLJy6ORsngi5gMTu9PNU/w400-h229/port4.jpg" width="400" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>Trailing Digit Analysis With Sydney BOM Daily Data Sets</b></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both;">Unlike the leading digit, which is logarithmically distributed in most data (<i>Durtschi et al., 2004</i>), the trailing digit is typically uniformly distributed (<i>Preece, 1981</i>)</div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;"><br /></div></div><div class="separator" style="clear: both; text-align: left;">The 3rd digit of a number has a nearly uniform distribution with the 4th digit being close to uniform. The Sydney ACORN data is rounded to 1/10 of a degree, so the 3rd digit will be analysed. </div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">NOTE: The BOM thermometers have a tolerance of 0.5 of a degree, this includes their electronic thermometers. This tolerance is below WMO guidlines of 0.2 of a degree.</div><div class="separator" style="clear: both; text-align: left;">(<i>The Australian Climate Observations Reference Network – Surface Air Temperature (ACORN-SAT) Data-set Report of the Independent Peer Review Panel 4 September 2011)</i></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Trailing Digit R code from <i>Jean Ensminger and Jetson Leder-Luis</i> World Bank audit is used here to test various months from the Sydney Minimum and Maximum temperature data sets, both raw and adjusted.(<i>Measuring Strategic Data Manipulation: Evidence from a World Bank Project</i>). </div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Only October will be graphed or the analysis would be too long.</div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhERak37sMF-HxD9sRS-25a-zk6nLG0ih3mpum1j0L6fxUBsfh3Rywsp1AX_K2fLAFeCh7CAyabK3PRxXcumQIXuLeNPfVQS5j3-3Nm88MCDT1SfkO6pD4pSf9kWFbZCpu6LtTqOGlSO8E/s904/maxoctraw9.4e82.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="904" data-original-width="842" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhERak37sMF-HxD9sRS-25a-zk6nLG0ih3mpum1j0L6fxUBsfh3Rywsp1AX_K2fLAFeCh7CAyabK3PRxXcumQIXuLeNPfVQS5j3-3Nm88MCDT1SfkO6pD4pSf9kWFbZCpu6LtTqOGlSO8E/w596-h640/maxoctraw9.4e82.jpg" width="596" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><b>Above: All the days for October,</b> about 3300 of them from 1910-2018. This is from the <b>Sydney Max Raw </b>data set, this is unadjusted data from the BOM. We are looking at the 3rd digit in all the raw temperatures (not anomalies) because this test is regardless of Benfords and can thus be used directly on temperature data.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">It produces a Chi-square p value of 9.4e-82, a tiny number meaning it's highly significant to reject the null hypothesise that the distribution is uniform.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Next, <b>October Mav2</b>, the adjusted data set.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjD2BoRBmHKBdlUEw6h22J43Pq8p5KD5xXNBevhIyRprQaY38uSJaPS8XDYgW_YUPHP79lr59Q_JTRMOoi2saocqwcK_Tgof5ydzEaYT-gQ0Lq9rUp86V05O2vQmoi5YV8SSNW2ePHahIc/s904/octmax1.6e76.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="904" data-original-width="842" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjD2BoRBmHKBdlUEw6h22J43Pq8p5KD5xXNBevhIyRprQaY38uSJaPS8XDYgW_YUPHP79lr59Q_JTRMOoi2saocqwcK_Tgof5ydzEaYT-gQ0Lq9rUp86V05O2vQmoi5YV8SSNW2ePHahIc/w596-h640/octmax1.6e76.jpg" width="596" /></a></div><br /><div class="separator" style="clear: both; text-align: left;">Above: Sydney October days 1910-2018 using <b>Maxv2</b> adjusted data set. This also fails the uniform distribution. The 5 digit has increased dramatically from Raw.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">Next, October Min Raw Data.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjABjvBVTFAfFpttlAvDGEiTUGC0ciWWI8L00CITm1sR7RFWzQvupblqe49yi_CgcXoHHWFAscuKNb3AaEPLTgTwX4KKsV2XIOk6AeOjhg6vYPDQT1WCGo3J7OjhzYAXPTXSVvdortLGyo/s904/minrawoct7e78.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="904" data-original-width="842" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjABjvBVTFAfFpttlAvDGEiTUGC0ciWWI8L00CITm1sR7RFWzQvupblqe49yi_CgcXoHHWFAscuKNb3AaEPLTgTwX4KKsV2XIOk6AeOjhg6vYPDQT1WCGo3J7OjhzYAXPTXSVvdortLGyo/w596-h640/minrawoct7e78.jpg" width="596" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><b>Above: Sydney Min Raw</b> data from 1910-2018, the the 3300 October days. This data is supposed to be unadjusted but fails the Chi-square test for uniform distribution. The 5 digit has a too low probabilty in 3rd postion again.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>Next:</b> <b>October Minv2</b>.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQAF-8Iw4pgWtnIVMFY4k_86K1V7mp2JSmAMa579wZwaQ6WDkdPc9hIYaj0dl1KwZhNJFgeoejyr6QowXNteRNNNrDPvPZpLQ0bgDpm-d71ZGqYxrgFDdUb1qAd0ZpmQYhZcgACqwC4g8/s904/octmin1.697.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="904" data-original-width="842" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQAF-8Iw4pgWtnIVMFY4k_86K1V7mp2JSmAMa579wZwaQ6WDkdPc9hIYaj0dl1KwZhNJFgeoejyr6QowXNteRNNNrDPvPZpLQ0bgDpm-d71ZGqYxrgFDdUb1qAd0ZpmQYhZcgACqwC4g8/w596-h640/octmin1.697.jpg" width="596" /></a></div><br /><div class="separator" style="clear: both; text-align: left;"><b>Above: Sydney October Minv2</b> adjusted dataset fails to comply with a uniform distribution as well, with an equally low p value conpared to the raw data.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">----------------------------------------------------------------------------------------------------------------------------</div><div class="separator" style="clear: both; text-align: left;"><b>Note: </b>The 5 digit is often low and occurs in other months too. <b>This could be indicative of a double-rounding error, where the majority of temperature readings were done in Fahrenheit and rounded to 1/10 of a degree, then later converted to Celcius and rounded to 1/10 of a degree again. </b></div><div class="separator" style="clear: both; text-align: left;"><i><br /></i></div><div class="separator" style="clear: both; text-align: left;"><div class="separator" style="clear: both;">"Statistical methods, especially those concerned with assessing distributional changes or temperature extremes on <b>daily time-scales</b>, are sensitive to rounding, double-rounding, and precision or unit changes. Application of precision-decoding to the GHCND database shows that <i>63% of all temperature observations are misaligned</i> due to unit conversion and double-rounding, and that many time series</div><div class="separator" style="clear: both;">contain substantial changes in precision over time." <i>(Decoding the precision of historical temperature observations, </i><i>Andrew Rhines et al) </i></div></div><div class="separator" style="clear: both; text-align: left;">-----------------------------------------------------------------------------------------------------------------------------</div><div class="separator" style="clear: both; text-align: left;"> </div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><b>Result Of Trailing Digits Analysis For Sydney Daily</b></div><div class="separator" style="clear: both; text-align: left;"><b>Min Raw, Minv2, Max Raw and Maxv2 Data Sets</b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div>All months have some problem with trailing digits not conforming to a uniform distribution. The Min temperature data for winter months are worst, closely followed with the Max temperatures in December, January and February. </div><div><br /></div><div><b>Lack of uniformity with Trailing Digits are a classic marker of data tampering</b> (<i>Uri Simonsohn, <a href="http://datacolada.org/74">http://datacolada.org/74</a></i>)</div></div></div><div><br /></div><div><br /></div><div>----------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div><br /></div><div><br /></div><div><br /></div><div><br /></div><div><br /></div><div><b>Pattern Exploration Of Sydney Daily Min Max Data Sets--</b></div><div><b>Looking for Duplication and Repeated Sequences-</b></div><div><b>Beyond Chance.</b></div><div><br /></div><div><br /></div><div>If sequences from the temperature data sets are duplicated over different years, or multiple days have duplicated temperatures beyond what can be expected from chance, we have found potential data integrity issues and possible tampering. </div><div><br /></div><div>We will be using a specialised software module from JMP to find duplicated and sequences repeated beyond chance. The software calculates the probability of an event happening by chance, considering the data set size, number of unique values and repetitions within the data set. </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSkurd2cEorB1mndy8aiB8STgJ4ojOChxKtLcZlGzodOI0TI6ITcR89PZQEX8xuyNUpS7N8LMZmKKqAAOJFFrJjEWp63u-qDiZYi0LgHXYbkvkdMNqY9BIF4s8f-i1A08WO6-eafwb8V0/s1290/rare1.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="525" data-original-width="1290" height="260" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSkurd2cEorB1mndy8aiB8STgJ4ojOChxKtLcZlGzodOI0TI6ITcR89PZQEX8xuyNUpS7N8LMZmKKqAAOJFFrJjEWp63u-qDiZYi0LgHXYbkvkdMNqY9BIF4s8f-i1A08WO6-eafwb8V0/w640-h260/rare1.jpg" width="640" /></a></div><br /><div><b>Above: </b>Daily Min Raw Data Set For Sydney for December, about 3300 days.</div><div>Straight away we find a problem, a big one. The software flags that 15 days temps are exactly duplicated to 1/10 C and repeated in another year.</div><div><br /></div><div>It looks like a copy/paste somehwere in the 40 000 days time series, the sheer number of days probably being the reason this hasn't been picked up before.</div><div><br /></div><div>The software gives this a probability of being of by chance as <b>zero</b>.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgr8ltvYhM-npvxaHP7itUwV7jGoHLHTcCBcimbyN8hTm0wuV6kasUtijxXA9V5Y8t8HhJuhgdSGdTh_WaTJMyWizyPXAJFr2zXOlMl9-ez3hYRKeetvV1CQ6D4CQesq1Q4RVDX5VvfaiM/s813/minraw.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="813" data-original-width="212" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgr8ltvYhM-npvxaHP7itUwV7jGoHLHTcCBcimbyN8hTm0wuV6kasUtijxXA9V5Y8t8HhJuhgdSGdTh_WaTJMyWizyPXAJFr2zXOlMl9-ez3hYRKeetvV1CQ6D4CQesq1Q4RVDX5VvfaiM/w104-h400/minraw.jpg" width="104" /></a></div><br /><div>Looking at the December Min Raw data set, we can we that an exact sequence has been duplicated in the following year. Recall, this is RAW, unadjusted data with just basic data quality checks and preprocessing! This identical sequence also exists in the Minv2 data set.</div><div><br /></div><div><br /></div><div>But things get worse for<b> July daily temps for Sydney 1910-2018.</b></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnirpW9LE2fKBPzXu0rpv3NETfZuoc3AzbZOiK-laWlqV1u_cwmgPNxWeZzBUCDrm6OuT8ROqQmM9vl0WNFPNLI8gknPT7kb1b-wUxEBra4L8RCVTWN72Utrq8HhX3jREIdteL7NWCyPM/s1311/rare2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="496" data-original-width="1311" height="242" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnirpW9LE2fKBPzXu0rpv3NETfZuoc3AzbZOiK-laWlqV1u_cwmgPNxWeZzBUCDrm6OuT8ROqQmM9vl0WNFPNLI8gknPT7kb1b-wUxEBra4L8RCVTWN72Utrq8HhX3jREIdteL7NWCyPM/w640-h242/rare2.jpg" width="640" /></a></div><br /><div><b>Above:</b> Both <b>Min Raw and Minv2 for July</b> have 31 days, a complete month, "copy pasted" into another year. The probability of this happening by chance is zero again.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuvSNb00XPdDwsxRan8jWOs1m9mn6FeZTb71k3CBmwtSX_Xa8vJPp8DayFeev5cr7RMQ6w1U8zONR7-wF-H8ISRU5-SqrSMqIMZUoEnZlBEVtIgocPGLv1M-rrUcAF345uZPnty6phVAc/s749/rare3.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="749" data-original-width="314" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuvSNb00XPdDwsxRan8jWOs1m9mn6FeZTb71k3CBmwtSX_Xa8vJPp8DayFeev5cr7RMQ6w1U8zONR7-wF-H8ISRU5-SqrSMqIMZUoEnZlBEVtIgocPGLv1M-rrUcAF345uZPnty6phVAc/s320/rare3.jpg" /></a></div><br /><div><b>Above:</b> A snapshot of a full month being copy pasted into another year in both Sydney Minv2 and Min Raw data. Again, the Raw is supposed to be relatively untouched according to BOM. Yet this copied sequence gets carried over to the adjusted Minv2 set.</div><div><br /></div><div><br /></div><div>But there's more:</div><div><b>June Minv2 + Min Raw also have a full month of 30 days copy pasted into another year. </b></div><div><b><br /></b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4wZtLg0K2Erx4VyeD_TefFsunjpdCiftSS_4Bzb6QoODeuqnfLjJw1kut8YbgFVYA5Vw-xUzUg74Aozar1M08Aiv107OzqVpqGnsgkTRZBdt0u3jmHzS5TeXugGdfMGR0EDYIUSHw95w/s1256/rare4.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="518" data-original-width="1256" height="264" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4wZtLg0K2Erx4VyeD_TefFsunjpdCiftSS_4Bzb6QoODeuqnfLjJw1kut8YbgFVYA5Vw-xUzUg74Aozar1M08Aiv107OzqVpqGnsgkTRZBdt0u3jmHzS5TeXugGdfMGR0EDYIUSHw95w/w640-h264/rare4.jpg" width="640" /></a></div><br /><div><b>Above:</b> Sydney June Daily Temps, Minv2 + Min Raw duplicated 30 day sequence.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDGpGP7KYAYj1sj2r6PAAiY0imlEa_l9ukWAUbvwYTfcm0_6FsGNzYPvPsU43yU6fHMwnNui-KEYvvoGI7LodrSIAdJN93_3JdPj1V6IRzabwfBjs2XczZ5EEEub-7pWTJrGt7hfGErrI/s649/rae5.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="649" data-original-width="328" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDGpGP7KYAYj1sj2r6PAAiY0imlEa_l9ukWAUbvwYTfcm0_6FsGNzYPvPsU43yU6fHMwnNui-KEYvvoGI7LodrSIAdJN93_3JdPj1V6IRzabwfBjs2XczZ5EEEub-7pWTJrGt7hfGErrI/s320/rae5.jpg" /></a></div><br /><div><b>Above: </b>A duplicated 30 day sequence for June Minv2 and Min Raw.</div><div><br /></div><div><br /></div><div>There are also linear relationships between datasets too, suggesting linear regression being used from raw to adjusted. For example a constant of 0.6 and slope of 1 exists between minv2 and minraw in January--</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_R_wkalCt-OAb1PHOuMBxzZ6a7nmn6aP0Ih7kCHhALFSPyfKjk-URzOgVziWkqtejlJ9LG4gfaUCOOilQhXCofR0bMdYg9crD5U892wu31-4tUZh36zap5q84Va1HEwMKo43QOueSZ3Q/s344/lll1.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="344" data-original-width="289" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_R_wkalCt-OAb1PHOuMBxzZ6a7nmn6aP0Ih7kCHhALFSPyfKjk-URzOgVziWkqtejlJ9LG4gfaUCOOilQhXCofR0bMdYg9crD5U892wu31-4tUZh36zap5q84Va1HEwMKo43QOueSZ3Q/s320/lll1.jpg" /></a></div><br /><div>But in some sequences between raw and adjusted, the constant is 0.2 slope 1, then 0.3 slope then 0.4 slope 1 and so on in a regular pattern.</div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgk57Vk5_mx0GEApdotIBEMc-9RVlZk8MID0yPQqDNsAkTPn9oxGCrpAuBBpKVhZNeFnWAUf-ijFjyLNcVk9tId26PM_KkunkQxMOFcTb4L_LapLCd3HtqDbqWX9gGYkKYte5eKNC_MS7o/s484/linearx.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="402" data-original-width="484" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgk57Vk5_mx0GEApdotIBEMc-9RVlZk8MID0yPQqDNsAkTPn9oxGCrpAuBBpKVhZNeFnWAUf-ijFjyLNcVk9tId26PM_KkunkQxMOFcTb4L_LapLCd3HtqDbqWX9gGYkKYte5eKNC_MS7o/s320/linearx.jpg" width="320" /></a></div><b>Above:</b> Direct linear relationships between, minimum and maximum adjusted daily temperatures in March.<br /><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDyoOFubFwNtwY0VtDcEFq6Xh7tOkS0ZaYxEGHNaJeI-uSseDxHcZHG4dFr2_sV0zyexnfFUXjlXeki5O2-mR3humMOzOrj4hqIctmqb8nJII_-kFsT5oU4_4kFmXwt4_-2k-NaUeP-wM/s373/lll2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="373" data-original-width="325" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDyoOFubFwNtwY0VtDcEFq6Xh7tOkS0ZaYxEGHNaJeI-uSseDxHcZHG4dFr2_sV0zyexnfFUXjlXeki5O2-mR3humMOzOrj4hqIctmqb8nJII_-kFsT5oU4_4kFmXwt4_-2k-NaUeP-wM/s320/lll2.jpg" /></a></div><br /><div><br /></div><div>Shorter sequences that are duplicated but are still fairly rare.</div><div>The below sequence in the Maxv2 June data that has a rarity of 16 heads in a row, equivalent to more than a 1 in 65 500 of occurring by chance.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhF9De-N1mfz12aWs4733rIWcH2KqBQrGpO7rOrHhr8ZFFTiivIcjAKJd5UM711DABkoOqW6xHiI2pbry6T0qYDCY-rM0WebGj3Syy5PT3cdHIjxC1VGg-K0UHxUTV43bE8gBwj6WEnhJs/s296/rare77.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="296" data-original-width="205" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhF9De-N1mfz12aWs4733rIWcH2KqBQrGpO7rOrHhr8ZFFTiivIcjAKJd5UM711DABkoOqW6xHiI2pbry6T0qYDCY-rM0WebGj3Syy5PT3cdHIjxC1VGg-K0UHxUTV43bE8gBwj6WEnhJs/s0/rare77.jpg" /></a></div><br /><div>In Minv2 September below, the number of unique temperatures and the size of the dataset gives a rarity of 15.3 for the below sequence which equals to a 1 in 40 300 chance for that event happening by chance.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLT9EWgHjej-odZGUFeIJvLk7G7RbmDviUxMoucBCK4mHcy91HGCuoqG6KSTMLEhVhW27F2vb3DOFVNapvvap685Nt7LE2tnI-Z4znCQZwpxMRjLLab491sVJP6SHhIjir-kjweNW-IO8/s251/rare88.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="251" data-original-width="202" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLT9EWgHjej-odZGUFeIJvLk7G7RbmDviUxMoucBCK4mHcy91HGCuoqG6KSTMLEhVhW27F2vb3DOFVNapvvap685Nt7LE2tnI-Z4znCQZwpxMRjLLab491sVJP6SHhIjir-kjweNW-IO8/s0/rare88.jpg" /></a></div><div><br /></div><br /><div>Looking at the complete Sydney Maxv2 dataset with 40 000 days and looking for sequences duplicated ACROSS MONTHS, two extreme cases with rarity scores of 16.5 which equals a 1 in 92000 chance pop up:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwTF2ue9Ep8d0li3nG7idvcHu4Hms1NZSsa-vZhFPcmwx-p7RJSYLHKV9RWYU53I8liZk58lr9KCQRGhiF8bgYmKgDrff-ytZtmWM2-Y4U061ldqxuloI0LusxqZubOragQ_DZ3cAM2B8/s289/rare999.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="235" data-original-width="289" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwTF2ue9Ep8d0li3nG7idvcHu4Hms1NZSsa-vZhFPcmwx-p7RJSYLHKV9RWYU53I8liZk58lr9KCQRGhiF8bgYmKgDrff-ytZtmWM2-Y4U061ldqxuloI0LusxqZubOragQ_DZ3cAM2B8/s0/rare999.jpg" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyR_8uXF1kMVisTBCcbSyOBOVfEg7KHqeBPaWo76KgDu1tvAF5Qy3iAFdl4y9O0SibpzKGbDO6qvoGVP-VnUJLJkFnvGDhxaos7gS5eyQ0vs9CFiFZlEpEpE_afgR3wFq3VLDnb8eLv04/s293/rare999bbbb.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="250" data-original-width="293" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyR_8uXF1kMVisTBCcbSyOBOVfEg7KHqeBPaWo76KgDu1tvAF5Qy3iAFdl4y9O0SibpzKGbDO6qvoGVP-VnUJLJkFnvGDhxaos7gS5eyQ0vs9CFiFZlEpEpE_afgR3wFq3VLDnb8eLv04/s0/rare999bbbb.jpg" /></a></div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsenODTpXF9CPOa5GzBJHapRpZk90N3WuBTMHZF9luKoE2KTh5MI8DvyzvktogZFtcmy0dMw0dGEfrQtHgunpt6WXOHZPKS0nomfiRpXBl5jlbM-Q3olmFiNEHS_aIa6wKWs-y6NYbdW0/s261/marchrepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="261" data-original-width="218" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsenODTpXF9CPOa5GzBJHapRpZk90N3WuBTMHZF9luKoE2KTh5MI8DvyzvktogZFtcmy0dMw0dGEfrQtHgunpt6WXOHZPKS0nomfiRpXBl5jlbM-Q3olmFiNEHS_aIa6wKWs-y6NYbdW0/s0/marchrepeats.jpg" /></a></div><br /><div><b>Above:</b> A shorter yet still improbable sequence in March Maxv2 dailies. Only sequences above the probability of 1 in 40 000 being chance are shown here, there are many many shorter sequences in the BOM data that are more unusual. For example 2 cities in the Netherlands (De Kooy and Amsterdam) were checked as well as 2 regions in the U.S (nw and sw regions from NOAA) and were compared to Sydney sequences, none came close to the large number of rare events.</div><br /><div><br /></div><div><b>Results Of Pattern Exploration:</b></div><div><b>Sydney has sequences copied between months and years that have</b></div><div><b>zero probability of being a chance occurance. The large number of the shorter duplicated series are also improbable.</b></div><div><b><br /></b></div><div><b>There are multiple linear relationships between raw and adjusted data suggesting linear regression adjustments between raw and adj.</b></div><div><b><br /></b></div><div>Generally speaking, the country data sets (not yet posted) are even worse the Sydney data. Charleville has 2 months copied, Port Mcquarie has large sequences copied, Cairns has January 1950 copied into December 1950. This exists in Raw Data and is carried over into adjusted data.</div><div><br /></div><div>The data has been tampered with. Missing data cannot be an explanation for copy/pasting sequences because:</div><div><br /></div><div>1 - Data is imputed via neural nets etc. In the climate industry, data is imputed via neighboring stations with close correlation.</div><div><br /></div><div>2 - Nearly all BOM data has some missing temps, some data sets have years of empty spaces. There are over 200 in these data sets. Why would there be an attempt to conceal 1 month of missing temp sequences?</div><div><br /></div><div>Temperature records are being reported to 1/10 C of a degree. </div><div><b>Copy/pasting months into different years, or worse, <i>into different months</i> as has happened in other data sets, is data tampering.</b> This should not happen with time series data. See below:</div><div><br /></div><div>-------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div><div><i>Weather Data: Cleaning and Enhancement, Auguste C. Boissonnade; Lawrence J. Heitkemper and David Whitehead, Risk Management Solutions; Earth Satellite Corporation</i></div></div><div><div><br /></div><div>"CLEANING OF WEATHER DATA</div><div>Weather data cleaning consists of two processes: the replacement of missing values</div><div>and the replacement of erroneous values. These processes should be performed</div><div>simultaneously to obtain the best result.</div><div>The replacement of one missing daily value is fairly easy. However, the problem</div><div>becomes much more complicated if there are blocks of daily missing values. Such</div><div>cases are not uncommon, particularly several decades ago. The problem of data</div><div>cleaning then becomes a problem of replacing values by interpolations between</div><div>observations across several stations (spatial interpolation) and interpolations</div><div>between observations over time (temporal interpolation)."</div></div><div><br /></div><div>----------------------------------------------------------------------------------------------------------------------------</div><div>The Best For Last........</div><div><br /></div><div class="separator" style="clear: both; text-align: left;"><b>An Analysis Of Repeating Numbers In Climate Data.</b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div><div class="separator" style="clear: both; text-align: left;"><b><br /></b></div>Uri Simonsohn is amongst other things a "data detective" who specialises is statisical analysis of published studies. He attempts to replicate these studies and tests the data for tampering and fabrication.<div>He has produced a very useful tool tool for <a href="http://datacolada.org/77" target="_blank">forensic data analysis</a>.</div><div><br /></div><div>The R code is available from him to do what he calls a "number bunching" test -- this test for repaeted numbers that occur more than expected for a particular data set. </div><div><br /></div><div>I have used this code to test the bunching of repeat temperatures in the Sydney Daily Min Max Temperature time series.</div><div><br /></div><div><br /></div><div><br /></div><div><b>Problem:</b></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLpbgrghe3UB2M3SPGktQ2UMqjjtNrU6z-hKP48s7uz2ElJIgp6wc9tshlYWMoF9UbGW4RuzfEgQC81-NDAx1DCssB7EWdnAD-kc9QPcTb7ea8GbCyPP9SMQWtFvcTLkMSLWKoGKBY9yQ/s557/march1.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="490" data-original-width="557" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLpbgrghe3UB2M3SPGktQ2UMqjjtNrU6z-hKP48s7uz2ElJIgp6wc9tshlYWMoF9UbGW4RuzfEgQC81-NDAx1DCssB7EWdnAD-kc9QPcTb7ea8GbCyPP9SMQWtFvcTLkMSLWKoGKBY9yQ/s320/march1.jpg" width="320" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCZYiOmSWewQPHkTfuP8iQvLh-0CmE6LjbeEJiBnXado0K4yFQmhd-EndwuhDfDfQIr98mCNFIjgKrGHu82iEOyH-h87-zpD6p-iHQN92nGRET5JTu_woOa7X5PCt27lINxwtyosX04_4/s556/march2.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="484" data-original-width="556" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCZYiOmSWewQPHkTfuP8iQvLh-0CmE6LjbeEJiBnXado0K4yFQmhd-EndwuhDfDfQIr98mCNFIjgKrGHu82iEOyH-h87-zpD6p-iHQN92nGRET5JTu_woOa7X5PCt27lINxwtyosX04_4/s320/march2.jpg" width="320" /></a></div><br /><div><b>Above</b>: This example is from all the days in March in the Sydney daily Min Raw and Minv2 temp time series. <b>A massive increase in repeated numbers from raw to minv2!</b></div><div><br /></div><div>Looking at the bottom picture first, shows that the most repeated temp in this series was 17.8 and it was repeated 88 times. The next highest repeating temp was 18.3 at 86 times and so on.</div><div><br /></div><div>The first picture in this example is showing Minv2, the <i>adjusted</i> temperatures for Max temps of March.</div><div>Notice what happens to the repeats. They increase <i>a lot.</i></div><div><i><br /></i></div><div>Increasing number repetition is a common way of manipulating data.</div><div><br /></div><div>Lets look at December Max and Min temps. December is one of the suspect months that has a high level of tampering, from Benfords law to number sequences that are repeated.</div><div><br /></div><div>At this point we are looking for repeated numbers. To get a quick view of this, lets graph the repeated numbers in the Min Max December time series.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjslABYJOpy7elUPXsfN-1nQMcJGAM3IBNUIWmzamQSaqk_YKWUppk2vhNJKEyVpsduk-1INx7J12_QFCCAi-Cp3NfVLTdwVR1Oafe6t_JX-56PWe-B5IcQ3TyJLXHUzR1W_2leoiAKVj0/s1920/decmaxrawmaxv2repeatsgraph.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="347" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjslABYJOpy7elUPXsfN-1nQMcJGAM3IBNUIWmzamQSaqk_YKWUppk2vhNJKEyVpsduk-1INx7J12_QFCCAi-Cp3NfVLTdwVR1Oafe6t_JX-56PWe-B5IcQ3TyJLXHUzR1W_2leoiAKVj0/w640-h347/decmaxrawmaxv2repeatsgraph.jpg" width="640" /></a></div><div><b>Above:</b> Repeated temps in Dec, Minv2 in blue and Min Raw in orange.</div><div><b>The most repeated temps have the longest spikes.</b> </div><div><br /></div><div>How many time they repeat is on the left vertical axis, the bottom axis is the actual temps. Min Raw (orange) has a single peak that is highest, but Minv2 (blue) has more overall higher spikes. Minv2 also appears to the eye to be more "bunchy"....more spaces and blocks or grouping.</div><div>But how much bunching is normal and how much is suspicious? </div><div><br /></div><div>This is where the number bunching software helps us. A formula is created (similar to entropy) to average frequency of each distinct number (repeated temp) , and then 5000 - 10000 boostraps are run and a graph with the results is output showing observed repeated numbers against expected repeated numbers for this sample. See the website for more details. (<a href="http://datacolada.org/77" target="_blank">Link</a>)</div><div><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqBsmjrReQWzqHZWT5JogW507tgWS6kZd0oTuZIGPBzYAQFYSpwJoRUMOeYdpWJMSqntiGDaqEWuD9xaBITxr18-Ht-KndbTL-j7Q2sWTw-l8argBR3ndxAl1ujOxWVYIZdgVWOLkQXUY/s1920/decminv2minrawgraph.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1042" data-original-width="1920" height="347" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqBsmjrReQWzqHZWT5JogW507tgWS6kZd0oTuZIGPBzYAQFYSpwJoRUMOeYdpWJMSqntiGDaqEWuD9xaBITxr18-Ht-KndbTL-j7Q2sWTw-l8argBR3ndxAl1ujOxWVYIZdgVWOLkQXUY/w640-h347/decminv2minrawgraph.jpg" width="640" /></a></div><br /><div><b>Above:</b> This is the Min Raw in orange and Minv2 in blue for December. The number bunching analysis for repeated numbers will be run again with this data to asssess the bunching of repeats.</div><div><br /></div><div>Number Bunching Results.</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjisCf_c7r1JWjPNwXnaMKNQ2mYGWtuNTuS0RP8ThvLPOoKuNRSpkqnKu6K7U0AwhV8_x-eCfevO99YLqlqd7Cmf7I5Lazu1ofhLPGNKshUySpNAiNgjk18myTydN_B3bwK2CCcPQ3CEBM/s1202/decmaxrawrepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjisCf_c7r1JWjPNwXnaMKNQ2mYGWtuNTuS0RP8ThvLPOoKuNRSpkqnKu6K7U0AwhV8_x-eCfevO99YLqlqd7Cmf7I5Lazu1ofhLPGNKshUySpNAiNgjk18myTydN_B3bwK2CCcPQ3CEBM/w640-h488/decmaxrawrepeats.jpg" width="640" /></a></div><div><b>Above:</b> Results of number bunching analysis for Max raw Sydney temps.</div><div>This shows the expected average frequencies against the observed average frequencies. The <i>red line is the</i> <i>observed</i> average frequencies for Max Raw data. The red line is within the distribution, it is 2.02 Std errors from the mean, about a 1 in 20 occurance. This is well within expectation.</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfsQpFX5DtEZf-IZCceGnFfeaji9Ao2nFdpvfSa0f3H_6feH-Y9u466LgGQDh2GPVFDwVuecvg4kckwQWaOrVnSN9a05xsPXn8gtEZZVeUAl-WE-PTqK2N9xVY7L5340aLnslAQRrJgN4/s1202/decmaxv2repeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfsQpFX5DtEZf-IZCceGnFfeaji9Ao2nFdpvfSa0f3H_6feH-Y9u466LgGQDh2GPVFDwVuecvg4kckwQWaOrVnSN9a05xsPXn8gtEZZVeUAl-WE-PTqK2N9xVY7L5340aLnslAQRrJgN4/w640-h488/decmaxv2repeats.jpg" width="640" /></a></div><div><b>Above:</b> Maxv2 -- the expected average frequencies and the observed average frequencies have been separated by a massive Std error of 27.9. We are seeing far too many observed average repeated numbers against what is expected for this sample. We would expect to see this bunching in fewer than 1 in 100 million times. </div><div><br /></div><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOSBIfH33MAQ7h-mn7YegOmAdT0ixF7vNTpTHUWX5HzL2jk14Yt7DBaeUp6zCbtXXEf8qzg2Gx6GTWu_uyd3el4bIB_yysUOZrBRVHiu5aOggaggVa0zJ_uXXDZ22KgGYeBVTvrxcVWg4/s1202/decminrawrepeats.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOSBIfH33MAQ7h-mn7YegOmAdT0ixF7vNTpTHUWX5HzL2jk14Yt7DBaeUp6zCbtXXEf8qzg2Gx6GTWu_uyd3el4bIB_yysUOZrBRVHiu5aOggaggVa0zJ_uXXDZ22KgGYeBVTvrxcVWg4/w640-h488/decminrawrepeats.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: left;"><b>Above</b>: The Min Raw Data tells a similar story, there are too many repeated numbers. The observed repeats have a 7.6 Std error. This is more than a 1 in a million occurance.</td></tr></tbody></table><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3hetphapxFBRAIKaqlzDxcO3gp8Ildn-P2IW4gHIZlaQOi8Cx7vxAu-9ouXBFHKvC8iqpqGOE3oDktf1g3vNXwHuxhKuAHVeTrAQ8wD-E1QX1hkR1oXz9hJWml0Y-nm3VCdf_DNYwir8/s1202/decminv2repeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3hetphapxFBRAIKaqlzDxcO3gp8Ildn-P2IW4gHIZlaQOi8Cx7vxAu-9ouXBFHKvC8iqpqGOE3oDktf1g3vNXwHuxhKuAHVeTrAQ8wD-E1QX1hkR1oXz9hJWml0Y-nm3VCdf_DNYwir8/w640-h488/decminv2repeats.jpg" width="640" /></a></div><br /><div><b>Above:</b> Minv2 - the observed average repeated numbers (red line) here is so far out of expectation, 41.5 Std errors, we never expect to see this. The numbers become too tiny for any meaningful computation. The data has extremely high rate of bunching. Extremely high number of repeated temps.</div><div><br /></div><div><br /></div><div><b>June below:</b></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjI9sAMk2Z4RyEeOjS8gLMeOs2IqZPHR3ICbsg618_Rpx7aM7-onwaMhOZmTHAfomn5tJyqfnx9iwpvB_s9FPmwCXsD4MNjQeA1wewy068wQEh99Exuh0EUPif7qXzGKYt58U7-4UDp9OI/s1202/junemaxrawrepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjI9sAMk2Z4RyEeOjS8gLMeOs2IqZPHR3ICbsg618_Rpx7aM7-onwaMhOZmTHAfomn5tJyqfnx9iwpvB_s9FPmwCXsD4MNjQeA1wewy068wQEh99Exuh0EUPif7qXzGKYt58U7-4UDp9OI/w640-h488/junemaxrawrepeats.jpg" width="640" /></a></div><div><b>Above:</b> June Max Raw data has standard error of nearly 12, a very high level of bunching we would virtually never expect to see.</div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1AtG0H_0U4j17q24c0_lkSfBLqPKKMV8nsjcJG8N6Ca2Mw3Xt0CO2v1C7tdHJwAhXVyqgFOSM8lmVRvjs9IYMNN6W9EfMjGflSC93sbBgepIjN-RgC_YTtjrJbdW5Is_4ZqnWVSZHdB0/s1202/junemaxv2repeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1AtG0H_0U4j17q24c0_lkSfBLqPKKMV8nsjcJG8N6Ca2Mw3Xt0CO2v1C7tdHJwAhXVyqgFOSM8lmVRvjs9IYMNN6W9EfMjGflSC93sbBgepIjN-RgC_YTtjrJbdW5Is_4ZqnWVSZHdB0/w640-h488/junemaxv2repeats.jpg" width="640" /></a></div><b>Above:</b> The Maxv2 adjusted data for June....and is it adjusted! It was bad in Raw, it is a whopper in adjusted Maxv2 data. The standard error of 49 is massive, the chance of seeing this in this sample is nil. A high level of manipulation in repeated numbers (temps).<br /><div><br /></div><div> </div><div><b>October below:</b></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9BFV0OVA6ZypsP4GfH9MfOKZUTPIITArBe1-ewB9QuQc41o5Z6ngo2QZNAZa7o77QxFUlNCdbMva89pa7bLMDfFS9nMK9CmbjE8Riy2JQYAGCLDijS-M6aFX2fCx35i7kk9AyIDS98jU/s1202/octmarawrepeats.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9BFV0OVA6ZypsP4GfH9MfOKZUTPIITArBe1-ewB9QuQc41o5Z6ngo2QZNAZa7o77QxFUlNCdbMva89pa7bLMDfFS9nMK9CmbjE8Riy2JQYAGCLDijS-M6aFX2fCx35i7kk9AyIDS98jU/w640-h488/octmarawrepeats.jpg" width="640" /></a></div><div><b>Above:</b> The October Max Raw data has observed average repeated numbers against expected average repeated numbers of 4.8 Std errors past the mean, highly unusual but not beyond expectation. More than 1 in 150 000 event.</div><div><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3tvIx0QLYtLC1HtjMPFmtqdZ_gjToh7ayKCpy-DUQmkf4f8HJfO5utXTMoeEB0JMXvQdgJam7pJdfKWnlJ3OXfxzkw4e23dckHpHedoDaNTxy5Z_qg2cHod-EpuyZfWf1-HLduoCpad4/s1202/octmaxv2repeats.jpg" style="margin-left: 1em; margin-right: 1em; outline-width: 0px; user-select: auto;"><img border="0" data-original-height="916" data-original-width="1202" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3tvIx0QLYtLC1HtjMPFmtqdZ_gjToh7ayKCpy-DUQmkf4f8HJfO5utXTMoeEB0JMXvQdgJam7pJdfKWnlJ3OXfxzkw4e23dckHpHedoDaNTxy5Z_qg2cHod-EpuyZfWf1-HLduoCpad4/w640-h488/octmaxv2repeats.jpg" width="640" /></a></div><br /><div><b>Above:</b> The October Maxv2 adjusted data set has far too many oberserved repeats against expected repeated temps, over 24 Std errors. Too tiny a probabilty to calculate. We would not expect to see this.</div><div><br /></div><div><br /></div><div><b>Results:</b></div><div>The frequency of repeated temperatures, called number bunching" in this software analysis, tests how likely the data has been tampered with. <i>A much more extreme outcome exists here</i> than in the study Uri Simonsohn highlights on this website and where he supplies the <a href="http://datacolada.org/77" target="_blank">R code to test this</a>. The suspect study he used was shown was retracted for suspected fabrication. The BOM data is extremely suspicious.</div><div>--------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div><b>Wrapping Up:</b></div><div> The first step and the biggest one that takes up most analysis time, is data cleaning and preprocessing and integrity checks. If the data has no integrity at input, it is not worth persuing.</div><div><br /></div><div>There are many questions to be answered on the data integrity of not only the Sydney Min Max temperature time series, but many/all other cities and towns. </div><div><br /></div><div><b>Preliminary work shows even worse results for the smaller towns compared to the Sydney data.</b></div><div>More posts will follow to document more from the BOM temperature data time series that is used for data modeling and projections. The garbage in - garbage out scenario means no credibilty can be given to climate modeling using this data.</div><div><br /></div><div>Looking at other climate data providers such as Berkley Earth shows similar problems. The ocean temperature anomalies from Berkely earth will be looked at in the future, but preliminary work shows they have no use whatsoever. The ocean surface temp anomalies are so far from conforming to Benfords Law, it is clear they are only "guesstimates" (interpolations, they call it). Any meaningful modeling output from these anomalies is doomed. </div><div><br /></div><div>At the very least, a<b> Government forensic audit should be performed on The BOM climate data.</b></div><div><b>It is extremely suspicious and would have been flagged in any financial data base for an audit.</b></div><div><br /></div><div><br /></div><div>-----------------------------------------------------------------------------------------------------------------------------</div><div><br /></div><div>Increased Uncertainty Besides Dirty Data</div><div><b>Errors that Increase Uncertainy Even More</b></div><div><br /></div><div>1 - Double Rounding errors exist in most climate data and have mostly not been corrected.</div><div><div>(Decoding the precision of historical temperature observations, Andrew Rhines, Marrtin P. Tingley, Karen A. McKinnon, Peter Huybers)</div><div><br /></div><div>2 - Errors in using anomalies ( New systematic errors in anomalies global mean temperatures time-series, by Michael Limburg , 2014)</div></div><div><br /></div><div>3 - Uncertainty. Autocorrelation time series do not follow Gaussian error propogation. Darwin 30 temp average has an uncertainty of plus or minus 0.4 C degree, making any warming within the boundaries of error.</div><div>(Can we trust time series of historical climate data? About some oddities in applying standard error</div><div>propagation laws to climatological measurements Michael Limburg (EIKE) Porto Conference) </div><div><br /></div><div>4 - <a href="https://www.wiley.com/en-au/The+Flaw+of+Averages%3A+Why+We+Underestimate+Risk+in+the+Face+of+Uncertainty-p-9781118073759" target="_blank">Flaw Of Averages</a>. Using averages means that on average you are wrong. Particularly when you use averages of <i>averages.</i></div><div><br /></div><div>5 - BOM thermometer (including electronic) tolerances are 0.5 C degrees, below WMO suggested spics of 0.2 C degrees.</div></div><div><br /></div><div>6 - Errors in inadequate spatial sampling. "<i>While the Panel is broadly satisfied with the ACORN-SAT network coverage, it is concerned that network coverage in some of the more remote areas of Australia is sparse</i>." Report of the Independent Peer Review Panel 4 September 2011. </div><div>This relates to : "Global and hemispheric temperature trends: uncertainties related to inadequate spatial sampling", (Thomas R. Karl, Richard W. Knight, John R. Christy, 1993)</div><div><br /></div><div>7 - Confidence intervals for time averages in the presence of long-range correlations, a case study on Earth surface temperature anomalies, M. Massah 1 and H. Kantz, Max Planck Institute for the Physics of Complex Systems, Dresden, Germany) -- "Time averages, a standard tool in the analysis of environmental data, suffer severely from long-range correlations." Uncertaintaines larger than expected, again.</div><div><br /></div><div>More analysis to follow in other blogs.</div>Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-17292154120784126662017-04-04T19:05:00.003-07:002017-09-14T00:30:51.873-07:00JonBenet Ransom Note Analysis Using Syntactic Ngrams -- Or Taking The Words Away And Looking At Structure.<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0g5XwL1Em5uLoEtCU17mRbIzC_CPWxFxZ-IgeWWcrG0EsMxBHlw2w3yQLBZfDtk8H0YS9PTD41U-r-lEpN1CXj5OgncZMtZ3ZN6LBJldAMxl9Q-_iey9cUhXwYw-0agEr3GYx9Mu4HBI/s1600/R_Workshop_zzzzz__001.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0g5XwL1Em5uLoEtCU17mRbIzC_CPWxFxZ-IgeWWcrG0EsMxBHlw2w3yQLBZfDtk8H0YS9PTD41U-r-lEpN1CXj5OgncZMtZ3ZN6LBJldAMxl9Q-_iey9cUhXwYw-0agEr3GYx9Mu4HBI/s640/R_Workshop_zzzzz__001.jpg" width="640" /></a></div>
<br />
<br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">New state of the art software is being released in various domains, much of which can help in stylometry analysis. I have decided to bite the bullet finally and move over from Matlab to R, the open source statistical software.</span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">The best permutation and nonparametric combination test software is now on R -</span></span><br />
<a href="http://caughey.mit.edu/software"><span style="font-family: "verdana" , sans-serif; font-size: large;">http://caughey.mit.edu/software</span></a><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">This allows you to compare samples against base without worrying whether your data is complies with the normality curve, or if you have more variables than samples and so on. Devin Caughey has written some very nice papers on this, and now his software is available on R.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Now with the release of Stylo R package, I have well and truly moved over to R:</span></span><br />
<a href="https://sites.google.com/site/computational%20stylistics/home"><span style="font-family: "verdana" , sans-serif; font-size: large;">https://sites.google.com/site/computational stylistics/home</span></a><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">This is a superb stylometry package with some of the latest developments in stylo analysis such as Burrows Delta and Consensus Bootstrap Tree, rolling Delta etc. These guys know their stuff and have written a great program.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Two more bits of software to complete the analysis puzzle, the state of the art Stanford Parser from the Stanford NLP Group - <a href="https://nlp.stanford.edu/software/lex-parser.shtml">https://nlp.stanford.edu/software/lex-parser.shtml</a></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">And with the advent of Syntactic Ngrams by Google and others, some great ideas along these lines with with software to produce them, Dr. Gregori Sidorov has an interesting site along with some great papers he has written. He has done some interesting work on the syntactic ngrams and call them <i>sngrams</i>. His site and the software in Python -- <a href="http://www.cic.ipn.mx/~sidorov/">http://www.cic.ipn.mx/~sidorov/</a></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">Also worth mentioning is authorship software Toccata by Richard Forsyth, along with his other software. I bought Beagle from him in the eighties, and still have fond memories of it. All his new stuff is in Python:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><a href="https://www.richardsandesforsyth.net/software.html">https://www.richardsandesforsyth.net/software.html</a></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">That's a round up of the software, so lets put it together slowly.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<b><span style="font-size: large;">The Problem:</span></b></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">A 374 word ransom note at the scene of a murder, or accidental homicide of JonBenet Ramsey. The FBI and police and lead investigator James Kolar agree the note was part of the <i>"staging"</i> of the crime scene.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">A staged ransom note means it is trying to portray what it is not. The writer was aware that handwriting would be extensively analysed afterwards, this alone means that handwriting analysis (physically comparing writing) would be useless in a court of law because a lot of effort would have been made to fake and randomise the appearance of the note, and it could never be "beyond reasonable doubt."</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<b><span style="font-size: large;">Linguistic Analysis:</span></b></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Linguistic analysis is an option and has progressed in leaps and bounds over the last few years: (Koppel,Eder, Rybicki, Hoover et al).</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">It has been known for a long time that people tend to write with their own "style" and using function words, for example "at", "by", "be", "but" and "can" provide linguistic fingerprints because people are unaware of these tiny words and they are <i>not context sensitive</i>, making them a good marker in many cases. </span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">By themselves they are not enough however. And so the search is on for more markers and more software to separate the signal from the noise.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span><span style="font-size: large;">WritePrint which is embedded into Jstylo (earlier post) has about 800 different variables it creates, and used to be considered the gold standard.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Another clever method used with success in a stylometry competition was by the team of Koppel, Akiva and Dagan with their <i>"Unstable"</i> words as markers:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<a href="http://onlinelibrary.wiley.com/doi/10.1002/asi.20428/abstract"><span style="font-size: large;">http://onlinelibrary.wiley.com/doi/10.1002/asi.20428/abstract</span></a></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span><span style="font-size: large;"><b>The JonBenet Ramsey Ransom Note:</b></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">Looking at the JonBenet ransom note, means that using content words would fail. In other words, pronouns probably need to be ignored, and content words cannot be used because all ransom notes bear similarities along these lines.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">One ransom note would be linked to another if you used word frequencies of "you" and "money" and "die", for example. Since the JonBenet is staged or faked (she was dead when the note was written, the note was purported to be from a "faction"), it is likely that there would be red herrings in the writing in order to attribute it to a radical group.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Any spelling mistakes, hyphens and strange letter formations etc would be obvious and probably useless as markers because the writer knew the note would be analysed, and keeping in mind the dynamics of staging, you would expect conscious errors/red herrings etc.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">What we need to do is look for <i>unconscious style markers</i> and text structure, things that are written as habit. It is likely that just as the handwriting experts noted that the last part of the note was the most fluid, it is also likely that the last part also has the most unconscious markers due to force of habit...concentrating on staging a note in the beginning, and it becoming more "free flowing" with habit taking over at the end.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">It is also likely that if the crime was covered up by the parents after the son accidentally hit JonBenet on the head with a torch in a fit of rage for snatching some pineapple from him in a midnight snack as per the CBS show (which seems to line up the evidence as the most likely scenario), it would be natural to think that both parents are involved to <i>some extent</i>, one dictating some text or ideas, the other writing.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;">People write differently to how they talk, and use different parts of the brain to process written text and verbal, so one of the parents would be dominating in their unconscious writing style unless the letter was being quoted verbatim (unlikely.)</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;">
<b><span style="font-size: large;">Parts-Of Speech Analysis:</span></b></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<i><span style="font-size: large;">The idea is to take away the words, leaving the lexical structure of the ransom note.</span></i></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><i><br /></i>
This is easily done with the Stanford Parser, and also the Stanford Tagger, both in Java and I have also used the MontyLingua Tagger written in Python.</span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">What a Speech Tagger does is replace words with parts of speech lexical categories such as Verbs, Nouns, Pronouns, Determiners etc. The most used Tags are the Penn Tree Bank of tags, of which there are 36:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_nfFVVZdGQFeedUHM4MX_TIdXB2rQ_wFvy7rO60Vqhuca74jJ_6Fww206cpBQuDmFS2Fyk1zy9X2byL6uHBCpFRWqBjdWmr28FUC9NFEdcW0XhRE14TksXhnDlrzpV16iS4lOvvJnQ3s/s1600/tags.tif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: "verdana" , sans-serif; font-size: large;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_nfFVVZdGQFeedUHM4MX_TIdXB2rQ_wFvy7rO60Vqhuca74jJ_6Fww206cpBQuDmFS2Fyk1zy9X2byL6uHBCpFRWqBjdWmr28FUC9NFEdcW0XhRE14TksXhnDlrzpV16iS4lOvvJnQ3s/s640/tags.tif" width="484" /></span></a></div>
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span></div>
<span style="font-family: "verdana" , sans-serif; font-size: large;">This means every word in language automatically gets tagged with one of the above parts of speech tags. There are 6 different Verbs, and <i>depending on the context of the writing</i>, it gets it's assigned Tag from this list.</span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">As an example, lets look at a snippet of text from the ransom note using the word <i>"hence"</i>, and one of Patsy's notes with the word <i>"hence"</i> and tag them:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<b><span style="font-size: medium;">1<span class="Apple-tab-span" style="white-space: pre;"> </span> /NN of/IN eternal/JJ life/NN and/CC <i>hence</i>/RB ,/, no/DT hope/NN </span></b></span><br />
<span style="font-family: "verdana" , sans-serif;"><b><span style="font-size: medium;"><br /></span></b></span>
<b><span style="font-family: "verdana" , sans-serif; font-size: medium;">2<span class="Apple-tab-span" style="white-space: pre;"> </span> /NN of/IN the/DT money/NN and/CC <i>hence</i>/RB a/DT earlier/JJR delivery/NN </span></b><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">The top line tells us there is a Noun followed by a Preposition and then an Adjective in the Patsy note at the top, and the ransom note below is slightly different but the lexical structure is <i>very</i> similar. The actual words are followed by a slash and then a tag by the parser.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Looking at the ransom note now, <i>and deleting all the words, only keeping the parts of speech tags, it looks like this:</i></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: medium;">VB RB ! PRP VBP DT NN IN NNS WDT VBP DT JJ JJ NN. PRP VBP NN PRP$ NN CC</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">RB DT NN IN PRP VBZ. IN DT NN PRP VBP PRP$ NN IN PRP$ NN. PRP VBZ JJ CC</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">JJ CC IN PRP VBP PRP$ TO VB CD, PRP MD VB PRP$ NNS TO DT NN. PRP MD VB</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">CD, CD CD IN PRP$ NN. CD, CD MD VB IN CD NNS CC DT VBG CD, CD IN CD NNS.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">VB JJ IN PRP VBP DT JJ NN NN TO DT NN. WRB PRP VBP NN PRP MD VB DT NN IN</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">DT JJ NN NN. PRP MD VB PRP IN CD CC CD VBP NN TO VB PRP IN NN. DT NN MD</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">VB VBG RB PRP VBP PRP TO VB VBN. IN PRP VBP PRP VBG DT NN JJ, PRP MD VB</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">PRP JJ TO VB DT JJR NN IN DT NN CC RB DT JJR NN NN IN PRP$ NN. DT NN IN</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">PRP$ NNS MD VB IN DT JJ NN IN PRP$ NN. PRP MD RB VB VBN PRP$ NNS IN JJ</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">NN. DT CD NNS VBG IN PRP$ NN VBP RB RB IN PRP RB PRP VBP PRP RB TO VB</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">PRP. VBG TO NN IN PRP$ NN, JJ IN NNP,NN, FW, MD VB IN PRP$ NN VBG</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">VBD. IN PRP VBP PRP VBG TO DT JJ NN, PRP VBZ. IN PRP JJ NN NNS, PRP VBZ.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">IN DT NN VBZ IN DT NN VBN CC VBD IN, PRP VBZ. PRP MD VB VBN IN JJ NNS</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">CC IN DT VBP VBN , PRP VBZ. PRP MD VB TO VB PRP CC VB VBN IN PRP VBP JJ</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">IN NNP NN NNS CC NNS. PRP VBP DT CD NN IN VBG PRP$ NN IN PRP VBP TO IN</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">JJ PRP. VB PRP$ NNS CC PRP VBP DT CD NN IN VBG PRP$ RB. PRP CC PRP$ NN</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">VBP IN JJ NN IN RB IN DT NNS. VB NN TO VB DT NN NNP. PRP VBP RB DT RB JJ</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">NN IN RB VBP RB VBP IN VBG MD VB JJ. VB VB PRP NNP. VB IN JJ JJ JJ NN</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">IN PRP. PRP VBZ IN TO PRP RB NNP!</span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;"> This is the ransom note with all the words and content deleted, leaving only the Penn Tree Bank Tags such as Nouns and Adjectives. So<b> we have minimised the text to it's basic lexical structure of 36 tags.</b></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">We do this with all of Patsy notes, about 15 000 words, and John Ramsey's letter of 10 000 words. We also add in two genuine ransom notes, the short Robert Wiles notes and the very long 982 word ransom note from the Barbara Mackle kidnapping.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Running all the POS TAGS in R using the brilliant Stylo R Package and running the Consensus Bootstrap Tree, we get this output:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5m_0R0psPONtwIprv1Kg3iYbGT10Rvj50LzK8vkyrS4YL9JMe6jaOQoJWqpn_5_PnaOQ_Uyci0WTinM9mXKbfb9GX9lBP1rbUM-sY99zz3rADPEToOzEUND42IzSJwungRsIrNxBSxuM/s1600/postagsnalysis.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: "verdana" , sans-serif; font-size: large;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5m_0R0psPONtwIprv1Kg3iYbGT10Rvj50LzK8vkyrS4YL9JMe6jaOQoJWqpn_5_PnaOQ_Uyci0WTinM9mXKbfb9GX9lBP1rbUM-sY99zz3rADPEToOzEUND42IzSJwungRsIrNxBSxuM/s640/postagsnalysis.jpg" width="640" /></span></a></div>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;"><i><br /></i></span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><i>Using NO words, only parts of speech, the POS structure of one of Patsy's notes is similar to the ransom note</i>, while the other ransom notes get binned together as being similar,and the two Christmas notes get put together too.</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-size: large;">Using a clustering algorithm, where the closest most similar to clumped together, this dendogram is produced on the twenty most frequent POS tags:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikbgVsAfiMjWvqqRDQWpJrAfROaU4gVMGmyq5qm7qnolM2A9fm4hiKig6RUMv2-k_VEwn-xrYP9XA9xoDpjycP2KwUkWJuc7D-sAvQS8bUalu2D_PmjmZjTdgae4G9yj9wIpTXo12d1ck/s1600/dendo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: "verdana" , sans-serif; font-size: large;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikbgVsAfiMjWvqqRDQWpJrAfROaU4gVMGmyq5qm7qnolM2A9fm4hiKig6RUMv2-k_VEwn-xrYP9XA9xoDpjycP2KwUkWJuc7D-sAvQS8bUalu2D_PmjmZjTdgae4G9yj9wIpTXo12d1ck/s640/dendo.jpg" width="640" /></span></a></div>
<span style="font-family: "verdana" , sans-serif;"><span style="font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">This lumps Patsy with the ransom note, her other notes similar to John, and the real kidnapping notes from Wiles and Mackle are on the outskirts of Patsy and the JonBenet ransom note.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Now, asking the software to classify who wrote the note, or more accurately, who is the closest match and using one of the most best classifiers proven to have a good track record in authorship, the SVM classifier, Patsy is determined to be the author.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Using one of the most recent and powerful algorithms in determining the distance ie the closeness of match is the Burrows Delta, which is included in the package, as well as modifications such as Eders Delta and Argamons Delta....<b>the output is again Patsy as the author.</b></span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><b><br /></b>
<b>Is there a way to get more linguistic structure out of the writing ie more information than POS Tags can give us?</b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Yes there is. This brings us to:</span></span><br />
<b><span style="font-family: "verdana" , sans-serif; font-size: large;">Syntactic Ngrams</span></b><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><b><br /></b>
<b>Part 1 - Parsing Text To Create A Dependency Tree:</b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">Recall, POS Tags (above) give us lexical structure, a word is replaced with a verb or noun tag, but tells us nothing about the syntactic dependency tree structure; telling us what is the subject and object of the sentence is, which word is at the head (root) of the tree and so on.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">We are now going to extract syntactic information. This is very different to POS Tags/ Parts Of Speech.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">http://demo.ark.cs.cmu.edu/parse/about.html</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">What we extract with syntactic parsing is the tree structure of a sentence -- which word is the object, which word is dependant on another, and to <i>create a tree structure that is non linear</i>. This means the words in a sentence are not listed by the parser in the order they are written, but in the order assessed to be syntactically correct according to a dependency tree.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">The critical take away point from this is that syntactic structure is NON LINEAR, meaning<i> the order of the sentence from the parser is different to how it was written</i>. The state of the art Stanford Parser has an accuracy of about 97% and reveals reveals the syntactic structure of text without words, <i>as a first step!</i></span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span><span style="font-family: "verdana" , sans-serif;">An example of the parser output for the sentence:</span></span><br />
<span style="font-family: "verdana" , sans-serif;"><span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<b><span style="font-family: "verdana" , sans-serif;">The boy with the brown eyes ate the cake.</span></b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">
<span style="font-family: "verdana" , sans-serif;">det(boy-2, The-1)</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">nsubj(ate-7, boy-2)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">case(eyes-6, with-3)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">det(eyes-6, the-4)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">amod(eyes-6, brown-5)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">nmod(boy-2, eyes-6)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">root(ROOT-0, ate-7)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">det(cake-9, the-8)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">dobj(ate-7, cake-9)</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Root is at the top of the tree, above that is a noun modifier, and brown at -5 (5th word) is dependent on eyes at -6. There are around 50 tags from the dependency parser, such as determiners, noun subjects etc.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">Onwards now to:</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<b><span style="font-family: "verdana" , sans-serif;">Part 2- Ngrams</span></b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Ngrams have been used for a long time and are one of the most reliable indicators of authorship (Sidorov 2014). Ngrams can be characters or words. You can think of it as a sliding window:</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">Using the above sentence again which comes from Google powerpoint presentation about their ngrams:</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif;">
<b><span style="font-family: "verdana" , sans-serif;">The boy with the brown eyes ate the cake.</span></b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><b><br /></b>
A bigram or 2 unit ngram is a 2 word sliding window:</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">The Boy, Boy With, With The, The Brown and so on.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">A trigram is 3 words or character unit (word in our example) and goes like this:</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">The Boy With, Boy With The, The Brown Eyes, Brown Eyes Ate and so on.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Two to five ngram units are the most useful in authorship (Sidorov).</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;"><br /></span>
<b><span style="font-family: "verdana" , sans-serif;">Part 3 - Syntactic Ngrams</span></b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">The final piece to this puzzle is the syntactic ngram. Google has used them to index several million books and 320 billion ngrams, with it's ngram viewer:</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<a href="https://books.google.com/ngrams"><span style="font-family: "verdana" , sans-serif;">https://books.google.com/ngrams</span></a></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">This is a simplistic interface though, and can only be used for frequencies, however there is more sophisticated analysis possible by downloading the Google ngram data.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;">Notice a problem in the last trigram string above:</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;"><i>Brown Eyes Ate</i></span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><i><br /></i></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">This is obviously misleading and won't help with the text analysis of that sentence ie the subject is missing. <i>You never get this output when you use syntactic ngrams</i>, so they are far more powerful, contain more information and are more relevant to the text being analysed!</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">And once again, the beauty with syntactic ngrams is that they are<i> non linear</i>, they contain structure information in a different order according to the parser tree.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">As mentioned, this example is from a Google presentation as they explain the purpose of their ngram viewer.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">But there is more power in these little guys yet!</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">Thanks to Dr. Gregori Sidorov, we can produce mixed syntactic ngrams which he calls <b>sngrams</b>--you can mix the syntactic tags from the parser with POS tags (above) or words or lemma.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">You now have mixed sngrams, or sngrams with relations, which he calls snrgrams.</span></span><br />
<br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">He has a site and software in Python to create various sngrams in different sizes along with some interesting papers:</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<a href="http://www.cic.ipn.mx/~sidorov/"><span style="font-family: "verdana" , sans-serif;">http://www.cic.ipn.mx/~sidorov/</span></a></span><br />
<br />
<br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">The take away point from this is that text goes into the Stanford Parser, that output from that goes into the sngram software, the output from that is sngrams or snrgrams (if you mixed them) of various sizes ie bigrams trigrams etc.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">Long story short--<i>these snrgrams have been shown the be the most powerful use of ngrams in various applications! </i></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><i><br /></i></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><a href="http://www.g-sidorov.org/Sidorov2014_IJCLA.pdf">http://www.g-sidorov.org/Sidorov2014_IJCLA.pdf</a></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">The JonBenet ransom note is coded as a 2 unit SNRGRAM (bigram) with Syntactic tags and POS tags.</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">WE are using the power of syntactic tags and syntactic POS tags containing more linguistic structure information than ever.</span><br />
<br />
<span style="font-family: "verdana" , sans-serif; font-size: large;">The output of the ransom note looks like this:</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;"><span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span><span style="font-family: "verdana" , sans-serif;">root[RB] nmod[IN] root[NNS] root[VBP] acl:relcl[NN] dobj[DT]</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">acl:relcl[WDT] root[PRP] dobj[JJ] root[DT] nmod[VBP] ccomp[IN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">ccomp[PRP] root[NNS] dobj[PRP$] cc[CC] root[VBZ] root[PRP] conj[DT]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">dobj[RB] dobj[NN] nmod[IN] dobj[PRP$] nmod[DT] nmod[PRP$] root[NN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[PRP] conj[VBP] dobj[PRP$] advcl[IN] advcl[VB] advcl[PRP] conj[NNS]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod[DT] conj[PRP] root[JJ] conj[NN] nmod[TO] xcomp[TO] root[VBZ]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[CC] root[PRP] root[VB] conj[MD] xcomp[CD] nmod[IN] dobj[$] root[CD]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod[PRP$] root[NN] root[PRP] root[MD] nmod[IN] nmod[$] conj[JJ]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">conj[NNS] root[CD] nsubj[$] root[$] root[IN] root[CC] conj[DT] root[VB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">conj[$] amod[CD] root[MD] ccomp[IN] ccomp[PRP] root[VBP] dobj[JJ]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod[DT] dobj[DT] root[JJ] nmod[TO] ccomp[NN] dobj[NN] nmod[IN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[VBP] advcl[PRP] nmod[DT] advcl[NN] nmod[NN] root[NN] root[PRP]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod[JJ] advcl[WRB] root[MD] dobj[DT] dobj[NN] nmod[CC] nmod[IN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod:tmod[RB] advcl[PRP] advcl[NN] root[NN] root[PRP] advcl[TO] root[VB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">dobj[CD] root[MD] nmod[CD] xcomp[VB] root[VBP] advcl[JJ] advcl[PRP]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">advcl[IN] xcomp[TO] nsubj[DT] root[NN] root[VB] root[MD] xcomp[VB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">xcomp[PRP] conj[RB] advcl[VBG] root[JJ] nmod[PRP$] advcl[IN] dobj[JJR]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">conj[DT] nmod[DT] root[PRP] root[MD] root[VBP] advcl[PRP] dobj[DT]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">dep[PRP] xcomp[TO] dep[RB] nmod[IN] conj[NN] dep[NN] conj[JJR] dobj[CC]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">xcomp[NN] dobj[NN] nmod[IN] nsubj[NNS] nmod[DT] nmod[PRP$] nmod[NN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nsubj[DT] root[NN] nmod[JJ] root[MD] nmod[IN] root[NNS] dobj[PRP$]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[NN] root[PRP] root[VB] nmod[JJ] root[RB] root[MD] xcomp[PRP]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[NNS] dobj[PRP$] advcl[VB] advcl[PRP] root[RB] root[VBP] xcomp[RB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">acl[RP] xcomp[TO] nsubj[DT] root[PRP] advcl[IN] nsubj[VBG] nsubj[CD]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">acl[NN] nmod[IN] nmod[VBN] nmod[FW] nmod[NNS] root[VBG] case[IN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod[PRP$] nmod[TO] nmod[NNP] nmod[NN] root[NN] csubj[NN] nmod[JJ]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[MD] acl[VBG] root[VBP] advcl[PRP] nmod[DT] advcl[VBG] dep[PRP]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nmod[TO] dep[NN] root[PRP] advcl[IN] nmod[JJ] advcl[PRP] root[VB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">advcl[NNS] root[PRP] advcl[IN] dobj[NN] advcl[VBN] acl[VBN] advcl[DT]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">acl[IN] advcl[NN] advcl[VBZ] acl[CC] nsubj[DT] root[NN] root[PRP]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">advcl[IN] nmod[IN] root[NNS] advcl[DT] advcl[IN] conj[PRP] root[VBZ]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[CC] root[PRP] root[VB] nmod[JJ] root[MD] advcl[VBP] conj[VBN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">ccomp[IN] xcomp[PRP] conj[VB] ccomp[NNS] conj[JJ] nmod[IN] nmod[NNS]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">ccomp[PRP] nmod[CC] nmod[NN] root[MD] xcomp[TO] root[CC] root[PRP]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">ccomp[VBP] root[VB] nmod[NNP] root[VBN] xcomp[PRP] acl[NN] dobj[PRP$]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">advcl[VB] advcl[PRP] xcomp[RB] dobj[DT] acl[IN] xcomp[TO] root[NN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[PRP] advcl[IN] dobj[NN] amod[CD] acl[VBP] dobj[VBG] root[NNS]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[VBP] conj[PRP] dobj[DT] acl[IN] conj[NN] acl[NN] root[CC]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">dobj[PRP$] dobj[VBG] amod[CD] dobj[NN] root[NNS] root[VBP] cc[RB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nsubj[CC] root[JJ] root[IN] conj[PRP$] cc[IN] root[PRP] conj[DT]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">nsubj[NN] root[RB] dobj[DT] xcomp[TO] root[VB] xcomp[NNP] root[RB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">dobj[NN] ccomp[IN] acl:relcl[VBP] root[VBP] root[RB] advmod[IN] root[JJ]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">acl:relcl[JJ] ccomp[MD] ccomp[NN] root[PRP] amod[RB] root[VB] ccomp[VB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[DT] acl:relcl[RB] root[VB] xcomp[PRP] root[RB] root[NNP] nmod[IN]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">dobj[NNP] dobj[DT] root[NN] dobj[JJ] advmod[RB] advmod[PRP] nmod[TO]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: medium;">root[VBZ] root[PRP] root[IN] root[RB]</span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>
<span style="font-family: "verdana" , sans-serif;"><br /></span>
<i><span style="font-family: "verdana" , sans-serif;">Again, there are no words here, just ngrams with syntactic structure that is NON LINEAR, not in the same order as written.</span></i></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;">
<span style="font-family: "verdana" , sans-serif;">Doing this for all the text as before and using the Stylo R Package software gave the following results...</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span>Using the single word analysis in Stylo with various occurrences of the most frequent sngrams, was nearly the same as using only 2 characters from the sngrams, which was nearly the same as using 4 characters sngram combinations with frequencies up to 500 most used sngram character combinations--<b>they all redflagged Patsy as the most likely author!</b></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkmMqUxRqjrzEF0WRGlRxE5yhmjruzoki9cm-trUyFb048gSfmnxK7e9tBondtENW1qP217Wh9fgT03vRyAj6fTPj73_5aO5PPApvhj8UA4YsMeFAKFYeZvkgCVo1Fy83IosltZAwA59Q/s1600/2gram.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkmMqUxRqjrzEF0WRGlRxE5yhmjruzoki9cm-trUyFb048gSfmnxK7e9tBondtENW1qP217Wh9fgT03vRyAj6fTPj73_5aO5PPApvhj8UA4YsMeFAKFYeZvkgCVo1Fy83IosltZAwA59Q/s640/2gram.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJAK-V68Q_tIrZRWCTCVedOlvaVEbZHj4wIDXPSwAPb-ZX-1fIv77ZJ-2o0KnLZ3E4tJrc1xagqiFtdVn5ln7HfnGgbFuKmWiGhJjMf7BiTXYz1dh_4Ev4G09reHIaPFbIpjGfqs6LU20/s1600/4gramconsesus.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJAK-V68Q_tIrZRWCTCVedOlvaVEbZHj4wIDXPSwAPb-ZX-1fIv77ZJ-2o0KnLZ3E4tJrc1xagqiFtdVn5ln7HfnGgbFuKmWiGhJjMf7BiTXYz1dh_4Ev4G09reHIaPFbIpjGfqs6LU20/s640/4gramconsesus.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEju-AEDYj9C56hhKW_ViBOdosu9u-7nck_Z3sLUppJW6E6yQ4kI4PAle3qPP0SO4PJG41C7xi7gFE3bFiolNlH7Sz-E7e-ofUwXqP9OVUDfjMy_Ubzj5T5l22arAmyqVaofzxIhbBr0Pog/s1600/4grams+detla.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEju-AEDYj9C56hhKW_ViBOdosu9u-7nck_Z3sLUppJW6E6yQ4kI4PAle3qPP0SO4PJG41C7xi7gFE3bFiolNlH7Sz-E7e-ofUwXqP9OVUDfjMy_Ubzj5T5l22arAmyqVaofzxIhbBr0Pog/s640/4grams+detla.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2qHgdAobgVyDIE_CsXYC55RUdzmDqOjY1Xz5rwfT7PidSMKZvH4GYlSMef2jtDK4Bl7nuD_7BVcTjj2WkDXy5NubZtIWn7R_g5nbTOSZdXD9sLwxhzsXd0gFhL858apH2GjXFIi62wP0/s1600/4grams.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2qHgdAobgVyDIE_CsXYC55RUdzmDqOjY1Xz5rwfT7PidSMKZvH4GYlSMef2jtDK4Bl7nuD_7BVcTjj2WkDXy5NubZtIWn7R_g5nbTOSZdXD9sLwxhzsXd0gFhL858apH2GjXFIi62wP0/s640/4grams.jpg" width="640" /></a></div>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span><span style="font-family: "verdana" , sans-serif;">In other words, this was the most stable output of any analysis I have done over a whole range of settings, showing that the sngrams contain my relevant syntactic information, despite the lack of words!</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">As a final note, I should mention that I used the sngrams as input into Jstylo, the authorship attribution software from Drexel University, and just like the results above, increased the probability of Patsy being the author. Using the same Enron Corpus etc from my earlier post, the sngrams increased the likelihood of Patsy being the author.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">Let me know if you have any questions. </span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">A project I have in the pipeline is use sngrams for lie detection in written statements.</span></span><br />
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;"><br /></span></span>
<span style="font-family: "verdana" , sans-serif; font-size: large;"><span style="font-family: "verdana" , sans-serif;">Coming soon! </span></span>
<span style="font-family: "verdana" , sans-serif;"><br /></span>
Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com2tag:blogger.com,1999:blog-1417239720543668306.post-88039109066565171912016-08-29T17:31:00.002-07:002020-04-23T14:53:37.782-07:00Lessons From The Poker Table --Spot The Liar In The Real World.<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk0zJVgb7CtGPs86uwKQtlKepmpW4Q1s6ckFaI7nnG-6XLVeNOpDJgt3IAs-jaE9Oji-EQTAFrIpfB7ykQL-HGUPMTRlryER_vcfhYyerDO1hLj1rZl2GjOpMWpm88Q6ocfrKK5E7mgnk/s1600/poker-games.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk0zJVgb7CtGPs86uwKQtlKepmpW4Q1s6ckFaI7nnG-6XLVeNOpDJgt3IAs-jaE9Oji-EQTAFrIpfB7ykQL-HGUPMTRlryER_vcfhYyerDO1hLj1rZl2GjOpMWpm88Q6ocfrKK5E7mgnk/s640/poker-games.jpg" width="640" /></a></div>
<br />
<br />
I have been playing small stakes poker for a few years now, with the original intent to study the cues and tells that come from bluffing to see if I could translate generic observations from the poker table into real world deception detection.<br />
<br />
Reading about a police interview technique developed by Professor R.E Geiselman of UCLA, an expert in detecting lies, gave me my first clue:<br />
<br />
<i>"When asked if they want to add anything, deceptive people tend to say NO quickly whereas truthful people either go ahead and add something new or they at least think about it before saying NO."</i><br />
<br />
It occurred to me that a speeding up and slowing down behaviour translated directly into a specific bluffing scenario that went like this:<br />
<br />
1 -- If you are strong, but bluffing to be weak, what do you do? You look at your cards and then your chips and pretend to think whether you are going to increase your bet. You stall.<br />
<br />
2 -- If you are weak but bluffing to be strong, what do you do? You don't hesitate, you move your chips out quickly.<br />
<br />
This poker table analogy holds directly in a real world scenario: People who move or act too quickly (quicker than their baseline!) are potentially deceptive or lacking confidence. It's a red flag moment at the table and it's a red flag moment in the real world too. Off course, this means you need a baseline, a situation where some small talk has taken place prior and where you have had an opportunity to gauge behaviour.<br />
<br />
Something else I noted on the poker table was that people who I thought were bluffing seemed to be slightly more friendly, or polite <i>at that point</i>, as if not wanting to antagonise other players or draw attention to themselves.<br />
<br />
Frank Enos from Columbia University in his thesis says:<br />
<i>"Preliminary findings suggested that pleasantness is the most promising factor in predicting
deception..."</i><br />
<i><br /></i>
As expected, there is an overlap from poker table to real world scenarios making it a great laboratory to study human behaviour, it's just a matter of paying attention.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.comtag:blogger.com,1999:blog-1417239720543668306.post-65901603621452090912016-08-22T19:04:00.001-07:002016-08-22T19:04:51.343-07:00How to Get your Message Out With Clinton's Media Strategist Idea.<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxA35VIAEa5lbhX_EhCn9pFzrCU_CsPHQzfcmkv2yIzazNgkHmp3QJ0aiZUWRHObQRLLwaXLqoJ4rEjgtqyDSWVmNMwSYCJ89Sv3Ch7wbiK96gxW3E_Q310MptYh-XF8jPv6wiJqrIfIQ/s1600/583556126.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="425" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxA35VIAEa5lbhX_EhCn9pFzrCU_CsPHQzfcmkv2yIzazNgkHmp3QJ0aiZUWRHObQRLLwaXLqoJ4rEjgtqyDSWVmNMwSYCJ89Sv3Ch7wbiK96gxW3E_Q310MptYh-XF8jPv6wiJqrIfIQ/s640/583556126.jpg" width="640" /></a></div>
<br />
An effective linguistic technique to get your messages out comes from Bill Clinton's media strategist in Clinton's book <i>Behind The Oval Office</i>.<br />
<br />
Clinton was frustrated by the fact that he had created millions of jobs and cut the deficit, but it went largely unnoticed and unaccredited.<br />
<br />
His Media Stategist Bob Squier suggested that two messages be combined creating a presupposition for one of the messages. The idea, said Squier, was to talk about the jobs that had been created while also talking about what you are going to do.<br />
<br />
Squier continued, "<i>For example, the seven million jobs we've created won't be much use if we can't find educated people to fill them. That's why we want a tax deduction for college tuition to help kids go to college to take those jobs</i>."<br />
<br />
This turned out to be very effective and works because it assumes or presupposes that part of the message is a fact.<br />
<br />
Lets say you want to get the message out:<br />
1 -- This is the worlds safest car.<br />
2 -- Now you can afford it.<br />
<br />
Putting both premises across individually will allow some one dispute<i> both</i> messages.<br />
By combining the two messages into one such as:<br />
<br />
<b>Now you can afford the world's safest car.</b><br />
<br />
If you disagree with this message, you are disagreeing about the fact that you can afford the car, not that it is the worlds safest car, because the safety aspect is now assumed or presupposed.<br />
<br />
In fact this technique has been used by advertisers and politicians for a long time and it's been shown to be an effective way to get a message out because in our busy daily life we assume presuppositions are correct to save ourselves cognitive processing time.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-58940320703395731532016-08-19T16:00:00.000-07:002016-08-19T16:31:15.750-07:00Millers Law And Lochte's "Apology".<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhypHrnaWnTKidjX8Dukpghvt2gVPun5_ZJXxdr3qLA1lLSy1rPlsEaIchIiohMHPrJADXKYcup203yMLSZQ56A6YlPscYP07mSPqgjJhSUIQGohjY-O_V_YywjgEY5zAOuwAE22dNeuVg/s1600/2016-08-20_084810.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhypHrnaWnTKidjX8Dukpghvt2gVPun5_ZJXxdr3qLA1lLSy1rPlsEaIchIiohMHPrJADXKYcup203yMLSZQ56A6YlPscYP07mSPqgjJhSUIQGohjY-O_V_YywjgEY5zAOuwAE22dNeuVg/s640/2016-08-20_084810.jpg" width="640" /></a></div>
<br />
Twenty four hours after deconstructing Ryan Lochte's lie in my last post, the U.S swimmers admit they lied about the robbery after they trashed a service station.<br />
<br />
Lochte issued an apology -- sort of.<br />
<br />
<span style="background-color: #fefefe; color: #262626; font-family: "cnn" , "helvetica neue" , "helvetica" , "arial" , "utkal" , sans-serif; font-size: 18px; line-height: 30px;">"I wanted to apologize for my behavior last weekend -- for not being more careful and candid in how I described the events of that early morning and for my role in taking the focus away from the many athletes fulfilling their dreams of participating in the Olympics," he said Friday on Instagram.</span><br />
<span style="background-color: #fefefe; color: #262626; font-family: "cnn" , "helvetica neue" , "helvetica" , "arial" , "utkal" , sans-serif; font-size: 18px; line-height: 30px;"><br /></span><span style="font-family: inherit;">
As in lie</span> detection where you need to listen really very carefully to what is said without putting your own interpretation on it, the same thing applies to listening to an apology.<br />
<br />
What he probably means is he regrets the lack of <i>care</i> in telling his lie which made it so easy to deconstruct.<br />
<br />
<div style="color: rgba(0, 0, 0, 0.701961); line-height: 32px; margin-bottom: 32px;">
<span style="font-family: inherit;"><strong>Millers Law</strong> by Princeton Professor George Miller instructs us to suspend judgement and not put your own interpretation on something that someone says.</span><br />
<span style="font-family: inherit;"><br />The law states --</span><span style="font-family: inherit;"><i><b>"To understand what another person is saying, you must assume that it is true and try to imagine what it could be true of."</b></i></span><span style="font-family: inherit;"><br /><br />This is a way of stopping you from making a snap judgement and interpreting what someone is saying using your own internal "dictionary", because very often they are using their own different internal "dictionary".</span></div>
Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com2tag:blogger.com,1999:blog-1417239720543668306.post-16703621153815587762016-08-18T01:10:00.002-07:002016-08-18T01:38:22.335-07:00Deception Analysis Of Ryan Lochte’s Robbery Account<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJkkdA9o50zGKlPY2obfsv9G1WAePwybqhQDTrEZxfjMOze4S22RgSPol-lRBuIcVQuSKNxvanKCfiJmoilaRLsF57-Sr5QkFXeLnnHGz-FeoiFBDAdX0_7wNIpMqm8dRbgToC4wnTrTI/s1600/ryan-lochte-2016-o-trials-2251-720x500.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="443" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJkkdA9o50zGKlPY2obfsv9G1WAePwybqhQDTrEZxfjMOze4S22RgSPol-lRBuIcVQuSKNxvanKCfiJmoilaRLsF57-Sr5QkFXeLnnHGz-FeoiFBDAdX0_7wNIpMqm8dRbgToC4wnTrTI/s640/ryan-lochte-2016-o-trials-2251-720x500.jpg" width="640" /></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
I've been analysing police criminal statements to determine
verbal/written cues for a quite few years now, with the view to developing
automated software to red flag statement inconsistencies and deception.</div>
<br />
<o:p></o:p><br />
<span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;">It turns out that there is evidence that liars
tend to tell a less coherent story, items are more likely to be out of sequence,
they are less likely to include conversations or sensory details such as what
something smelled or looked like, and there are more likely to be
contradictions, and so on.</span><br />
<span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;"><br /></span>
<br />
<div class="MsoNormal">
My interest was peaked in the controversy arising from the
reported robbery in Rio of 4 U.S swimmers, and that fact that the judge said
that there are inconsistencies between the swimmers statements.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
I decided to have a look at Ryan Lochte’s statement if I could
find a direct quote. <br />
This is what Lochte said on NBC Today:<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<i>“<b>We got pulled over</b>, in the taxi, and these guys came out with a
badge,<br />
police badge, <b>no lights</b>, no nothing
just a police badge and <b>they pulled us
over</b>.</i></div>
<div class="MsoNormal">
<i><br />
<b>They pulled out their guns</b>, <b>they told the other swimmers</b> to get
down on the ground — <br />
they got down on the ground. <o:p></o:p></i></div>
<div class="MsoNormal">
<i><br /></i></div>
<div class="MsoNormal">
<i>I refused, I was like
we didn't do anything wrong, so — <br />
I'm not getting down on the ground.</i></div>
<div class="MsoNormal">
<i><br />
<b>And then</b> <b>the guy pulled out his gun</b>, he cocked it, put it to my forehead and
he said,<br />
<b>'Get down,' and I put my hands up</b>, I
was like 'whatever.' </i></div>
<div class="MsoNormal">
<i><br />
<b>He</b> took <b>our money</b>, he took <b>my wallet</b>
— he left my cell phone, he left my credentials.”<o:p></o:p></i></div>
<div class="MsoNormal">
<i><br /></i></div>
<div class="MsoNormal">
<i><br /></i></div>
<div class="MsoNormal">
Looking at some of the most interesting parts of the
statement, Lochte starts of with <b>we</b><br />
got pulled over, and then ends the sentence with an out of sequence “<b>they</b> pulled us over” after telling us what he didn't see.<br />
<br />
When most people are asked an open question, they describe what happened, <i>not</i>
what didn’t happen. <br />
Saying what didn’t happen in response to an open question is called a <i>spontaneous negation</i> by FBI agent John
Schafer in his book and is a red flag in deception.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
“....they told the <i><b>other</b></i>
swimmers to get down on the ground...”<o:p></o:p></div>
<div class="MsoNormal">
<br />
Lochte didn’t say “they told us to get down on the ground”, he said the <i>“other
swimmers”</i> were told this. He is isolating himself from the group. It’s no
longer <i>we</i> and <i>us.</i><o:p></o:p></div>
<div class="MsoNormal">
<br />
It seems Lochte is still standing around, with attitude to boot (“I’m not
getting on the ground”)<br />
When a gun is pointed to his forehead and he is told to get down, <i>after </i>the other swimmers were told to
get down and which they did, <b>at this point</b> <b>he puts his
hands up</b>.<br />
<!--[if !supportLineBreakNewLine]--><br />
<!--[endif]--><o:p></o:p></div>
<div class="MsoNormal">
Then some interesting bits: <i>“And then the guy pulled out his
gun....”</i><br />
<br />
1 –“<i>Then”</i> indicates that some time had passed, perhaps something is being
skipped over.<br />
<br />
2 – <b>“the”</b> is out of context. <i>“And then the guy pulled out his gun, and cocked
it...”</i> by using <b>the</b> in this manner it indicates that the gunman is previously
known. <br />
<br />
3 – The most obvious glaring problem is that the guns were already out in the earlier
part, but now we are being told <i>“then the guy pulled out his gun”</i>.<br />
<br />
4 – The gunman tells him to get down and <b>then</b> he puts up his hands. <br />
<br />
5 – Lochte portrays himself as a hero by being dismissive towards the
gunman with the <i>“whatever” </i>attitude.<br />
<br />
6 – “He took <b>our</b> money, he took <b>my</b> wallet..”<br />
It wasn’t <b>they</b> took <b>our</b> wallets. Lochte is treated differently again, with his
wallet being taken by the single gunman, while the others had there money
taken.<br />
<!--[if !supportLineBreakNewLine]--><br />
<!--[endif]--><o:p></o:p></div>
<div class="MsoNormal">
This statement is riddled with inconsistencies and red flags and appears very deceptive.<br />
<br />
It would seem something else happened that is being covered up with this “robbery”.<o:p></o:p></div>
<div class="MsoNormal">
Lochte never told the police about the robbery, he sent a text message to his mother afterwards who was also in Rio.<br />
<br />
Only when media reports came out via his mother did police get Lochte and
another swimmer Feigen in to make statements. Reportedly Lochte’s statement
said there was only one gunman involved while Feigens statement said there were
several gunmen but only one was armed.<br />
<br />
<o:p></o:p></div>
<div class="MsoNormal">
Media report:<br />
” <i>Judge Blanc De Cnop noted that Lochte had
said a single robber approached<br />
the athletes and demanded all their money (400 real, or $124).</i></div>
<div class="MsoNormal">
<i><br />
Feigen's statement said a number of robbers targeted the athletes<br />
but only one was armed, the statement said. Another potential issue<br />
highlighted by judge was the behavior of the athletes on arrival at the<br />
Olympic Village in the aftermath."</i><br />
<br />
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Lochte’s mother played it down, saying,”<i> They just took
their wallets and basically that was it.”</i><o:p></o:p></div>
<span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;"><br /></span>
<span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;">Looking at Lochte’s statement on NBC, there are
many red flags raised, but bringing all the other media reports into the mix
lifts this to another level. </span><br />
<span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;"><br /></span>
<span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;">I think this whole episode was best summed up by local television new announcer Mariana Godoy --</span><br />
<br />
<i><span style="font-family: "calibri" , "sans-serif"; font-size: 11.0pt; line-height: 115%;">"</span><span style="font-family: "calibri" , sans-serif; font-size: 14.6666669845581px; line-height: 16.8666667938232px;">So the American swimmer lied about the robbery?</span><span style="font-family: "calibri" , sans-serif; font-size: 14.6666669845581px; line-height: 16.8666667938232px;"> He went from one party to another party and didn't want to tell his Mommy about it?"</span></i>Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-8198860527260837622016-07-19T02:05:00.002-07:002016-07-19T02:45:20.880-07:00Melania Trump Plagiarises Michelle Obama Speech<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghnt7migw9MmxcIFoURyukS31pQXoXAkI4wTga1_iBjRKF7wrrPkKgwRuChHBKBKRKImL-uwD3ydPwMB26AbVto0B1aAlFpg97dvOMjainDMXX-3AadD1kMwG9Zd4iYOtzmA4Ec7rfEkI/s1600/trump.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghnt7migw9MmxcIFoURyukS31pQXoXAkI4wTga1_iBjRKF7wrrPkKgwRuChHBKBKRKImL-uwD3ydPwMB26AbVto0B1aAlFpg97dvOMjainDMXX-3AadD1kMwG9Zd4iYOtzmA4Ec7rfEkI/s640/trump.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Jarrett Hill first noticed the close similarities of Melania Trump's Speech to Michelle Obama's 2008 speech --</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
https://twitter.com/JarrettHill</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
The paragraphs in question are very close:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfZT7E_zTUGnjQV493P2dDkO5oPfailndzQpL3_YWrbba6wDOlqwpEhoyDiUBW81xRa4yzN_uTiCkzaQaezWndiMtxeh0dw6EihcYYxdtoKWrSCVQfi4GxrB3GKsGY9GCB-RC1qih8Eu4/s1600/Cns6js4W8AAWSIq.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="490" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfZT7E_zTUGnjQV493P2dDkO5oPfailndzQpL3_YWrbba6wDOlqwpEhoyDiUBW81xRa4yzN_uTiCkzaQaezWndiMtxeh0dw6EihcYYxdtoKWrSCVQfi4GxrB3GKsGY9GCB-RC1qih8Eu4/s640/Cns6js4W8AAWSIq.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
from:NPR Politics</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Running both <i>full</i> speeches through the anti plagiarism detection software Jstylo from Drexel University along with 60 extra random emails from Enron to act as placebo and using a bayesian text classifier gives this:</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc8xDaevGu3qwTGrzDdqArnzBkeZnZIE-knUUNBMDDJ1bolmy_WgvN6db8WVXxgH1USToQTwWRK1mKBGmMxdJmNvVDzeggnhkiuj88q7W_OgqRB3sZguS_axpQ8m3KmtOFs3ahw2JxfAA/s1600/melania11.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc8xDaevGu3qwTGrzDdqArnzBkeZnZIE-knUUNBMDDJ1bolmy_WgvN6db8WVXxgH1USToQTwWRK1mKBGmMxdJmNvVDzeggnhkiuj88q7W_OgqRB3sZguS_axpQ8m3KmtOFs3ahw2JxfAA/s640/melania11.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyfVW1KUKsWIE2jCUB08vapjVALq95zUD9pAgBeMOQuuvoKS3e2qxFTHVBZhvytf_14A3FCdFQpvy22p2xI_ogE7CEGXXuj2_pAT9A0ATOZ83YeRWwS5AMvKlfDxWf8WEEadTPtQj27p8/s1600/melani22a.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="440" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyfVW1KUKsWIE2jCUB08vapjVALq95zUD9pAgBeMOQuuvoKS3e2qxFTHVBZhvytf_14A3FCdFQpvy22p2xI_ogE7CEGXXuj2_pAT9A0ATOZ83YeRWwS5AMvKlfDxWf8WEEadTPtQj27p8/s640/melani22a.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
The anti plagiarism software picks Michelle Obama 2008 speech as the closest match to Melania's speech 2016. In this case it is 100% sure.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Trump is running at 76% lies in his statements according to verification website Politifact's truth-o-meter, and his wife seems to have acquired the deception habit too.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-42922652500433167372016-07-17T22:45:00.001-07:002016-07-17T22:45:18.087-07:00The Lying Game--U.K Police Statement Video<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj028NyFS8m9v_03dJFKTMuyUyMmf-MHNAJZsECMIN82MBAmHb1dhfRnw1DuNWG8Ghe_4Rvb8zEI_LS_UAS2ptGfnMKmq_V_qBxgAQCI_o6eL5O91blR-UBlkQB_CVST5i90uXNM4p6Tgw/s1600/2016-07-18_154425.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="350" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj028NyFS8m9v_03dJFKTMuyUyMmf-MHNAJZsECMIN82MBAmHb1dhfRnw1DuNWG8Ghe_4Rvb8zEI_LS_UAS2ptGfnMKmq_V_qBxgAQCI_o6eL5O91blR-UBlkQB_CVST5i90uXNM4p6Tgw/s640/2016-07-18_154425.jpg" width="640" /></a></div>
<br />
<br />
A superb 1 hour show with U.K police and forensic psychologist statement analysis of of public TV appeals to spot the liar in murder cases.<br />
<br />
<a href="https://www.youtube.com/watch?v=G5cBUr__1Sk">https://www.youtube.com/watch?v=G5cBUr__1Sk</a><br />
<br />
Take note how people close their eyes in a blocking action when they don't like what they hear.<br />
Notice too how the U.K police seat suspects on couch instead of a table and chair, making it much easier to watch non verbals.<br />
<br />
In particular, when people lie, their is a "cognitive overload",and instead of gestures which are normal and feature in most honest conversation, the body "locks down" and doesn't move.<br />
<br />
Liars rehearse lies, but seldom rehearse gestures.<br />
<br />
A terrific show.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com5tag:blogger.com,1999:blog-1417239720543668306.post-19914443071354525032016-07-12T20:06:00.002-07:002018-07-02T18:07:56.967-07:00Unmasking The JonBenet Ransom Note With Stylometry Software (new additions july 2018)The tragic and bizarre murder of JonBenet Ramsey is 20 years old. The ransom note from this case is analysed using the latest stylometric software to determine the authorship.<br />
<br />
The program Jstylo has Writeprints as it's backbone, which <i><b>"<span style="background-color: white; color: #333333; font-family: "arial" , sans-serif; font-size: 14px; line-height: 20px;">automatically extracts thousands of multilingual, structural, and semantic features to determine who is creating 'anonymous' content online. Writeprint can look at a posting on an online bulletin board, for example, and compare it with writings found elsewhere on the Internet. </span></b></i><br />
<i><b><span style="background-color: white; color: #333333; font-family: "arial" , sans-serif; font-size: 14px; line-height: 20px;"><br /></span></b></i>
<i><b><span style="background-color: white; color: #333333; font-family: "arial" , sans-serif; font-size: 14px; line-height: 20px;">By analyzing these certain features, it can determine with more than 95 percent accuracy if the author has produced other content in the past." <a href="https://www.nsf.gov/news/news_summ.jsp?cntn_id=110040">(University Arizona)</a></span></b></i><br />
<br />
The software uses "<span style="background-color: white; color: #333333; font-family: "arial" , sans-serif; font-size: 14px; line-height: 20px;"><i><b>cutting-edge technology and novel new approaches to track their moves online, providing an invaluable tool in the global war on terror"</b></i> . <a href="https://www.nsf.gov/news/news_summ.jsp?cntn_id=110040">(University Arizona)</a></span><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUb7sLX4Nhtv7FUv0kPDboXZD2fz79tOC4ytrjDdwKVJyjuJ3zjm8BaPzbSgPseEVVnD5s5d63CATk5ULJXf1Wq0XelHKyb1DxoD5ylIqiBUToaopgtT4_rrP7Ro3KtpPiRY_8BHuIL04/s1600/spider22.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="352" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUb7sLX4Nhtv7FUv0kPDboXZD2fz79tOC4ytrjDdwKVJyjuJ3zjm8BaPzbSgPseEVVnD5s5d63CATk5ULJXf1Wq0XelHKyb1DxoD5ylIqiBUToaopgtT4_rrP7Ro3KtpPiRY_8BHuIL04/s640/spider22.jpg" width="640" /></a></div>
<br />
<br />
Over the years there have been dozens of handwriting studies done, but considering that this long, rambling,strange and bizarre ransom note was designed to disguise handwriting (the letter <b>a</b> for example <i>changes 6 times in it's construction</i>), logically there would never be a match that would stand up in court.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdTk5b-YqvBJstjMZ-bH78DaSddTQBhUxRX3HB4rPuNKPxFw5lAdudSmy6Mbvys7eOcXXAClC-jU4HRb5kGhwUoy4vO7j2Gb4tY9q8mfw0WhqwOL9rlJmi27SshXyjD9WILpYUYUWFUI8/s1600/41kULu3BdbL._SX331_BO1%252C204%252C203%252C200_.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdTk5b-YqvBJstjMZ-bH78DaSddTQBhUxRX3HB4rPuNKPxFw5lAdudSmy6Mbvys7eOcXXAClC-jU4HRb5kGhwUoy4vO7j2Gb4tY9q8mfw0WhqwOL9rlJmi27SshXyjD9WILpYUYUWFUI8/s400/41kULu3BdbL._SX331_BO1%252C204%252C203%252C200_.jpg" width="267" /></a></div>
<br />
<br />
Drexel Research University released anti plagiarism software called<b> Jstylo</b> which perked my interest in this murder case and the ransom note.<br />
<br />
<b>There are 375 words in the ransom note. Forsyth and Holmes show that a minimum of 250 words are required to attribute a document to an anonymous author. </b><br />
<span style="font-size: x-small;">R. S. Forsyth and D. I. Holmes, “Feature-finding for test classification,”
Literary and Linguistic Computing, vol. 11, no. 4, pp. 163–174, 1996</span><br />
<br />
This made it viable to test the ransom note against writing from Patsy and John Ramsey.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhcExsAbY1dNWtPNQuSgcYsvpojrQXa9fpKMXNJCKSyatmsazG0Vlinkm1_A0V1BitXYe4Spy8bjvMDst2BHt6VtY7FeYMIVhQ22uJzbY7dz_buHgCuhYV89VJwxBQ9j_s8Q2iJlzeduI/s1600/jjjjj.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhcExsAbY1dNWtPNQuSgcYsvpojrQXa9fpKMXNJCKSyatmsazG0Vlinkm1_A0V1BitXYe4Spy8bjvMDst2BHt6VtY7FeYMIVhQ22uJzbY7dz_buHgCuhYV89VJwxBQ9j_s8Q2iJlzeduI/s400/jjjjj.jpg" width="310" /></a></div>
<br />
<br />
The software has been shown to be effective with accuracy rates of around 80% in identifying anonymous users on hack forums, with probabilities rising to 93%-97% accuracy in identifying a target document from among 50 authors (Abbasi and Chen). Rates drop to around 90% for 100 authors.<br />
<br />
Its' also been used to identify programming source code authorship.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpgQzcqFB65BV47xUNTc0XhyO7W-bFtlKqCDJA7cvyH0ijWZhU-3pntLWSKRsOkqvk7HIdeRtlGJ4CVYIAisnEThPNCXSRs_F8ZOrupatiLKWnG8rtCjRhSnoXGctvV53m4yxwEyEcoHA/s1600/zzzz.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpgQzcqFB65BV47xUNTc0XhyO7W-bFtlKqCDJA7cvyH0ijWZhU-3pntLWSKRsOkqvk7HIdeRtlGJ4CVYIAisnEThPNCXSRs_F8ZOrupatiLKWnG8rtCjRhSnoXGctvV53m4yxwEyEcoHA/s640/zzzz.jpg" width="436" /></a></div>
<br />
<br />
<br />
I downloaded their software<b> </b><a href="https://psal.cs.drexel.edu/index.php/JStylo-Anonymouth">JSTYLO</a><br />
<br />
This is superb for a few reasons: it has embedded in it <i>WEKA</i>, an incredible data mining suite from the university of Waikato, NZ, and also <i>WRITEPRINTS</i>, the gold standard forensic stylometric characteristic generator for author identification with an automated interface.<br />
<br />
Combined together, over 800 variables are created by Writeprints limited for each piece of text, which is then analysed by Weka for a linguistic "fingerprint" amongst all the test samples you give it.<br />
<br />
<b>Stylometry is the statistical analysis of writing <i>style</i> to identify authorship.</b> <b>This style involves many "invisible" words such as articles, function words, adverbs and pronouns which become unique to us as we develop our writing style, it not just a frequency count of obvious words. The hidden unconscious aspect of this makes it ideal for computer analysis.</b> (James Pennebaker, <i>The Secret Life Of Pronouns</i>)<br />
<br />
There was always suspicion on the parents because of their strange behaviour. There is interesting video interview footage on the internet, where they ask Patsy if she would take a lie detector test and she says she would whilst simultaneously shaking her head in a no motion, a classic incongruity between what was said and done (see former FBI agent Joe Navarro's book <i>What Everybody Is Saying</i> for more this non verbal cue).<br />
<br />
<b>Deceptive people use language differently to innocent people</b>, see ten Brinke and Porter, <i>Psychology, Crime & Law 2015</i>). Another interesting study on language changes in deception relates to Dutch Professor Diederik Stapel who reported false data in 25 of his academic papers. The study compared his 25 fraudulent papers with his 25+ legitimate papers. <a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0105937">Academic Fraud Study</a><br />
<br />
The outcome: <i>"<span style="background-color: white; color: #333333; font-family: "arial"; font-size: 13px; line-height: 18px;"><b>This research supports recent findings that language cues vary systematically with deception, and that</b> <b>deception can be revealed in fraudulent scientific discourse.</b>"</span></i><br />
<br />
For this post, I will only look at the stylometric aspect of the ransom not. A future post will look at the linguistics of this case.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinR-oG81vxdBFqFlWe2wl5euQ66Msi2tn1wppDEi-Y611fNyYyfzloi87y7syGjSGZV4-Qe2z2ftBaHXxb_x_iRU8Z3CPp_nZan6Y-mylclfYOyzWCQ5-AaV9GibmBwi3_6T7Yq6o4iXE/s1600/NewNoteScan-p1-Crop-706x1024.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinR-oG81vxdBFqFlWe2wl5euQ66Msi2tn1wppDEi-Y611fNyYyfzloi87y7syGjSGZV4-Qe2z2ftBaHXxb_x_iRU8Z3CPp_nZan6Y-mylclfYOyzWCQ5-AaV9GibmBwi3_6T7Yq6o4iXE/s640/NewNoteScan-p1-Crop-706x1024.jpg" width="440" /></a></div>
<br />
Above: page 1 of the two and a half page ransom note.<br />
<br />
I located 5 notes written by Patsy Ramsey, including 1995 and 1996 Xmas notes. I haven't had much luck locating anything sizable written by John Ramsey, however.<br />
<br />
But I needed a placebo--lots of random notes and emails to test against.<br />
<br />
Many universities are using the Enron Email Corpus from Carnegie Mellon--https://www.cs.cmu.edu/~./enron/<br />
<br />
The email servers were seized during the Enron fraud trails where a dozen executives went to jail. After the court case, the emails were acquired by a university and have been made available for various political/social studies. It is the largest email corpus (1.5 million emails) which show day to day life in a large corporation.<br />
<br />
The emails make a perfect training set, and have been used as that in various studies, as well as creating models such as being able to identify male and female writing with 80% probability.<br />
<br />
<b>Schein and Caver show that attribution accuracy is greatly affected by topic, I've tried to avoid this by greatly varying topics by using the Enron dataset.</b><br />
<br />
<a href="http://stealthserver01.ece.stevens-tech.edu/gendercreatetext?count=9885">http://stealthserver01.ece.stevens-tech.edu/gendercreatetext?count=9885</a><br />
<br />
The reason the Enron corpus is being used by the University of British Columbia and others for language and social engineering studies is that Enron was in effect a small city -- it was a vast corporate structure that had thousands of daily emails on all subjects, from business, to small talk to flirting to deception.<br />
<br />
I downloaded the Enron corpus and randomly selected about 60 emails and added the Ramsey letters mentioned above.<br />
<br />
All this was put into Jstylo, the authorship attribution software.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheIvoUAX5JjgQWCwoBobHl8JGGiFYAlKUlespjpThBfcm2yqk8ksMTNJPA-SXJYuHeqzvH_TAJjPrPrTx1G6zqJBAKbpfRA0m9duOh3bb6Q0_zQfJVHek_9VZcA2lNPn2TXEi9e5V2ml0/s1600/jj1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="604" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheIvoUAX5JjgQWCwoBobHl8JGGiFYAlKUlespjpThBfcm2yqk8ksMTNJPA-SXJYuHeqzvH_TAJjPrPrTx1G6zqJBAKbpfRA0m9duOh3bb6Q0_zQfJVHek_9VZcA2lNPn2TXEi9e5V2ml0/s640/jj1.jpg" width="640" /></a></div>
<br />
<br />
The ransom note was put in the Test side, the 80 odd emails and text was in the Training side.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7Nrb8ew7-J7Xf6gR_Bqyrsq6f9JJOdYOlJG914WnNqS-Bi90WnUPhiAp2oJyaOYcgxY4RPGiYjjcLfCrWZf_6T6M0ahq89GKtk7voosNc74Io5bTqXO-rUjVTBhKVSv4l-9IgWLZwPcM/s1600/jj2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7Nrb8ew7-J7Xf6gR_Bqyrsq6f9JJOdYOlJG914WnNqS-Bi90WnUPhiAp2oJyaOYcgxY4RPGiYjjcLfCrWZf_6T6M0ahq89GKtk7voosNc74Io5bTqXO-rUjVTBhKVSv4l-9IgWLZwPcM/s640/jj2.jpg" width="640" /></a></div>
<br />
With all the emails and ransom note loaded in, I went to the data mining section and selected an algorithm with the least error <i>after cross validation</i> which looks for similarity between the writing samples.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9rblmjf26mnZXFCk7trwGvHfJ3KelnZspyTUP9mwZ_cd0aFKmkmrUQiiT8EpESKWkxfgYa9hyphenhyphenK2NVlCOKXzZtJIV2QxWArK7QDxP916YrzckebhVzGeEF8jRFki_xZcS21mmDot4XK8w/s1600/jj3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="604" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9rblmjf26mnZXFCk7trwGvHfJ3KelnZspyTUP9mwZ_cd0aFKmkmrUQiiT8EpESKWkxfgYa9hyphenhyphenK2NVlCOKXzZtJIV2QxWArK7QDxP916YrzckebhVzGeEF8jRFki_xZcS21mmDot4XK8w/s640/jj3.jpg" width="640" /></a></div>
<br />
The next step was to run the "trained" model (trained on Enron and Patsy writing) on the test writing (ransom note) and look for the closest match. In effect I am asking it--which text does this ransom note look most like? <br />
<br />
Writeprints creates 800 variables per document, creating a "sliding window" as it analyses a broad range of text characteristics.<br />
<br />
Result--Patsy Ramsey at 75%.<br />
<br />
I ran it again with different emails and text, and then different data mining algorithms, same result.<br />
<br />
There is some good advice on which classifier to pick by <a href="http://blog.echen.me/2011/04/27/choosing-a-machine-learning-classifier/">Edwin Chan</a>.<br />
<br />
<br />
<b>More....</b><br />
<b><br /></b>
Patsy died of cancer in 2006, and that is probably why the Police Commissioner Mark Beckner said they don't expect to make any arrests in the future, even though the case is still open.<br />
<br />
Police Chief Mark Beckner did participate in an interview on Reddit, and one of the questions that always stuck out to me was this:<br />
<br />
<div style="background-color: white; color: #212121; line-height: 32px; margin-bottom: 24px;">
<span style="font-family: inherit;"><i>Q: “When Patsy wrote out the sample ransom note for handwriting comparison, it is interesting that she wrote “$118,000″ out fully in words (as if trying to be different from the note). <br />Who writes out long numbers in words? Does this seem contrived to you?”</i></span><i style="background-color: transparent; font-family: inherit;">Beckner: “The handwriting experts noted several strange observations.”</i></div>
<br />
<br />
<b>Update 1: Sept 2016</b><br />
<b><br /></b>
It has been pointed out to me by two people, <a href="https://solvingjonbenet.blogspot.com.au/">DocG</a> and also Eve Berger (no relation) from Linkedin that John Ramsey was also reported as having used the notorious <i>"and hence"</i> in an interview. I did find a transcript of this interview with both John and Patsy talking to student journalists, including an incredible part where Patsy says, <i>"...Even If We Are Guilty.....".</i><br />
<br />
Shades of O.J Simpson and <i>"If I Did It.....".</i><br />
That's worth a look all on it's own which I'll do in the next day or two.<br />
<br />
<a href="http://www.forumsforjustice.org/forums/showthread.php?10314-Newseum-Ramsey-Interview-(J-and-P-interview-transcripts-from-Journalism-Class-visit)">John + Patsy Transcript</a><br />
<br />
But, how unusual is <i>"and hence"</i>? Well, using the Google Ngram viewer which searches books from the 1800 to 2008, here's a graph I made:<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4gs6Udi-Cu0Lz19NE8uBgXdIF5V2WAccaZ1X-eAMqQTwvF__Tog0ycj1rhbPN-tL-_fUYTXpa3odAg_-vUv-N2EcpHN-c75NC9yAGCLOvzG1OuAs2QOvNGCDR17AdlMziJKauYPbyROw/s1600/hence.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="330" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4gs6Udi-Cu0Lz19NE8uBgXdIF5V2WAccaZ1X-eAMqQTwvF__Tog0ycj1rhbPN-tL-_fUYTXpa3odAg_-vUv-N2EcpHN-c75NC9yAGCLOvzG1OuAs2QOvNGCDR17AdlMziJKauYPbyROw/s640/hence.jpg" width="640" /></a></div>
<br />
Very uncommon, it would seem.<br />
I will look into the linguistics using the interview material soon, referencing some of the recent <i>automated </i>deception detection methods.<br />
<br />
<b><br /></b>
<b>Update 2:</b><br />
<br />
<b>I've had a few questions about Jstylo.</b><br />
<b><br /></b>
<b>Let's get something out of the way, DocG asked me to use <i>his text to test,</i> which turned out to be speech, a NO-NO. The results didn't work because it should be speech to speech, text to text. I told him this when I found out, and said it wasn't<i> valid</i>. He couldn't accept it because of the <a href="https://youarenotsosmart.com/2011/03/25/the-sunk-cost-fallacy/">Sunk Cost Fallacy.</a> He loved the outcome because he thought he had found a weak link. </b><br />
<br />
<b>DocG said he uses "instinct", "intuition" and "social research experience". I told him I was only interested in EMPIRICAL results against his "intuition", so we agree to disagree.</b><br />
<br />
1 -- Firstly, Jstylo is a <i>closed system</i>.<br />
This means that the suspect must be among the text samples you are analysing. The software will pick the closest match.<br />
<br />
2 -- <i>Speech with speech and text with text</i>. People use language differently when they talk compared to how they write. Different parts of the brain are used for speech and writing. If you want to identify speech, use all speech as your input. If you want to identify written text, all your inputs should be text.<br />
<br />
The Pennebaker text analysis software LIWC has frequency averages over many thousand of samples for blogs, speech, newspapers etc. This program shows the dramatic and consistent differences between speech and written text, see below for average frequencies.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVe4P0ZhpgEuHnuAg-7UD-dYN9opQdc1dMqSXMa6suyGQvikq9ozerJy-4j3AG1TxEKbUjI0SMc5r_yIfL7KlJ8P9d_j4aPdo6lgCPrwLbvgu-aS56dtRDbCQVPF4aXJC7WdJ1F7zNMDk/s1600/table.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="210" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVe4P0ZhpgEuHnuAg-7UD-dYN9opQdc1dMqSXMa6suyGQvikq9ozerJy-4j3AG1TxEKbUjI0SMc5r_yIfL7KlJ8P9d_j4aPdo6lgCPrwLbvgu-aS56dtRDbCQVPF4aXJC7WdJ1F7zNMDk/s320/table.jpg" width="320" /></a></div>
<br />
<br />
3 -- Generally, the more text samples that you have from your target, the better. Recommended amount of text to ID document Target is 550 words, but Forsyth and Holmes show that 250 words is a minimum. For various authors to test against, about 5000+ words recommended.<br />
<br />
I have been reading a study where <i>reviewers</i> <i>on YELP are linked (identified) and where the reviews are only average 149 words in length: </i>https://arxiv.org/pdf/1405.4918.pdf<br />
I don't have more details on this.<br />
<br />
4 -- There seems to be a way to create an <i>open</i> system with Jstylo, where if it doesn't identify an author, it won't just point to the closest match, but will come up with <b>unknown</b>.<br />
http://www.stolerman.net/papers/classify_verify_ifip-wg11.9-2014.pdf<br />
I don't have more details on this.<br />
<br />
5 -- Jstylo is not a black box, it is an automated GUI or interface combining established open source established software:<i> JGAAP, Writeprints and Weka.</i> Writeprints uncovers writing characteristics. Input features can be added or removed, and the spreadsheet can be exported showing the most significant important variables.<br />
<br />
6 -- News, Academic papers and Security Conferences using Jstylo around the world:<br />
https://psal.cs.drexel.edu/index.php/Main_Page<br />
<br />
7 -- All software works on this principle:<br />
garbage in = garbage out<br />
<br />
<br />
<b>Ransom Note Contradictions</b><br />
<br />
The writer of the ransom note probably <i>did not</i> commit the murder, although they were part of the cover up. The note is a contradictory and naive attempt to use psychological misdirection to point the investigators in another direction.<br />
<br />
First it's a "faction, (a small dissenting group within a larger group??), then there's a suggestion it may be someone at John Ramsay's workplace who is aware of his exact Christmas bonus, there are numerous movie quotes in an effort to appear<i> more criminal</i>, and a psychological attempt to issue a secondary threat of not releasing the body for "proper" burial because the writer knew the child was already dead.<br />
<br />
<b>The numerous contradictions involve telling a sleeping person to be <i>well rested</i>, not realising a kidnapper doesn't <i>deliver</i> a victim, crossing out <i>deliver</i> then using the word <i>pickup</i>.</b><br />
<br />
<b>There is also the issue of a kidnapper calling between 8.00-10.00am with delivery instructions, yet banking hours start at <a href="http://www.bank-locations.com/usbank_locations/30865-Boulder-CO.html">9.00am</a>, and the option of withdrawing the money earlier for an earlier delivery/pickup phone call from the kidnapper!</b><br />
<br />
The CBS show established the murder weapon was the flashlight. The expert forensic pathologist was able to show that a 10 year old child could create the exact injury (same hole dimensions too) on a human skull with pigskin using the flashlight. The flashlight belonged to the Ramsey household, yet had been wiped of prints, <i>as well as the batteries</i>. The motivation to wipe the batteries clean becomes clear if you think about <i>guilty knowledge</i>.<br />
<br />
Pathologist Dr. Werner Spitz said that the child was brain dead from the blow to the skull, so the intricate garrote was theatrical misdirection to shift attention away.<br />
<br />
The Ramsey's themselves ignored nearly all the instructions on the note, they phoned the police, they invited friends over, John sent his friend to the bank, they had no concern when the telephone call deadline passed without incident, and so on.<br />
<br />
<b>Guilty knowledge relies not on lying but <i>recognition</i> of information you shouldn't know with resultant anomalous behaviour.</b><br />
<br />
<b>911 Call</b><br />
<br />
The 911 call also stood out in using the strange phrase, "<i>We have a kidnapping</i>..."<br />
Many 911 calls are used to set up an alibi.<br />
This one is no exception, IMO.<br />
<br />
Check out FBI research on guilty and innocent 911 calls and their checklist.<br />
https://leb.fbi.gov/2008-pdfs/leb-june-2008<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY1Bv1kj52dTS5BHvzCfO7KOaRKf8jz3E0X5OodrMe2uuIXDPP9E_4a5jYQsB02go6qustr-yRYputV89xVTeplGiacYaPQQ3NlzpzgNwmY6MHYRaw1dcYGgUdDEWSGx2VWvVmWqdKUus/s1600/zzzzzz.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="283" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY1Bv1kj52dTS5BHvzCfO7KOaRKf8jz3E0X5OodrMe2uuIXDPP9E_4a5jYQsB02go6qustr-yRYputV89xVTeplGiacYaPQQ3NlzpzgNwmY6MHYRaw1dcYGgUdDEWSGx2VWvVmWqdKUus/s400/zzzzzz.jpg" width="400" /></a></div>
<br />
<br />
<br />
<br />
Porter and ten Brinke 2015 note that females give off more guilty verbal cues than males, and that is certainly the case here with Patsy giving more red flag cues over the course of the investigation, particularly in her video interviews and her statements. Automated software using verbal and written analysis also confirms this.<br />
<br />
<br />
<b>Update 2 Sept 2016:</b><br />
<b>2nd Jstylo Run</b><br />
<br />
I have been studying and testing more of the Jstylo software capabilities over the last week. I've decided to run it again over different training samples instead of Enron.<br />
<br />
Drexel University provide different problem sets, and there is one with a couple of dozen authors, each with 4 or 5 pieces of text to test against .<br />
<br />
I used 2 of the top classifiers here, Weka's SMO and Random Forest with 300 trees on a shortened version of Writeprints, Called Writeprints Limited.<br />
<br />
I includes 2 of Patsy's known texts, and John Ramsey's written speech when he was running for office in Michigan.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4TpI2fHneQnYPQmA77Yidb7Flhygt3XabDD89Vi2XCt-fbVXTkwxQDqDbqzC0kCWcb-NX88uETcQCKjlv5t_yltQ5cjAsKC6KexSBZ7mFoxMacuXiSwmk8n3UqS5AJOKSPEFl74SLUZQ/s1600/new.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="324" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4TpI2fHneQnYPQmA77Yidb7Flhygt3XabDD89Vi2XCt-fbVXTkwxQDqDbqzC0kCWcb-NX88uETcQCKjlv5t_yltQ5cjAsKC6KexSBZ7mFoxMacuXiSwmk8n3UqS5AJOKSPEFl74SLUZQ/s640/new.jpg" width="640" /></a></div>
<br />
Using different classifiers and different training authors from my first test, I got the same results with Patsy leading the pack in both classifiers and John Ramsey barely moving the needle. I removed each of the four texts from Patsy one at a time and retested, and each text made a difference -- <b>each written text from Patsy contributed something to the classification.</b> These are not probabilities, but ranking results.<br />
<br />
Patsy has linguistic fingerprints on the ransom note. Even a visual examination shows she uses <b>exact</b> whole sentence structures, not just the words "and hence".<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZKbZKMs67E5Yaj924sDpYvVXloK-5VbSsiOeRCK7MK_qMUhzyCjeQtNPOciqG1fgGVWhVFly_Q0l7ZkDzr5sg0GHu0mKr179HZIacxqfQlx98juzxmsCAyOHHvODAUDq5N0bB87v23Gk/s1600/sentence.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZKbZKMs67E5Yaj924sDpYvVXloK-5VbSsiOeRCK7MK_qMUhzyCjeQtNPOciqG1fgGVWhVFly_Q0l7ZkDzr5sg0GHu0mKr179HZIacxqfQlx98juzxmsCAyOHHvODAUDq5N0bB87v23Gk/s640/sentence.jpg" width="640" /></a></div>
<br />
The first sentence is from the ransom note, the second is from her Christmas note to friends. The word <i>delivery</i> was crossed out and <i>pickup</i> was added when the author realised that a kidnapper would not deliver the kidnap victim back, but would phone to say where the victim could be found.<br />
<br />
The complete sentence structure is identical, on each side of "and hence". It is part of her "linguistic fingerprint", besides all the invisible characteristics that get picked up by the Writeprints software.<br />
<br />
Different software, different analysis--<br />
<b><i>Different Ransom Notes</i> Comparisons Using Linguistic Inquiry and Word Count software</b><br />
<br />
Also known as LIWC, this software from psychologist James Pennebaker from the University of Texas has been well validated and used in many studies, over 6000 on Google Scholar, to date.<br />
<br />
According to Tausczik and Pennebaker:<br />
<i>"LIWC is a transparent text analysis program that counts words in psychologically meaningful categories. Empirical results using LIWC demonstrate its ability to detect </i><i>meaning in a wide variety of experimental settings, including to show attentional focus, emotionality, social relationships, thinking styles, and individual differences."</i><br />
<i><br /></i>
LIWC has been used to in various studies, from assessing depression to deception detection (Newman Pennebaker).<br />
<br />
Of interest to me is the Gender analysis, again from Tausczik and Pennebaker:<br />
<br />
<i>"Sex differences in language use show that women use more social words and refer</i><i>ences to others, and men use more complex language. A meta-analysis of the texts </i><i>Tausczik and Pennebaker from many studies shows that that the largest language </i><i>differences between males andfemales are in the complexity of the language </i><i>used and the degree of social references (Newman, Groom, Handelman, & Pennebaker, 2008). </i><i>Males had higher use of large words, articles, and prepositions. </i><br />
<i>Females had higher use of social words, and pronouns, </i><i>including first-person singular and third-person pronouns."</i><br />
<i><br /></i>
I located 2 more actual ransom notes, the longest ones I could find. These are the <a href="https://en.wikipedia.org/wiki/Barbara_Mackle_kidnapping">Barbara Mackle kidnapping</a> and the <a href="https://en.wikipedia.org/wiki/Leopold_and_Loeb">Leopold and Loeb</a> kidnapping. All the kidnappers were caught and convicted and were men.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjztCVL9vZk1kxb3Ikou0UwOscet4PjDo7kWMozW1DHRfGeuApw0gvhvk7zulfjOJ4ymFywH-eGBOW3Z8EJ-aCGaX3P7FL3GdeVhvFfWQUcS2MQeynD5dvF_aAPFCrT97QNETRPlyrGZtA/s1600/spreadsheet.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="428" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjztCVL9vZk1kxb3Ikou0UwOscet4PjDo7kWMozW1DHRfGeuApw0gvhvk7zulfjOJ4ymFywH-eGBOW3Z8EJ-aCGaX3P7FL3GdeVhvFfWQUcS2MQeynD5dvF_aAPFCrT97QNETRPlyrGZtA/s640/spreadsheet.jpg" width="640" /></a></div>
<br />
<br />
LIWC was run on all the ransom notes as well as a complete average on 4 of Patsy's notes she wrote.<br />
<br />
As per Pennebaker above, the Mackle Leopold notes have no <i>I</i> pronoun and lower <i>We</i> <i>He She</i> pronouns. Women use less articles and again the Mackle Leopold notes have more articles. Women use more social words, and the JonBenet note has very high social language.<br />
<br />
What is very interesting here is that <i><b>anxiety of the letter writer</b></i> is revealed in writing, and even though that JonBenet note was written in the house and would have taken about half an hour to write (21 minutes just to copy it, as the CBS show noted), there was NO anxiety. Yet there was anxiety in the other pre-written notes!<br />
<br />
Also, as a measure of authenticity, the JonBenet note is very low and there were more tentative words (not shown, but also a female indicator).<br />
<br />
<br />
<b>3rd Supporting Software Analysis</b><br />
<br />
<br />
Whissell's Dictionary of Affect is a very useful measure of <a href="https://www.god-helmet.com/whissel-dictionary-of-affect/index.htm">pleasantness</a>, not what the words mean but a<b> sentiment rating</b> of the overall pleasantness of the text.<br />
<br />
I have found a direct correlation to pleasantness and deception, and a study at <a href="http://www.cs.columbia.edu/~frank/enos_thesis.pdf">Columbia University </a>confirms this, but increased social language increases pleasantness too:<br />
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjen3qagEYgbaZ8oJBH4Xed9yQ1Z1gn467aj2p-4cv3UwAoAq3jEdfgkDtKa60tXW7S3_EdEwVhlAcIkrqfcOBEW_FEVRs4cOk_4A7XhCOdeGCHS_PpfU-SCBMhQZJehG7IN3KkzGavK4s/s1600/pleasant.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjen3qagEYgbaZ8oJBH4Xed9yQ1Z1gn467aj2p-4cv3UwAoAq3jEdfgkDtKa60tXW7S3_EdEwVhlAcIkrqfcOBEW_FEVRs4cOk_4A7XhCOdeGCHS_PpfU-SCBMhQZJehG7IN3KkzGavK4s/s640/pleasant.jpg" width="640" /></a></div>
<br />
The JonBenet note is above average in pleasantness and social language and higher than both other ransom notes, showing it more likely to be written by a female as per Pennebaker above.<br />
<br />
As FBI profiler Roger Depue wrote in his book, the ransom note was essentially nonsensical, obviously staged, and was "feminine" with terms such as <i>"gentlemen watching over", </i>and telling sleeping people to <i>"be well rested."</i>.<br />
<br />
*********************************************************************************<br />
<br />
<br />
<b>MORE on Analysing the Syntactic and linguistic Structure of the JonBenet Ransom note, or taking the words away and leaving the parts of speech and Syntactic Tree structure here of the text:</b><br />
<b><br /></b><a href="http://www.elastictruth.com/2017/04/new-analysis-of-ramsey-ransom-note.html">http://www.elastictruth.com/2017/04/new-analysis-of-ramsey-ransom-note.html</a><br />
<br />
<br />
<br />
<br />
<br />
<b>FBI interest on this website:</b><br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaAIEQrKtHGb-UPwcJ4I8TV79GstRGO-mC2zvXcjBA69cicfjzDouShpkuAypxtHZJyc-dPv9uBMOcOwRBjQR6Fo0XTMLtZL6FJNjfRDjWZN1sYUm2PSwhqkZyI6NifSsj8CpVhRF0t9w/s1600/fbi.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="270" data-original-width="501" height="215" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaAIEQrKtHGb-UPwcJ4I8TV79GstRGO-mC2zvXcjBA69cicfjzDouShpkuAypxtHZJyc-dPv9uBMOcOwRBjQR6Fo0XTMLtZL6FJNjfRDjWZN1sYUm2PSwhqkZyI6NifSsj8CpVhRF0t9w/s400/fbi.jpg" width="400" /></a></div>
<br />
I have had a very amicable email exchange with Frank Marsh from the FBI who wanted to know a bit about training suggestions and material on statement analysis etc. It's a credit to the agency that they take the time look around to see if there is any new information or techniques they need to know. I blocked out a few details above for privacy. I would love to take up Frank on his offer of a dinner "next time I am near Quantico."<br />
<br />
<br />
<br />
*********************************************************************************<br />
<br />
<br />
<b>NEW ADDITION July 2018</b><br />
<br />
In <b>Clustering Analysis</b>, variables that are similar to each other form a cluster or group. The software runs through the data in an <b>unsupervised</b> way which means there is no targ et variable used, and looks for the closest and most similar clusters. It is similar to a decision tree, but works without a classification variable.<br />
<br />
The software I am using is the free MDL Cluster software: https://www.kdnuggets.com/2016/08/mdl-clustering-unsupervised-attribute-ranking-discretization-clustering.html<br />
<br />
It is very efficient, on par or better then k-means and EM and works as a Java exe in standalone mode. It can also be used for discretisation and corpus building, although I won't be using those capabilities.<br />
<br />
<i>I wanted to see what other ransom notes have in common. Obviously at surface level and at the most basic, they all have demands, a list of instructions, possibly a threat and so on.</i><br />
<br />
The JonBenet ransom note was a very long rambling note of 370 words.<br />
I managed to find 3 other complete ransom notes, two even longer than the JonBenet one:<br />
<br />
1 - The Leopold and Loeb random note of 401 words, <a href="https://en.wikipedia.org/wiki/Leopold_and_Loeb">https://en.wikipedia.org/wiki/Leopold_and_Loeb</a><br />
<br />
2 - The Barbara Mackle ransom note at a whopping 972 words, <a href="https://en.wikipedia.org/wiki/Barbara_Mackle_kidnapping">https://en.wikipedia.org/wiki/Barbara_Mackle_kidnapping</a><br />
<br />
3 - The Rob Wiles ransom note at 152 words, <a href="https://crimewatchdaily.com/2017/06/08/ransom-arrest-conviction-but-no-body-what-happened-to-robert-wiles/">https://crimewatchdaily.com/2017/06/08/ransom-arrest-conviction-but-no-body-what-happened-to-robert-wiles/</a><br />
<br />
Along with the ransom notes, I have four notes from Patsy Ramsey, titled Patsy 1 + 2 and Patsy 1995 and 1996. A 2110 word letter from John Ramsey was included in this experiment.<br />
<br />
The idea is to see what a clustering algorithm would find by lumping Patsy and John Ramsey along with the four ransom notes. The software is blind to who the note belongs too--the classification variable which specifies the owner of the note is NOT used by the software. In other words, the clustering software is looking for similarities.<br />
<br />
At the basic word level, all the ransom notes are similar. This is obvious and useless, a ransom note is completely different to a Christmas Card for example.<br />
<br />
So we need to look at a deeper level. I have already looked at a syntactic level in another blog post, now I want to used LIWC, the linguistic inquiry and word count software from James Pennebaker at the Uni of Texas.<br />
<br />
<a href="http://journals.sagepub.com/doi/abs/10.1177/0261927X09351676?journalCode=jlsa">http://journals.sagepub.com/doi/abs/10.1177/0261927X09351676?journalCode=jlsa</a><br />
<br />
The built in dictionary has categories for things like anger, negation, function words etc. I wanted to try this on a custom built dictionary by Jeremy Frimer called the Prosocial dictionary--<br />
<a href="http://www.pnas.org/content/112/21/6591">http://www.pnas.org/content/112/21/6591</a><br />
<br />
This has been used in interesting hypotheses, such as a decline in prosocial (helping, caring language) language tracking with dissatisfaction of politics. They have even created a model tracking approval ratings of politicians based on their prosocial language.<br />
<br />
I downloaded the Prosocial Lexicon and ran it over the ransom notes. This added up how often certain words in certain categories appeared in each note:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSai27pDbO0ISdXKemh1JjT6agYMOE3bsCJmi-_h3WaN6eEiBvGIDltSFVbsxQtEcs0JW2TwoAc-y1oLZl2opNCKzKG49ifK_pVrn9C2kdw9O_8JS9NbLSCmSmu1FWID3p_3tYV58MARk/s1600/cluster222222.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="252" data-original-width="1600" height="99" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSai27pDbO0ISdXKemh1JjT6agYMOE3bsCJmi-_h3WaN6eEiBvGIDltSFVbsxQtEcs0JW2TwoAc-y1oLZl2opNCKzKG49ifK_pVrn9C2kdw9O_8JS9NbLSCmSmu1FWID3p_3tYV58MARk/s640/cluster222222.jpg" width="640" /></a></div>
<br />
There were about 127 columns, many sparse, only a few are shown here. This is now the input for the MDL Cluster software. It ignores the last classification variable of who the author of the note was, and runs through all the variables, looking for similar groups or clusters.<br />
<br />
The output using the best 20 variables was:<br />
Attributes: 21<br />
Ignored attribute: filename<br />
Instances: 9 (non-sparse)<br />
Attribute-values in original data: 57<br />
Numeric attributes with missing values (replaced with mean): 0<br />
Minimum encoding length of data: 450.94<br />
---------------------------------------------------------------<br />
(48.70) (9.74)<br />
#ProSocial<=0.008772 (11.45)<br />
#support*<=0 (-0.66) [0,1,0,0,0,1,1,0,0] jon_ransom.txt<br />
#support*>0 (-2.00) [0,0,0,1,0,0,0,1,0] mackle_demand.txt<br />
#ProSocial>0.008772 (11.57)<br />
#help*<=0 (-2.00) [0,0,1,0,0,0,0,0,1] leo_loeb.txt<br />
#help*>0 (-2.00) [1,0,0,0,1,0,0,0,0] john_letter.txt<br />
<br />
---------------------------------------<br />
Number of clusters (leaves): 4<br />
Correctly classified instances: 4 (44%)<br />
Time(ms): 41<br />
<br />
<br />
A new spreadsheet was created by the software, showing the clusters:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf5fV_0cvHxeb15DsrNoKGcPjC66McLGmJss_GmYrGtSPNlTJ8sDWyl8Vg3taoR2uVmwEZzf23YXfvc3cPEkq6CIznlgqqkgmOXMA4KZC6gszpDwy6TbF_Oj5mQAEf4tBORi5n6Z54Z9E/s1600/cluster2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="277" data-original-width="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf5fV_0cvHxeb15DsrNoKGcPjC66McLGmJss_GmYrGtSPNlTJ8sDWyl8Vg3taoR2uVmwEZzf23YXfvc3cPEkq6CIznlgqqkgmOXMA4KZC6gszpDwy6TbF_Oj5mQAEf4tBORi5n6Z54Z9E/s1600/cluster2.jpg" /></a></div>
<br />
<b>Four clusters were found. Cluster 0 shows Patsy 1995 and Patsy 1996 lumped with the JonBenet ransom note!!</b> There was similarity with Patsy2 note and the Mackle note in Cluster1, as well as John's letter and Patsy 1 in Cluster3. Rob Wiles and the Leopold and Loeb note were put together in Cluster2.<br />
<br />
<i>A completely automated clustering approach with NO information about who wrote which note, groups Patsy with the JonBenet ransom note, even though on the surface, all the ransom notes appear similar in that there are demands and instructions and so on.</i><br />
<br />
The Prosocial Dictionary tracks helping and caring language, it could probably be thought of as an empathy indicator which has proved useful for a few studies. It seems that the Patsy notes and the JonBenet ransom note are in the same cluster because of a low level use of caring language in the ransom note which has a similar "signature" to Patsy. It has been observed by a few people that the ransom note is "feminine" in the way it is written, talking about being well rested and so on. This is confirmed with various online Gender handwriting analysis sites when the ransom note is analysed.<br />
<br />
<b>Another potential direction is the new field of Sentiment Analysis</b>, used to detect the sentiment in product reviews, hotel reviews and so on--<br />
<a href="https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html">https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html</a><br />
<br />
An exciting new method is DepecheMood, which used 37 000 terms along with emotion scores--<br />
https://arxiv.org/abs/1405.1605<br />
<br />
They have built an online website to test text--<br />
<a href="http://www.depechemood.eu/">http://www.depechemood.eu/</a><br />
<br />
Plugging the ransom notes into Depechemood shows different sentiment--<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKKEzrhc0ShcGNhJWQUQWsg7i3EyRxTzJcVLC7Jli7oNirdRRgZne-377IbCKWyY40Jcjsod-U-qR5FcIpjFAtaAFYmABzShokeZRjN2zvR_VOJQas90A-pNPxN8ZdWYsTA5la2v309PE/s1600/ransomMood.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="562" data-original-width="1028" height="347" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKKEzrhc0ShcGNhJWQUQWsg7i3EyRxTzJcVLC7Jli7oNirdRRgZne-377IbCKWyY40Jcjsod-U-qR5FcIpjFAtaAFYmABzShokeZRjN2zvR_VOJQas90A-pNPxN8ZdWYsTA5la2v309PE/s640/ransomMood.jpg" width="640" /></a></div>
<br />
The JonBenet Ransom Note above<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgj2kq9qRxP8JlrBxhIX0kniO45sY3FxbGNGn2NkMWXQrzUKX2X6MaN5IoOoOnpAYdFhaVwV-8WQEaM0vZh3zSQ8161f1eEDcIXHNGNlWiAXSKfW-gEhEI8e39qOfmQO48If397M4WpTGM/s1600/mackleMood.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="566" data-original-width="1025" height="352" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgj2kq9qRxP8JlrBxhIX0kniO45sY3FxbGNGn2NkMWXQrzUKX2X6MaN5IoOoOnpAYdFhaVwV-8WQEaM0vZh3zSQ8161f1eEDcIXHNGNlWiAXSKfW-gEhEI8e39qOfmQO48If397M4WpTGM/s640/mackleMood.jpg" width="640" /></a></div>
<br />
The Mackle Note above<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSzU-ZSI0CycTVot7knPeYAb9ggCryvkyU-Yy5qhrdMqnpQhIElJghySs5QTdNgIpV9ZRSRxY0L4tla9zzPfzfMfxRIfY5xsnTey0Pw_qenhqJE_Gwxzx123xZ67qRm2Vj3yv2h-n0XEo/s1600/leoMood.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="563" data-original-width="1030" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSzU-ZSI0CycTVot7knPeYAb9ggCryvkyU-Yy5qhrdMqnpQhIElJghySs5QTdNgIpV9ZRSRxY0L4tla9zzPfzfMfxRIfY5xsnTey0Pw_qenhqJE_Gwxzx123xZ67qRm2Vj3yv2h-n0XEo/s640/leoMood.jpg" width="640" /></a></div>
<br />
The Leopold Note above<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgdjOklLN7LLsLz-8sQUPQoWtiO3uzxzWr_A8INjWCUiOgf-7wcDrqY5uQOrzrWRAY5_35YukDDJ2HSNTidscAIMI1Xx7XILhZDPedTMh-7vqifEXnNzOYyY9l43s9tVFs-5yF6UMUpIw/s1600/wilesMood.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="563" data-original-width="1033" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgdjOklLN7LLsLz-8sQUPQoWtiO3uzxzWr_A8INjWCUiOgf-7wcDrqY5uQOrzrWRAY5_35YukDDJ2HSNTidscAIMI1Xx7XILhZDPedTMh-7vqifEXnNzOYyY9l43s9tVFs-5yF6UMUpIw/s640/wilesMood.jpg" width="640" /></a></div>
<br />
The Robe Wiles note above<br />
<br />
<br />
It's interesting to see that the two top emotions in the JonBenet ransom note (top) are Sadness and Anger, consistent with what you would expect if the CBS special scenario played out ie JonBenet being accidentally killed by her brother as she snatched some pineapple from him during a late night snack.<br />
<br />
More to follow......Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com17tag:blogger.com,1999:blog-1417239720543668306.post-28994087089694793582016-07-10T20:44:00.002-07:002016-07-10T20:44:41.591-07:00Negative Elections Work Because "Fear Is An Effective Means Of Persuasion".Studies show that human beings are more motivated by loss than by gain.<br />
<br />
If you frame <i>the same message</i> in two different ways, such as..."if you insulate your windows, you will SAVE a dollar a day in heating" compared with "if you fail to insulate your windows, you will LOSE a dollar a day", most people are more motivated to act on the loss message.<br />
<br />
(see Robert Cialdini-<a href="https://www.influenceatwork.com/">https://www.influenceatwork.com</a>/)<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWSASZlgchX2oM1qcmS2OB1nZok05qKDncw3YCjfV9VhwUOHzSGu_W8VwmalfVVQp1qxDQPnhlNg8S9c0ld3SI8TABSdXXFhlUaNsQb2ROYeIZEk7aLJ9aiJOLp8VQdERFL7Nfr2AUq8M/s1600/gains-losses-20141022100620929.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWSASZlgchX2oM1qcmS2OB1nZok05qKDncw3YCjfV9VhwUOHzSGu_W8VwmalfVVQp1qxDQPnhlNg8S9c0ld3SI8TABSdXXFhlUaNsQb2ROYeIZEk7aLJ9aiJOLp8VQdERFL7Nfr2AUq8M/s320/gains-losses-20141022100620929.jpg" width="320" /></a></div>
<br />
<br />
The headline above comes from this persuasion study:<br />
<br />
<a href="http://www.ncurproceedings.org/ojs/index.php/NCUR2014/article/viewFile/867/489">http://www.ncurproceedings.org/ojs/index.php/NCUR2014/article/viewFile/867/489</a><br />
<br />
It concludes with <b>"the stronger the fear appeal, the greater the chance the individual will accept the recommendation of action."</b><br />
<b><br /></b>
This helps explain why deceptive negative election campaigns are becoming more common. Chris Mitchell in the Australian laments that so many fellow journalist continued the message uncritically:<br />
<br />
"<span style="color: #333333; font-family: 'Times New Roman', sans-serif; font-size: 19px; line-height: 22px;"><i>All week journalists from the national broadcaster and much of the print and commercial electronic media seemed to agree with Bill Shorten that Labor’s dishonest Medicare scare had shown up the Coalition for being out of touch with voters.</i></span><br />
<span style="color: #333333; font-family: 'Times New Roman', sans-serif; font-size: 19px; line-height: 22px;"><i><br /></i></span>
<span style="color: #333333; font-family: 'Times New Roman', sans-serif; font-size: 19px; line-height: 22px;"><i>The 2014 budget recommended a small Medicare co-payment of exactly the kind Labor wanted to introduce under former prime minister Bob Hawke 25 years ago. It was the only budget since 2010 that sought to deal with the issue S&P is warning about."</i></span><br />
<span style="color: #333333; font-family: 'Times New Roman', sans-serif; font-size: 19px; line-height: 22px;"><br /></span>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzsbrBCofwsIrfP4dZ7Rj1NTn1Bk-kEze93DEBm-7maYUmEZNAuxUre7dd1TeqyH9K2ZavmoLZdnggxAdhZBXXRvzs8Ou89gULJhZ2zJDeMVWXF9-OJe_n1jZ4N_lURwqXyov0Ad52Uo0/s1600/baby-purple-elephant-3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="542" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzsbrBCofwsIrfP4dZ7Rj1NTn1Bk-kEze93DEBm-7maYUmEZNAuxUre7dd1TeqyH9K2ZavmoLZdnggxAdhZBXXRvzs8Ou89gULJhZ2zJDeMVWXF9-OJe_n1jZ4N_lURwqXyov0Ad52Uo0/s640/baby-purple-elephant-3.jpg" width="640" /></a></div>
<br />
<br />
Don't think of a purple elephant!!<br />
Of course, when I say that, you think of a purple elephant.<br />
<br />
The brain does not automatically process negatives, a basic principle of neurolinguistics. Any negation such as not, don't or un are initially processed subconsciously by the brain in the positive. So if you say to a child,"Don't spill your milk", the child's brain first subconsciously processes <i>spill your milk, </i>and then <i>Don't </i>is added on to the sentence by the conscious brain.<br />
<br />
Saying <i>don't </i>makes it more likely that the milk will be spilled. Just like thinking of a purple elephant.<br />
That's why <i>un</i>caring or <i>non</i>violent are weak messages, but also another reason why the negative message in an election campaign stays with us.<br />
<br />
But the downside of going negative is that <b>"</b><span style="background-color: white; font-family: Tahoma, Arial, sans-serif; font-size: 14px; line-height: 20px;"><b>such ads may work to both shrink and polarize the electorate,</b>” as the political scientists Shanto Iyengar of Stanford </span>has long pointed out.<br />
<br />
This was the case with the Australian election, with record numbers of voters leaving the major parties to vote for the independents. Labor had the second lowest number of primary votes in it's history, while the Liberals lost at least 1.7 million voters moving to right of centre independents .<br />
<br />
With changing times comes lack of accountability for lies and deceptions during an election. Football players are more likely to be punished for foul play than a politician who lies in an attempt to influence votes. Voting is an emotional process, not a logical one, and when you are trying to sell something, whether a politician or a beer, it can be more effective to sell on an emotional basis instead of relying on the facts.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ypasO748re5KiIom5Df3zAHCbZORSXoh4T2wOydrPN5_xJ_w9NslhpItYwlWZ5N_Zv6GEz2i-jaJByRlRe1VCl1fke5QfJe7C2682m4YX-XmSOTZwUI1Jtzwzrj6-1md7clu560_9O4/s1600/barack-obama_3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="414" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ypasO748re5KiIom5Df3zAHCbZORSXoh4T2wOydrPN5_xJ_w9NslhpItYwlWZ5N_Zv6GEz2i-jaJByRlRe1VCl1fke5QfJe7C2682m4YX-XmSOTZwUI1Jtzwzrj6-1md7clu560_9O4/s640/barack-obama_3.jpg" width="640" /></a></div>
<br />
During the 2010 campaign, Obama employed 29 behavioural scientist and psychologists, including best selling authors Dan Ariely and Richard Thaler to create proposals to reduce emotions and create reason, and then show the science behind it.<br />
<br />
One of the things that came out of this was to never rebuff a negative or deceptive claim with a negation such as <i>not, isn't, doesn't</i>. The claim was made that Obama was a Muslim. The Obama team did not respond with "Obama <i>isn't</i> a Muslim", they responded with a positive statement saying "Obama is a Christian".. and so on.<br />
<br />
Responding with a negation such as <i>don't spill your milk</i> is more likely to anchor <i>the spill your milk</i> or Obama is a Muslim or Malcolm Turnbull is going to privatise Medicare, in the mind.<br />
<br />
Obama is using science to respond to negative campaigns, something Malcolm Turnbull should have done a long time ago.<br />
.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-45644325086749322702016-07-03T17:42:00.000-07:002016-07-03T17:42:34.813-07:00Learning From Politicians: Applause Cues For Speakers<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoc4lHXk_pKFs6MuPz-Eu7JRAXLjf8GegYYvbH4P8k_SXMZn9Vwxf5ejHbZpsb09S9oARKM1P4g-MH-dhXKMvumfe5wlduCJ2MGXHd4NBT2qPoagTuSLccfBOt9-LiSVWKk7J1-wEDXr0/s1600/business-people-clapping-2015-billboard-650.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="422" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoc4lHXk_pKFs6MuPz-Eu7JRAXLjf8GegYYvbH4P8k_SXMZn9Vwxf5ejHbZpsb09S9oARKM1P4g-MH-dhXKMvumfe5wlduCJ2MGXHd4NBT2qPoagTuSLccfBOt9-LiSVWKk7J1-wEDXr0/s640/business-people-clapping-2015-billboard-650.jpg" width="640" /></a></div>
<br />
<br />
Applause generation cues are crafted within in a political talk because audiences need to know not only if they are going to applaud but <i>when</i> they are going to applaud.<br />
<br />
In particular, two of these rhetorical techniques (Atkinson 1983), the 3 item list ( rule of three) and the contrast principle, are very useful for speakers, newspaper editors, advertising and any situation designed to persuade.<br />
<br />
The Rule Of Three, or trios and triplets are everywhere in western culture. We have an inherent attraction to the number three, it allows us to express a concept, to emphasise it to make it memorable.<br />
<br />
When a 3 item list is included in a speech, we recognise it as such and can anticipate the completion of that point, so it becomes a natural applause cue.<br />
<br />
So for example, Tony Blair was applauded for his famous 3 item list, "Ask me my main priorities for Government, and I tell you: education, education and education."<br />
<br />
On during the election night, opposition leader Bill Shorten said, "The Labor party is re-energised, it is unified and it is more determined than ever.<br />
<br />
Obama said," Homes have been lost; jobs shed; businesses shuttered."<br />
<br />
The second important rhetorical technique is the Contrast Principle which fundamental in the way our brain makes decisions.<br />
<br />
Advertising is in essence contrast -- you show that you are the only red apple amongst green apples.<br />
Things such as before and after diet pictures, before and after hair loss programs and so on, have been used in advertising for decades.<br />
<br />
Contrast highlights and exaggerates what precedes it, so for example in retail you will always be sold a suit first, <i>then</i> a jumper or shirt because it appears more trivial in price. A real estate agent will show you an older run down property first, then show you something closer to your brief, and it <i>appears even more suitable because of the contrast that preceded it..</i><br />
<i><br /></i>
In speech, an example is John Major saying, "We are in Europe to help shape it, <i>not</i> to be shaped by it." To be effective in speech, the second part of the contrast needs to be very similar to the first part.<br />
<br />
Atkinson research in 1984 showed that the 3 item list and the contrasts techniques were used by virtually all "charismatic speakers", and that the media often selected such passages as part of their print.<br />
<br />
Research by John Heritage and David Greatbatch backs up Atkinson's findings and shows that in a political context, nearly half of all the applause generated in speeches was from these two rhetorical techniques alone.<br />
<br />
A good reason for all speakers to be aware and use the Rule Of Three and the Contrast Principle.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com2tag:blogger.com,1999:blog-1417239720543668306.post-4616481639421501162016-06-30T16:10:00.000-07:002016-06-30T16:22:26.198-07:00Judging Competence + Success From A Face.Previous studies have shown that personality traits such as competence and trustworthiness can be reliably judged from a face (Hess, Adams, & Kleck, 2005). Similar studies showed a 70% correlation to predicting election results in the U.S based on looking at photo's and scoring the same traits.<br />
<br />
However, can you tell which CEO's are most successful from their face?<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP8KfNn6Ar7aUWzz1TPDtVdPyncJKFMvtDt1z3fCKIiN249pAsyGTX3Hf3oXIfTtGuvgNMP_N0-nD85cSoP5cdMROHlnUe5e7TV9WgXWJ4sTjGsb1oTVo8vaVOWlObuuGiksG_mXBg6FE/s1600/faces-pano_26145.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="296" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP8KfNn6Ar7aUWzz1TPDtVdPyncJKFMvtDt1z3fCKIiN249pAsyGTX3Hf3oXIfTtGuvgNMP_N0-nD85cSoP5cdMROHlnUe5e7TV9WgXWJ4sTjGsb1oTVo8vaVOWlObuuGiksG_mXBg6FE/s640/faces-pano_26145.jpg" width="640" /></a></div>
<span style="font-size: x-small;"> credit-- inc.com</span><br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
Undergraduates were asked to judge CEO's likability and competence based on a series of photo's in a study by Nicholas Rule and Nalini Ambadi:<br />
<br />
http://psych.utoronto.ca/users/rule/pubs/2008/Rule&Ambady(2008_PsychSci).pdf<br />
<br />
It turns out that leaders that scored the highest also ran the most successful companies. Nicholas Rule says, "<i>These findings suggest that naive judgments may provide
more accurate assessments of individuals than well-informed
judgments can</i>."<br />
<br />
First impressions are critical, because we do judge a book by it's cover. Which brings me to Bill Shortens demeanor when interviewed on TV.<br />
<br />
I have noticed in the last week that Shorten appears to be going into "sad mode" as he begins to speak. This morning while being interviewed on channel 9 with one day to the election, Bill Shorten's face visibly changed at the moment he began to speak. His eyebrows over the nose went went together and up, his top eyelids dropped slightly, in a classic sad expression.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFspuFPMvmwkoGiYMApec-lQXs4WZZAaEbx_shHh9O3GpX4L9OKnmn4y0Hfp8QhQqOs3MSa5dDntphlwjodAxaJH3vL4wO7GYu0d-3fBvt1ddxgBsMaXtTO3F53OLrPXlQ1Fjc1sv1Z2k/s1600/sad.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="384" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFspuFPMvmwkoGiYMApec-lQXs4WZZAaEbx_shHh9O3GpX4L9OKnmn4y0Hfp8QhQqOs3MSa5dDntphlwjodAxaJH3vL4wO7GYu0d-3fBvt1ddxgBsMaXtTO3F53OLrPXlQ1Fjc1sv1Z2k/s640/sad.jpg" width="640" /></a></div>
<br />
Whether this is to invoke underdog sympathy, or portray a large burden being placed on his shoulders, it appears to be a conscious attempt because he visibly changes as he begins the interview.<br />
<br />
The world's pioneer and authority on facial recognition, Dr. Paul Ekman in his book Emotions Revealed says,<br />
<br />
"The eyebrows are very important, highly reliable signs of sadness."<br />
And when talking about actors he says, "It makes them seem more empathetic, warm and kind, <i>but that may not be a true reflection of what they are feeling.</i>"<br />
<br />
I wanted to test whether Bill Shorten is using this as part of his persona. I downloaded all his speeches for the month of June as well as Malcolm Turnbull to use as a comparison.<br />
<br />
I ran the speeches through the psychological text analysis tool LIWC from James Pennebaker from the University of Texas to categorise dozens and dozens of words related to sadness from all the speeches.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguByZT2dhG-gsVQlNsulM3EAvW4Xr2LvHmWyeexhdTr9hk8HtK44AmxeKoYqAfSS_xv5an-PKrA-GTI4OIfhgiTW3t1UFptDH99YzkhMg2Jzd0FLKIJsMrYiOxxUIIc140FqnjSojdMyE/s1600/jmp2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguByZT2dhG-gsVQlNsulM3EAvW4Xr2LvHmWyeexhdTr9hk8HtK44AmxeKoYqAfSS_xv5an-PKrA-GTI4OIfhgiTW3t1UFptDH99YzkhMg2Jzd0FLKIJsMrYiOxxUIIc140FqnjSojdMyE/s640/jmp2.jpg" width="498" /></a></div>
<br />
<br />
<br />
Compared to Turnbull, Shorten uses more sad language, using more words like lose, missing, tragic, deprived.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTE5yPjnidIUEnCNJ2_i_SA8ZEo0wmLDSJpIjNntgPYVSRBdxMJmLST2XFWfzsi86I7n5aV-g1kyV0uiaEtFG6sqkkNBB1K78jyhsUItRuU33ayiIVIIA8WtrvRsZtRuVPU9DfWANXeTs/s1600/npc2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTE5yPjnidIUEnCNJ2_i_SA8ZEo0wmLDSJpIjNntgPYVSRBdxMJmLST2XFWfzsi86I7n5aV-g1kyV0uiaEtFG6sqkkNBB1K78jyhsUItRuU33ayiIVIIA8WtrvRsZtRuVPU9DfWANXeTs/s320/npc2.jpg" width="320" /></a></div>
<br />
<br />
This persona may be part of the political spin developed by Shorten and the Labor campaign to invoke empathy, but portraying trustworthiness and competence has been shown to be a better combination.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-92185152090417971942016-06-27T15:37:00.000-07:002016-06-27T15:56:57.067-07:00Election Word Watching - Who's Most Deceptive?I downloaded 28 speeches from Malcolm Turnbull and Bill Shorten (14 each) for the month of June. This includes scripted and unscripted Q+A sessions. I ran all the speeches through the text analysis software, which used 4500 words in 80 categories to to analyse what was said and to get deeper into the state of mind and intent of the leaders.<br />
<br />
The most frequent words used by both leaders shown here as a word cloud. The larger the word, the more often it was used.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0ijuWQD4ABpm7-CELK3oyCRQYMXK8Tgwo9LmHO8Xwi225C-RftWn4AnuMhJtmX1PhOQsYiULkNxSxiZnurHibYES3ZTbgI0mnU05V1PXS6DhndE-77UNb1ska81u3KhoCWERInCF043E/s1600/cloudshorten2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="558" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0ijuWQD4ABpm7-CELK3oyCRQYMXK8Tgwo9LmHO8Xwi225C-RftWn4AnuMhJtmX1PhOQsYiULkNxSxiZnurHibYES3ZTbgI0mnU05V1PXS6DhndE-77UNb1ska81u3KhoCWERInCF043E/s640/cloudshorten2.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
Above: Bill Shorten word cloud.</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlxLaLLS4jtFt0htqeo0vLtLbCJHUo9JcmuanQ5VQh5_Qj6AzSbQqtOzd72HRHoBIFXpaB08hSAUECNFJpxso334NTj12NbWXXWsqh5RNzys3C_yuN3r0yOavBQwa-0RAiGjX5_wCyoAo/s1600/wordturnbull.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="540" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlxLaLLS4jtFt0htqeo0vLtLbCJHUo9JcmuanQ5VQh5_Qj6AzSbQqtOzd72HRHoBIFXpaB08hSAUECNFJpxso334NTj12NbWXXWsqh5RNzys3C_yuN3r0yOavBQwa-0RAiGjX5_wCyoAo/s640/wordturnbull.jpg" width="640" /></a></div>
Above: Malcolm Turnbull word cloud<br />
<br />
While both leaders mention the other side, Shorten stands out with how often he uses the word "Turnbull", it's nearly as large as Labor.<br />
<br />
Next we look at Equivocation or Hedge words (I believe, think, might, should could etc) which reduces commitment and can "indicate a less positive experience or an unwillingness to communicate information". (Wiener and Mehrabian 1968)<br />
<br />
I'll also look at Negations, words like no, not and all contractions of not. Equivocation and Negation were pinpointed by Susan Adams of the FBI as the most indicative indicators of deception in her paper Indicators Of Veracity And Deception: An Analysis Of Written Statements Made To Police - 2006.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPSjvBhdPPCgfOlGcsWdIlp6ViFJPhC97bORlATqzUyOfNvE2BQDhA4Vu2Z1Ji-wvu943pIkjuhDGh7-_IaZLR6klvU3xupE19vTrftScgZSXvk6sJ0g03Ted2pKiGYcO1fBgewcGqnQ4/s1600/equiv.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="438" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPSjvBhdPPCgfOlGcsWdIlp6ViFJPhC97bORlATqzUyOfNvE2BQDhA4Vu2Z1Ji-wvu943pIkjuhDGh7-_IaZLR6klvU3xupE19vTrftScgZSXvk6sJ0g03Ted2pKiGYcO1fBgewcGqnQ4/s640/equiv.jpg" width="640" /></a></div>
<br />
<br />
The results are highly significant at below the 1% level. Bill Shorten uses far more Equivocation and Negation compared to Malcolm Turnbull. This is a red flag in deception, but it is also high in <i>losing</i> political parties in the 100 year election speech analysis I did. Winners of elections used less, losers used more of these words.<br />
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJoQc-8L5bnOSfCdt4DB44Xa9hQREg6Kf4hpbVU7lhml35iN-Tcc7quwt5PTWSOuNem9dkyuV7hSzucw-Yo-E1gdkE1Z02gH-iGFtaraIq6FMuHOAVcgh2D8CNTHkbgo6RKUocp8NN8WM/s1600/power.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="418" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJoQc-8L5bnOSfCdt4DB44Xa9hQREg6Kf4hpbVU7lhml35iN-Tcc7quwt5PTWSOuNem9dkyuV7hSzucw-Yo-E1gdkE1Z02gH-iGFtaraIq6FMuHOAVcgh2D8CNTHkbgo6RKUocp8NN8WM/s640/power.jpg" width="640" /></a></div>
<br />
<br />
Tracking what drives the election leaders, we break down affiliation, achievement and power indicators in speech. People high in affiliation are concerned with relationships and close allies. Both leaders are similar in affliation and also achievement orientation.<br />
<br />
The power indicator -- how driven is the individual to control, status and prestige. David Winter argued in his analysis of U.S presidents that the degree of power was an indicator of political effectiveness, but I have found it to be a negative indicator in Australian elections - the public don't seem to vote for power language used in elections.<br />
<br />
Looking at words relating to anger, Turnbull scores higher (more angry). This is a negative indicator and is also a negative indicator in the German election prediction using twitter to determine public sentiment that I mentioned in a previous post.<br />
<br />
Both leaders are very similar on indicators of Analytical thinking, both are similar in Risk and Reward indicators, and both are similar in Authenticity.<br />
<br />
Shorten uses far more "I" pronouns which tends to be "more personal, but more insecure." (LIWC analysis by psychologist James Pennebaker).<br />
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhwqN-dNwz4xj6m7lBcyY2PXAIO8lKdogpk6pdlPXlO2HKhU3IZ3yL-4X7hekiHCDJP7QvIW_T4AKhk2M77umlYI1etEY4xGRUCRwvqGAaOcLl12iG8_ZxvaAUSezQ8B10iPEOIbSM1ow/s1600/present2222.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="396" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhwqN-dNwz4xj6m7lBcyY2PXAIO8lKdogpk6pdlPXlO2HKhU3IZ3yL-4X7hekiHCDJP7QvIW_T4AKhk2M77umlYI1etEY4xGRUCRwvqGAaOcLl12iG8_ZxvaAUSezQ8B10iPEOIbSM1ow/s640/present2222.jpg" width="640" /></a></div>
<br />
<br />
Both leaders are similar in focus on the future and the past events, but Bill Shorten is <i>far</i> more likely to focus on the present. There doesn't appear to be that much of a difference on the graph, but it is highly significant on the statistical analysis.<br />
<br />
Pennebaker says, "<span style="background-color: white; color: #555555; font-family: "georgia" , "times new roman" , "times" , serif; font-size: 12px; line-height: 12px;">People oriented towards the present are thinking about current events that are psychologically close. Present-focused people tend to be more neurotic, depressed, and pessimistic than either past- of future-oriented people.</span><span style="background-color: white; color: #555555; font-family: "georgia" , "times new roman" , "times" , serif; font-size: 12px; line-height: 12px;"> </span>"<br />
<br />
In total Malcolm Turnbull only has 3 categories of words out of 80 that are more statistically excessive than Bill Shorten, whereas Shorten has 12 categories.<br />
<br />
Shorten has a problem with language <i>compared</i> to Malcolm Turnbull (this only relates when compared to each other) - he is perceived as more negative and is flagged as more deceptive.<br />
<br />
In closing, the sincerity problem Shorten has is really shown in these pictures (gifs) below.<br />
He is delivering a speech where he says, "I believe..." then looks down at his notes to remind himself what he believes in. This really is indicative of poor preparation or going into "auto pilot" during a speech.<br />
<br />
But just when you think he was tired or having a bad day, he does it again during a second important speech about his asylum seeker/refugee policy!<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYo5jsnVmtKv0vpUPuPd4dhINNYKwmeJxWqGXsOP_w3YJp-Y_5gimxrlcJyNnLV8vAsAz8Vrw6B8QLYV4Uz7wDbx_H16jh3IeElgrSNrJzkiKF6gWmlbRNC39kazFrdzDeppkqTg_qAFQ/s1600/shorten2.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYo5jsnVmtKv0vpUPuPd4dhINNYKwmeJxWqGXsOP_w3YJp-Y_5gimxrlcJyNnLV8vAsAz8Vrw6B8QLYV4Uz7wDbx_H16jh3IeElgrSNrJzkiKF6gWmlbRNC39kazFrdzDeppkqTg_qAFQ/s320/shorten2.gif" width="320" /></a></div>
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhq-aZSERtRE5untb-7aSljekxkDctZPBu4gS0mc5g5WEUKzMOf-O1iHZ4bxAFx6YkLY2f4ws5eWsbdtLRPERob8Dydb-pgoa491ftLnO9Aw8DHV5R6MdyeC8Zs86lEem8s6v6YWQddL1Y/s1600/shortengif.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="319" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhq-aZSERtRE5untb-7aSljekxkDctZPBu4gS0mc5g5WEUKzMOf-O1iHZ4bxAFx6YkLY2f4ws5eWsbdtLRPERob8Dydb-pgoa491ftLnO9Aw8DHV5R6MdyeC8Zs86lEem8s6v6YWQddL1Y/s320/shortengif.gif" width="320" /></a></div>
<br />
<br />
This looks to all the world that Bill Shorten doesn't know what he believes.<br />
<br />
Wrapping this up with the latest Befair prices on the election, Labor is diving further, with Liberal at $1.13, almost 89% chance of winning with less than a week to go.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjpFgUxbUaUMPlv30XTVVLTSFyrXoH4ToL0wZY6oO5xxd08Gd-ZfuFOZK1lVswags4CgSrA79YkW7C798cv07COq4IsnFDk5QNzSWWdK9umx9X5Z4Kk9qb8sS74u3dKjzvwfh1VN2LYzIE/s1600/odds.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="226" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjpFgUxbUaUMPlv30XTVVLTSFyrXoH4ToL0wZY6oO5xxd08Gd-ZfuFOZK1lVswags4CgSrA79YkW7C798cv07COq4IsnFDk5QNzSWWdK9umx9X5Z4Kk9qb8sS74u3dKjzvwfh1VN2LYzIE/s640/odds.jpg" width="640" /></a></div>
Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-32709535900040711232016-06-26T20:25:00.002-07:002016-06-27T18:43:06.599-07:00Power handshakes and other election non verbals.Handshakes are often our first and sometimes only point of contact we have with another person. How we do it, affects how we are perceived. The handshake has the power to leave an impression, so it is interesting watching how the election leaders approach this.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPT3nyJqkdx6xe00A9XcrQ4nFC1gzZaOI59NKbZPt7gPPxUauSjPjNzsvm36ZQFrPcdw5delKfeeYkl6zB9z7UZbOEiLLfrAuZoIUizFYpjDcGK-qWiT5kK-5AVHumBLwCvffpaZbZqk4/s1600/shake3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="414" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPT3nyJqkdx6xe00A9XcrQ4nFC1gzZaOI59NKbZPt7gPPxUauSjPjNzsvm36ZQFrPcdw5delKfeeYkl6zB9z7UZbOEiLLfrAuZoIUizFYpjDcGK-qWiT5kK-5AVHumBLwCvffpaZbZqk4/s640/shake3.jpg" width="640" /></a></div>
<br />
<br />
Turnbull tends to pull in Shorten when they shake hands to establish dominance. This can appear very negative if it's exaggerated as with Tony Abbott's aggressive hand jousting exhibition with Kevin Rudd.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYeN3jO7egjqhSuGnvHSv-qOyawpiB-vu92AooGi8L3dTStuY6U4cOSjA0EaXckDjIqTC3p37iqRR-rD8VGyn3wn4HgBwgKizllgmVxbGKV7gMZD2E43jfc-TrryCiaRDc1EaKyzqxOto/s1600/abbott.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="406" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYeN3jO7egjqhSuGnvHSv-qOyawpiB-vu92AooGi8L3dTStuY6U4cOSjA0EaXckDjIqTC3p37iqRR-rD8VGyn3wn4HgBwgKizllgmVxbGKV7gMZD2E43jfc-TrryCiaRDc1EaKyzqxOto/s640/abbott.jpg" width="640" /></a></div>
<br />
Turnbull appears more subtle--<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvHLh9mlUNIUyBgqFXMYDKBxMVpbzcZwpzcVSxbkZMqtf6hg9ui-GB2iDABkWK3K3_gyguvr4xJcLoDdJRfyMtrGMKMKq50FF6eFRxD7WsjTu8KNcqan48_5V0wxuUZkyLEtjjYkz0Ey4/s1600/shake1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="410" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvHLh9mlUNIUyBgqFXMYDKBxMVpbzcZwpzcVSxbkZMqtf6hg9ui-GB2iDABkWK3K3_gyguvr4xJcLoDdJRfyMtrGMKMKq50FF6eFRxD7WsjTu8KNcqan48_5V0wxuUZkyLEtjjYkz0Ey4/s640/shake1.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6TtbNRaAcieoveCg9Hl9BUto0SjY5kXvRY1yYTqIRt3ntwW54rw3ocQZW7kAUxP6wMl_sxVJHr5-0USnVxNrBh-5UN10ZLclWc4mpAePKrBm7JYors7vJgQB0lcYHlrUHTXNx_CqPoPc/s1600/shake2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6TtbNRaAcieoveCg9Hl9BUto0SjY5kXvRY1yYTqIRt3ntwW54rw3ocQZW7kAUxP6wMl_sxVJHr5-0USnVxNrBh-5UN10ZLclWc4mpAePKrBm7JYors7vJgQB0lcYHlrUHTXNx_CqPoPc/s640/shake2.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiNK4yY4Vj3Ks2P-VP0j8PaSask0eX21WmwNKtc39bbLQvQEGVimc9vGlFs0Pq-Rwh7nmRFMXpWviP3ILcB6zM_HBHBzgRUEd57l_nJDSkQ9F_gJHhBiufvHCm61A6r106T5Dg2kwkue5w/s1600/shake4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="424" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiNK4yY4Vj3Ks2P-VP0j8PaSask0eX21WmwNKtc39bbLQvQEGVimc9vGlFs0Pq-Rwh7nmRFMXpWviP3ILcB6zM_HBHBzgRUEd57l_nJDSkQ9F_gJHhBiufvHCm61A6r106T5Dg2kwkue5w/s640/shake4.jpg" width="640" /></a></div>
<br />
Although with Prince Charles, Malcolm Turnbull made no attempt at a power handshake:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9DscxbI4JgfjKOQkWpiLLmzh6s-3dJ7Z4We6y5JdCqd4HKxnJXBTd2c9vbvJm9txbs4GilpowrNYnzw6dyCpQuKLSyYkhrqEKndGxUvt6i9dezPFoUBMcI81Nb3nNO1GX-syqskiq5lc/s1600/charles.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9DscxbI4JgfjKOQkWpiLLmzh6s-3dJ7Z4We6y5JdCqd4HKxnJXBTd2c9vbvJm9txbs4GilpowrNYnzw6dyCpQuKLSyYkhrqEKndGxUvt6i9dezPFoUBMcI81Nb3nNO1GX-syqskiq5lc/s640/charles.jpg" width="640" /></a></div>
<br />
Possibly the very worst handshake you could do, and one that almost always leaves a negative impression is referred to as the "Politicians Handshake" and and involves the left hand covering the handshake in a two handed gesture, in an attempt to appear more friendly.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ebCtChn4V085lNo5CRa07sjcsUPlZLVStd8cvbcaYpUMnTBgju1i_lyS8A2a85c6PRtvUmbhk97HCMcrIV0hqzCNvONXJl70DPZZRBII5O4_KJLCq_z8nSSV58335ymLTIX70woFA3A/s1600/poli333.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="486" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ebCtChn4V085lNo5CRa07sjcsUPlZLVStd8cvbcaYpUMnTBgju1i_lyS8A2a85c6PRtvUmbhk97HCMcrIV0hqzCNvONXJl70DPZZRBII5O4_KJLCq_z8nSSV58335ymLTIX70woFA3A/s640/poli333.jpg" width="640" /></a></div>
<br />
As with Shorten's jacket-off-and -rolled-up-sleeves approach, it's manufactured to make him appear as friendly, one of the people, let's get to work look.<br />
<br />
During the election campaign, Malcolm Turnbull has come across as more confident, competent and relaxed then Bill Shorten. Whereas Turnbull stands up straight and hold his head up, Shorten tucks his chin in, a sign of discomfort.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi98tbEPUXUzv24Hn3HfNg_rCAXECOig8rGwohjwflPxWo9YIui5iwaR5U7X7qKlT8B8CUd23Q5jn797Lc01WZy-5UffxDsxJM1BOhg8JxNc2wLp27ciS2n9pUyIV9GHETkTi7RuT0Zj70/s1600/xxturnbull-shorten.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi98tbEPUXUzv24Hn3HfNg_rCAXECOig8rGwohjwflPxWo9YIui5iwaR5U7X7qKlT8B8CUd23Q5jn797Lc01WZy-5UffxDsxJM1BOhg8JxNc2wLp27ciS2n9pUyIV9GHETkTi7RuT0Zj70/s640/xxturnbull-shorten.jpg" width="640" /></a></div>
<br />
AS a small child, if you smelt something disgusting, you wrinkled you nose and pursed your lips. Thirty years later, if you read a contract or see something you don't like, you purse your lips again. The lip pursing is normally only a brief moment, but it's an extremely reliable cue for discomfort.<br />
<br />
When there is tension or stress, we have tension in our mouth area, which results in the lips being "sucked" in, and extreme discomfort or stress can make the lips disappear completely. Look around the airport next time you fly, and watch people when a flight is cancelled. Again, this is a very reliable indicator of stress and discomfort.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_dpRwpDy9ftUzc1_fOhFwHNq40czy96X5Hxn0_sXaFp6tH-GjOPDQHaj7E7ApiS37VIGbroZ9k6aCBxThJIdSf8B_43uyBwMfQ_wvRtzkxysUrPq18QaHar_oqbrc6A9ugL63_qBuW7A/s1600/zzface.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="478" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_dpRwpDy9ftUzc1_fOhFwHNq40czy96X5Hxn0_sXaFp6tH-GjOPDQHaj7E7ApiS37VIGbroZ9k6aCBxThJIdSf8B_43uyBwMfQ_wvRtzkxysUrPq18QaHar_oqbrc6A9ugL63_qBuW7A/s640/zzface.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj41b3i8_dwq3uUspw8DdmyrfFJOleo25MnsLji7-bx0UNcodEJOD9f_eX6MRhbMaQrnAI3Na6vJyPg1c7AyXvS58YZo449wad-xA3EcixlKiaUhyphenhyphen6EuAuwZtr1HbQhDNRw4pX68iz52Ys/s1600/zzzz.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj41b3i8_dwq3uUspw8DdmyrfFJOleo25MnsLji7-bx0UNcodEJOD9f_eX6MRhbMaQrnAI3Na6vJyPg1c7AyXvS58YZo449wad-xA3EcixlKiaUhyphenhyphen6EuAuwZtr1HbQhDNRw4pX68iz52Ys/s640/zzzz.jpg" width="640" /></a></div>
<br />
Shorten hasn't been coached to control overt discomfort and stress cues because even the Press have picked up on this, exhibiting his most extreme displays:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgejvpVa-6nbLmBo-l2o-moKrkAnHdykNDiua-vCfq90Y-zneTpgRX-bX4Eg-mfkpdXNQNVVctsTr1QLQThQi1HfKqnd8KVErQyFZITcBrXtw3jE24nS3scHWzwMwoGl7dTmE6jpLIa-x8/s1600/grumpy.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgejvpVa-6nbLmBo-l2o-moKrkAnHdykNDiua-vCfq90Y-zneTpgRX-bX4Eg-mfkpdXNQNVVctsTr1QLQThQi1HfKqnd8KVErQyFZITcBrXtw3jE24nS3scHWzwMwoGl7dTmE6jpLIa-x8/s640/grumpy.jpg" width="640" /></a></div>
<br />
<a href="http://www.smh.com.au/federal-politics/political-opinion/alex-ellinghausen-photographing-malcolm-turnbull-tony-abbott-and-bill-shorten-20151210-glkboq.html">http://www.smh.com.au/federal-politics/political-opinion/alex-ellinghausen-photographing-malcolm-turnbull-tony-abbott-and-bill-shorten-20151210-glkboq.html</a><br />
<br />
During interviews and debates, Malcolm Turnbull has his non verbals mostly under control, ensuring that he looks confident. Confidence is correlated to perceived competence in studies, so it is critical that a leader appear confident.<br />
<br />
He speaks with large <i>open</i> gestures showing the palms of his hands, he stands straight, he tends not to point, which is highly offensive for many people. Turnbull's large open gestures are contrasted with Shortens more closed approach....hands closer to the body and sometimes pointing.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyN7upz-2oLlQwMwtctUR695qkNh0o9Ex_cymcUkG-8gSF-5QqnX8pv1-y8f2kFXP1MLo6KTHkJ_S-s-2PzsHSp0fjd2KeDkNONqEuFLCSv7LozpFGNCuV6LbLNUZ2e32GjS3hicI43Cs/s1600/closed.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyN7upz-2oLlQwMwtctUR695qkNh0o9Ex_cymcUkG-8gSF-5QqnX8pv1-y8f2kFXP1MLo6KTHkJ_S-s-2PzsHSp0fjd2KeDkNONqEuFLCSv7LozpFGNCuV6LbLNUZ2e32GjS3hicI43Cs/s640/closed.jpg" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8LsmmWI4BfOp0_p5LtqbycZX6GILwzfpHPnqR7N47W5vrJrXT_RyuklT82l9hyCYa6Qb4om1jofedu64rdf0hEtHgRACnVKI524kCUbxHdxgvG-3rE4aEYVG6X5IA7n6wDs0fecl0juQ/s1600/open.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8LsmmWI4BfOp0_p5LtqbycZX6GILwzfpHPnqR7N47W5vrJrXT_RyuklT82l9hyCYa6Qb4om1jofedu64rdf0hEtHgRACnVKI524kCUbxHdxgvG-3rE4aEYVG6X5IA7n6wDs0fecl0juQ/s640/open.jpg" width="640" /></a></div>
<br />
Millions of years ago, when we didn't like what we saw or were intimidated, we would run away. Today in a business setting this translates to us <i>leaning away </i>from someone who says something we don't agree with. We lean back in our chair, are lean away, and in a more extreme case we <i>turn our body away from what we don't like or agree with</i>.<br />
<br />
This is evident when you watch political debates, but it is another problem that Bill Shorten has. He turns away slightly or looks with a sideways glance. Covering or turning away, or "ventral denial" shows discomfort with what is being said or asked. An easy way to see this is to watch which direction the feet point. The feet point to where the body wants to go.<br />
<br />
If you talk to a client and their foot points to the exit door, they need to go. Jury consultant Jo Ellen Demetrius cites a study of jury members --when jury members don't like a witness, they face the witness but their feet point to the exit door.<br />
<br />
So watch where the body (use the belly button as a directional indicator) points, and be aware of the more subtle version of feet pointing away.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAz-EiOBYarCLzyGGHj8eEfHRqgSOV0LAv0QOa6bZc8NrFjxbZyMW6wcouSPqqRUHXt7SEbRnG1CUxIxizSZwcKinaVjzyOHIVXvTnLiztTLZJrTqVVczl9BCx9v_s1xUXBv0qNyYJ1eM/s1600/aaaa.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAz-EiOBYarCLzyGGHj8eEfHRqgSOV0LAv0QOa6bZc8NrFjxbZyMW6wcouSPqqRUHXt7SEbRnG1CUxIxizSZwcKinaVjzyOHIVXvTnLiztTLZJrTqVVczl9BCx9v_s1xUXBv0qNyYJ1eM/s640/aaaa.jpg" width="640" /></a></div>
<br />
Bill Shorten is not fairing well with his display of non verbals during this election campaign. I'll look at the verbals of the leaders in the next post.<br />
<br />Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-3166746296539457382016-06-23T18:28:00.002-07:002016-08-22T23:48:25.120-07:00"Why Most Published Research Findings Are False."Why False Positives are the downfall of most research papers.<br />
<br />
The headline comes from this academic paper --<br />
<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/</a><br />
<br />
It's based around a massive statistical problem at the moment. Simply put, the problem is this - because computers and software have become so powerful, multiple testing is becoming the norm.<br />
<br />
So whereas many years ago one experiment would have been carefully considered and a hypothesis would have been formulated, then the testing would determine whether this hypothesis was correct or not at say the 95% level (the most common).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHHywCWIFPKBzGK8lziLU112sIvddC5XXv9FKRqNf-0JBcJgvLVfuMrwhadZkIwU5YPeHT_aBGm9Bcv0KJn0Qwkzgu3B3lMnwjzKAHp0k04utuiDrtKQvYq_-O3UiEG4C74or4d1DbQos/s1600/eklund_figure1-1024x836.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="261" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHHywCWIFPKBzGK8lziLU112sIvddC5XXv9FKRqNf-0JBcJgvLVfuMrwhadZkIwU5YPeHT_aBGm9Bcv0KJn0Qwkzgu3B3lMnwjzKAHp0k04utuiDrtKQvYq_-O3UiEG4C74or4d1DbQos/s320/eklund_figure1-1024x836.png" width="320" /></a></div>
<span style="font-size: x-small;">Above graph from paper showing brain scan data grossly exaggerated because of incorrect statistical tests and controls.</span><br />
<br />
<br />
This means that luck and False Positives (thinking you've found something when you haven't) would be at around the 5 % level and you could be fairly sure you had a positive result at the 95% level.<br />
<br />
Nowadays, by running 20, or 50 or 100 tests and looking for an effect <i>after the fact,</i> it creates a massively biased data set. It's not 5% False Positives anymore, it's more like 35%-40. (in brain scans).<br />
<br />
You are now <i>guaranteed to to have lots and lots of positive effects that are due to luck alone</i>, It's called data steering and many other terms. The bias is so large you don't have a result even if you think you do.<br />
<br />
In neuroscience and brain scans, they are testing thousands of voxels at a time. The errors become magnifies the more tests that are being run, and they realise they have serious analysis problems with fMRI scan analysis --<br />
<br />
<a href="http://reproducibility.stanford.edu/big-problems-for-common-fmri-thresholding-methods/">http://reproducibility.stanford.edu/big-problems-for-common-fmri-thresholding-methods/</a><br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5Yd0vjFDKx9ASYo3iZHJwIPbVYh0hKFGqW128L9E55MV32iQe-BXBwNdV1J0bhQD0nsKZFxS7BsKiwH_Tf_7dRcPXrgIGeE3C9MHXovRq5v7TNfK2EcHaxuOGPj2r5knf070Aksoejus/s1600/pain_small.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5Yd0vjFDKx9ASYo3iZHJwIPbVYh0hKFGqW128L9E55MV32iQe-BXBwNdV1J0bhQD0nsKZFxS7BsKiwH_Tf_7dRcPXrgIGeE3C9MHXovRq5v7TNfK2EcHaxuOGPj2r5knf070Aksoejus/s320/pain_small.png" width="298" /></a></div>
<br />
<br />
This helps to explain the fact that studies are unable to be replicated in many/most cases. Nowadays, only 18% of Pharma tests pass Phase 2 stage, only 50% pass Phase 3 stage.<br />
<br />
It's been estimated that since 2004, only 7% of studies have accounted for this multiple testing error. So all studies from drug trials to economics that do not use some form of Multiplicity Control (compensating for that fact that multiple testing has been done) are useless!<br />
<br />
So what does this have to do with word + deception analysis? Everything. Many variable are considered, and some kind of multiplicity control to compensate for this is vital.<br />
<br />
I emailed an Italian professor, Livio Finos at the University of Padova who wrote the MatLab code for this book and from which I have based my statistical tests-<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHAZP1FgeSgCjmxUt6mrvesPWA2nEP7j63t4gL0aOfcePRXY76eAiapgM4hqxSg1kpA11IibTrv166UhVJ4DuYgiSo3xD3BYD0NEsFx07dapWOkBw6DCPFydgry9NEnr80sPs7URKcYwk/s1600/complex.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHAZP1FgeSgCjmxUt6mrvesPWA2nEP7j63t4gL0aOfcePRXY76eAiapgM4hqxSg1kpA11IibTrv166UhVJ4DuYgiSo3xD3BYD0NEsFx07dapWOkBw6DCPFydgry9NEnr80sPs7URKcYwk/s320/complex.jpg" width="192" /></a></div>
<br />
I checked with him if I had the procedure correct that he advocated in his paper with a new method of Multiplicity Control -- <a href="http://link.springer.com/article/10.1007%2FBF02741320">http://link.springer.com/article/10.1007%2FBF02741320</a><br />
<br />
He confirmed by email that I had the correct procedure. I then hired a Ukrainian freelance programmer Victor, and after a few days of back and forth email communication, it culminated in a 4 hour skype session in the middle of the night and I managed to get the MatLab code written and tested. This means I can correct for multiple tests with a new efficient process besides the industry standard FDR approach, as well as using Non Parametric Permutation Tests for all analysis, so the results I get are reliable.<br />
<br />Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-57455159713771423292016-06-23T16:29:00.002-07:002016-06-25T15:15:50.149-07:00Turnbull, Shorten + 100 Years Of Election SpeechesHow does Malcolm Turnbull's election speech and Bill Shorten's Budget response compare to the last 100 years of election speeches in Australia?<br />
<br />
To find out, I downloaded election speeches from 1903 from:<br />
<a href="http://electionspeeches.moadoph.gov.au/explore">http://electionspeeches.moadoph.gov.au/explore</a><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIB7lFUKEvcAHXMaJ52YIoViuPaWzFuc4oHKOR7UM9cxF63feroTgO7WDzbbaLnhfQcWrYzhaxalarwvbnfVrhl_MK2-MeXBc84FX0zNmrKkL3rnJjoBhCIukDuJ04V1a95w6iVuSStZ4/s1600/xxxxx.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIB7lFUKEvcAHXMaJ52YIoViuPaWzFuc4oHKOR7UM9cxF63feroTgO7WDzbbaLnhfQcWrYzhaxalarwvbnfVrhl_MK2-MeXBc84FX0zNmrKkL3rnJjoBhCIukDuJ04V1a95w6iVuSStZ4/s400/xxxxx.jpg" width="400" /></a></div>
<br />
<br />
I cleaned up the data, removing things like [APPLAUSE], and used a psychological text analysis tool called LIWC (Linguistic Inquiry And Word Count) developed by James Pennebaker at the University of Texas and which has been verified and used in over 6000 articles and studies on Google Scholar.<br />
<br />
The program looks at about four and a half thousand words in eighty categories. These categories tapped emotional and cognitive dimensions of the speaker, revealing things like state of mind, the tone and how analytic someone was, as well as keeping track of the almost "invisible" function words used in speech.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0S3DF9tyoFs6WKvYA899BPP6WwkKrr9qBhac77YYcyY00e_hxGDfYemNzV_kWt1ZzgvLkgTGR4q1gLL-R1_mUXkkGgSRrdyr9gOgvatEeN4k2_ok3BCDHM_K2qw2TMOp2mE_rU03At3Q/s1600/liwc2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="276" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0S3DF9tyoFs6WKvYA899BPP6WwkKrr9qBhac77YYcyY00e_hxGDfYemNzV_kWt1ZzgvLkgTGR4q1gLL-R1_mUXkkGgSRrdyr9gOgvatEeN4k2_ok3BCDHM_K2qw2TMOp2mE_rU03At3Q/s320/liwc2.jpg" width="320" /></a></div>
<br />
<br />
Custom MatLab code was used to find highly significant groups of words that was able to separate the winners from the losers over all the elections.<br />
<br />
This in itself was amazing to me, I wasn't sure whether there would be a clean separation between what the winners of the last 100 years of elections said, and what the losers said.<br />
<br />
It turns out election winners tend to use certain language, as do the losers. Comparing Malcolm Turnbull's election speech to this "model" placed him in the winner group, while placing Bill Shorten in the losers group by a long margin, hence the prediction of Turnbull winning by about 85-90%.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmU1PfifXk_Ku84K37QzptNGRgxAGBsO8XoTV0_WTo1V3OeS7a0y0xF9bgfF1Xl38iKUJWyn7r3JIsxnfx08ytZh-7ZqhlAHfBVlecXWWBbNREPJW9LrhZ-W7S-dmbxawcIrkoQokyCsA/s1600/graph33.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmU1PfifXk_Ku84K37QzptNGRgxAGBsO8XoTV0_WTo1V3OeS7a0y0xF9bgfF1Xl38iKUJWyn7r3JIsxnfx08ytZh-7ZqhlAHfBVlecXWWBbNREPJW9LrhZ-W7S-dmbxawcIrkoQokyCsA/s320/graph33.jpg" width="320" /></a></div>
<br />
Some of these categories are positive (which winners have more of) and some are negative (which typically losers have more of). So for instance, the text analysis program has groups of words relating to power-awareness - this captures the degree to which people use words such as command, boss, victim and defeat. This measure to the degree the awareness of the power they have. In Australian politics, this equates to less being better, it tended to be higher in the losers group.<br />
<br />
Turnbull comes across higher on the Authentic algorithm (capturing cognitive complexity and relatively low rates of negative emotion) and also his Tone is higher, both being positive.<br />
<br />
On the downside for Turnbull has greater use of the word They (from the last post) which is negative.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnzrQWHchCOqjqtKpH1oMKcHs76PREYA7jktDxpSd46CCMk5Y15s-_hIAcVh808rnotzUablvcUrOfllhqwOdY7d5aLypNWUa7fWximlLT8ghsLAO5Q2SDiqqY4dOILvroYwZ9uogYZU/s1600/equivVSneg.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="295" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnzrQWHchCOqjqtKpH1oMKcHs76PREYA7jktDxpSd46CCMk5Y15s-_hIAcVh808rnotzUablvcUrOfllhqwOdY7d5aLypNWUa7fWximlLT8ghsLAO5Q2SDiqqY4dOILvroYwZ9uogYZU/s320/equivVSneg.jpg" width="320" /></a></div>
<br />
<br />
Shorten has a problem with far too much equivocation, this is hedging language which reduces commitment and allows one to minimise what has been said if it turns out to be wrong at a future date. Words like I believe, I think, I thought, suppose and so on. He also uses too many negations such as couldn't, should't and wouldn't. (which is an indicator in deception or spin if it's not in response to a specific question).<br />
<br />
This analysis was Turnbull against ALL the election winners, then against all the election losers; the same was done with Shorten. It compared them to the model - how well did they fit in the previous election winners group, how well did they fit in the losers group.<br />
<br />
It did not compare Turnbull directly to Shorten with their speech, I will be doing that next post.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-69940868175582496682016-06-19T23:04:00.001-07:002016-06-22T15:07:22.080-07:00The politics of WE and THEY.The Secret Life Of Pronouns by James Pennebaker used statistical analysis and research to show that pronouns such as I and we and they as well as the small functions words, which are essentially "invisible" to us in day to day communication, actually revealed our feelings, our state of mind and were close to being a fingerprint for each individual.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjn0hwKS6q0WmMDeU2ADDu1W9gZBpSCbBjejdxmkGTZAuvdI0pqx2NbJ1SekMN0pIvoaF7qDG802Y0lm6JNIxbXv9v_wI3zJOnLte947DTt4Dy_X9PiJM9603dqeECyUbLFqGyHDS3uth0/s1600/graphic.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="285" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjn0hwKS6q0WmMDeU2ADDu1W9gZBpSCbBjejdxmkGTZAuvdI0pqx2NbJ1SekMN0pIvoaF7qDG802Y0lm6JNIxbXv9v_wI3zJOnLte947DTt4Dy_X9PiJM9603dqeECyUbLFqGyHDS3uth0/s320/graphic.jpg" width="320" /></a></div>
<br />
<span style="color: #6d6b6b; font-family: sans-serif; line-height: 21.4500007629395px;"><span style="font-size: xx-small;">Credit: Jane Fennessy/Blue Vapours</span></span><br />
<span style="color: #6d6b6b; font-family: sans-serif; font-size: 14.3000001907349px; line-height: 21.4500007629395px;"><br /></span>
The way we wrote revealed a lot of psychological information about us as well as being unique to such a degree, that anonymous text could attributed to an unknown writer.<br />
<br />
J.K Rowling was unmasked (and admitted) to being "first" time writer Robert Galbraith, and was detected solely on her use of language, in particular the small pronouns and function words.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCu1jrT1bBOAUlFIfxe1wcB3l0K4Y9j7uNhuEN67J_hiz9Xnh-neY0eX90GxVUI1TOGDN7gd2DDzNAmDlvbMP4CH6aoKX_h4T5gYWpJfma9RCvBlKpYAnNnxJPbVOxVvJfVI0-LQioA0Q/s1600/2016-06-20_134239.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCu1jrT1bBOAUlFIfxe1wcB3l0K4Y9j7uNhuEN67J_hiz9Xnh-neY0eX90GxVUI1TOGDN7gd2DDzNAmDlvbMP4CH6aoKX_h4T5gYWpJfma9RCvBlKpYAnNnxJPbVOxVvJfVI0-LQioA0Q/s400/2016-06-20_134239.jpg" width="205" /></a></div>
<br />
So what does this have to do with politics? Words, in particular pronouns, are vital tools of the politician. During a speech or interview they attempt to present themselves in a positive light, while presenting negative aspects of their opponents, and pronouns in particular are used for this purpose.<br />
<br />
One way of doing this is the WE (or us) against THEM dichotomy. Vote for us and you'll be better off, vote for them and you'll be worse off.<br />
<br />
And with politics becoming more <i>personality driven</i> as major political party become less ideological, politicians present different identities to different voter groups in an attempt to relate to diverse groups. A key persuasion principle is that people like people that are <i>similar</i> to them.<br />
<br />
The meaning of WE is that of group membership, it's the rest of the group that includes me. Politicians use of the pronoun WE has several connotations - to talk on behalf of their party, to reduce their individual responsibility or to to include or exclude listeners from a group. This makes WE a very useful linguistic tool for the politician.<br />
<br />
"WE as a nation stand tall..." refers to all of us.<br />
"WE have been able to build ever closer links...." now refers to the party.<br />
"WE are a true and trusted friend..... " now refers back to the all of us.<br />
"WE are called upon to do so...." is ambiguous, but refers to a political event.<br />
"What do WE know?" refers to all Australians.<br />
"WE know that the Liberals are in danger of blowing out the budget......." now becomes the political party.<br />
<br />
So in using WE, there is a continual shift between all of us (all Australians) and the party, and the <i>listener can include themselves in that group or not.</i><br />
<br />
THEY is similar as a distancing tool: "THEY are not able to balance the budget", "THEY are worried...", "THEY will say......"<br />
<br />
THEY can include the Australian people or the party, so a politician saying "THEY won't be fooled by...." is distancing himself from that claim by attributing that premise to all Australian people.<br />
<br />
So WE and THEY involve distancing or including, allowing the listener to be part of that group or not, and even as an ambiguous reference allowing the politician to appeal to a wider group, and allowing the comment to be "softened" or negated at a later date.<br />
<br />
The primary goal is to always represent oneself in a positive light, and WE and THEY are very useful for that.Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-60852389134489868202016-06-15T00:29:00.000-07:002016-06-19T20:32:26.809-07:00Betfair Odds on Liberals Diving....A five cents dive on Betfair for the Liberals in 24 hours puts them at 86% of winning.<br />
<br />
The markets are nearly in the middle of my estimate from 5 weeks ago of 85-90% probability in Liberals favour.<br />
<br />
Yet just this weekend media reports have them "neck and neck".<br />
I think not.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjqZ-z9pDT5HiJfCv-gahYIdjjrbZUG_XLRHQFilrFDIWayFOMB8rrd84A8e3vdJSR_g1qfl9DSfBFBkEU-fT5ISLBLREbkMghB9gbkgJVR5Zm9XziyhBxKmba0togFttCsxKEKPtXcps/s1600/2016-06-15_171430.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="387" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjqZ-z9pDT5HiJfCv-gahYIdjjrbZUG_XLRHQFilrFDIWayFOMB8rrd84A8e3vdJSR_g1qfl9DSfBFBkEU-fT5ISLBLREbkMghB9gbkgJVR5Zm9XziyhBxKmba0togFttCsxKEKPtXcps/s640/2016-06-15_171430.jpg" width="640" /></a></div>
<br />Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0tag:blogger.com,1999:blog-1417239720543668306.post-89027528751475281762016-06-14T11:35:00.000-07:002016-06-21T15:31:35.907-07:00Elections and DeceptionsElection mode in Australia and the U.S means electoral spin and truth stretching, although in Trump's case it was stretched and broken long ago -- Pulitzer winning web site Politfact.com and their truth-o-meter have tracked and verified Trump's election statements as 70% false or lies!<br />
<br />
Politics is becoming less about policies and political messages and more personality driven as the major parties begin to look more and more similar.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvX8hD4IY8v_Nv3yrYDfFZYP2wni7AZYbZ7qiz0kOWG_tkO4wfy-ozFYa2vjbENn5MrJhu9LpF-HTzFOGsgunBJNldcVLWtkNSSdn2_G5xHdR3uT9IbbINutEFkCrW_DMW_LhbbWnru5g/s1600/7285216-3x2-940x627.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvX8hD4IY8v_Nv3yrYDfFZYP2wni7AZYbZ7qiz0kOWG_tkO4wfy-ozFYa2vjbENn5MrJhu9LpF-HTzFOGsgunBJNldcVLWtkNSSdn2_G5xHdR3uT9IbbINutEFkCrW_DMW_LhbbWnru5g/s640/7285216-3x2-940x627.jpg" width="640" /></a></div>
<br />
<br />
Studies from the U.S show over 70% accuracy in predicting election outcomes by showing voters photographs of politicians and asking them who looks more competent, and who looks more trustworthy.<br />
<br />
With a more cynical electorate, politicians try to appeal to a wide range of people, to be everything to everyone, and so represent themselves with varying degrees of spin - a form of deception.<br />
<br />
Politicians don't answer questions that would make them look bad or alienate part of their electorate, and so use equivocation and hedging to minimise the impact of what they are saying, even ignoring the question (ignoring the question in a political interview happened around 40% of the time, UK, Peter Bull, Claptrap And Ambiguities).<br />
<br />
So words from speeches and interviews can be used to track spin and honesty among politicians. And just like in FBI Statement Analysis, we tend to see discomfort and stress in the communication leakage of politicians who are stretching the truth. We see it in the "I" pronoun usage being reduced as they distance themselves from what is being said, we see it in an increase of words conveying negative emotions (Pennebaker, Uni Texas), and see it with an increase of action words to keep the story moving along.<br />
<br />
<span style="text-align: center;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgec6GJeF4i2iAJBkA0_lT0HTMJ9xWljKjqdxrZHcn9W7atfGeRpClc1OUOdsysljH-ING4nllH6hqUoKzLM9P0jci1_Icc2C1c7WKez3TLcC0JmGF-LgMu6uP1PM99aR8LsFh53q3iZm4/s1600/congress.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="204" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgec6GJeF4i2iAJBkA0_lT0HTMJ9xWljKjqdxrZHcn9W7atfGeRpClc1OUOdsysljH-ING4nllH6hqUoKzLM9P0jci1_Icc2C1c7WKez3TLcC0JmGF-LgMu6uP1PM99aR8LsFh53q3iZm4/s640/congress.jpg" width="640" /></a></div>
<br />
<br />
<br />
<br />
Words have also been used in election and voter sentiment prediction, such as tens of thousands of twitter feeds being used to predict the German election outcome based on voter sentiment--<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCzdOcvG2E3PCm9GlQ26QCSon-Go1gyDanDJg_ZiN3R6Otiiy2chYzhP-Brv0YpJgn66eZQ7w1QRAgMD39MoPOIICCIhWW2WkMf5dcQ_tM7YTF7eH0GzI6kDsjk2e75InETXhbtKtv9L4/s1600/twitter.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="408" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCzdOcvG2E3PCm9GlQ26QCSon-Go1gyDanDJg_ZiN3R6Otiiy2chYzhP-Brv0YpJgn66eZQ7w1QRAgMD39MoPOIICCIhWW2WkMf5dcQ_tM7YTF7eH0GzI6kDsjk2e75InETXhbtKtv9L4/s640/twitter.jpg" width="640" /></a></div>
<br />
<br />
Jeremy Frimer also tracked voter sentiment with what he calls Prosocial language, plotting a direct correlation to language used in Congress and voter sentiment plunging to an all time low of 9% satisfaction with Congress.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8S3U39Rli_QwipN9GrXHFG2ZVVCITSNavKeEuMAcfBWvMnUk4Kw9tn7ol7MimclBYEzue1aLG5cDDCYMWQ_eQ2NMyrH-sdvxDacPoQAXyjcxbalwhQcCHj2oATZ7K_2bhG8nLAdFQ9ls/s1600/voter.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="258" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8S3U39Rli_QwipN9GrXHFG2ZVVCITSNavKeEuMAcfBWvMnUk4Kw9tn7ol7MimclBYEzue1aLG5cDDCYMWQ_eQ2NMyrH-sdvxDacPoQAXyjcxbalwhQcCHj2oATZ7K_2bhG8nLAdFQ9ls/s320/voter.jpg" width="320" /></a></div>
<br />
<br />
I have analysed election speeches from Malcolm Turnbull and Bill Shorten, as well as all the election speeches of every Prime Minister and Opposition Leader in Australia going back to 1903 using psychological text analysis software and <i>custom MatLab code</i> to determine if it was possible to create a model on linguistics alone, on what is said in speeches and interviews....to separate the winners from the losers.<br />
<br />
It turns out that there are 9 categories or "bundles" of words that Australian election winners use more of, and election losers use less of, and vice versa. This is significant at at the 5% level.<br />
<br />
This allows for a fairly easy creation and testing of a linguistic model for winning Australian Elections.<br />
<br />
So for example, words that relate to negative emotions (nearly 400 words selected by psychologists are in this group) are highly significant, and using them reveals conscious and even subconscious negative emotions about what is being said. This type of analysis has also been used in criminal statement analysis.<br />
<br />
I ran my model over Malcolm Turnbull's speeches as well as Bill Shorten and gave Turnbull an 85-90% probability of winning. This was 4 weeks ago when the Befair odds on Liberal were $1.45. I was hoping they would lengthen for the Liberals, but only Shorten has lengthened!<br />
<br />
Alas, Turnbull is at unbackable odds of $1.22 at the moment, translating into about 80% chance of winning, so the Betfair market is moving closer to my assessment.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5Jn4Mh5de0C5EoEojdfZuI2aiVdrjAG04mNLOhLTZyOzvkI_W_2DuBV9Yb7qxYGyPH0ZNSzDiH1rD8WuPWGWwbSGQIwGTgvDui_8VjfkacRkC70CW5EbOwNkx1txMUSjVfo35AcAcuaM/s1600/xxxxxx.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="416" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5Jn4Mh5de0C5EoEojdfZuI2aiVdrjAG04mNLOhLTZyOzvkI_W_2DuBV9Yb7qxYGyPH0ZNSzDiH1rD8WuPWGWwbSGQIwGTgvDui_8VjfkacRkC70CW5EbOwNkx1txMUSjVfo35AcAcuaM/s640/xxxxxx.jpg" width="640" /></a></div>
This is despite the media hyping up the chances of Labor, painting the election as "too close to call" about a week ago. The significance of the Betfair market (and most betting markets) has been well known for quite a while as far more accurate than polls.<br />
<br />
The lower take on winnings from Betfair and zero longshot/favourite bias means a more accurate barometer than bookmakers, media and polls. See for example <a href="http://researchdmr.com/RothschildPOQ2009.pdf">http://researchdmr.com/RothschildPOQ2009.pdf</a><br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8vmpyYhoPpeF35dfLaMUvOpBCvi5M4wR-cEkWS6_P-6V9wrByG-PDyzi2H2gRHTQdD9xGiuvrHvcN84yFHImdIcYlXCvPAsszSVsuhabEg7ZxrzppLPv8RuyUvfc6xz0mHIS9NZ3Yau0/s1600/4000.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="384" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8vmpyYhoPpeF35dfLaMUvOpBCvi5M4wR-cEkWS6_P-6V9wrByG-PDyzi2H2gRHTQdD9xGiuvrHvcN84yFHImdIcYlXCvPAsszSVsuhabEg7ZxrzppLPv8RuyUvfc6xz0mHIS9NZ3Yau0/s640/4000.jpg" width="640" /></a></div>
<br />
<br />
<br />
Of course, it's not just politicians stretching the truth during a campaign, it's media too. Each have an agenda to push.<br />
<br />
<i>Language is never neutral.</i> The same words can mean different things to different people because we all carry our own internal dictionary. When we listen to someone talk, we assume things based on our interpretation of what they are saying. Words create pictures in our mind.<br />
<br />
That's why the Bush campaign used "Tax Relief" instead of "Tax Cuts"; the visual minds-eye message is much more powerful because you are now giving people a relief from a burden. And so "Drilling for Oil" becomes "Energy Exploration", "Gambling" becomes "Gaming" (you don't think of a loser in a raincoat with a crushed up form guide anymore, gaming is something new and exciting with family-friendly venues).<br />
<br />
Whenever euphemisms are used, such as the military use of "collateral damage" or the media love of the legal euphemism "execute" for a wanton murder, an agenda is being promoted.<br />
<br />
Words are the toolbox of persuasion. Language experts are on both sides of politics, advising what words to use, which phrases poll higher. Experts like Frank Lund for the Republicans and George Lakoff for the Democrats.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbbFDm1obkPLNsFAypqRP0wUBt6Zdh2cllxOLejbdqGQ_wbA5PrDlzYTq_9Kp0N77ksET5TnQt2m2o3wVzUbQeqfiSU7KJfeHFfVqdMC3B19TV8WOkB2_3anCPXMfeJhGr383dKw-0ksc/s1600/lak.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="259" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbbFDm1obkPLNsFAypqRP0wUBt6Zdh2cllxOLejbdqGQ_wbA5PrDlzYTq_9Kp0N77ksET5TnQt2m2o3wVzUbQeqfiSU7KJfeHFfVqdMC3B19TV8WOkB2_3anCPXMfeJhGr383dKw-0ksc/s640/lak.jpg" width="640" /></a></div>
<br />
<br />
Both of these experts have written books which can be bought on Amazon, and show the current state of the art as far as political linguistics goes.<br />
<br />
If you accept a word and use it, you accept <i>the frame </i>(as Lakoff calls it) that comes with that word.<br />
<br />
Different interest groups use words that frame their agenda the way they want it. So "Boat People" becomes "Refugees" which becomes "Asylum Seekers", while others may say "Economic Refugees" depending on their frame, and so on.<br />
<br />
A handy way to see how word usage changes over time, becoming more or less popular is to use Google's is to use Ngram Viewer:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi00luo1YxXRGGmR3N1kIPPz5IiYudjFhOFMWHGuMJUp-xk2dPAw7ERfUDjLZcJmgNRWMyKophF3sCxFSRDLo8U6uxIJbQZkk-daTsJJU2K2yTX4hAzRk1YQNDKdrtVthmQh2ItSMisLNY/s1600/ngramviewer.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="310" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi00luo1YxXRGGmR3N1kIPPz5IiYudjFhOFMWHGuMJUp-xk2dPAw7ERfUDjLZcJmgNRWMyKophF3sCxFSRDLo8U6uxIJbQZkk-daTsJJU2K2yTX4hAzRk1YQNDKdrtVthmQh2ItSMisLNY/s640/ngramviewer.jpg" width="640" /></a></div>
<br />
<br />Tom Bergerhttp://www.blogger.com/profile/03196089855453332363noreply@blogger.com0Hobart TAS, Australia-42.881903 147.32381399999997-43.627037 146.03292049999996 -42.136769 148.61470749999998