Deception Detection In Non Verbals, Linguistics And Data.

Lessons From The Poker Table --Spot The Liar In The Real World.

I have been playing small stakes poker for a few years now, with the original intent to study the cues and tells that come from bluffing to see if I could translate generic observations from the poker table into real world deception detection.

Reading about a police interview technique developed by Professor R.E Geiselman of UCLA, an expert in detecting lies, gave me my first clue:

"When asked if they want to add anything, deceptive people tend to say NO quickly whereas truthful people either go ahead and add something new or they at least think about it before saying NO."

It occurred to me that a speeding up and slowing down behaviour translated directly into a specific bluffing scenario that went like this:

1 -- If you are strong, but bluffing to be weak, what do you do? You look at your cards and then your chips and pretend to think whether you are going to increase your bet. You stall.

2 -- If you are weak but bluffing to be strong, what do you do? You don't hesitate, you move your chips out quickly.

This poker table analogy holds directly in a real world scenario: People who move or act too quickly (quicker than their baseline!) are potentially deceptive or lacking confidence. It's a red flag moment at the table and it's a red flag moment in the real world too. Off course, this means you need a baseline, a situation where some small talk has taken place prior and where you have had an opportunity to gauge behaviour.

Something else I noted on the poker table was that people who I thought were bluffing seemed to be slightly more friendly, or polite at that point, as if not wanting to antagonise other players or draw attention to themselves.

Frank Enos from Columbia University in his thesis says:
"Preliminary findings suggested that pleasantness is the most promising factor in predicting deception..."

As expected, there is an overlap from poker table to real world scenarios making it a great laboratory to study human behaviour, it's just a matter of paying attention.

How to Get your Message Out With Clinton's Media Strategist Idea.

An effective linguistic technique to get your messages out comes from Bill Clinton's media strategist  in Clinton's book Behind The Oval Office.

Clinton was frustrated by the fact that he had created millions of jobs and cut the deficit, but it went largely unnoticed and unaccredited.

His Media Stategist Bob Squier suggested that two messages be combined creating a presupposition for one of the messages. The idea, said Squier, was to talk about the jobs that had been created while also talking about what you are going to do.

Squier continued, "For example, the seven million jobs we've created won't be much use if we can't find educated people to fill them. That's why we want a tax deduction for college tuition to help kids go to college to take those jobs."

This turned out to be very effective and works because it assumes or presupposes that part of the message is a fact.

Lets say you want to get the message out:
1 -- This is the worlds safest car.
2 -- Now you can afford it.

Putting both premises across individually will allow some one dispute both messages.
By combining the two messages into one such as:

Now you can afford the world's safest car.

If you disagree with this message, you are disagreeing about the fact that you can afford the car, not that it is the worlds safest car, because the safety aspect is now assumed or presupposed.

In fact this technique has been used by advertisers and politicians for a long time and it's been shown to be an effective way to get a message out because in our busy daily life we assume presuppositions are correct to save ourselves cognitive processing time.

Millers Law And Lochte's "Apology".

Twenty four hours after deconstructing Ryan Lochte's lie in my last post, the U.S swimmers admit they lied about the robbery after they trashed a service station.

Lochte issued an apology -- sort of.

"I wanted to apologize for my behavior last weekend -- for not being more careful and candid in how I described the events of that early morning and for my role in taking the focus away from the many athletes fulfilling their dreams of participating in the Olympics," he said Friday on Instagram.

As in lie detection where you need to listen really very carefully to what is said without putting your own interpretation on it, the same thing applies to listening to an apology.

What he probably means is he regrets the lack of care in telling his lie which made it so easy to deconstruct.

Millers Law by Princeton Professor George Miller instructs us to suspend judgement and not put your own interpretation on something that someone says.

The law states --
"To understand what another person is saying, you must assume that it is true and try to imagine what it could be true of."

This is a way of stopping you from making a snap judgement and interpreting what someone is saying using your own internal "dictionary", because very often they are using their own different internal "dictionary".

Deception Analysis Of Ryan Lochte’s Robbery Account

I've been analysing police criminal statements to determine verbal/written cues for a quite few years now, with the view to developing automated software to red flag statement inconsistencies and deception.

It turns out that there is evidence that liars tend to tell a less coherent story, items are more likely to be out of sequence, they are less likely to include conversations or sensory details such as what something smelled or looked like, and there are more likely to be contradictions, and so on.

My interest was peaked in the controversy arising from the reported robbery in Rio of 4 U.S swimmers, and that fact that the judge said that there are inconsistencies between the swimmers statements.

I decided to have a look at Ryan Lochte’s statement if I could find a direct quote.
This is what Lochte said on NBC Today:

We got pulled over, in the taxi, and these guys came out with a badge,
police badge, no lights, no nothing just a police badge and they pulled us over.

They pulled out their guns, they told the other swimmers to get down on the ground —
they got down on the ground.

I refused, I was like we didn't do anything wrong, so —
I'm not getting down on the ground.

And then the guy pulled out his gun, he cocked it, put it to my forehead and he said,
'Get down,' and I put my hands up, I was like 'whatever.' 

He took our money, he took my wallet — he left my cell phone, he left my credentials.”

Looking at some of the most interesting parts of the statement, Lochte starts of with we
got pulled over, and then ends the sentence with an out of sequence “they pulled us over” after telling us what he didn't see.

When most people are asked an open question, they describe what happened, not what didn’t happen.
Saying what didn’t happen in response to an open question is called a spontaneous negation by FBI agent John Schafer in his book and is a red flag in deception.

“....they told the other swimmers to get down on the ground...”

Lochte didn’t say “they told us to get down on the ground”, he said the “other swimmers” were told this. He is isolating himself from the group. It’s no longer we and us.

It seems Lochte is still standing around, with attitude to boot (“I’m not getting on the ground”)
When a gun is pointed to his forehead and he is told to get down, after the other swimmers were told to get down and which they did, at this point he puts his hands up.

Then some interesting bits: “And then the guy pulled out his gun....”

1 –“Then” indicates that some time had passed, perhaps something is being skipped over.

2 – “the” is out of context. “And then the guy pulled out his gun, and cocked it...” by using the in this manner it indicates that the gunman is previously known.

3 – The most obvious glaring problem is that the guns were already out in the earlier part, but now we are being told “then the guy pulled out his gun”.

4 – The gunman tells him to get down and then he puts up his hands.

5 – Lochte portrays himself as a hero by being dismissive towards the gunman with the “whatever” attitude.

6 – “He took our money, he took my wallet..”
It wasn’t they took our wallets. Lochte is treated differently again, with his wallet being taken by the single gunman, while the others had there money taken.

This statement is riddled with inconsistencies and red flags and appears very deceptive.

It would seem something else happened that is being covered up with this “robbery”.
Lochte never told the police about the robbery, he sent a text message to his mother afterwards who was also in Rio.

Only when media reports came out via his mother did police get Lochte and another swimmer Feigen in to make statements. Reportedly Lochte’s statement said there was only one gunman involved while Feigens statement said there were several gunmen but only one was armed.

Media report:
Judge Blanc De Cnop noted that Lochte had said a single robber approached
the athletes and demanded all their money (400 real, or $124).

Feigen's statement said a number of robbers targeted the athletes
but only one was armed, the statement said. Another potential issue
highlighted by judge was the behavior of the athletes on arrival at the
Olympic Village in the aftermath."

Lochte’s mother played it down, saying,” They just took their wallets and basically that was it.”

Looking at Lochte’s statement on NBC, there are many red flags raised, but bringing all the other media reports into the mix lifts this to another level. 

I think this whole episode was best summed up by local television new announcer Mariana Godoy --

"So the American swimmer lied about the robbery?  He went from one party to another party and didn't want to tell his Mommy about it?"

Melania Trump Plagiarises Michelle Obama Speech

Jarrett Hill first noticed the close similarities of Melania Trump's Speech to Michelle Obama's 2008 speech --

The paragraphs in question are very close:
from:NPR Politics

Running both full speeches through the anti plagiarism detection software Jstylo from Drexel University along with 60 extra random emails from Enron to act as placebo and using a bayesian text classifier gives this:

The anti plagiarism software picks Michelle Obama 2008 speech as the closest match to Melania's speech 2016. In this case it is 100% sure.

Trump is running at 76% lies in his statements according to verification website Politifact's truth-o-meter, and his wife seems to have acquired the deception habit too.

The Lying Game--U.K Police Statement Video

A superb 1 hour show with U.K police and forensic psychologist statement analysis of of public TV appeals to spot the liar in murder cases.

Take note how people close their eyes in a blocking action when they don't like what they hear.
Notice too how the U.K police seat suspects on couch instead of a table and chair, making it much easier to watch non verbals.

In particular, when people lie, their is a "cognitive overload",and instead of gestures which are normal and feature in most honest conversation, the body "locks down" and doesn't move.

Liars rehearse lies, but seldom rehearse gestures.

A terrific show.

Unmasking The JonBenet Ransom Note With Stylometry Software (new additions july 2018)

The tragic and bizarre murder of JonBenet Ramsey is 20 years old. The ransom note from this case is analysed using the latest stylometric software to determine the authorship.

The program Jstylo has Writeprints as it's backbone, which "automatically extracts thousands of multilingual, structural, and semantic features to determine who is creating 'anonymous' content online. Writeprint can look at a posting on an online bulletin board, for example, and compare it with writings found elsewhere on the Internet. 

By analyzing these certain features, it can determine with more than 95 percent accuracy if the author has produced other content in the past." (University Arizona)

The software uses "cutting-edge technology and novel new approaches to track their moves online, providing an invaluable tool in the global war on terror" . (University Arizona)

Over the years there have been dozens of handwriting studies done, but considering that this long, rambling,strange and bizarre ransom note was designed to disguise handwriting (the letter a for example changes 6 times in it's construction), logically there would never be a match that would stand up in court.

Drexel Research University released anti plagiarism software called Jstylo which perked my interest in this murder case and the ransom note.

There are 375 words in the ransom note. Forsyth and Holmes show that a minimum of 250 words are required to attribute a document to an anonymous author. 
 R. S. Forsyth and D. I. Holmes, “Feature-finding for test classification,” Literary and Linguistic Computing, vol. 11, no. 4, pp. 163–174, 1996

This made it viable to test the ransom note against writing from Patsy and John Ramsey.

The software has been shown to be effective with accuracy rates of around 80% in identifying anonymous users on hack forums, with probabilities rising to 93%-97% accuracy in identifying a target document from among 50 authors (Abbasi and Chen). Rates drop to around 90% for 100 authors.

Its' also been used to identify programming source code authorship.

I downloaded their software JSTYLO

This is superb for a few reasons: it has embedded in it WEKA, an incredible data mining suite from the university of Waikato, NZ, and also WRITEPRINTS, the gold standard forensic stylometric characteristic generator for author identification with an automated interface.

Combined together, over 800 variables are created by Writeprints limited for each piece of text, which is then analysed by Weka for a linguistic "fingerprint" amongst all the test samples you give it.

Stylometry is the statistical analysis of writing style to identify authorship. This style involves many "invisible" words such as articles, function words, adverbs and pronouns which become unique to us as we develop our writing style, it not just a frequency count of obvious words. The hidden unconscious aspect of this makes it ideal for computer analysis. (James Pennebaker, The Secret Life Of Pronouns)

There was always suspicion on the parents because of their strange behaviour. There is interesting video interview footage on the internet, where they ask Patsy if she would take a lie detector test and she says she would whilst simultaneously shaking her head in a no motion, a classic incongruity between what was said and done (see former FBI agent Joe Navarro's book What Everybody Is Saying for more this non verbal cue).

Deceptive people use language differently to innocent people, see ten Brinke and Porter, Psychology, Crime & Law 2015). Another interesting study on language changes in deception relates to Dutch Professor Diederik Stapel who reported false data in 25 of his academic papers. The study compared his 25 fraudulent papers with his 25+ legitimate papers. Academic Fraud Study

The outcome: "This research supports recent findings that language cues vary systematically with deception, and that deception can be revealed in fraudulent scientific discourse."

For this post, I will only look at the stylometric aspect of the ransom not. A future post will look at the linguistics of this case.

Above: page 1 of the two and a half page ransom note.

I located 5 notes written by Patsy Ramsey, including 1995 and 1996 Xmas notes. I haven't had much luck locating anything sizable written by John Ramsey, however.

But I needed a placebo--lots of random notes and emails to test against.

Many universities are using the Enron Email Corpus from Carnegie Mellon--

The email servers were seized during the Enron fraud trails where a dozen executives went to jail. After the court case, the emails were acquired by a university and have been made available for various political/social studies. It is the largest email corpus (1.5 million emails) which show day to day life in a large corporation.

The emails make a perfect training set, and have been used as that in various studies, as well as creating models such as being able to identify male and female writing with 80% probability.

Schein and Caver show that attribution accuracy is greatly affected by topic, I've tried to avoid this by greatly varying topics by using the Enron dataset.

The reason the Enron corpus is being used by the University of British Columbia and others for language and social engineering studies is that Enron was in effect a small city -- it was a vast corporate structure that had thousands of daily emails on all subjects, from business, to small talk to flirting to deception.

I downloaded the Enron corpus and randomly selected about 60 emails and added the Ramsey letters mentioned above.

All this was put into Jstylo, the authorship attribution software.

The ransom note was put in the Test side, the 80 odd emails and text was in the Training side.

With all the emails and ransom note loaded in, I went to the data mining section and selected an  algorithm with the least error after cross validation which looks for similarity between the writing samples.

The next step was to run the "trained" model (trained on Enron and Patsy writing)  on the test writing (ransom note) and look for the closest match. In effect I am asking it--which text does this ransom note look most like?

Writeprints creates 800 variables per document, creating  a "sliding window" as it analyses a broad range of text characteristics.

Result--Patsy Ramsey at 75%.

I ran it again with different emails and text, and then different data mining algorithms, same result.

There is some good advice on which classifier to pick by Edwin Chan.


Patsy died of cancer in 2006, and that is probably why the Police Commissioner Mark Beckner said they don't expect to make any arrests in the future, even though the case is still open.

Police Chief  Mark Beckner did participate in an interview on Reddit, and one of the questions that always stuck out to me was this:

Q: “When Patsy wrote out the sample ransom note for handwriting comparison, it is interesting that she wrote “$118,000″ out fully in words (as if trying to be different from the note).
Who writes out long numbers in words? Does this seem contrived to you?”
Beckner: “The handwriting experts noted several strange observations.”

Update 1: Sept 2016

It has been pointed out to me by two people, DocG and also Eve Berger (no relation) from Linkedin that John Ramsey was also reported as having used the notorious "and hence" in an interview. I did find a transcript of this interview with both John and Patsy talking to student journalists, including an incredible part where Patsy says, "...Even If We Are Guilty.....".

Shades of O.J Simpson and "If I Did It.....".
That's worth a look all on it's own which I'll do in the next day or two.

John + Patsy Transcript

But, how unusual is "and hence"? Well, using the Google Ngram viewer which searches books from the 1800 to 2008, here's a graph I made:

Very uncommon, it would seem.
I will look into the linguistics using the interview material soon, referencing some of the recent automated deception detection methods.

Update 2:

I've had a few questions about Jstylo.

Let's get something out of the way, DocG asked me to use his text to test, which turned out to be speech, a NO-NO.  The results didn't work because it should be speech to speech, text to text. I told him this when I found out, and said it wasn't valid. He couldn't accept it because of the Sunk Cost Fallacy. He loved the outcome because he thought he had found a weak link. 

DocG said he uses "instinct", "intuition" and "social research experience".  I told him I was only interested in EMPIRICAL results against his "intuition", so we agree to disagree.

1 -- Firstly, Jstylo is a closed system.
This means that the suspect must be among the text samples you are analysing. The software will pick the closest match.

2 -- Speech with speech and text with text. People use language differently when they talk compared to how they write. Different parts of the brain are used for speech and writing. If you want to identify speech, use all speech as your input. If you want to identify written text, all your inputs should be text.

The Pennebaker text analysis software LIWC has frequency averages over many thousand of samples for blogs, speech, newspapers etc. This program shows the dramatic and consistent differences between speech and written text, see below for average frequencies.

3 -- Generally, the more text samples that you have from your target, the better. Recommended amount of text to ID document Target is 550 words, but Forsyth and Holmes show that 250 words is a minimum. For various authors to test against, about 5000+ words recommended.

I have been reading a study where reviewers on YELP are linked (identified) and where the reviews are only average 149 words in length:
I don't have more details on this.

4 -- There seems to be a way to create an open system with Jstylo, where if it doesn't identify an author, it won't just point to the closest match, but will come up with unknown.
I don't have more details on this.

5 -- Jstylo is not a black box, it is an automated GUI or interface combining established open source established software: JGAAP, Writeprints and Weka. Writeprints uncovers writing characteristics. Input features can be added or removed, and the spreadsheet can be exported showing the most significant important variables.

 6 -- News, Academic papers and Security Conferences using Jstylo around the world:

7 --  All software works on this principle:
garbage in = garbage out

Ransom Note Contradictions

The writer of the ransom note probably did not commit the murder, although they were part of the cover up. The note is a contradictory and naive attempt to use psychological misdirection to point the investigators in another direction.

First it's a "faction, (a small dissenting group within a larger group??), then there's a suggestion it may be someone at John Ramsay's workplace who is aware of his exact Christmas bonus, there are numerous movie quotes in an effort to appear more criminal, and a psychological attempt to issue a secondary threat of not releasing the body for "proper" burial because the writer knew the child was already dead.

The numerous contradictions involve telling a sleeping person to be well rested, not realising a kidnapper doesn't deliver a victim, crossing out deliver then using the word pickup.

There is also the issue of a kidnapper calling between 8.00-10.00am with delivery instructions, yet banking hours start at 9.00am, and the option of withdrawing the money earlier for an earlier delivery/pickup phone call from the kidnapper!

The CBS show established the murder weapon was the flashlight. The expert forensic pathologist was able to show that a 10 year old child could create the exact injury (same hole dimensions too) on a human skull with pigskin using the flashlight. The flashlight belonged to the Ramsey household, yet had been wiped of prints, as well as the batteries. The motivation to wipe the batteries clean becomes clear if you think about guilty knowledge.

Pathologist Dr. Werner Spitz said that the child was brain dead from the blow to the skull, so the intricate garrote was theatrical misdirection to shift attention away.

The Ramsey's themselves ignored nearly all the instructions on the note, they phoned the police, they invited friends over, John sent his friend to the bank, they had no concern when the telephone call deadline passed without incident, and so on.

Guilty knowledge relies not on lying but recognition of information you shouldn't know with resultant anomalous behaviour.

 911 Call

The 911 call also stood out in using the strange phrase, "We have a kidnapping..."
Many 911 calls are used to set up an alibi.
This one is no exception, IMO.

Check out FBI research on guilty and innocent 911 calls and their checklist.

Porter and ten Brinke 2015 note that females give off more guilty verbal cues than males, and that is certainly the case here with Patsy giving more red flag cues over the course of the investigation, particularly in her video interviews and her statements. Automated software using verbal and written analysis also confirms this.

Update 2 Sept 2016:
2nd Jstylo Run

I have been studying and testing more of the Jstylo software capabilities over the last week. I've decided to run it again over different training samples instead of Enron.

Drexel University provide different problem sets, and there is one with a couple of dozen authors, each with 4 or 5 pieces of text to test against .

I used 2 of the top classifiers here, Weka's SMO and Random Forest with 300 trees on a shortened version of Writeprints, Called Writeprints Limited.

I includes 2 of Patsy's known texts, and John Ramsey's written speech when he was running for office in Michigan.

Using different classifiers and different training authors from my first test, I got the same results with Patsy leading the pack in both classifiers and John Ramsey barely moving the needle. I removed each of the four texts from Patsy one at a time and retested, and each text made a difference --  each written text from Patsy contributed something to the classification. These are not probabilities, but ranking results.

Patsy has linguistic fingerprints on the ransom note. Even a visual examination shows she uses exact whole sentence structures, not just the words "and hence".

The first sentence is from the ransom note, the second is from her Christmas note to friends. The word delivery was crossed out and pickup was added when the author realised that a kidnapper would not deliver the kidnap victim back, but would phone to say where the victim could be found.

The complete sentence structure is identical, on each side of "and hence". It is part of her "linguistic fingerprint", besides all the invisible characteristics that get picked up by the Writeprints software.

Different software, different analysis--
Different Ransom Notes Comparisons Using Linguistic Inquiry and Word Count software

Also known as LIWC, this software from psychologist James Pennebaker from the University of Texas has been well validated and used in many studies, over 6000 on Google Scholar, to date.

According to Tausczik and Pennebaker:
"LIWC is a transparent text analysis program that counts words in psychologically meaningful categories. Empirical results using LIWC demonstrate its ability to detect  meaning in a wide variety of experimental settings, including to show attentional focus, emotionality, social relationships, thinking styles, and individual differences."

LIWC has been used to in various studies, from assessing depression  to deception detection (Newman Pennebaker).

Of interest to me is the Gender analysis, again from Tausczik and Pennebaker:

"Sex differences in language use show that women use more social words and references to others, and men use more complex language. A meta-analysis of the texts Tausczik and Pennebaker from many studies shows that that the largest language differences between males andfemales  are in the complexity of the language used and the degree of social references (Newman, Groom, Handelman, & Pennebaker, 2008). Males had higher use of large words, articles, and prepositions. 
Females had higher use of social words, and pronouns, including first-person singular and third-person pronouns."

I located 2 more actual ransom notes, the longest ones I could find. These are the Barbara Mackle kidnapping and the Leopold and Loeb kidnapping. All the kidnappers were caught and convicted and were men.

LIWC was run on all the ransom notes as well as a complete average on 4 of Patsy's notes she wrote.

As per Pennebaker above, the Mackle Leopold notes have no I pronoun and lower We  He She pronouns. Women use less articles and again the Mackle Leopold notes have more articles. Women use more social words, and the JonBenet note has very high social language.

What is very interesting here is that anxiety of the letter writer is revealed in writing, and even though that JonBenet note was written in the house and would have taken about half an hour to write (21 minutes just to copy it, as the CBS show noted), there was NO anxiety. Yet there was anxiety in the other pre-written notes!

Also, as a measure of authenticity, the JonBenet note is very low and there were more tentative words (not shown, but also a female indicator).

3rd Supporting Software Analysis

Whissell's Dictionary of Affect is a very useful measure of pleasantness, not what the words mean but a sentiment rating of the overall pleasantness of the text.

I have found a direct correlation to pleasantness and deception, and a study at Columbia University confirms this, but increased social language increases pleasantness too:

The JonBenet note is above average in pleasantness and social language and higher than both other ransom notes, showing it more likely to be written by a female as per Pennebaker above.

As FBI profiler Roger Depue wrote in his book, the ransom note was essentially nonsensical, obviously staged, and  was "feminine" with terms such as "gentlemen watching over", and telling sleeping people to "be well rested.".


MORE on Analysing the Syntactic and linguistic Structure of the JonBenet Ransom note, or taking the words away and leaving the parts of speech and Syntactic Tree structure here of the text:

FBI interest on this website:

I have had a very amicable email exchange with Frank Marsh from the FBI who wanted to know a bit about training suggestions and material on statement analysis etc. It's a credit to the agency that they take the time look around to see if there is any new information or techniques they need to know. I blocked out a few details above for privacy. I would love to take up Frank on his offer of a dinner "next time I am near Quantico."



In Clustering Analysis, variables that are similar to each other form a cluster or group. The software runs through the data in an unsupervised way which means there is no targ et variable used, and looks for the closest and most similar clusters. It is similar to a decision tree, but works without a classification variable.

The software I am using is the free MDL Cluster software:

It is very efficient, on par or better then k-means and EM and works as a Java exe in standalone mode. It can also be used for discretisation and corpus building, although I won't be using those capabilities.

I wanted to see what other ransom notes have in common. Obviously at surface level and at the most basic, they all have demands, a list of instructions, possibly a threat and so on.

The JonBenet ransom note was a very long rambling note of 370 words.
I managed to find 3 other complete ransom notes, two even longer than the JonBenet one:

1 - The Leopold and Loeb random note of 401 words,

2 - The Barbara Mackle ransom note at a whopping 972 words,

3 - The Rob Wiles ransom note at 152 words,

Along with the ransom notes, I have four notes from Patsy Ramsey, titled Patsy 1 + 2 and Patsy 1995 and 1996. A 2110 word letter from John Ramsey was included in this experiment.

The idea is to see what a clustering algorithm would find by lumping Patsy and John Ramsey along with the four ransom notes. The software is blind to who the note belongs too--the classification variable which specifies the owner of the note is NOT used by the software. In other words, the clustering software is looking for similarities.

At the basic word level, all the ransom notes are similar. This is obvious and useless, a ransom note is completely different to a Christmas Card for example.

So we need to look at a deeper level. I have already looked at a syntactic level in another blog post, now I want to used LIWC, the linguistic inquiry and word count software from James Pennebaker at the Uni of Texas.

The built in dictionary has categories for things like anger, negation, function words etc. I wanted to try this on a custom built dictionary by Jeremy Frimer called the Prosocial dictionary--

This has been used in interesting hypotheses, such as a decline in prosocial (helping, caring language) language tracking with dissatisfaction of politics. They have even created a model tracking approval ratings of politicians based on their prosocial language.

I downloaded the Prosocial Lexicon and ran it over the ransom notes. This added up how often certain words in certain categories appeared in each note:

There were about 127 columns, many sparse, only a few are shown here. This is now the input for the MDL Cluster software. It ignores the last classification variable of who the author of the note was, and runs through all the variables, looking for similar groups or clusters.

The output using the best 20 variables was:
Attributes: 21
Ignored attribute: filename
Instances: 9 (non-sparse)
Attribute-values in original data: 57
Numeric attributes with missing values (replaced with mean): 0
Minimum encoding length of data: 450.94
(48.70) (9.74)
#ProSocial<=0.008772 (11.45)
  #support*<=0 (-0.66) [0,1,0,0,0,1,1,0,0] jon_ransom.txt
  #support*>0 (-2.00) [0,0,0,1,0,0,0,1,0] mackle_demand.txt
#ProSocial>0.008772 (11.57)
  #help*<=0 (-2.00) [0,0,1,0,0,0,0,0,1] leo_loeb.txt
  #help*>0 (-2.00) [1,0,0,0,1,0,0,0,0] john_letter.txt

Number of clusters (leaves): 4
Correctly classified instances: 4 (44%)
Time(ms): 41

A new spreadsheet was created by the software, showing the clusters:

Four clusters were found. Cluster 0 shows Patsy 1995 and Patsy 1996 lumped with the JonBenet ransom note!! There was similarity with Patsy2 note and the Mackle note in Cluster1, as well as John's letter and Patsy 1 in Cluster3. Rob Wiles and the Leopold and Loeb note were put together in Cluster2.

A completely automated clustering approach with NO information about who wrote which note, groups Patsy with the JonBenet ransom note, even though on the surface, all the ransom notes appear similar in that there are demands and instructions and so on.

The Prosocial Dictionary tracks helping and caring language, it could probably be thought of as an empathy indicator which has proved useful for a few studies. It seems that the Patsy notes and the JonBenet ransom note are in the same cluster because of a low level use of caring language in the ransom note which has a similar "signature" to Patsy. It has been observed by a few people that the ransom note is "feminine" in the way it is written, talking about being well rested and so on. This is confirmed with various online Gender handwriting analysis sites when the ransom note is analysed.

Another potential direction is the new field of Sentiment Analysis, used to detect the sentiment in product reviews, hotel reviews and so on--

An exciting new method is DepecheMood, which used 37 000 terms along with emotion scores--

They have built an online website to test text--

Plugging the ransom notes into Depechemood shows different sentiment--

The JonBenet Ransom Note above

The Mackle Note above

The Leopold Note above

The Robe Wiles note above

It's interesting to see that the two top emotions in the JonBenet ransom note (top) are Sadness and Anger, consistent with what you would expect if the CBS special scenario played out ie JonBenet being accidentally killed by her brother as she snatched some pineapple from him during a late night snack.

More to follow......

Negative Elections Work Because "Fear Is An Effective Means Of Persuasion".

Studies show that human beings are more motivated by loss than by gain.

If you frame the same message in two different ways, such as..."if you insulate your windows, you will SAVE a dollar a day in heating" compared with "if you fail to insulate your windows, you will LOSE a dollar a day", most people are more motivated to act on the loss message.

(see Robert Cialdini-

The headline above comes from this persuasion study:

It concludes with "the stronger the fear appeal, the greater the chance the individual will accept the recommendation of action."

This helps explain why deceptive negative election campaigns are becoming more common. Chris Mitchell in the Australian laments that so many fellow journalist continued the message uncritically:

"All week journalists from the national broadcaster and much of the print and commercial electronic media seemed to agree with Bill Shorten that Labor’s dishonest Medicare scare had shown up the Coalition for being out of touch with voters.

The 2014 budget recommended a small Medicare co-­payment of exactly the kind Labor wanted to introduce under former prime minister Bob Hawke 25 years ago. It was the only budget since 2010 that sought to deal with the issue S&P is warning about."

Don't think of a purple elephant!!
Of course, when I say that, you think of a purple elephant.

The brain does not automatically process negatives, a basic principle of neurolinguistics. Any negation such as not, don't or un are initially processed subconsciously by the brain in the positive. So if you say to a child,"Don't spill your milk", the child's brain first subconsciously processes spill your milk, and then Don't is added on to the sentence by the conscious brain.

Saying don't makes it more likely that the milk will be spilled. Just like thinking of a purple elephant.
That's why uncaring or nonviolent are weak messages, but also another reason why the negative message in an election campaign stays with us.

But the downside of going negative is that "such ads may work to both shrink and polarize the electorate,” as the political scientists Shanto Iyengar of Stanford has long pointed out.

This was the case with the Australian election, with record numbers of voters leaving the major parties to vote for the independents. Labor had the second lowest number of primary votes in it's history, while the Liberals lost at least 1.7 million voters moving to right of centre independents .

With changing times comes lack of accountability for lies and deceptions during an election. Football players are more likely to be punished for foul play than a politician who lies in an attempt to influence votes. Voting is an emotional process, not a logical one, and when you are trying to sell something, whether a politician or a beer, it can be more effective to sell on an emotional basis instead of relying on the facts.

During the 2010 campaign,  Obama employed 29 behavioural scientist and psychologists, including best selling authors Dan Ariely and Richard Thaler to create proposals to reduce emotions and create reason, and then show the science behind it.

One of the things that came out of this was to never rebuff a negative or deceptive claim with a negation such as not, isn't, doesn't. The claim was made that Obama was a Muslim. The Obama team did not respond with "Obama isn't a Muslim", they responded with a positive statement saying "Obama is a Christian".. and so on.

Responding with a negation such as don't spill your milk is more likely to anchor the spill your milk or Obama is a Muslim or Malcolm Turnbull is going to privatise Medicare, in the mind.

Obama is using science to respond to negative campaigns, something Malcolm Turnbull should have done a long time ago.

Learning From Politicians: Applause Cues For Speakers

Applause generation cues are crafted within in a political talk  because audiences need to know not only if they are going to applaud but when they are going to applaud.

In particular, two of these rhetorical techniques (Atkinson 1983), the 3 item list ( rule of three) and the contrast principle,  are very useful for speakers, newspaper editors, advertising and any situation designed to persuade.

The Rule Of Three, or trios and triplets are everywhere in western culture. We have an inherent attraction to the number three, it allows us to express a concept, to emphasise it to make it memorable.

When a 3 item list is included in a speech, we recognise it as such and can anticipate the completion of that point, so it becomes a natural applause cue.

So for example, Tony Blair was applauded for his famous 3 item list, "Ask me my main priorities for Government, and I tell you: education, education and education."

On during the election night, opposition leader Bill Shorten said, "The Labor party is re-energised, it is unified and it is more determined than ever.

Obama said," Homes have been lost; jobs shed; businesses shuttered."

The second important rhetorical technique is the Contrast Principle which fundamental in the way our brain makes decisions.

Advertising is in essence contrast -- you show that you are the only red apple amongst green apples.
Things such as before and after diet pictures, before and after hair loss programs and so on, have been used in advertising for decades.

Contrast highlights and exaggerates what precedes it, so for example in retail you will always be sold a suit first, then a jumper or shirt because it appears more trivial in price. A real estate agent will show you an older run down property first, then show you something closer to your brief, and it appears even more suitable because of the contrast that preceded it..

In speech, an example is John Major saying, "We are in Europe to help shape it, not to be shaped by it." To be effective in speech, the second part of the contrast needs to be very similar to the first part.

Atkinson research in 1984 showed that the 3 item list and the contrasts techniques were used by virtually all "charismatic speakers", and that the media often selected such passages as part of their print.

Research by John Heritage and David Greatbatch backs up Atkinson's findings and shows that in a political context, nearly half of all the applause generated in speeches was from these two rhetorical techniques alone.

A good reason for all speakers to be aware and use the Rule Of Three and the Contrast Principle.

Judging Competence + Success From A Face.

Previous studies have shown that personality traits such as competence and trustworthiness can be reliably judged from a face (Hess, Adams, & Kleck, 2005). Similar studies showed a 70% correlation to predicting election results in the U.S based on looking at photo's and scoring the same traits.

However, can you tell which CEO's are most successful from their face?


Undergraduates were asked to judge CEO's likability and competence based on a series of photo's in a study by Nicholas Rule and Nalini Ambadi:

It turns out that leaders that scored the highest also ran the most successful companies. Nicholas Rule says, "These findings suggest that naive judgments may provide more accurate assessments of individuals than well-informed judgments can."

First impressions are critical, because we do judge a book by it's cover. Which brings me to Bill Shortens demeanor when interviewed on TV.

I have noticed in the last week that Shorten appears to be going into "sad mode" as he begins to speak. This morning while being interviewed on channel 9 with one day to the election, Bill Shorten's face visibly changed at the moment he began to speak. His eyebrows over the nose went went together and up, his top eyelids dropped slightly, in a classic sad expression.

Whether this is to invoke underdog sympathy, or portray a large burden being placed on his shoulders, it appears to be a conscious attempt because he visibly changes as he begins the interview.

The world's pioneer and authority on facial recognition, Dr. Paul Ekman in his book Emotions Revealed says,

"The eyebrows are very important, highly reliable signs of sadness."
And when talking about actors he says, "It makes them seem more empathetic, warm and kind, but that may not be a true reflection of what they are feeling."

I wanted to test whether Bill Shorten is using this as part of his persona. I downloaded all his speeches for the month of June as well as Malcolm Turnbull to use as a comparison.

I ran the speeches through the psychological text analysis tool LIWC from James Pennebaker from the University of Texas to categorise dozens and dozens of words related to sadness from all the speeches.

Compared to Turnbull, Shorten uses more sad language, using more words like lose, missing, tragic, deprived.

This persona may be part of the political spin developed by Shorten and the Labor campaign to invoke empathy, but portraying trustworthiness and competence has been shown to be a better combination.

Election Word Watching - Who's Most Deceptive?

I downloaded 28 speeches from Malcolm Turnbull and Bill Shorten (14 each) for the month of June. This includes scripted and unscripted Q+A sessions. I ran all the speeches through the text analysis software, which used 4500  words in 80 categories to to analyse what was said and to get deeper into the state of mind and intent of the leaders.

The most frequent words used by both leaders shown here as a word cloud. The larger the word, the more often it was used.

Above: Bill Shorten word cloud.

                                                  Above: Malcolm Turnbull word cloud

While both leaders mention the other side, Shorten stands out with how often he uses the word "Turnbull", it's nearly as large as Labor.

Next we look at Equivocation or Hedge words (I believe, think, might, should could etc) which reduces commitment and can "indicate a less positive experience or an unwillingness to communicate information". (Wiener and Mehrabian 1968)

I'll also look at Negations, words like no, not and all contractions of not. Equivocation and Negation were pinpointed by Susan Adams of the FBI as the most indicative indicators of deception in her paper Indicators Of Veracity And Deception: An Analysis Of Written Statements Made To Police - 2006.

The results are highly significant at below the 1% level. Bill Shorten uses far more Equivocation and Negation compared to Malcolm Turnbull. This is a red flag in deception, but it is also high in losing political parties in the 100 year election speech analysis I did. Winners of elections used less, losers used more of these words.

Tracking what drives the election leaders, we break down affiliation, achievement and power indicators in speech. People high in affiliation are concerned with relationships and close allies. Both leaders are similar in affliation and also achievement orientation.

The power indicator -- how driven is the individual to control, status and prestige. David Winter argued in his analysis of U.S presidents that the degree of power was an indicator of political effectiveness, but I have found it to be a negative indicator in Australian elections - the public don't seem to vote for power language used in elections.

Looking at words relating to anger, Turnbull scores higher (more angry). This is a negative indicator and is also a negative indicator in the German election prediction using twitter to determine public sentiment that I mentioned in a previous post.

Both leaders are very similar on indicators of Analytical thinking, both are similar in Risk and Reward indicators, and both are similar in Authenticity.

Shorten uses far more "I" pronouns which tends to be "more personal, but more insecure." (LIWC analysis by psychologist James Pennebaker).

Both leaders are similar in focus on the future and the past events, but Bill Shorten is far more likely to focus on the present. There doesn't appear to be that much of a difference on the graph, but it is highly significant on the statistical analysis.

Pennebaker says, "People oriented towards the present are thinking about current events that are psychologically close. Present-focused people tend to be more neurotic, depressed, and pessimistic than either past- of future-oriented people. "

In total Malcolm Turnbull only has 3 categories of words out of 80 that are more statistically excessive than Bill Shorten, whereas Shorten has 12 categories.

Shorten has a problem with language compared to Malcolm Turnbull (this only relates when compared to each other) - he is perceived as more negative and is flagged as more deceptive.

In closing, the sincerity problem Shorten has is really shown in these pictures (gifs) below.
He is delivering a speech where he says, "I believe..." then looks down at his notes to remind himself what he believes in. This really is indicative of poor preparation or going into "auto pilot" during a speech.

But just when you think he was tired or having a bad day, he does it again during a second important speech about his asylum seeker/refugee policy!

This looks to all the world that Bill Shorten doesn't know what he believes.

Wrapping this up with the latest Befair prices on the election, Labor is diving further, with Liberal at $1.13, almost 89% chance of winning with less than a week to go.

Power handshakes and other election non verbals.

Handshakes are often our first and sometimes only point of contact we have with another person. How we do it, affects how we are perceived. The handshake has the power to leave an impression, so it is interesting watching how the election leaders approach this.

Turnbull tends to pull in Shorten when they shake hands to establish dominance. This can appear very negative if it's exaggerated as with Tony Abbott's aggressive hand jousting exhibition with Kevin Rudd.

Turnbull appears more subtle--

Although with Prince Charles, Malcolm Turnbull made no attempt at a power handshake:

Possibly the very worst handshake you could do, and one that almost always leaves a negative impression is referred to as the "Politicians Handshake" and and involves the left hand covering the handshake in a two handed gesture, in an attempt to appear more friendly.

As with Shorten's jacket-off-and -rolled-up-sleeves approach, it's manufactured to make him appear as friendly, one of the people, let's get to work look.

During the election campaign, Malcolm Turnbull has come across as more confident, competent and relaxed then Bill Shorten. Whereas Turnbull stands up straight and hold his head up, Shorten tucks his chin in, a sign of discomfort.

AS a small child, if you smelt something disgusting, you wrinkled you nose and pursed your lips. Thirty years later, if you read a contract or see something you don't like, you purse your lips again. The lip pursing is normally only a brief moment, but it's an extremely reliable cue for discomfort.

When there is tension or stress, we have tension in our mouth area, which results in the lips being "sucked" in, and extreme discomfort or stress can make the lips disappear completely. Look around the airport next time you fly, and watch people when a flight is cancelled. Again, this is a very reliable indicator of stress and discomfort.

Shorten hasn't been coached to control overt discomfort and stress cues because even the Press have picked up on this, exhibiting his most extreme displays:

During interviews and debates, Malcolm Turnbull has his non verbals mostly under control, ensuring that he looks confident. Confidence is correlated to perceived competence in studies, so it is critical that a leader appear confident.

He speaks with large open gestures showing the palms of his hands, he stands straight, he tends not to point, which is highly offensive for many people. Turnbull's large open gestures are contrasted with Shortens more closed approach....hands closer to the body and sometimes pointing.

Millions of years ago, when we didn't like what we saw or were intimidated, we would run away. Today in a business setting this translates to us leaning away from someone who says something we don't agree with. We lean back in our chair, are lean away, and in a more extreme case we turn our body away from what we don't like or agree with.

This is evident when you watch political debates, but it is another problem that Bill Shorten has. He turns away slightly or looks with a sideways glance. Covering or turning away, or "ventral denial" shows discomfort with what is being said or asked. An easy way to see this is to watch which direction the feet point. The feet point to where the body wants to go.

If you talk to a client and their foot points to the exit door, they need to go. Jury consultant Jo Ellen Demetrius cites a study of jury members --when jury members don't like a witness, they face the witness but their feet point to the exit door.

So watch where the body (use the belly button as a directional indicator) points, and be aware of the more subtle version of feet pointing away.

Bill Shorten is not fairing well with his display of non verbals during this election campaign. I'll look at the verbals of the leaders in the next post.

© ElasticTruth

This site uses cookies from Google to deliver its services - Click here for information.

Professional Blog Designs by pipdig