From dating profiles to Brexit – how to spot an online lie

From dating profiles to Brexit – how to spot an online lie


A world of deceit. Shutterstock


There are three things you can be sure of in life: death, taxes – and lying. The latter certainly appears to have been borne out by the UK’s recent Brexit referendum, with a number of the Leave campaign’s pledges looking more like porkie pies than solid truths.

But from internet advertising, visa applications and academic articles to political blogs, insurance claims and dating profiles, there are countless places we can tell digital lies. So how can one go about spotting these online fibs? Well, we, along with Ko de Ruyter from City University London’s Cass Business School and Mike Friedman of the Catholic University of Louvain, have developed a digital lie detector – and it can uncover a whole host of internet untruths.

In our new research, we used linguistic cues to compare tens of thousands of emails pre-identified as lies with those known to be truthful. And from this comparison, we developed a text analytic algorithm that can detect deception. It works on three levels.

1. Word use

Keyword searches can be a reasonable approach when dealing with large amounts of digital data. So, we first uncovered differences in word usage between the two document sets. These differences identify text that is likely to contain a lie. We found that individuals who lie generally use fewer personal pronouns, such as I, you, and he/she, and more adjectives, such as brilliant, fearless, and sublime. They also use fewer first-person singular pronouns, such as I, me, mine, with discrepancy words, such as could, should, would, as well as more second-person pronouns (you, your) with achievement words (earn, hero, win).

Fewer personal pronouns indicate an author’s attempt to dissociate themselves from their words, while using more adjectives is an attempt to distract from the lie through a flurry of superfluous descriptions. Fewer first-person singular pronouns combined with discrepancy words indicate a lack of subtlety and a positive self-image, while more second-person pronouns combined with achievement words indicate an attempt to flatter recipients. We therefore included these combinations of search terms in our algorithm.

2. Structure scrutiny

Another part of the solution lay in analysing the variance of cognitive process words, such as cause, because, know and ought – and we identified a relationship between structure words and lies.

Liars cannot generate deceptive emails from actual memory so they avoid spontaneity to evade detection. That does not mean that liars use more cognitive process words overall than people who are telling the truth, but they do include these words more consistently. For example, they tend to connect every sentence to the next – “we know this happened because of this, because this ought to be the case”. Our algorithm detects such usage of process words in communications.

3. Cross-email approach

We also studied the ways in which a sender of an email alters their linguistic style while exchanging a number of emails with someone else. This part of the study revealed that as the exchange went on, the more the sender tended to use the function words that the receiver was using.



Looking for love: but are they lying? Shutterstock


Function words are words that contribute to the syntax, or structure, rather the meaning of a sentence – for example an, am, to. And senders revised the linguistic style of their messages to match that of the receiver. As a consequence, our algorithm identifies and collects such matching.

Exciting applications

Consumer watchdogs can use this technology to assign a “possibly lying” score to advertisements of a dubious nature. Security companies and national border forces can use the algorithm to assess documents, such as visa applications and landing cards, to better monitor compliance with access and entry rules and regulations. Secretaries of higher education exam committees and editors of academic journals can improve their proofing tools for automatically checking student theses and academic articles for plagiarism.

In fact, the potential applications go on and on. Political blogs can successfully monitor their social media interactions for textual anomalies, while dating and review sites can classify messages submitted by users on the basis of their “possibly lying” score. Insurance companies can make better use of their time and resources available for claim auditing. Accountants, tax advisers, and forensic specialists can investigate financial statements and tax claims and find deceptive smoking guns through our algorithm.

Humans are startlingly bad at consciously detecting deception. Indeed, human accuracy when it comes to spotting a lie is just 54%, hardly better than chance. Our digital lie detector, meanwhile, is 70% accurate. It can be put to work to fight fraud wherever it occurs in computerised content and as the technology evolves, its Pinocchio warnings can be wholly automated and its accuracy will increase even further. Just as Pinocchio’s nose reflexively signalled falsehood, so does our digital lie detector. Fibbers beware.

[The Conversation]

July 8, 2016 / by / in , , , , , , , , ,

Leave a Reply

Show Buttons
Hide Buttons

IMPORTANT MESSAGE: is a website owned and operated by Scooblr, Inc. By accessing this website and any pages thereof, you agree to be bound by the Terms of Use and Privacy Policy, as amended from time to time. Scooblr, Inc. does not verify or assure that information provided by any company offering services is accurate or complete or that the valuation is appropriate. Neither Scooblr nor any of its directors, officers, employees, representatives, affiliates or agents shall have any liability whatsoever arising, for any error or incompleteness of fact or opinion in, or lack of care in the preparation or publication, of the materials posted on this website. Scooblr does not give advice, provide analysis or recommendations regarding any offering, service posted on the website. The information on this website does not constitute an offer of, or the solicitation of an offer to buy or subscribe for, any services to any person in any jurisdiction to whom or in which such offer or solicitation is unlawful.