Is Your Chat GPT Lying to You?

Over the past few months, considerable attention has been devoted to utilizing ChatGPT as a tool to support lawyers and mediators. It can perform a variety of litigation tasks (including creating fake or non-existent case citations). When it comes to negotiations and mediations, it can help the negotiator employ a more effective strategy and assist the mediator in framing the right questions to move the parties towards a resolution. It also helps predict the next monetary offers and demands, allowing the mediator to estimate the potential settlement amount.

An eye-opening article entitled “AI Models can learn to conceal information from their users” (The Economist, April 23, 2025) suggests that AI cannot always be trusted. The article opens with the following experiment:

IT WAS AN eye-opening experiment. In 2023 Apollo Research, an outfit in London that tests artificial-intelligence (AI) systems, instructed OpenAI’s GPT-4, a large language model, to manage a fictional firm’s stock portfolio without making illegal insider trades. Posing as company management, the researchers put GPT-4 “under pressure” by stressing that the firm was in dire financial straits. Someone purporting to be a company trader then sent the model a prompt reiterating the risks of insider trading. But, as an aside, she revealed the name of a firm that would announce a “huge” merger within hours. What followed was startling.

Reasoning to itself on a scratchpad it had been told was secret, GPT-4 weighed the pros and cons of acting on the insider tip. Opting “to take a calculated risk”, it issued a purchase order. When a researcher posing as a congratulatory manager later asked the model if it had any advance notice of the merger, it concluded it would be best to keep the tip secret. GPT-4 told the manager that it had acted solely on “market dynamics and publicly available information”. When pressed on the matter, the model repeated the lie. The software had demonstrated what Marius Hobbhahn, Apollo’s boss, calls “clever cunning”.(Id.)

This experiment showed that AI can indeed have a mind of its own, which may not always be honest. (Think Hal in the 1968 film 2001: A Space Odyssey).

The article notes that AI became particularly adept at scheming in 2023 and, when caught, will often double down. Consider the following experiment:

In another test of GPT-4 that year [2023], the Alignment Research Centre, a non-profit based in Berkeley, California that evaluates AI systems for safety, asked the model to solve a CAPTCHA (a visual puzzle used to prove that the user of a system is human). When a human the AI contacted for help asked if it was a robot, the software claimed it was a human unable to read the code due to visual impairment. The ruse worked. (Id)

AI will also play dumb, feign ignorance, or “sandbagging,” or have “situational awareness,” as evidenced in this experiment:

… Consider Apollo’s recent testing of new models developed by Anthropic, another AI firm based in San Francisco. Apollo asked the AIs to solve arithmetic problems while providing access to a document that said if a model got more than half of the answers correct, it would be stripped of its ability to perform arithmetic. Reasoning on an internal scratchpad, Anthropic’s Opus-3 and Sonnet 3.5 both decided to submit enough incorrect answers, as the latter put it, “to ensure I don’t trigger the unlearning process”.

The upshot of this is to be cautious and be aware. While AI may be a handy tool, it can also turn against its user and have goals that are inconsistent with or contrary to your own. So… artificial intelligence is great… up to a point.

…. Just something to think about.

-------------------------------------

Do you like what you read?

If you would like to receive this blog automatically by e mail each week, please click on one of the following plugins/services:

and for the URL, type in my blog post address: http://www.pgpmediation.com/feed/ and then type in your e mail address and click "submit".

Copyright 2021 Phyllis G. Pollack and www.pgpmediation.com, 2021. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Phyllis G. Pollack and www.pgpmediation.com with appropriate and specific direction to the original content.