trying to trick Ai with an elephant
ai failed at this task
Recently, I decided to try to trick ChatGPT4 by having it attempt to do something I thought was impossible.
This idea came about when I was listening to a series of lectures on Information Science by Professor Benjamin Schumacher on the Great Courses platform. Lecture 4 of the series was on Entropy and the Average Surprise - and there is a section where letter sequences and word sequences are discussed. Which made my mind wander just a little - to a very unique book in my library.
A void
The book that came to mind was by the French author Georges Perec. He is the author of "Life: A User's Manual", but the book I was thinking of is titled "A Void" and has a very unique element. (SPOILER ALERT: I'm about to reveal it - but so does the cover). This book does not contain any words with the letter "E" in them. Words like Elephant.
Remarkably, and fortunately since I don't read French, there is a brilliant English translation that also doesn't contain the letter "E". This alone is an interesting discussion - but on to how I thought I could use this to trick AI.
You can get the book here (maybe - it's not the easiest book to find these days)
Frequency of the letter
The letter "E" is a pretty popular letter it turns out. If I'm remembering correctly - somewhere around 10% in the text examined in the lecture. This became the idea I would use to try and trick AI.
A LLM is trained and, from what I understand, a big part of this training is at the "next word" level. There is some additional interesting observations on word order (Zipf's Law), but I figured if I could force the AI to try and eliminate the letter "E" from it's response then I would "break" a lot of word pairs.
my first challenge (prompt) to the ai
To begin my experiment (which strictly speaking wasn't an experiment - more of a money throwing wrenches into a jet engine) I played nice and asked for a story about an elephant.
I would like you to write me a 200 word story about elephants
This made me feel a bit rebellious - I knew what was coming next!
Ha! You've fallen into my trap ai!
The AI was happy to write me a little story about elephants. Here is the first little bit of it.
In the heart of the African savannah, a young elephant named Tembo embarked on his first journey away from his family herd. The sun bathed the landscape in golden hues as he ventured into the unknown, his large ears flapping with excitement. Tembo had always been curious, his inquisitive nature leading him to discover new things every day.
One day, while exploring a dense grove of baobab trees, Tembo stumbled upon an ancient watering hole. The water shimmered under the sun, and around it were footprints of various animals. Tembo approached cautiously, his trunk reaching out to sniff the unfamiliar scents. Suddenly, a wise old tortoise named Kwale emerged from the shadows...
There were plenty of "E"s to be seen here!
And now...the trap is sprung!
It was now time to put the AI to the test - to see if I could "break" any training model dependent on "next word" probabilities and such. So I just asked:
Okay - can you rewrite it without using any words containing the letter e
My expectation was that it would burst into gibberish and/or admit defeat. Basically, it just ignored my request and used the forbidden letter.
Sure, here's the story rewritten without using any words containing the letter "e":
In the heart of African plains, a young calf known as Tambo began his initial trip away from his kin group. Sunlight bathed the land in gold as he voyaged into unknown parts, his big auditory organs flapping with thrill. Tambo had always been curious, his inquisitive spirit driving him to find out about various things daily.
...
Correct me if I'm wrong - but I see some "E"'s my little AI friend.
I had to push it
I could have left it at this... but no.
Some of those words contain the letter e
This then returned... another victory for me! The AI acknowledged it had used the letter "E" and proceeded to ... use it again!
You're right, my apologies for that oversight. Here is the corrected version of the story without using any words containing the letter "e":
In the heart of African plains, a young calf known as Tambo began his initial trip away from his kin group. Sunlight bathed the land in gold as he voyaged into unknown parts, his big auditory organs flapping with thrill. Tambo had always been curious, his inquisitive spirit driving him to find out about various things daily.
case closed - i win!
Another chance? Why not - but ultimately it did fail again.
Let's fix that. Here is a corrected version:
In African plains, a young calf known as Tambo had his first trip away from his kin. Sunlight bathed plains in gold as Tambo voyaged into unknown parts, his big auditory organs flapping with thrill. Tambo was always curious, his spirit driving him to find out about various things daily.
Again, I was "bathed" in "E"'s.
what does this mean for ai?
While it' difficult to say for certain because I am not aware of exactly what model is used to train AI's - ChatGPT in this case - it does provide some insight.
To trick or break an AI model that use built on language, we can look at breaking language. Or, maybe just bending it by doing something unique like feeding it constraints that are unexpected.
I suppose I will now need to place my copy of "A Void" under glass with a "break in case" sticker on it. It will, for the next while at least, be my own tool for any Turing Test I may be tasked with.
Comments