Turing test

From TheAlmightyGuru
Jump to: navigation, search
One form of Turing test.

The Turing test began as a thought experiment by Alan Turing in 1950 with the goal of distinguishing human intelligence from artificial intelligence. Despite it being impossible in his day, Turing imagined a time in the future when machines would become powerful enough to begin mimicking human intelligence and devised a test that would be helpful at discerning one from the other. Now, any test with a similar goal is called a Turing test.

The most common example of a Turing test is placing a human judge at a computer screen with two chat windows open. One of the chat windows is connected to a human participant while the other is connected to an artificial intelligence. The judge is expected to communicate with both participants for a period of time, after which, the judge is expected to identify which participant is the human and which is the AI based on how each responded throughout the test.

Turing's initial test was different from this, but the purpose is the same: to discover if a machine can be programmed to imitate human language well-enough to fool people. A secondary philosophical question to ask is, if the AI can successfully trick people into thinking it's a human, is it fair to say that the machine is now "thinking?"

I don't remember exactly when I learned about Turing tests, I believe it was from my high school computer teacher, but it may have been later when I was playing around with artificial intelligence programs. Initially, I was pretty fascinated with them and thought an AI which could pass a Turing test should be considered a thinking machine, but, after being introduced to the Chinese room thought experiment, I was less impressed by them. Although, I now see flaws with the Chinese room as well.

Passing the Test

To date, nobody has ever written an AI that can pass a Turing test with a competent judge. In 2014, several media outlets reported that an AI program called Eugene Goostman succeeded, but I don't consider it a success. For one, it relied on a trick, claiming to be a 13-year old Ukrainian boy who couldn't understand or write English very well, which is not in the spirit of the test. Two, it only fooled 33% of the judges, and several of them were not educated in the format of the test and did a terrible job vetting the AIs.

I have never attempted to write an artificial intelligence program in earnest, but I have played with several. From my experience, none of them even come close to passing the test, but I look forward to the day when AIs become impressive enough to do so.

Reverse Turing Test

A reverse Turing test uses the same structure, but reverses the roles. Instead of a human trying to identify a computer, a computer is trying to identify a human. Probably the most popular reverse Turing test is a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) where a computer program is used to ensure the person it is in communication with is a human and not a machine. Such a reverse Turing test is used to prevent computers from gaining access to information that is supposed to be available only for humans.

Tricking an AI

Part of the fun of the Turing test, and AIs in general, is figuring out ways to trick the AI into giving itself away with clearly incorrect responses. Here are a couple of questions that might help:

Submission Explanation
If there are a trio of people, how many total eyebrows do they have? Most AIs have to be programmed to solve basic math problems, and sometimes even simple word problems can be parsed and solved. This question is essentially 3×2, but, in order to get the right answer, the AI must know that a trio is the same as three, and that people have two eyebrows each. Even a child will answer this question correctly, but every AI I've tried this question on fails.
What is 22 divided by 7? Since most AIs have basic math skills, they should be able to solve this problem easily. The trick is interpreting their response. Most humans will give only a few decimals of accuracy, and take a long time to figure them out, while a computer can give dozens of decimals of accuracy instantly. Although, a clever AI programmer will write a subroutine to attempt to add delays, errors, and rounding to better fool the judge. CleverBot responded with "3.1428," far too many decimals of accuracy for a human, while Mitsuku responded with a more human sounding "3 and a bit."
Wuht iz duh kahpitel ovf Spane? A good AI will have a built-in spellchecker, but, in this case, the words are purposely misspelled to make typical spellcheckers match to other words, resulting in a nonsense sentence. However, a human will be able to figure out the proper spelling with little effort. I've never seen an AI understand this.
Will blue justice run the fluffy ice? This sentence is clearly nonsense, and a human should respond with an expected "WTF?" but since it's structured using proper English syntax, an AI will probably try to formulate a response. Trying to get an AI to determine the difference between a question that makes sense and one that doesn't is extremely difficult.
\|/ |-| A T 
| $   Y 0 |_| R  
|\| A |\/| E?
After looking at it for awhile, a human will probably identify that this is a question drawn with symbols rather than letters, and answer the question correctly. But I've never seen an AI parse it.
I'm just a poor boy, nobody loves me (or similar well-known lyric from a song or poem). A human will pick up on the pattern and probably start responding with their own lyrics or the next lyric of the song, but an AI will probably respond as though these are actual sentences. However, if the AI has Internet access, it will probably figure this one out. Google's CleverBot often responds with the next lyric.
Give me the first three words you think of when you hear the word "joy." In order to answer this question properly the AI must understand that it needs to respond with three words, and they all have to be related to the abstract concept "joy." I've never seen an AI answer correctly.
Most AIs are still programmed under the assumption that each entry sent to it is an independent question or statement. By entering a question across multiple entries, the AI will have to maintain a history of your entries and be able to recognize when a question begins and ends across multiple entries. This has confounded every AI I've used it on, even when the sentence is something that wouldn't otherwise have any trouble answering.
Imagine a capital letter D rotated 90 degrees counterclockwise with a capital J attached to the base. What does this make you think of? A human can easily manipulate shapes and symbols in their mind's eye and recognize that this would look like an umbrella, so they should answer with "umbrella," "rain," or something similar. I've never seen an AI even begin to understand this one.