The winner of the first Loebner competition in 1991 was a program called PC Therapist III, written by Joseph Weintraub. It won in the category of whimsical conversation, which was perfect for the kind of non sequitur it would deliver. Several of the later winners of the competition have dismissed Weintraub's system as nothing more but an ELIZA clone, but in truth, Weintraub was so disappointed in the version of ELIZA which he had purchased, that he returned it and set about to write his own. Weintraub won the competition in 1991, 1992, 1993, and again in 1995. Two of his other systems were called PC Professor (which was prepared to talk about the differences between men and women) and PC Politician (which discussed liberals and conservatives.)
PC Therapist was touted to employ AI sentence parsing and knowledge base technology. If it could not construct a reply based on keyword parsing, it would pull a relevant quote or phrase from its knowledge base, called KBASEK.
In 1994, the winning program was written by Thomas Whalen of the Communications Research Centre, in Ottawa, Canada. His TIPS system was designed to provide information rather than simulate a conversation; and, to that end, referenced a database of pre-written answers to specific questions. Whalen regards himself as Computational Behaviorist, and his TIPS system relied more on an analysis of actual human behavior, than on a linguistic-based grammar.
For the 1995 competition, he tried to equip his entry with the personality of a specific person to cope with new rules that opened up the conversation to multiple topics (up through 1994, an entry could choose to carry on a conversation in a limited domain). He hoped that by engaging the judges in a dialog around a story of the trials and tribulations of TIPS' character (Joe the Janitor), who was about to lose his job; thereby constraining the conversation to a pre-scripted domain while being human-like. He also programmed Joe to deal with the kinds of questions Dale Carnegie said usually came up first in polite conversation. He was wrong. The judges did not engage in polite conversation and were not interested in playing along with the Joe the Janitor scenario; instead they immediately tried to confound the system and identify it as an imposter. He also found that the judges were less impressed by his system's four variations on the standard answer of I don't know than they were of a system that provided witty non sequiturs instead. These observations proved to be helpful to later contestants.
Whalen recently made the following observation:
Looking at grammars immediately forces one to wrestle with some of the most intractable problems in linguistics... I believe that approaching the problem from the direction of verbal behaviour rather than grammatical analysis provides a less steep learning curve for the program developer. By starting with small samples of behaviour and generalizing to more difficult and larger samples, we can develop useful systems earlier and avoid the seemingly insurmountable obstacles that engage and confound linguists from the outset. Most linguists, of course, do not believe that a behaviour-oriented approach will even encounter the most fundamental problems of linguistics much less solve them. The great thing about the Loebner competition is that it allows programs from any philosophical approach to compete head-to-head without prejudice or bias. We are seeing the competition bear fruit — the competing programs (even the computational behaviourists' programs) are becoming sufficiently sophisticated to begin to address some of the fundamental problems of language understanding. But it is still too early to see which approach will win in the long run. (Whalen, 2002, personal communication.)
In 1996, the primary motivation for Jason Hutchens, of the University of Western Australia, was to make a statement about the futility of the Loebner contest. He intentionally limited his development time to one month and incorporated no technologies from Artificial Intelligence. His opinions, which he has posted on the web, are that the Loebner competition does not attract serious attention from the A.I. community, and that it does nothing to push the envelope of modern technology. If he could win with a one-month hack it would show that the Loebner competition is irrelevant to anyone doing serious A.I. work. He was quoted as saying that his systems had nothing to do with artificial intelligence, and in fact, were about as smart as a Mr. Coffee.
Hutchens entered two systems in 1996: MegaHAL and HeX. At that time, MegaHAL was nothing more than a gibberish-generator, although in 1998 it had been improved enough to be Hutchens' primary entry. It makes use of third order Markov chains that relate the probabilities of one word pair given another preceding word pair in a state transition model, and then it makes use of as many keywords as it can from the user's input to make it look like it is conversing. The probabilities were derived from the transcripts of previous Loebner judges. MegaHAL at this time was not the serious entry — that was HeX, which made use of MegaHAL as just one of its modules.
Like Whalen, he equipped HeX to talk about the Dale Carnegie polite conversation starters, but he anticipated that the judges would also probably introduce weird topics. He also found that sometimes the judges would type in several sentences in a row, so HeX was able to react to more than just the last one. He found that the judges often asked questions that started with wh_ (what, when, where, why, who, etc.) and they almost always ended such questions with a question mark. So it was easy to parse questions that could not be answered and reflect them back as a statement. He also found that the dead giveaway for some entrants had been that they sometimes repeated, verbatim, the same response, but that humans did not.
The method of constructing an answer was to iterate roughly in this order:
In 1997, he entered a considerably more powerful program, SEPO but lost to David Levy's team, and in 1998, he entered the reworked MegaHAL but again failed to win. He attributes this to an increase in the sophistication of the competitors.
Part of the reason that the competition began getting tougher was the resurgence of an empirical approach within the field of computational linguistics in the mid-1980s and the related development of a family of probabilistic techniques at places like the IBM Thomas J. Watson Research Center. According to Manning and Schütze (1999), empiricism had been the driving force in language research in the early days until the ascendency of rationalism in the early 1960s. Chomsky and others believed that the logic and rules of grammar determine an utterance, and that these rules are innate. However, this point of view has gradually fallen out of favor, and a more stochastic paradigm is now more credible. An empirical point of view sees the way that words are used as changeable according to the practices of the verbal community, and it more easily accounts for dialects and gradual shifts in language than does a rigid set of rules. Meaning exists not within the word, but within the way that people speak and understand the word.
Within the context of this sea change in the field of natural language processing, a team from Yorkshire, England, representing Sheffield University and Intelligent Research, Inc. and led by David Levy, put together a conversant personality they called Catherine, based on the CONVERSE system, for the 1997 competition. Like the old PARRY system, Catherine took the initiative in the conversation rather than being a passive participant. The idea was that Catherine would control the conversation, allowing fewer opportunities for the judges to ask unconstrained questions. At the same time that Catherine's scripts were trying to control the direction of the conversation, a module to respond to questions and comments accessed a database of information relevant to the conversation. These databases included a thesaurus and a dictionary of proper names. They also included a Person database which had all the relevant biographical information about Catherine's character and other characters such as her fictitious family and friends. A weighting system could control which of these modes (the top-down script, or the bottom-up question-answering module) would predominate depending on how the conversation was proceeding.
A sophisticated text parser had been trained using the statistics from analyses of a corpus of British dialog. One can say that in general, Catherine took better advantage of current speech and language tools than her predecessors. Catherine carried on a conversation with the judges that was heavily laced with current events about Bill Clinton, Whitewater, and the coming out of two lesbians at the White House the night before the contest. Probably the currency of her conversation topics played a part in making her human-like.
Since winning the 1997 Loebner Prize, Levy has focused on investigating conversant agents for small platforms and embedded systems.
Robby Garner, like Thomas Whalen, considers himself a computational behaviorist, continuing the trend toward an analysis of the dialogues with previous judges. His systems won the competition two years in a row (1998 and 1999) and he has remained active with the Loebner competition, giving technical support to current contestants.
Garner has beliefs that resonate strongly with many people who consider themselves behaviorists:
Human intelligence is such a vague term, and we don't fully understand ourselves. Artificial Intelligence has become even more vague, and I like to say that I don't believe in the word intelligence at all because human behavior is more than just the meanings of our words and how many of them you can remember.
I define intelligence as the capacity to acquire and apply knowledge. Knowledge is familiarity, awareness, or understanding gained through experience or study.
Passing the Turing test does not require intelligence.
Garner's systems that won the Loebner competition matched phrases they have seen in the past, and when new phrases were encountered, they were flagged for later refinement. Conversations were modeled as Stimulus-Response objects, and where possible, appropriate and humorous answers were associated with questions. However, there were good backup strategies in case there was not an existing association. For instance, after parsing the input, an algorithm looked at frequencies of the words used and found the three most significant words. It then performed database mining (from a huge database that included Probert's Electronic Encyclopedia, the Jargon File, The Devil's Dictionary and other references) and put the data returned into one of many response templates.
Richard S. Wallace won the next two Loebner competitions (2000 and 2001) with ALICE (an acronym for Artificial Linguistic Internet Computer Entity.) ALICE is based on the Artificial Intelligence Markup Language (AIML). Wallace, like several other contestants before him, is very open in characterizing his systems as based on a strategy of deception and pretense [that] can be traced through the history of artificial intelligence.
A relatively small number of important constructs are important in AIML:
Wallace sees ALICE as in the same family as ELIZA but more sophisticated by many degrees. He says that you do not need artificial intelligence to pass the Turing Test, nor do you need complex theories of learning, neural nets and cognitive models.
For the 2002 competition, Kevin Copple entered a simplified version of the EllaZ program, called simply Ella. Copple made good use of the best techniques of previous winners, including those of ALICE / AIML. He says my approach is to use tools that are available, be clever where I can, do a lot of work, find ways to re-use work of others, and prepare for more sophisticated approaches.
When asked what makes Ella different than other ELIZA-like approaches, he listed the following techniques:
| Aging Gracefully | Autism | Behavioral Safety |
Book Reviews |
Commentaries |
Education |
| Everyday Life |
Parenting |
Organizational Behavior Management | Pets & Animals |
Verbal Behavior | CCBS Publications |
Copyright
©1997-2010 by the Cambridge Center for Behavioral Studies.
All rights reserved.
Feedback or questions about the
Cambridge Center for
Behavioral Studies or our website?
Contact our webmaster or
our Executive Director, Dr. Philip N. Chase.
The
Cambridge Center for Behavioral Studies Publication Office
is located at the following address:
550 Newtown Road, Suite 700
Littleton, MA 01460
Telephone: (978)
369-CCBS (2227)
Facsimile: (978) 369-8584
Visit other sites through the Behavior Analysis Webring:
[
Previous 5 Sites
|
Previous
|
Next
|
Next 5 Sites
|
Random Site
|
List Sites ]