Ron Kaplan, Director of Natural Language Research, Palo Alto Research Center and Marti Hearst, Professor, SIMS, UC Berkeley; Science Advisor for Search, Yahoo! spoke on the state of natural language interfaces for search.
Synopsis: Kaplan says we're at the level where where it's like talking to a one-year-old today. What's desired is something more akin to conversing with "an intelligent research assistant." He adds, "It's not just about search. How do we interact with the world of ubiquitous computing [talking to remotes, your fridge, your car, sensors of all kinds, etc]? They'll be useful to extent we can have natural conversations." His prediction: "We'll be at 8-year-old level in 2010. In the [classic] hockeystick curve, I'm going to claim we're at the inflection point."
Marti Hearst claims that a well thought out user interface itself can help guide people and speaks about the role of inference: "What will people want to do next based on other people who had same question?" She's even more optimistic: "Shouldn't online travel agencies be more like a travel agent? Maybe we'll be there in about 4 years. And a pretty good desktop assistant? I'd say 5 years because there is a lot of government research in this area."
Progress in Search: A Conversational User Interface (CUI) by 2015?
From program: Our debate at AC2005 will consider differing estimates of the difficulty and strategies necessary to achieving a first-generation conversational user interface (CUI, pronounced "cooey") within the coming decade. Achieving a functional CUI would be perhaps the single most important and empowering artificial intelligence/intelligence amplification breakthrough we may witness in our lifetimes. It would give us the ability to talk to, be productive with, and be continually educated by our computers, cellphones, internet, and other complex technologies using simple but natural human conversation.
Sibley Verbeck, StreamSage (Moderator): More of an [intelligence] amplification than a technology itself. It's language understanding, it's inference. For instance, control applications - at lunchtime, chatted about getting rid of remotes and talk to devices in our living room. Being at Comcast, I have to think about television. If you are watching the news and something comes on. Any two minute new story generates more questions than it answers. What if we augmented and queued up more in-depth version of story, provide other information.
What is under hood? How close are we?
Ron Kaplan, Director of Natural Language Research, Palo Alto Research Center
Slides titled: Converging on Conversation: Search and Everything Else
It's not just about search. How do we interact with the world of ubiquitous computing? They'll be useful to extent we can have natural conversations.
Where are we now? It's like talking to a 1-year-old. Can't say what you want, nor get what you need. Adding more words yields no hits (usually).
Issue today is precision.
Work arounds: order by popularity (based on incoming links, clickthroughs for these hits - assuming you are generic person: "if it's good for everyone else, it's good for you") - it's not individualized. You order cookbook for your mother and now you get cookbook recommendations on Amazon for life.
What's desired: A better model is not a 1-year-old, but an intelligent research assistant.
Let's say I want to know... What prevented the Northwest strike? Then you have a conversation for clarifation. Next question: By mechanics or flight attendents? Mimics having a back-and-forth conversation with your assistant to refine the search rather than keywords and documents [model].
Typing is a problem sometimes. IM is anything but Instant. Language is efficient; unsaid but understood in context and based on expectations ("The fish seemed ready to eat"). The speaker and hearer must model each other. Problem is that it's an intricate interactiton of intricate operations.
Component technologies: Accurate speech, robust grammatical analysis, ontologies and inference, personalization (how does it figure out "me", context, expectations), dialog models (primitive but useful)
Personalization is not very good right now. Needs discussion along with observed behavior ("I'm buying this for my grandmother"). The personal context (speech, interests) can be on client side.
Also needed coherent architecture that enables modularity.
We need another 2 cycles of Moore's Law. GB + GIPS.
We'll be at 8-year-old level in 2010. In the [classic] hockeystick curve, I'm going to claim we're at the inflection point.
Marti Hearst, Professor, SIMS, UC Berkeley; Science Advisor for Search, Yahoo!
SIMS is interdisciplinary - within School of Information - includes economics, law, IT
In short-term a lot of focused, domain specific interfaces. Such as "shortcuts" on search engines like "SFO flights" - but, of course, people don't like memorizing command languages.
User interface design itself can often make up for a lack of natural language processing technology itself by limiting choices, suggesting next choices, etc.
Compute statistics on data. For instance, factoid questions: "Who is president of Uganda?" (For straight-forward questions with single answers.)
Automating dialogue is not that far ahead today. Only a little bit of work in that area [thus far].
Large-scale huge behavior collections, such as spelling corrections. Making inferences: What will people want to do next based on other people who had same question?
Spelling example: Dictionaries not enough because many words aren't in standard dictionaries. Use other people's mistakes to map to other misspellings. If horrible misspelling then map to the closet (better) mispelling until you [iteratively] hit on correct spelling. First maps to other misspellings. Number of correct spellings obviously must outnumber the misspellings. This algorithm wouldn't work other than there are so many queries [to map against].
Shouldn't online travel agencies be more like a travel agent? Maybe we'll be there in about 4 years.
And a pretty good desktop assistant? I'd say 5 years because there is a lot of government research in this area.