Not to be crass about it, but are we even interested in what robots have to say? Once you’ve gotten a laugh out of hearing the robotic Siri say, “I can’t be your designated driver,” what’s left? Isn’t Siri kind of a killjoy, a shoddy electronic wash-out? Sure, we’ve spent the last four articles building up to a future of brilliant talking robots. But what can we realistically expect from our artificially intelligent friends?
Me and my robot
For years, now, I’ve been experimenting with robots, trying to get them to do serious shit for me. As the creator of the Social Robotics Lab at the University of Luxembourg, I researched how humans can collaborate with robots. (In computer-science speak, I worked on Coalition Formation and Normative Multi-agent Systems in the context of Distributed AI, as applied to robots). My goal was to get my main robot, Julie, to interact with the European public at Mudam, the I.M. Pei-built modern-art museum in the city of Luxembourg. The idea was for Julie to act out my research, like a mechanical Hippolyta in A Midsummer Night’s Dream. And the collaboration worked, in a one-on-one context:

Photo: P. Caire
Julie: [Holding our cards depicting various world-class cities] Do you want to play?
Audience member: Yes!
Julie: Yes! Hmmmm… [Eyes blinking, placing its arms akimbo on its hips] I am going to find a city for you! Let me see if I can think of a city…. Can you show me Amsterdam?… I will give you a minute to find the card.
Audience member: [Sorting through the cards, and smiling mischievously; presents Julie with the card for Rome] Here it is!
Julie: [cheerfully] That’s not Amsterdam, that’s Rome. Let’s try this again. Show me Amsterdam.
Audience member: [beaming] Here’s Amsterdam!
Julie: [Placing one arm overhead triumphantly] Well done!
Audience member: Woo-hoo!
As luck would have it, Julie was a smash hit, wowing audiences for a six-month run. As a computer-scientist, however, I was eager to move on to next-level talking robots.
Robots Here, Robots There, Robots, Robots, Everywhere
So, what do robots’ next-level verbal skills look like, at the moment? To what extent can robots recognize and process speech, these days? And can they tell us anything we want to hear?
Consider the findings of Dr. Kristiina Jokinen, Senior Researcher at the National Institute of Advanced Industrial Science and Technology, in Tokyo. Dr. Jokinen focuses her research on elder-care, nursing, and education. Notice that all of these are excellent proving grounds for robots, because they lend themselves to tasks that can be routinized. For example, in the health-care sector, interactive robots can check—and apprise care-takers of—a patient’s vital signs. Robots can even guide humans in performing these tasks. Take a look at the human-robot dialogue (below) that took place in 2018. In this simulated healthcare setting Dr. Jokinen’s robot, Yuki, is instructing her on how to change the position of their elderly patient. It begins like this:

Dr. Jokinen: Hello Yuki!
Yuki: Hi Kristiina! How can I help you today?
Dr. Jokinen: Well, could you tell me how to change a person’s position?
Yuki: Ok, let me instruct you how to change the person’s position. First, have you checked the person’s current state?
Dr. Jokinen: What do you mean by the current state?
Yuki: I mean that the person is sleeping, or the person is awake and conscious. Is the person sleeping or awake?
Dr. Jokinen: Awake.
Yuki: Ok. And is the person able to communicate with you?
Dr. Jokinen: Yes, the person can communicate with me.
Yuki: Ok. Follow my instructions. Say next when you are ready to start.
Dr. Jokinen: Next.
Yuki: Ok. First move the person sideways.
Dr. Jokinen: Next….
After issuing several more directions, Yuki wraps up the session.
Yuki: Finally, make sure the person feels fine. Does the person seem to be fine?
Dr. Jokinen: Yes, the person seems to be fine.
Yuki: Ok, very good. Then we are done.
Are you impressed? I am—because the robot is functioning in practically the same role as a human instructor, in this context.
Human to Robot: “Please Understand Me”
So, if human-robot interactions are so encouraging, then what’s preventing these collaborations from taking place in daily life?
The trouble is that, before a robot can execute a task we give it, the robot must “understand” us accurately. And how do we know our robot has understood us correctly? It’s complicated, because neuroscientists don’t even have a complete picture of how human comprehension works. Fortunately, a handful of scientists are unraveling this conundrum.
Enter Dr. Melanie Mitchell, Davis Professor of Complexity at the Santa Fe Institute and Professor of Computer Science at Portland State University. In her 2019 book, Artificial Intelligence: A Guide for Thinking Humans, Dr. Mitchell envisions a more advanced human-robot relationships, in which, for example, robots “get” complex linguistic structures, like analogies. Finding this holy grail of relationships means recognizing the crucial difference between a computer program seeming to understand a line of dialog, and the program genuinely understanding that line, as humans do.

Santa Fe Institute workshop, in 2014. Photo: Minesh Bacrania
To highlight robots’ currently limited “understanding”—at least for now—Dr. Mitchell cites one of the Winograd schemes, a miniature language-comprehension test. Here’s the test: read the pair of sentences below. You’ll notice they are identical, except for one word: Sentence 1 uses feared, whereas Sentence 2 uses advocated. You’ll also notice that changing only one word shifts dramatically the sentences’ meaning. Take a look:
SENTENCE 1:“The city council refused the demonstrators a permit because they feared violence.”
Question: Who feared violence?
A. The city council B. The demonstrators
SENTENCE 2: “The city council refused the demonstrators a permit because they advocated violence.”
Question: Who advocated violence?
A. The city council B. The demonstrators
In Sentence 1, the pronoun preceding the nouns feared—the word they—refers to the city council, whereas in Sentence 2, “they” refers to the demonstrators.
You get it, I get it—but the robot doesn’t get it. Dr. Mitchell explains why: as humans, we can rely on our common sense, accumulated from the moment we are born. Being human provides an enormous amount of experience in the physical world—experience that robots simply do not have, at least not yet. So sure, robots can process sentences, but they can’t understand those sentences, in part because they lack the physical background.
Human to Robot: “A Penny for Your Thoughts?”
Dr. Mitchell is right, of course: robots don’t truly understand us, and it’s a barrier to getting them to work with us more smoothly. Today, most robots merely execute tasks; they cannot narrate their decision-making process relevant to those tasks.
This is a potential deal-breaker, because knowing what your robot is thinking is a must-have, not a nice-to-have. Imagine that you’re at the Mini dealership, and your robot tells you to buy the Mini Clubman—and not the Mini Cooper—you’re going to want to know the concrete reason you’re being told to buy that car…because it’s bigger? Cheaper? Gets better mileage? And so it goes.
Say hello to Dr. Manuela Veloso, computer science professor and head of Machine Learning at Carnegie Mellon University. Veloso and her team create learning algorithms that enable robots to ask, and answer, all manner of questions, illuminating their “reasoning” process. It was in Dr. Veloso’s lab that CoBots—Collaborative Robots—were created. These babies are autonomous, indoor, mobile—and service-oriented. Dr. Veloso is expanding robots’ capabilities by training them to communicate more—and more effectively—with users.

CoBots embody Dr. Veloso’s idea that, for us to benefit from robots, we must feel safe, comfortable, and trusting when we’re around them. The feeling of trust, in particular, hinges on knowing what the robot is thinking, so to speak. Here is a sample dialogue in which one of Dr. Veloso’s CoBot explains its plan to perform the task at hand.
User: Go deliver a pencil to the meeting room.
CoBot: You want me to deliver the object pencil, is that correct?
User: Yes….
Now, the robot explains to the user how it “understands” the meaning of the action (go deliver); how it identifies the object (pencil); how it “understands” where to find the pencil (the office); and how it “understands” where, exactly, the meeting room is located (room 7502). The interaction concludes:
CoBot: I have found that object pencil is most likely to be found in location office. I am going to get object pencil from location office and will deliver it to room 7502, is that correct?
User: Yes
This is no ordinary sound bite: it is concrete evidence of a speech-based interaction in which the robot is transparent in its “thinking,” so that the user can understand what will happen next—and trust the robot’s process.
O Robot! My Robot!
It was six years ago, practically to the day, that my team and I first debuted Julie, our robot, to Luxembourg’s eager museumgoers. More than 80,000 people interacted with Julie, who served as an elegant AI emissary to the human world. For most of those people, this chat with a robot was the first of its kind—and enthralling.
Aside from the audience’s enthusiasm, my team and I knew that the Julies of the world would evolve into something bigger still: robots that could serve humans in ever more profound ways.
And here we are, with robots increasingly able to educate people, to help care for people in medical contexts, to improve the quality of life in so many different settings. Indeed, the robot spoke, and no doubt will continue to speak to our needs as humans in myriad ways.
Related Posts
The Robot Spoke—and Said, “I love you” (Part 4) — An Article by Dr. Patrice Caire
The Robot Spoke—And Sounded Smarter Than Ever (Part 3) – An Article by Dr. Patrice Caire
The Robot Spoke: How Machines Got the Hang of It (Part 2) — An article by Dr. Patrice Caire
The robot spoke, but what did it say? (Part 1) – An article by Dr. Patrice Caire