Human-first vs AI-first approach to building a smart and fast bot
(Reading Time: 8min)
I’ve designed two chatbot products in the past couple years. The first was smart and slow. The second is not as smart but fast. Both intend to be smart and fast. I’ll get to the benefits, drawbacks, and design decisions for both.
In 2015, with a team of <10 people, we’ve built a bot that automated your office. Half of that team were human agents. This year, I am designing a personal shopping bot at eBay. None are human agents.
The office assistant bot (Large) did well immediately. Teams completed transactions from day 1 and reengaged weekly. We were as efficient as humanly possible in getting people what they wanted. We entertained people in ways that were extremely time consuming. We invested in picking up nuances and building relationships. It was smart and slow. It wasn’t scalable for us. To slay, we starting building for speed and efficiency, the qualities of a bot. (Read about the birth of baby bot .)
Ebay wanted to tackle personal shopping from a different direction, without human assistants. The personal shopping space is tough regardless if a human or AI responds to requests. This bot will be all AI at birth, and it will be fast and scalable.
How long will the AI take to be smart enough to understand what we say and imply? What about threading and group chat? Does it know how to do that? How does the AI know the right and efficient questions to ask? And, what about this and that?
Fuck it. I’m in.
(New Product Development team at eBay)
Based on my experience designing for Large and eBay ShopBot , here are the benefits and drawbacks of building a human-first bot and an AI-first bot. Plus, the design directions in making them fast, smart, and scalable.
Large: half human bot
Flexible but inconsistent
No dead ends. People got what they asked for unless it was illegal, unethical, or out of our business model. We understood intent and trolls. And, guessed at emotions. We were responsive and changed course whenever we wanted. Obvi, because humans.
Live user testing daily. We saw patterns and located kinks. We tested communication styles, from a consistent bot-like dialogue to natural human language. We tested interactions, speed, length of information, types of information. Anything we wanted, without changing a line of code.
Inconsistent experience. Humans are fluid and make mistakes. One agent would be funnier than another or had more specific knowledge about an item than another. Response speed and attention differed depending on the hour. An agent may forget to ask for a piece of information which increases back and forth. Anything can happen.
Humans need rest. When bouncing from one agent to another, the customer was bound to feel the difference even slightly.
Personal but costly
Highly considered results. Agents asked only the important questions for them to find personalized options. We consulted experts, friends, and our own team. We all had special interests and knowledge and pitched in on what was the best item to get for what reason and purpose.
Extremely time consuming. We wanted to offer limited and personalized results, which required learning about each person’s preferences and decision making process. And, a lot of trial and error.
Agent spends 10 minutes considering 10+ factors to narrow down to 3 strong options to craft into structured messages to be sent back to the customer.
In just a few seconds the person sends back, “More options”
Rejected. Ok, let’s try again. Why didn’t these options work? I want to find you what you want.
"I just want more options.”
We only had a handful of agents.
Design direction: improve speed and efficiency
Creating a consistent voice: We started with a high level voice guideline with personality. As mentioned, we tested different tones from more human to more bot-like. We found that sounding robotic increased efficiency. People had a lower expectation of what it can actually understand. When the bot sounded too smart and too much like a human, people pushed its limits by saying things that had nothing to do with a request. Though entertaining for the person, it created a lot of noise. After finding a tone that met our purpose, we minimized agents talking to people. Agents spoke to the bot who spoke with people in a consistent branded voice.
Understanding service: We ran the natural course of finding and getting people what they want by observing how people ask our bot for things. We researched the necessary info different types of vendors needed, translated those bits into easily answerable questions, and mashed them up with details the customer provided. Then, optimized and automated that part of the flow.
Speeding up workflow: We started with parts of the experience where an agent wasn’t necessary to do the work and where we could easily script. We designed how agents got info from the bot, how the bot asked agents to perform a task, and how the bot chatted with people.
Minimizing mistakes: This came with having a consistent voice, structured flows, and templated messages. We ensured to always follow up, never leave anyone hanging, and remember to charge the team or person .
Large was becoming faster and more scalable like a bot, while human agents stayed in the background understanding intent when intent was lost, personalizing results, and curating databases of options for the bot to automatically pull from.
eBay ShopBot: all AI, no humans
Consistent but rigid
No playing favorites. Everyone gets the same amount of attention. eBay ShopBot speaks to everyone with a consistent voice, knowledge, and flow.
Less mistakes. It won’t go off course or skip a step by accident. If it does skip a step, it’s by design. It’s autonomous.
Unaware at birth. Baby AI doesn’t know what to say, what to think, and what to remember. It doesn’t understand what people are saying and what they really mean by it. We need to teach it, a lot, before it can serve anyone. It does only what we program it to do. This raises the question, “Is it smart enough to launch?”
No humans to the rescue. Dead ends are easy to come by when it doesn’t understand. There isn’t any human agents jumping in to move the conversation along. Baby AI better grow up fast.
Won’t bend the rules. The bot is always on the customer’s side by design, but doesn’t know when and how to go out of its way for the customer like a human would.
Scalable but generic
Always on. It doesn’t need a break unless it’s broken. It doesn’t take a paycheck.
Quicker results. It’s not that smart but it’s fast. So fast that we need to add artificial delays and pace the messages to not overwhelm people. The bot can answer more requests at once than any team of humans can.
Generic options. Results aren’t as considered and curated because baby AI doesn’t fully understand intent, personal preferences, or it just didn’t understand the words being used yet. It needs to learn world knowledge, understand the catalogue of items along with individual attributes, it needs to read between the lines. This can be difficult for humans too, not just AI.
More effort from customers. For the AI to grow up and be more personalized, people need to put in the work. This can definitely turn people off. They ask the bot for help and the bot is asking for help in return. Tell me why. Give me feedback. Teach me teach me.
Design direction: improve smarts
Teaching baby AI: Create conversation flows that offer value to the user but also teach the AI. It’s a hard balance to not complicate the user experience for the sake of the bot. But if the AI is going to set the service apart, then slay.
Not designing search: We consciously think beyond traditional search when creating the experience to avoid rebuilding search in chat. Admittedly, it may feel like search in its infancy because of the rigidity, but as the AI gets smarter the more the bot will set itself apart as a personal shopping assistant. We’re designing conversations that lead to the AI understanding intent. The goal is for the bot to only ask the key questions necessary to find the best results for that person.
Managing expectations: The bot doesn’t understand a lot of what people say. Therefore, dead-ends. We can teach people what it can understand and what it can’t. We can offer an alternative path when the bot knows it’s stuck. It can try to correct itself with the help of the customer. Whatever we design, it will never leave someone hanging intentionally.
eBay ShopBot keeps trying and offers other options when it doesn't understand
Building trust: People will learn to trust the bot if it’s consistent, accurate, looking out for them, responsive, transparent, and smart. Biggest hurdle is a smart AI, but it’s not the only trust signal we need to design for. Sentence structure, images, context, and nagginess all are design problems.
eBay ShopBot is still a baby, but it’s getting smarter. It learns by picking up implicit feedback based on people’s actions. It will be even smarter with explicit feedback. We need to design better ways for people to quickly signal if the options the bot offers are on point or not, and why. It also needs history, geography, meteorology, and mathematics classes. It needs to know current events, pop culture, and world knowledge. It’s not so that it can answer these questions, but so that it can find people relevant options.
Either approach to building a bot works depending on your objectives and resources. Not all bots need to be smart and fast or all AI, but it would be easier to scale. And, not all bots are trying to replace humans.
Are you building a half human or all AI bot? Why did you choose that route? What are the benefits and drawbacks you’ve experienced from your approach?
If interested in more of the never-ending challenge of designing for AI, send me a signal by sharing this post or following me.
New Product Development @eBay (ShopBot)
New product development at eBay (ShopBot). Previously designed for Large (slack bot). Once, designed for Assembly and General Assembly. Not the same. Grew into... more
@elainelee about #AI (1) #Disruption (1) #UX (1)