TL; DR: Move over, Siri, Alexa, and Google Assistant — researchers at Stanford’s Open Virtual Assistant Lab (OVAL) are hard at work on Almond, an open, privacy-focused virtual assistant. The research prototype translates user commands into customized programs while allowing users to govern installation and server set-up. The ongoing, multidisciplinary effort is growing and expanding, and researchers hope to get the technology into the hands of everyday users within the next year.
Ever since they found a home in internet-connected phones, laptops, and smartwatches, virtual assistants such as Alexa, Siri, and Google Assistant have been eager to serve as our newest windows into the web. These bots can look up weather forecasts, remind you to brush your teeth, and even sing you a lullaby.
But that doesn’t mean they can be trusted. In 2019, for example, several news outlets reported that Amazon, Apple, Facebook, Google, and Microsoft had hired human contractors to analyze anonymized virtual assistant recordings. The revelations contributed to growing concern over whether it is safe to welcome proprietary systems into our homes, offices, and vehicles.
“We see increasing dominance of commercial virtual systems; for example, Amazon’s virtual assistant, Alexa, has 70% of the current market share,” said Giovanni Campagna, Lead Developer of Stanford’s Almond project. “The way I see it, the fact that a significant amount of data goes through Amazon is a risk to consumer privacy.”
That’s why Giovanni and other team members from Stanford’s Open Virtual Assistant Lab (OVAL) are hard at work on their own privacy-focused virtual assistant, Almond. The ultimate goal of the active research project is to provide the public with open-source virtual assistant technology for the open web.
“We want to make it easier for businesses to own this new interface to the web, as well as their relationships with their customers, and make it easier for consumers to switch between various services and maintain ownership of their data,” he said.
Giving Users a Voice in the Virtual Assistant Experience
Almond offers several features that leading proprietary virtual assistants do not. With Almond, users can both monitor and filter commands, making it possible for a user to frequently scan a news site and receive a notification each time a new article is published, or monitor the site only for articles featuring a chosen keyword and receive notifications accordingly. They also have the power to govern the installation and server set up processes, securing data from prying eyes.
The platform is remarkably advanced when you consider the force behind it. Whereas Amazon has thousands of engineers and contractors dedicated to annotating data and improving the Alexa platform, the core Almond team consists of five Ph.D. students and one professor.
“The challenge is to develop AI-based, natural language technology at a scale and a cost where it becomes affordable for everybody to contribute to and build this open-source visual assistant,” Giovanni said. “It shouldn’t only be possible for Google and Amazon.”
The Almond project is fortunate to have OVAL on its side. Stanford professors across several disciplines, including computer science, digital civil society, and public policy, are part of the research lab and are working towards its goal to build an open-source, consumer-friendly virtual assistant (which can be deployed independently of the servers governed by industry giants).
“We believe in privacy, choice, and open competition,” Giovanni said. “We believe in providing convenience to users without having to be mediated by these too-large companies.”
OVAL is led by Faculty Director Prof. Monica Lam, who specializes in virtual assistants, natural language processing, programming in natural language, machine learning, and privacy. She also uses her experience as the co-author of Compilers, Principles, Techniques, and Tools (2nd Edition) to help with Almond, which is a compiler program from a dev standpoint.
“We want to build development infrastructure and systems that make it easier for everybody to contribute to the tool,” Giovanni said. “Then, as the tool improves, everybody can use it to build a bot they can deploy on a website or create a cohesive virtual assistant package that a consumer can download.”
Fostering Enhanced, Privacy-Focused Interactions in the IoT World
Almond’s purpose is to help users access anything on the web via a personal assistant designed to protect user privacy. First, by executing all data operations on a local device, the technology safeguards all login credentials for social networks and financial institutions, among all other online services.
“We’re protecting your actual data — for example, your statement and balance when you access online banking,” Giovanni said. “Because Almond is open-source, you get a choice of where to install it — on your personal computer, your phone, a virtual machine, or a cloud server. We imagine in the future there will be multiple hosting providers for people who lack the interest or the expertise to set up their own server. But the point is, every Almond is independent.”
Second, Almond builds its natural language model without listening in on user conversations.
“We don’t have the capability to do that, nor do we have the interest to do that from a privacy point of view,” Giovanni said. “Instead, we built a system where the bulk of training data is synthesized from developer information. We can generate a million dialogues automatically.”
The small amount of user data that Almond does require can be collected from users who agree to share data for research purposes.
“You can even acquire that from your own market research development team if you are a business that has access to that type of resource,” Giovanni said. “The idea is to reduce the amount of real data that you have to collect and annotate as a way to preserve privacy.”
An Ongoing, Multidisciplinary Research Effort
Giovanni said Almond, part of a multidisciplinary academic research project that studies aspects of human-computer interaction, is still very much a prototype.
“Understanding how people are going to interact with the system is absolutely part of our research,” he said. “We started building as a platform for our own research, and we hope to evolve first as a platform for other researchers, and, in the future, as a consumer product. But we’re not quite there yet.”
This summer, the Alfred P. Sloan Foundation granted the Almond project funds to be used toward a public release of the open-source project.
“We have development releases out there for very early adopters, but we want to have something in the hands of users around June of next year,” Giovanni said.
For the initial release, the Almond team will focus on providing a streamlined user experience built around the most frequently used digital assistant skills, including weather timers, reminders, and music, among other sound-based actions suitable for a smart speaker.
“The natural comparison is trying to match Alexa, but we also hope that our technology is able to build richer conversations over multiple turns — so that the user can ask questions, and the bot can offer questions for clarification, make suggestions, and so on.”
He said the team’s goal is to achieve those advancements while maintaining low development costs.
“We’re fortunate to be in the Stanford environment. There are lots of professors and students who are interested in social good projects, and they have great ideas from NLP, HCI, security, and privacy. We have a track record of building systems that are used by many people (from software-defined networking to compiler infrastructure). My last project as a Ph.D. student here will involve getting this in the hands of users. And that’s where you learn the most.”