Accessibility and Interaction in a World Beyond Screens

Android Makers by droidcon is one of the largest conferences globally for Android developers. It features talks, workshops, and networking opportunities, all focused on the latest advancements in Android development. I had the opportunity to give a keynote speech at last week’s conference held in Paris from 25th to 26th April 2024.  

In my keynote, I discussed how the increased usage of emerging technologies, like BCI, wearables, IoT, XR, etc., is drastically changing traditional interaction patterns. As we transition from graphical user interfaces to organic user interfaces, what does this mean for accessibility? See my slides here: 

Slides for Accessibility and Interaction in a World Beyond Screens keynote


The Challange

Accessibility is about removing barriers that allow users to engage and participate in everyday activities. Approximately 1.3 Billion people have a significant disability. That’s 1 in 6 people or 16% of the population. What does it mean to fully include them in our product journeys?

At Google, we have a product inclusion North Star that compels us to “Build for Everyone.” We work with communities with different usability needs from start to finish. Product inclusion can’t be condensed into a list of checkboxes. We have to understand the unique needs, preferences, and challenges our users face at various stages of the development process, and hold our product teams accountable to address these needs. Being intentional during ideation, UX research and design, implementation, user testing, and marketing is key to successful, inclusive outcomes. We build for equitable user experiences, not just minimum usability. And we hold ourselves accountable through testing and best practices.

Emerging Interaction Patterns and Interfaces

Command Line Interfaces (CLIs): The earliest interfaces, CLIs, were static, somewhat disconnected, and abstract. Understanding a system through a CLI can be very different than understanding it visually.  It's very directed, with clear commands. The user bears responsibility for recalling all of the commands.  If you stray from the list of specific commands, the system doesn't execute.

Graphical User Interfaces (GUIs): With GUIs, we have a more responsive interface that is somewhat indirect. We can have compound actions and explore what is available to us visually.  Icons also help us understand the purpose of a particular action.

Natural User Interfaces: Now we have more natural interfaces that are contextual and intuitive. An example of this is using a stylus with an iPad that has different texture brushes and pressure points that allow for a very similar experience to drawing and painting.

Organic User Interfaces: With a lot of these new emerging technologies, we see a move to an organic user interface that is more fluid and anticipates what we're going to be doing, especially with AI-enabled technologies or brain-computer interfaces (BCIs).

Introducing the (SMaG) Interaction Model

The (SMaG) Interaction Model describes these emerging interaction patterns–patterns that can be used to create more accessible experiences. These include voice, touch, manipulation, gestures, and brain-computer interfaces.

Speech Recognition: Speech is the most natural interaction pattern for those with mobility and vision disabilities.  For instance, you might say "Hey watch, say something funny" to a connected home device.  This initiates an audio stream that is captured by the device.  The system then goes through a process of understanding what was actually said through natural language understanding.  The system will try to understand the different tokens and what they might mean and their sentiment. Then, it will personalize the response based on context, memory, and knowledge of the user. The vision for your system will determine how these components are put together.  

When considering speech interfaces for accessibility, think of how we can holistically address user needs. For instance, speech interfaces may serve populations with vision or motion impairment well, as these interactions do not reqire sight, hand movement or body effort. It's important to note that voice is best used in quiet environments.  Also, avoid using voice in environments where errors can be costly because this process of understanding what is being said may not reach the desired accuracy threshold. Also, high accuracy may be necessary to avoid excluding groups who rely solely on a speech interface.

Manipulation: Using touch as a method of interaction is another way to ensure accessibility. There's discrete touch, continuous touch, deformation, and haptic feedback.  These interactions work well in mixed-reality environments with different controllers that allow for the accuracy needed to interact with that environment. Touch interaction can be accurate and reliable.

The best situations to use touch interfaces for accessibility purposes are when physical form compliments or enhances function, precision and accuracy are desired, public places where voice or gesture tracking would be difficult, and for populations with sight, hearing, or speech impairments. Keep in mind this interface may exclude populations with motion impairments.

Gestures: Gestures are even more natural and organic.  We have hand gestures that can be detected in a computer vision-enabled environment, like AR or VR. Gestures, including hand gestures, eye movements facial movements, and body movements, can be even more natural than touch interaction.  These systems understand what a user is trying to convey based on facial expression, and hand and eye movement.  Body movement can also be tracked to understand the position of your body in a particular action.  

Situations where gesture-based interactions shine are when users are interacting with large displays, conditions where you’d want to avoid touch for sterility purposes, and for populations with speech or hearing impairments. You want to avoid using gestures in public places where the subject would be hard to separate from a crowd and interactions where errors can have life-threatening impact. Like speech interfaces, you have to determine the necessary accuracy threshold to avoid excluding populations solely relying on gesture interactions.

Brain-Computer Interfaces (BCI): BCIs are a novel interaction method where sensors can track thoughts and brain impulses to determine what a user wants to do. BCIs allow people to control devices using their thoughts by translating brain signals into commands that can control hardware or software, such as robotic arms or computers. BCIs use electrodes to detect brain signals from the scalp, brain, or cerebral cortex, and then analyze those signals to determine what commands to relay to an output device.

BCIs are still experimental, but they can enable vast potential. For example, they could help people with paralysis control their limbs, or allow those with speech or hearing impairments to communicate with more people. A major risk to note is a direct connection between brain and machine raises huge privacy and security concerns that must be considered when developing these interfaces.

How We Build for the Future

Given the new technologies and interaction patterns emerging, how can Android developers prepare and build for this future? Here are some tools we’ll explore:

  1. MediaPipe: An open source framework from Google that enables machine learning capabilities for applications that use voice, computer vision, and text, and can be used on mobile, web, desktop, edge devices, and IoT

  2. ARCore: A cross-platform augmented reality (AR) SDK from Google that allows developers to build immersive experiences on Android, iOS, Unity, and Web.

  3. OpenBCI: A collection of libraries (including Java and Kotlin)  to help build brain-computer interfaces into your applications. 

Stay tuned for the next post in this series where we’ll dig into ways MediaPipe can help us build a more accessible future with emerging technologies.

Conclusion

We are all a part of building a future where no one is excluded from engaging with our products. Build your application to immerse people. Let them take full part in your application and not just be a user or an actor in your system, but someone who is delighted by a system they feel is tailor-made for them.