### Our Research Direction: Designing for Accessibility
In the digital age, achieving equity in technology means ensuring that every user has fair access to all online resources. However, our early research has uncovered a significant challenge in this pursuit: the “accessibility gap.” This term describes the lag between the release of new features and the delay in creating assistive layers to make these features usable for individuals with disabilities. In response, we are championing a shift from merely reactive tools to innovative agentic systems that integrate seamlessly with user interfaces.
### Research Pillar: Using Multi-System Agents to Improve Accessibility
At the heart of our accessible technology initiatives are multimodal AI tools. These sophisticated systems hold immense potential for building user-friendly interfaces that cater to diverse needs. One of our noteworthy prototypes showcases this approach in the realm of web readability, where we’ve implemented a model featuring a central Orchestrator that acts as a strategic reading manager.
#### The Orchestrator’s Role
The Orchestrator simplifies navigation by maintaining a shared understanding of the document at hand. This innovative system allows users to avoid the pitfalls of complex menus, enabling more straightforward interactions. By effectively delegating tasks to specialized sub-agents, our design makes content more accessible:
– **The Summarization Agent**: This sub-agent excels at processing intricate documents, distilling complex information into manageable pieces. It breaks down content into digestible bits, ensuring crucial insights are highlighted and easy to understand.
– **The Settings Agent**: A dynamic assistant that manages user interface adjustments, like text scaling, on-the-fly. This means users don’t have to fumble with static menus; the system adapts to their needs in real-time.
Our research indicates that implementing this modular approach significantly enhances user interactions. By ensuring that specialized tasks are handled by the appropriate agent, users can engage with technology more intuitively, sidestepping the frustration of searching for the “correct” button on their devices.
### Toward Multimodal Fluency
Our exploration doesn’t stop at improved text readability. We’re diligently working to transcend basic text-to-speech functions, aiming for multimodal fluency. Leveraging Gemini’s capability to process voice, vision, and text concurrently, we’ve developed prototypes that convert live video feeds into immediate, interactive audio descriptions.
#### Enhancing Situational Awareness
This approach is more than a simple narration of visual content; it fosters situational awareness. During our co-design sessions, we’ve gathered valuable insights into how allowing users to interactively query their surroundings can reshape their digital experiences. For instance, by asking for specific visual details in real-time, users can actively engage with their environment, significantly reducing cognitive load. This transformation turns a passive viewing experience into an immersive, conversational exploration, empowering users to gather information in dynamic settings actively.
### Transforming Accessibility Engagement
Our commitment to bridging the accessibility gap represents a fundamental shift in how we think about design in technology. By harnessing sophisticated AI systems and creating user-centric tools, we are paving the way for a more inclusive digital landscape. The combination of strategic orchestration and multimodal engagement promises to change the way users interact with technology, making it more accessible for everyone. Through ongoing research and development, we aim to continue pushing the envelope of what is possible in accessible design.
Inspired by: Source

