Unveiling Granite 4.0 1B Speech: A Breakthrough in Automatic Speech Recognition
We’re thrilled to introduce Granite 4.0 1B Speech, the latest innovation in IBM’s Granite Speech collection. This state-of-the-art model is specifically designed for enterprise applications that operate on resource-constrained devices. With a compact architecture focused on multilingual Automatic Speech Recognition (ASR) and bidirectional Speech Translation (AST), this model promises significant enhancements in performance and versatility.
Key Features of Granite 4.0 1B Speech
One of the standout aspects of Granite 4.0 1B Speech is its optimized design. Boasting half the parameters of its predecessor, granite-speech-3.3-2b, this new model offers improved English transcription accuracy and faster inference times. The implementation of speculative decoding technology ensures quicker responses without sacrificing quality.
Granite 4.0 1B Speech now supports a broader range of languages, including English, French, German, Spanish, Portuguese, and Japanese. The addition of Japanese ASR support is particularly noteworthy, as it’s a highly requested feature in the community. Furthermore, the model introduces keyword list biasing, which enhances its capacity to recognize names and acronyms, catering to specific user needs effectively.
Performance Metrics
Granite 4.0 1B Speech has recently achieved the impressive title of #1 on the OpenASR leaderboard. This ranking reflects not only its strong performance among open speech recognition systems but also its efficiency in handling complex tasks.
Measuring its effectiveness relies on metrics like Word Error Rate (WER)—the percentage of words transcribed incorrectly. A lower WER indicates better accuracy, and Granite 4.0 1B Speech excels in this domain. As illustrated in Chart 1, the model delivers impressive WER across various datasets while maintaining a lower parameter count than many of its competitors.
Chart 1: Granite 4.0 1B Speech achieves exceptionally low WER, demonstrating robust ASR accuracy across numerous benchmarks despite being a compact model.
Compatibility and Licensing
Maintaining the tradition of its predecessors, Granite 4.0 1B Speech is released under the Apache 2.0 license. This ensures that developers and enterprises can integrate it seamlessly into their applications. Native support is available within both Transformers and vLLM, making the model accessible for a wide range of usages.
Our evaluation spans a multitude of standard ASR and AST benchmarks across English, multilingual, and translation tasks. Granite 4.0 1B Speech consistently performs comparably to models with a significantly higher parameter count, showcasing its efficiency and reliability.
Recommended Usage
For optimal performance, we recommend pairing Granite 4.0 1B Speech with Granite Guardian in production settings. This combination provides additional risk detection capabilities, ensuring that applications can leverage the full power of the model while maintaining safety and compliance.
Try It Out!
Ready to experience the innovation of Granite 4.0 1B Speech? Don’t hesitate to experiment with the model and share your feedback. Your input is invaluable as we continue to enhance and refine our offerings.
Explore the future of speech recognition technology today!
Inspired by: Source

