The Language of Technology: How Transcription Shapes Communication
Earlier this year, I had the privilege of giving a talk about my research at Oxford’s All Souls College. To complement the presentation, I collaborated with a chef to design a unique menu inspired by my work in southwest Western Australia. As I was pondering this, I typed “Boorloo,” the Nyungar name for the City of Perth, into my device. Autocorrect, however, had other ideas and promptly replaced it with “Barolo.” While it was amusing, this slip offered a deeper insight into the complexities of language technologies and their inherent biases.
The Hidden Bias of Autocorrect
This amusing miscorrection highlights how increasingly ubiquitous technologies like autocorrect and speech recognition systems are rooted in mainstream English data. The algorithms behind these technologies often prioritize familiar terms and phrases, glossing over cultural or regional vocabulary. In this instance, the dictionary’s ignorance of “Boorloo” reflects a broader tendency to prioritize certain dialects while marginalizing others. This is not merely a trivial occurrence; it underscores the need to recognize and address the ways in which language technologies can shape our understanding of the world.
Understanding the Dynamics of Transcription
The process of transcription, particularly in automatic speech recognition, is far from straightforward. It involves more than just listening and writing down words. Each transcription protocol includes specific assumptions about what constitutes “standard” speech. Renowned linguist Mary Bucholtz eloquently noted that “all transcripts take sides.” The implications are significant: the linguistic choices made during transcription can affect not only how a presentation is perceived but also its content and credibility.
Recent research from Cornell University and Carnegie Mellon serves to illustrate this phenomenon. Viewers who encountered automatically generated subtitles often rated speakers as less clear and knowledgeable compared to those who received accurate captions. The discrepancies in transcription quality directly influenced audience perceptions and underscored the growing importance of accurate transcription.
The Stakes for First Nations Communities
The implications of transcription errors are particularly grave for First Nations people in Australia. The disconnect between conventional transcription practices and authentic communication methods can lead to significant misunderstandings. In many Indigenous communities, elements like pauses and silences carry immense communicative weight. In settings like Wadeye, for instance, silence is not merely an absence of sound; it plays a critical role in conveying meaning.
However, transcription systems, often developed in northern hemisphere academic contexts, typically treat these silences as gaps that need to be filled. This tendency to edit out pauses strips them of their significance, resulting in a loss of essential cultural context.
Furthermore, common words from non-English languages—such as “Boorloo”—are frequently misrepresented or overlooked. This can lead to serious consequences, especially in high-stakes environments, where inaccuracies can impact legal decisions, medical diagnoses, and social welfare.
Today, the rise of Artificial Intelligence (AI) in transcription poses additional challenges. Hospitals and general practices across Australia increasingly rely on AI scribes, which can introduce errors and inaccuracies into critical medical records. A recent study of multiple AI systems revealed that they frequently made mistakes in transcription and documentation. Alarmingly, around 50% of the analyzed samples contained factual inaccuracies, while “hallucinations”—erroneous fabrications—became a common occurrence. In one instance, a male patient was mistakenly recorded as using contraceptive pills, reflecting the dangers of automated systems within sensitive contexts.
Toward Inclusive Transcription Practices
To mitigate these issues, the development of more diverse and inclusive models for automatic speech recognition is essential. However, those of us currently tasked with transcription in various fields—whether in journalism, oral history, legal contexts, or sociolinguistic research—must take specific steps to enhance our practices.
-
Transparency of Conventions: It is imperative to make transcription conventions explicit. By doing so, we acknowledge what our systems can and can’t represent, allowing for greater clarity and understanding.
-
Recognizing Limitations: We must resist the urge to normalize all speech into something that aligns with an imagined standard. Recognizing the limitations of our transcription methods helps in acknowledging the richness and complexity of diverse languages and dialects.
-
Accountability in Representation: Transcribing spoken language into written form is inherently a decision-making process. The goal should not be achieving perfect objectivity but rather ensuring accountability for choices made about what gets included and excluded in the transcript.
In today’s digital age, the intersection of language and technology warrants thoughtful consideration. By addressing biases and emphasizing transparency, we can make strides toward a more equitable representation of diverse voices. The journey begins with recognizing the significance of every word, pause, and silence in our communications.
Inspired by: Source

