But few people have the language proficiency to transcribe audio manually. Inspired by voice assistants like Siri, Mahelona started researching natural language processing. “It became absolutely necessary to teach computers to speak Maori,” says Jones.
But Te Hiku faces a chicken and egg problem. To build a I’m here speech recognition model, it needs a lot of sounds to be transcribed. To transcribe audio, it needed cutting-edge speakers whose small numbers it tried to make up for in the first place. However, there are many beginner and intermediate speakers who can read I’m here louder than the words they can recognize in the recording.
So Jones and Mahelona, along with Te Hiku COO Suzanne Duncan, came up with a clever solution: instead of transcribing existing audio, they would ask people to record themselves while reading a series of sentences designed to record all sounds in the language. For an algorithm, the resulting dataset should serve the same function. From those thousands of pairs of spoken and written sentences, it will learn to recognize I’m here syllables in sound.
The team announced about a contest. Jones, Mahelona and Duncan contacted every Māori community group they could find, including traditional kapa haka dance troupe and waka ama teams race and reveal that the team that submits the most records wins the grand prize of $5,000.
The entire community has mobilized. Competition heated up. A Maori community member, Te Mihinga Komene, an educator and advocate for the use of digital technology for recovery I’m hereRecord 4,000 phrases alone.
Money is not the only motivator. People trusted Te Hiku’s vision and trusted it to protect their data. “Te Hiku Media says, ‘What you give us, we are here as kaitiaki [guardians]. We take care of it, but you still own your sound,” says Te Mihinga. “That’s important. Those values define us as Maori. ”
Within 10 days, Te Hiku had amassed 310 hours of speech-text pairs from about 200,000 audio recordings made by about 2,500 people, an unprecedented level of interaction among researchers in the AI community. . Caleb Moses, a Maori data scientist who was involved in the project, said: “Nobody can do it except a Māori organization.
The amount of data is still small compared to the thousands of hours typically used to train English language models, but it is enough to get started. Using data to launch an existing open source model from the Mozilla Foundation, Te Hiku created its first model. I’m here speech recognition model with 86% accuracy.