- Click to share on Facebook (Opens in a new window)
- Click to share on Twitter (Opens in a new window)
- Click here to share on LinkedIn (Opens in a new window)
- Click to email a friend (Opens in a new window)
(CNN Business) - While Google’s latest smartphone, the Pixel 4, got attention at a major launch event in New York on Tuesday, the presentation of a recording and transcription application enhanced by artificial intelligence was perhaps the largest surprise of the day
The Recorder app is designed to record meetings, music, conferences and more. It can recognize and transcribe in real time what you are saying and identify other types of noise, such as music and applause. You can search the recordings for specific words. For example, you can search for “rainbows” and receive results that show where that word was pronounced in each recording.
Recorder will come with the new Pixel phone, Google's flagship phone line to showcase the latest features of its Android application. That phone starts at $ 799, or $ 100 more than the basic iPhone 11 model, and ships to stores on October 24. As of December, Google will also launch the Recorder to older Pixel phone models.
While Recorder may sound like a fairly simple application, Sherry Lin, product manager of Recorder, told CNN Business that it was not easy to make its fast transcription work without losing the life of the phone's battery. Google had to discover how to pack a lot of artificial intelligence on the phone that is usually hidden in a remote server.
"Honestly, when we started we weren't sure if we could meet," Lin said in an interview on Tuesday.
As countless journalists and university students know, there are many applications to record audio on your smartphone, and some of them, such as Otter.ai, use artificial intelligence to translate conversations into transcripts, allowing you to do things like search for the resulting recordings. In general, if you want to do more than simply record a conversation, you will need an Internet connection because much of the artificial intelligence involved in the analysis and transcription, for example, of a fascinating lecture on the Hegelian dialectic, tends to happen on a distant server, and not on your smartphone.
A demonstration of the new Google Recorder application, which will come on your new Pixel phone.
To show how Recorder works on the phone, Sabrina Ellis, vice president of product management at Google, said Tuesday during a demonstration on the application stage that the phone was in airplane mode.
Lin said the reasons for keeping all Recorder operations on the phone are two: to help protect user privacy by keeping audio and related text on the phone, and to allow voice to be translated into text faster than it would have been to make a trip to and from a remote server.
However, making the application useable on a phone was complicated, partly because it is based on multiple artificial intelligence elements that can drain the phone's battery and jam its main processor. These include an artificial intelligence model that is specifically aimed at transcription (a retrained and restructured version of the model that drives the Google Assistant), one that works in the search, another to insert punctuation marks in transcripts and another to classify sounds other than speech.
Lin said that when she and her team started working on the application in March, the transcription model, most of the artificial intelligence application, drained the life of the phone's battery in less than half an hour and made it warm up. .
"We thought: 'We will never arrive unless we send an air conditioning unit with that thing,'" he joked.
At first, the software also froze the phone and was simply too large to send to consumers through Google Play, the company's online application store.
To reduce the artificial intelligence behind the application, Lin said the team "cut" the transcription model and trained it to capture the long-running discourse (this was done essentially by feeding artificial intelligence with long recordings of things like meetings, interviews and YouTube conferences) and ignoring background noise.
Lin said the application does not use remote workers to listen to user recordings, a traditional industry practice with virtual assistants that has been changing following media scrutiny regarding privacy concerns. (An exception could be if a user reports an error, such as a strange static sound, and gives explicit permission for the company to listen to a recording, he said.)
According to Lin, the default application saves all recordings and transcriptions on the phone, and the data is subject to the standard encryption of the Android device. The company cannot see any data related to the recording unless it chooses to export it to a Google product such as Google Drive or Gmail, he said.
One thing that the Recorder team is working on now is to find out who is speaking when there is more than one voice in a recording, Lin said. Currently, the application records all audio as if a single person was speaking and wants to discover how to segment the speech transcribed by the speaker.
"It's one of those things where it is so easy for humans to do and so difficult for a computer system," he said.