Tue. 01/10 – Now We Have VALL-E To Take Over My Podcast Voice

Tech Brew Ride Home

Amalgamated Internets, LLC

Technology, News, Tech News

4.7 • 1K Ratings

🗓️ 10 January 2023

⏱️ 20 minutes

🔗️ Recording | iTunes | RSS

🧾️ Download transcript

Summary

Well, now there’s VALL-E, a text to speech technology that could fully replace me as this podcast narrator. It looks like Microsoft wants to do everything just short of buying OpenAI entirely. More layoffs at Coinbase. Why the whole 5G interfering with airplanes thing still isn’t resolved. And not everything that says it’s ChatGPT, is really ChatGPT. Sponsors: RefundsPro.com Links: Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio (ArsTechnica) Microsoft eyes $10 billion bet on ChatGPT (Semafor) Buy with Prime, which brings Prime to third-party sites, officially launches in U.S. on Jan. 31 (TechCrunch) Coinbase to slash 20% of workforce in second major round of job cuts (CNBC) FAA giving airlines another year to fix altimeters that can’t handle 5G signals (Ars Technica) Sketchy ChatGPT App Soars Up App Store Charts, Charges $7.99 Weekly Subscription (MacRumors) YouTube Experiment 1 YouTube Experiment 2 Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript

Click on a timestamp to play from that location

0:00.0	Welcome to the Tech Meme Ride Home for Tuesday, January 10th, 2023. I'm Brian McCullough today.
0:09.0	Well, now there's Voli, a text-to-speech technology that could fully replace me as this podcast narrator.
0:16.6	It looks like Microsoft wants to do everything just short of buying open AI entirely.
0:21.9	More layoffs at coin base, Why the whole 5G interfering with airplanes thing still
0:26.4	isn't resolved and not everything that says its chat gp t is really chat gp t. Here's what you missed today in the world of tech.
0:35.0	Well it seems as though once again my instinct to investigate deeper into a topic was perfectly time.
0:44.6	Microsoft has unveiled Vol E.
0:47.3	A text-to-speech AI model trained on 60,000 hours of English speech that can simulate a person's voice from just three seconds of sample audio.
0:57.0	Quoting Aris Technica.
0:59.0	Once it learns a specific voice, Volly can synthesise audio of that person saying anything and do it in a way that
1:07.4	attempts to preserve the speaker's emotional tone. Its creators speculate that
1:12.1	Volley could be used for high quality text to speech applications,
1:16.0	speech editing where a recording of a person could be edited and changed from a text transcript,
1:22.0	making them say something they originally didn't,
1:26.3	and audio content creation when combined with other generative AI models like GPT 3.
1:33.6	Microsoft calls Vol E a neural codic language model,
1:38.7	and it builds off of a technology called Encodic,
1:42.0	which Meta announced in October 2022.
1:45.0	Unlike other text-to-speech methods that typically synthesize speech by manipulating waveforms,
1:51.0	volley generates discrete audio-coding codes from text and acoustic
1:56.3	prompts.
1:57.6	It basically analyzes how person sounds, breaks that information into discrete components called tokens, thanks to Encoddick,
	...

Please login to see the full transcript.

Previous episode | Next episode

Disclaimer: The podcast and artwork embedded on this page are from Amalgamated Internets, LLC, and are the property of its owner and not affiliated with or endorsed by Tapesearch.

Generated transcripts are the property of Amalgamated Internets, LLC and are distributed freely under the Fair Use doctrine. Transcripts generated by Tapesearch are not guaranteed to be accurate.