hpr3219 :: Linux Inlaws S01E18: Voice Recognition and Text to Speech
How to place fake prank calls into podcasts and what does TTS have to do with this
Hosted by monochromec on Thursday 2020-12-03 is flagged as Explicit and is released under a CC-BY-SA license.
Tags: voice recognition, text to speech, wavenet, tacotron 2, DeepSpeech, Lyrebird.
Listen in ogg, spx, or mp3 format. | Comments (0)
Part of the series: Linux Inlaws
This is Linux Inlaws, a series on free and open source software, black humour, the revolution and freedom in general (this includes ideas and software) and generally having fun.
In this episode, Chris is harassed by quite a few artificial nuisance callers, among drug lords, Irish nurses and some random Linux Inlaws Chief Financial Officer. Based on these examples, our two heroes discuss the history and current state of text-to- speech (TTS) and voice recognition. We attempted to use voice recognition software in order to produce a transcript of the show.
- Wavenet: https://deepmind.com/blog/article/wavenet-generative-model-raw-audio
- Tacotron: https://ai.googleblog.com/2017/12/tacotron-2-generating-human-like-speech.html
- DeepSpeech: https://github.com/mozilla/DeepSpeech
- Lyrebird / Welcome.AI: https://www.welcome.ai/lyrebird
- Nvidia Tacotron 2: https://github.com/NVIDIA/tacotron2
- Tensorflow: https://www.tensorflow.org
- PyTorch: https://pytorch.org
- Melspectrograms: https://medium.com/analytics-vidhya/understanding-the-mel-spectrogram-fca2afa2ce53
- GRAPHCORE: https://www.graphcore.ai
- FGPA: https://en.wikipedia.org/wiki/Field-programmable_gate_array
- IBM ROMP: https://en.wikipedia.org/wiki/IBM_ROMP
- Google's TTS: https://cloud.google.com/text-to-speech
- Apple M1: https://www.gsmarena.com/the_apple_m1_is_the_first_armbased_chipset_for_macs_with_the_fastest_cpu_cores_and_top_igpu-news-46222.php
- Secure Enclaves: https://support.apple.com/guide/security/secure-enclave-overview-sec59b0b31ff/web
- OSDU: https://www.opengroup.org/osdu/forum-homepage
- Jack Kerouac's On the Road: https://en.wikipedia.org/wiki/On_the_Road
Automatically generated using whisper
<< First, < Previous, Next >, Latest >>
whisper --model tiny --language en hpr3219.wav