Text-to-Speech and Decoder

Speech recognition for Hindi language

Text-to-Speech model for Hindi language (Flipkart):- The aim of this project was to generate audio speech in Hindi language for the given text sentences. I started on this project while I was working with Flipkart. ​ Key learnings:

  • Unsupervised generative models
  • Autoregressive models
  • Flow (& inverse Flow) based models
  • Papers I’ve read - Link

C++ decoder for Speech recognition engine (Flipkart):- Worked on the decoder module of the ASR pipeline (Automated Speech Recognition). Key responsibilities:

  • Implemented new features into production with C++ code base.
  • Improved latency & memory consumption.
  • Blog post 1 - Intro to CTC Loss
  • Blog post 2 - DP based prefix beam search PP
Image reference link. This image is only for a general overview of the ASR pipeline and doesn't reflect the actual work.