Ashutosh Chaubey

I am a founding data scientist at Anoki AI, where I work on text-to-speech and multimodal models in audio. I am a Bronze Medallist from the 2021 batch of Indian Institute of Technology, Roorkee

At Anoki, I have been working on projects related to content retrieval using audio/text and text-to-speech. Prior to this, I was working at LG Ad Solutions as a Data Scientist, where I worked on speaker recognition, automatic content recognition using audio and voice cloning. Over the past I have interned at Adobe Research, where I worked with Dr. Sumit Shekhar on active learning for content labelling in documents. I have also interned at Video Analytics Lab, IISc. Bengaluru where I worked with Prof. R. Venkatesh Babu on human pose estimation from a single RGB image. Back at my undergraduate college, I worked with Prof. R. Balasubramanian on automatic evaluation of machine synthesized speech.

Email  /  CV  /  Google Scholar  /  Linkedin  /  Github

I'm interested in speaker-specific properties of speech, security in speech processing and developing data-constrained speech processing systems.

Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition
Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose
To appear in IEEE ASRU, 2023
preprint / cite

Proposed two novel approaches for imposter identification in unseen speaker recognition. One of the approaches is speaker-specific thresholding and the other relies on meta-learning to decouple the problem of imposter identification from speaker identification.

Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose
Interspeech, 2022
poster / paper / cite

Inspired from their use in computer vision, we use relation networks for the task of speaker recognition and propose enhancements in terms of global supervision and faster training regime.

OPAD: An Optimized Policy-based Active Learning Framework for Document Content Analysis
Sumit Shekhar, Bhanu Prakash Reddy Guda, Ashutosh Chaubey, Ishan Jindal, Avneet Jain
CVPR Workshops, 2022
paper / patent / cite

We propose a reinforcement policy based active learning approach for document content labelling tasks such as object detection, layout detection and named entity recognition.

Universal Adversarial Perturbations: A Survey
Ashutosh Chaubey*, Nikhil Agrawal*, Kavya Barnwal, Keerat K. Guliani, Pramod Mehta
Survey paper, arXiv 2020
paper / cite

We present a comprehensive survey on universal adversarial perturbations, both attacks and defenses along with future directions for the topic.

A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech
Ashutosh Chaubey*, Jaynil Jaiswal*, Sasi Kiran Reddy Bhimvarapu, Shashank Kashyap, Puneet Kumar Balasubramanian Raman Partha Pratim Roy
ACPR, 2019
paper / cite

We propose a technique which leverages the discriminator from a GAN based TTS model for automatic evaluation of machine synthesized speech.

Indian Institute of Technology Roorkee
Bachelors of Technology in Computer Science & Engineering
July 2017 - May 2021, Grade Point - 9.718/10

Student Societies:

  • Chair | ACM IIT Roorke Chapter (Link)
  • Co-President | Vision and Language Group (Link)
  • Mentor | Student Mentorship Programme (Link)


Presented my paper A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech at the Student Academic Conference at Inter IIT Tech Meet 2019.

My hobbies include cooking and playing guitar. I am obsessed with the songs of Prateek Kuhad and Anuv Jain.

