Ashutosh Chaubey

I am a CS PhD student at the Institute for Creative Technologies, University of Southern California, where I am advised by Prof. Mohammad Soleymani at the Intelligent Human Perception Lab. I am a Bronze Medallist from the 2021 batch of Indian Institute of Technology, Roorkee

Prior to this, I was a Founding Research Engineer at Anoki AI where I worked on projects related to content retrieval using audio/text and text-to-speech. I have also worked at LG Ad Solutions as a Data Scientist, where I spent some time working on speaker recognition, automatic content recognition using audio and voice cloning. Over the past I have interned at Adobe Research, where I worked with Dr. Sumit Shekhar on active learning for content labelling in documents. I have also interned at Video Analytics Lab, IISc. Bengaluru where I worked with Prof. R. Venkatesh Babu on human pose estimation from a single RGB image. Back at my undergraduate college, I worked with Prof. R. Balasubramanian on automatic evaluation of machine synthesized speech.

I am always eager to collaborate on research with people in academia as well as industry. Please reach out to achaubey at usc dot edu to discuss potential collaborations.
Currently, I am looking for Research/Applied Scientist internship positions for Summer 2025. Please reach out if you have open positions.

Email  /  CV  /  Google Scholar  /  Linkedin  /  Github

profile photo
Masters/Undergrad Students

If you are a student and want to have a discussion with me regarding my papers or how to apply for a PhD program in the US, please email me at achaubey at usc dot edu

For students who wish to join our lab, please check our lab's open positions.

News
  • Aug '24 - I joined the Intelligent Human Perception Lab @ Institute for Creative Technologies, USC.
  • Apr '24 - I will be starting my PhD @ University of Southern California starting this Fall. Fight on!
  • Sep '23 - One paper has been accepted at ASRU, 2023. See you in Taipei!
  • Apr '23 - I have joined Anoki Inc. as a Founding Research Engineer.
  • Sep '22 - I will be at Interspeech 2022 at Incheon, Korea.
  • Jun '22 - One paper has been accepted at Interspeech, 2022!
  • Jul '21 - I have started my industry experience by joining LG Ad Solutions as a Data Scientist. On to new challenges!
Research

I have experience working on speaker recognition and speech processing systems. More recently, I am working on computer vision where I deal with problems related to multi-person interaction, behaviour generation and diffusion models.

Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition
Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose
IEEE ASRU, 2023
poster / paper / cite

Proposed two novel approaches for imposter identification in unseen speaker recognition. One of the approaches is speaker-specific thresholding and the other relies on meta-learning to decouple the problem of imposter identification from speaker identification.

Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose
Interspeech, 2022
poster / paper / cite

Inspired from their use in computer vision, we use relation networks for the task of speaker recognition and propose enhancements in terms of global supervision and faster training regime.

OPAD: An Optimized Policy-based Active Learning Framework for Document Content Analysis
Sumit Shekhar, Bhanu Prakash Reddy Guda, Ashutosh Chaubey, Ishan Jindal, Avneet Jain
CVPR Workshops, 2022
paper / patent / cite

We propose a reinforcement policy based active learning approach for document content labelling tasks such as object detection, layout detection and named entity recognition.

Universal Adversarial Perturbations: A Survey
Ashutosh Chaubey*, Nikhil Agrawal*, Kavya Barnwal, Keerat K. Guliani, Pramod Mehta
Survey paper, arXiv 2020
paper / cite

We present a comprehensive survey on universal adversarial perturbations, both attacks and defenses along with future directions for the topic.

A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech
Ashutosh Chaubey*, Jaynil Jaiswal*, Sasi Kiran Reddy Bhimvarapu, Shashank Kashyap, Puneet Kumar Balasubramanian Raman Partha Pratim Roy
ACPR, 2019
paper / cite

We propose a technique which leverages the discriminator from a GAN based TTS model for automatic evaluation of machine synthesized speech.

Education
University of Southern California
PhD, Computer Science
August 2023 - Present

Graduate Researcher - Intelligent Human Perception Lab, Institute for Creative Technologies

Indian Institute of Technology Roorkee
BS, Computer Science
July 2017 - May 2021, Grade Point - 9.718/10

Student Societies:

  • Chair | ACM IIT Roorke Chapter (Link)
  • Co-President | Vision and Language Group (Link)
  • Mentor | Student Mentorship Programme (Link)

Miscellaneous

Presented my paper A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech at the Student Academic Conference at Inter IIT Tech Meet 2019.

My hobbies include cooking and playing guitar. I am obsessed with the songs of Prateek Kuhad and Anuv Jain.



This template has been stolen from here.