Ashutosh Chaubey
I am a CS PhD student at the Institute for Creative Technologies, University of Southern California, where I am advised by Prof. Mohammad Soleymani at the Intelligent Human Perception Lab. I am a Bronze Medallist from the 2021 batch of Indian Institute of Technology, Roorkee
Prior to this, I was a Founding Research Engineer at Anoki AI where I worked on projects related to content retrieval using audio/text and text-to-speech.
I have also worked at LG Ad Solutions as a Data Scientist,
where I spent some time working on speaker recognition, automatic content recognition using audio and voice cloning.
Over the past I have interned at Adobe Research,
where I worked with Dr. Sumit Shekhar
on active learning for content labelling in documents.
I have also interned at Video Analytics Lab, IISc. Bengaluru
where I worked with Prof. R. Venkatesh Babu
on human pose estimation from a single RGB image. Back at my undergraduate college,
I worked with Prof. R. Balasubramanian
on automatic evaluation of machine synthesized speech.
I am always eager to collaborate on research with people in academia as well as industry. Please reach out to achaubey at usc dot edu to discuss potential collaborations.
Currently, I am looking for Research/Applied Scientist internship positions for Summer 2025. Please reach out if you have open positions.
Email  / 
CV  / 
Google Scholar  / 
Linkedin  / 
Github
|
|
Masters/Undergrad Students
If you are a student and want to have a discussion with me regarding my papers or how to apply for a PhD program in the US, please email me at achaubey at usc dot edu
For students who wish to join our lab, please check our lab's open positions.
News
- Aug '24 - I joined the Intelligent Human Perception Lab @ Institute for Creative Technologies, USC.
- Apr '24 - I will be starting my PhD @ University of Southern California starting this Fall. Fight on!
- Sep '23 - One paper has been accepted at ASRU, 2023. See you in Taipei!
- Apr '23 - I have joined Anoki Inc. as a Founding Research Engineer.
- Sep '22 - I will be at Interspeech 2022 at Incheon, Korea.
- Jun '22 - One paper has been accepted at Interspeech, 2022!
- Jul '21 - I have started my industry experience by joining LG Ad Solutions as a Data Scientist. On to new challenges!
|
Research
I have experience working on speaker recognition and speech processing systems.
More recently, I am working on computer vision where I deal with problems related to multi-person interaction, behaviour generation and diffusion models.
|
|
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition
Ashutosh Chaubey,
Sparsh Sinha,
Susmita Ghose
IEEE ASRU, 2023
poster
/
paper
/
cite
Proposed two novel approaches for imposter identification in unseen speaker recognition.
One of the approaches is speaker-specific thresholding and the other relies on meta-learning to decouple
the problem of imposter identification from speaker identification.
|
|
Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey,
Sparsh Sinha,
Susmita Ghose
Interspeech, 2022
poster
/
paper
/
cite
Inspired from their use in computer vision, we use relation networks for the task of speaker recognition and propose enhancements in terms of global supervision and faster training regime.
|
|
OPAD: An Optimized Policy-based Active Learning Framework for Document Content Analysis
Sumit Shekhar,
Bhanu Prakash Reddy Guda,
Ashutosh Chaubey,
Ishan Jindal,
Avneet Jain
CVPR Workshops, 2022
paper
/
patent
/
cite
We propose a reinforcement policy based active learning approach for document content labelling tasks such as object detection, layout detection and named entity recognition.
|
|
Universal Adversarial Perturbations: A Survey
Ashutosh Chaubey*,
Nikhil Agrawal*,
Kavya Barnwal,
Keerat K. Guliani,
Pramod Mehta
Survey paper, arXiv 2020
paper
/
cite
We present a comprehensive survey on universal adversarial perturbations, both attacks and defenses along with future directions for the topic.
|
|
A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech
Ashutosh Chaubey*,
Jaynil Jaiswal*,
Sasi Kiran Reddy Bhimvarapu,
Shashank Kashyap,
Puneet Kumar
Balasubramanian Raman
Partha Pratim Roy
ACPR, 2019
paper
/
cite
We propose a technique which leverages the discriminator from a GAN based TTS model for automatic evaluation of machine synthesized speech.
|
|
University of Southern California
PhD, Computer Science
August 2023 - Present
Graduate Researcher - Intelligent Human Perception Lab, Institute for Creative Technologies
|
|
Indian Institute of Technology Roorkee
BS, Computer Science
July 2017 - May 2021, Grade Point - 9.718/10
Student Societies:
- Chair | ACM IIT Roorke Chapter (Link)
- Co-President | Vision and Language Group (Link)
- Mentor | Student Mentorship Programme (Link)
|
This template has been stolen from here.
|
|