Kartik Audhkhasi

Kartik Audhkhasi 

I am currently a Research Staff Member in the Speech Group at IBM Research. My current research focuses on end-to-end/all-neural (E2E) speech recognition and processing. E2E speech systems use deep recurrent/convolutional neural networks to directly solve the machine learning task at hand (e.g. recognizing words or detecting keywords in speech). In comparison, conventional systems use a combination of several disjoint sub-systems such as hidden Markov models and neural networks to achieve the desired objective. Hence, E2E systems are easier to build while being as good as conventional systems.

I also work on other aspects of automatic speech recognition such as language modeling, distributed semantic word embeddings, and keyword search from speech. I have worked extensively on the IARPA Babel program for speech recognition and keyword search for low-resource languages.

My Ph.D. thesis with Prof. Shrikanth Narayanan (SAIL, University of Southern California) focused on “A computational framework for diversity in ensembles for humans and machine systems”. Prior to this, I completed my B.Tech. in Electrical Engineering and M.Tech. in Information & Communication Technology from Indian Institute of Technology, Delhi.

My research interests include automatic speech recognition, natural language processing, neural networks, machine learning, and big data.

Useful Links: Google Scholar Profile     CV     LinkedIn

Office Address:
1101 Kitchawan Road
IBM T. J. Watson Research Center
Yorktown Heights, NY 10598
Email: kaudhkha@us.ibm.com