ASR.2: Automatic Speech Recognition II |
Session Type: Poster |
Poster Time: Monday, December 18, 15:30 - 17:00 |
Location: Poster Area |
Session Chair: Thomas Hain, University of Sheffield
|
|
ASR.2.1: EXPLORING ARCHITECTURES, DATA AND UNITS FOR STREAMING END-TO-END SPEECH RECOGNITION WITH RNN-TRANSDUCER |
Kanishka Rao; Google Inc., United States |
Hasim Sak; Google Inc., United States |
Rohit Prabhavalkar; Google Inc., United States |
|
ASR.2.2: UNSUPERVISED ADAPTATION OF STUDENT DNNS LEARNED FROM TEACHER RNNS FOR IMPROVED ASR PERFORMANCE |
Lahiru Samarakoon; Hong Kong University of Science and Technology, China |
Brian Mak; Hong Kong University of Science and Technology, China |
|
ASR.2.3: EXPLORING NEURAL TRANSDUCERS FOR END-TO-END SPEECH RECOGNITION |
Eric Battenberg; Baidu SVAIL, United States |
Jitong Chen; Baidu SVAIL, United States |
Rewon Child; Baidu SVAIL, United States |
Adam Coates; Baidu SVAIL, United States |
Yashesh Gaur; Baidu SVAIL, United States |
Yi Li; Baidu SVAIL, United States |
Hairong Liu; Baidu SVAIL, United States |
Sanjeev Satheesh; Baidu SVAIL, United States |
Anuroop Sriram; Baidu SVAIL, United States |
Zhenyao Zhu; Baidu SVAIL, United States |
|
ASR.2.4: UNSUPERVISED ADAPTATION WITH DOMAIN SEPARATION NETWORKS FOR ROBUST SPEECH RECOGNITION |
Zhong Meng; Georgia Institute of Technology, United States |
Zhuo Chen; Microsoft Corporation, United States |
Vadim Mazalov; Microsoft Corporation, United States |
Jinyu Li; Microsoft Corporation, United States |
Yifan Gong; Microsoft Corporation, United States |
|
ASR.2.5: INCREMENTAL TRAINING AND CONSTRUCTING THE VERY DEEP CONVOLUTIONAL RESIDUAL NETWORK ACOUSTIC MODELS |
Sheng Li; National Institute of Information and Communications Technology, Japan |
Xugang Lu; National Institute of Information and Communications Technology, Japan |
Peng Shen; National Institute of Information and Communications Technology, Japan |
Ryoichi Takashima; National Institute of Information and Communications Technology, Japan |
Tatsuya Kawahara; Kyoto University, Japan |
Hisashi Kawai; National Institute of Information and Communications Technology, Japan |
|
ASR.2.6: ON LATTICE GENERATION FOR LARGE VOCABULARY SPEECH RECOGNITION |
David Rybach; Google Inc., Germany |
Michael Riley; Google Inc., United States |
Johan Schalkwyk; Google Inc., United States |
|
ASR.2.7: SIMPLIFYING VERY DEEP CONVOLUTIONAL NEURAL NETWORK ARCHITECTURES FOR ROBUST SPEECH RECOGNITION |
Joanna Rownicka; University of Edinburgh, United Kingdom |
Steve Renals; University of Edinburgh, United Kingdom |
Peter Bell; University of Edinburgh, United Kingdom |
|
ASR.2.8: LANGUAGE MODELING WITH HIGHWAY LSTM |
Gakuto Kurata; IBM Research, Japan |
Bhuvana Ramabhadran; IBM Research, United States |
George Saon; IBM Research, United States |
Abhinav Sethy; IBM Research, United States |
|
ASR.2.9: DIRECT MODELING OF RAW AUDIO WITH DNNS FOR WAKE WORD DETECTION |
Kenichi Kumatani; Amazon Inc., United States |
Sankaran Panchapagesan; Amazon Inc., United States |
Minhua Wu; Amazon Inc., United States |
Minjae Kim; Amazon Inc., United States |
Nikko Strom; Amazon Inc., United States |
Gautam Tiwari; Amazon Inc., United States |
Arindam Mandal; Amazon Inc., United States |
|
ASR.2.10: IMPROVING THE EFFICIENCY OF FORWARD-BACKWARD ALGORITHM USING BATCHED COMPUTATION IN TENSORFLOW |
Khe Chai Sim; Google Inc., United States |
Arun Narayanan; Google Inc., United States |
Tom Bagby; Google Inc., United States |
Tara Sainath; Google Inc., United States |
Michiel Bacchiani; Google Inc., United States |
|
ASR.2.11: LANGUAGE INDEPENDENT END-TO-END ARCHITECTURE FOR JOINT LANGUAGE IDENTIFICATION AND SPEECH RECOGNITION |
Shinji Watanabe; Johns Hopkins University, United States |
Takaaki Hori; Mitsubishi Electric Research Laboratories, United States |
John Hershey; Mitsubishi Electric Research Laboratories, United States |
|
ASR.2.12: KEYWORD SPOTTING FOR GOOGLE ASSISTANT USING CONTEXTUAL SPEECH RECOGNITION |
Assaf Hurwitz Michaely; Google Inc., United States |
Xuedong Zhang; Google Inc., United States |
Gabor Simko; Google Inc., United States |
Carolina Parada; Google Inc., United States |
Petar Aleksic; Google Inc., United States |
|
ASR.2.13: INVESTIGATION OF TRANSFER LEARNING FOR ASR USING LF-MMI TRAINED NEURAL NETWORKS |
Pegah Ghahremani; Johns Hopkins University, United States |
Vimal Manohar; Johns Hopkins University, United States |
Hossein Hadian; Johns Hopkins University, United States |
Daniel Povey; Johns Hopkins University, United States |
Sanjeev Khudanpur; Johns Hopkins University, United States |
|
ASR.2.14: MULTI-LEVEL LANGUAGE MODELING AND DECODING FOR OPEN VOCABULARY END-TO-END SPEECH RECOGNITION |
Takaaki Hori; Mitsubishi Electric Research Laboratories, United States |
Shinji Watanabe; Johns Hopkins University, United States |
John Hershey; Mitsubishi Electric Research Laboratories, United States |
|
ASR.2.15: LANGUAGE MODELING WITH NEURAL TRANS-DIMENSIONAL RANDOM FIELDS |
Bin Wang; Tsinghua university, China |
Zhijian Ou; Tsinghua university, China |
|
ASR.2.16: LISTENING WHILE SPEAKING: SPEECH CHAIN BY DEEP LEARNING |
Andros Tjandra; Nara Institute of Science and Technology, Japan |
Sakriani Sakti; Nara Institute of Science and Technology, Japan |
Satoshi Nakamura; Nara Institute of Science and Technology, Japan |
|
ASR.2.17: ATTENTION-BASED WAV2TEXT WITH FEATURE TRANSFER LEARNING |
Andros Tjandra; Nara Institute of Science and Technology, Japan |
Sakriani Sakti; Nara Institute of Science and Technology, Japan |
Satoshi Nakamura; Nara Institute of Science and Technology, Japan |
|