This webpage may contain errors. Please do NOT trust the following list, although the maintainer has tried his best to correct the mistakes. If you find an error, please contact the maintainer via email at “contact [at] ishikawa.cc”.

Highly contributed researchers

Statistics

Modeling Future Cost for Neural Machine Translation

Authors: Chaoqun Duan, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Conghui Zhu, Tiejun Zhao

Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System

Authors: Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Pretraining Techniques for Sequence-to-Sequence Voice Conversion

Authors: Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda

Many-to-Many Voice Transformer Network

Authors: Hirokazu Kameoka, Wen-Chin Huang, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Tomoki Toda

Preordering Encoding on Transformer for Translation

Authors: Yuki Kawara, Chenhui Chu, Yuki Arase

Overview of the Eighth Dialog System Technology Challenge: DSTC8

Authors: Seokhwan Kim, Michel Galley, R. Chulaka Gunasekara, Sungjin Lee, Adam Atkinson, Baolin Peng, Hannes Schulz, Jianfeng Gao, Jinchao Li, Mahmoud Adada, Minlie Huang, Luis A. Lastras, Jonathan K. Kummerfeld, Walter S. Lasecki, Chiori Hori, Anoop Cherian, Tim K. Marks, Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta

Editorial: Special Issue on the Eighth Dialog System Technology Challenge

Authors: Seokhwan Kim, Hannes Schulz, R. Chulaka Gunasekara, Chiori Hori, Abhinav Rastogi, Luis Fernando D'Haro

Spatial Active Noise Control Based on Kernel Interpolation of Sound Field

Authors: Shoichi Koyama, Jesper Brunnstrm, Hayato Ito, Natsuki Ueno, Hiroshi Saruwatari

$F_0$-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model

Authors: Yongwei Li, Jianhua Tao, Donna Erickson, Bin Liu, Masato Akagi

Corruption Is Not All Bad: Incorporating Discourse Structure Into Pre-Training via Corruption for Essay Scoring

Authors: Farjana Sultana Mim, Naoya Inoue, Paul Reisert, Hiroki Ouchi, Kentaro Inui

Time-Domain Audio Source Separation With Neural Networks Based on Multiresolution Analysis

Authors: Tomohiko Nakamura, Shihori Kozuka, Hiroshi Saruwatari

Gamma Boltzmann Machine for Audio Modeling

Authors: Toru Nakashika and Kohei Yatabe

An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Authors: Berrak Sisman, Junichi Yamagishi, Simon King, Haizhou Li

Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network

Authors: Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda

Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice Separation

Authors: Weitao Yuan, Bofei Dong, Shengbei Wang, Masashi Unoki, Wenwu Wang

Customer Satisfaction Estimation in Contact Center Calls Based on a Hierarchical Multi-Task Model

Authors: Atsushi Ando, Ryo Masumura, Hosana Kamiyama, Satoshi Kobashikawa, Yushi Aono, Tomoki Toda

Towards More Diverse Input Representation for Neural Machine Translation

Authors: Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao, Muyun Yang, Hai Zhao

Nonparallel Voice Conversion With Augmented Classifier Star Generative Adversarial Networks

Authors: Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo

ConvS2S-VC: Fully Convolutional Sequence-to-Sequence Voice Conversion

Authors: Hirokazu Kameoka, Kou Tanaka, Damian Kwasny, Takuhiro Kaneko, Nobukatsu Hojo

Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals

Authors: Tomi Kinnunen, Hctor Delgado, Nicholas W. D. Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md. Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

Massive Exploration of Pseudo Data for Grammatical Error Correction

Authors: Shun Kiyono, Jun Suzuki, Tomoya Mizumoto, Kentaro Inui

Optimizing Source and Sensor Placement for Sound Field Control: An Overview

Authors: Shoichi Koyama, Gilles Chardon, Laurent Daudet

NAUTILUS: A Versatile Voice Cloning System

Authors: Hieu-Thi Luong and Junichi Yamagishi

Spherical-Harmonic-Domain Feedforward Active Noise Control Using Sparse Decomposition of Reference Signals from Distributed Sensor Arrays

Authors: Yu Maeno, Yuki Mitsufuji, Prasanga N. Samarasinghe, Naoki Murata, Thushara D. Abhayapala

Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain

Authors: Yuki Mitsufuji, Stefan Uhlich, Norihiro Takamune, Daichi Kitamura, Shoichi Koyama, Hiroshi Saruwatari

Independent Low-Rank Matrix Analysis Based on Time-Variant Sub-Gaussian Source Model for Determined Blind Source Separation

Authors: Shinichi Mogami, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo, Nobutaka Ono

Jointly Optimal Denoising, Dereverberation, and Source Separation

Authors: Tomohiro Nakatani, Christoph Bddeker, Keisuke Kinoshita, Rintaro Ikeshita, Marc Delcroix, Reinhold Haeb-Umbach

Bayesian Singing Transcription Based on a Hierarchical Generative Model of Keys, Musical Notes, and F0 Trajectories

Authors: Ryo Nishikimi, Eita Nakamura, Masataka Goto, Katsutoshi Itoyama, Kazuyoshi Yoshii

Multi-Source Neural Machine Translation With Missing Data

Authors: Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

Microphone Array Wiener Post Filtering Using Monotone Operator Splitting

Authors: Kenta Niwa, Hironobu Chiba, Noboru Harada, Guoqiang Zhang, W. Bastiaan Kleijn

A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement

Authors: Aditya Arie Nugraha, Kouhei Sekiguchi, Kazuyoshi Yoshii

Unsupervised Neural Machine Translation With Cross-Lingual Language Representation Agreement

Authors: Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao

Machine Speech Chain

Authors: Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Corrections to "Machine Speech Chain"

Authors: Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Neural Machine Translation With Sentence-Level Topic Context

Authors: Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao

Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior

Authors: Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara

Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method

Authors: Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari

Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation

Authors: Naoki Makishima, Shinichi Mogami, Norihiro Takamune, Daichi Kitamura, Hayato Sumino, Shinnosuke Takamichi, Hiroshi Saruwatari, Nobutaka Ono

ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder

Authors: Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition

Authors: Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

Positive Emotion Elicitation in Chat-Based Dialogue Systems

Authors: Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura

Many-to-Many and Completely Parallel-Data-Free Voice Conversion Based on Eigenspace DNN

Authors: Tetsuya Hashimoto, Daisuke Saito, Nobuaki Minematsu

Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma

Authors: Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu, Yuta Kawachi, Noboru Harada

Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition

Authors: Takafumi Moriya, Tomohiro Tanaka, Takahiro Shinozaki, Shinji Watanabe, Kevin Duh

Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling

Authors: Michael Heck, Sakriani Sakti, Satoshi Nakamura

Sequence-to-Sequence Models for Emphasis Speech Translation

Authors: Quoc Truong Do, Sakriani Sakti, Satoshi Nakamura

DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score

Authors: Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Yoichi Haneda

Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation

Authors: Rui Wang, Masao Utiyama, Andrew M. Finch, Lemao Liu, Kehai Chen, Eiichiro Sumita

A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis

Authors: Manu Airaksinen, Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku

Single-Channel Speech Enhancement With Phase Reconstruction Based on Phase Distortion Averaging

Authors: Yukoh Wakabayashi, Takahiro Fukumori, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita

Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech

Authors: Cassia Valentini-Botinhao and Junichi Yamagishi

Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis

Authors: Xin Wang, Shinji Takaki, Junichi Yamagishi

Mel-Cepstrum-Based Quantization Noise Shaping Applied to Neural-Network-Based Speech Waveform Synthesis

Authors: Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization

Authors: Hirokazu Kameoka, Takuya Higuchi, Mikihiro Tanaka, Li Li

Context Adaptive Neural Network Based Acoustic Models for Rapid Adaptation

Authors: Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Christian Huemmer, Tomohiro Nakatani

Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models

Authors: Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

Boundary Matching Filters for Spherical Microphone and Loudspeaker Arrays

Authors: Csar D. Salvador, Shuichi Sakamoto, Jorge Trevio, Yiti Suzuki

Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms

Authors: Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Tatsuya Kawahara, Hiroshi G. Okuno

A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation

Authors: Kehai Chen, Tiejun Zhao, Muyun Yang, Lemao Liu, Akihiro Tamura, Rui Wang, Masao Utiyama, Eiichiro Sumita

Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks

Authors: Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari

Duration-Controlled LSTM for Polyphonic Sound Event Detection

Authors: Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda

Translation Quality Estimation Using Only Bilingual Corpora

Authors: Lemao Liu, Atsushi Fujita, Masao Utiyama, Andrew M. Finch, Eiichiro Sumita

Note Value Recognition for Piano Transcription Using Markov Random Fields

Authors: Eita Nakamura, Kazuyoshi Yoshii, Simon Dixon

Simultaneous Optimization of Multiple Tree-Based Factor Analyzed HMM for Speech Synthesis

Authors: Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Introduction to the Special Section on Sound Scene and Event Analysis

Authors: Gal Richard, Tuomas Virtanen, Juan Pablo Bello, Nobutaka Ono, Herv Glotin

Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models

Authors: Naoyuki Kanda, Xugang Lu, Hisashi Kawai

Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR

Authors: Takuya Higuchi, Nobutaka Ito, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Tomohiro Nakatani

Informative Acoustic Feature Selection to Maximize Mutual Information for Collecting Target Sources

Authors: Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Hitoshi Ohmuro

Preserving Word-Level Emphasis in Speech-to-Speech Translation

Authors: Quoc Truong Do, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura

Estimating Speech Recognition Accuracy Based on Error Type Classification

Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine

Authors: Toru Nakashika, Tetsuya Takiguchi, Yasuhiro Minami

Transition-Based Dependency Parsing Exploiting Supertags

Authors: Hiroki Ouchi, Kevin Duh, Hiroyuki Shindo, Yuji Matsumoto

Near and Far Field Speech-in-Noise Intelligibility Improvements Based on a Time-Frequency Energy Reallocation Approach

Authors: Tudor-Catalin Zorila, Yannis Stylianou, Tatsuma Ishihara, Masami Akamine

Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization

Authors: Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari

Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion

Authors: Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki

Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis

Authors: Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, Graham Neubig, Sakriani Sakti, Satoshi Nakamura

Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance

Authors: Zhizheng Wu, Phillip L. De Leon, Cenk Demiroglu, Ali Khodabakhsh, Simon King, Zhen-Hua Ling, Daisuke Saito, Bryan Stewart, Tomoki Toda, Mirjam Wester, Junichi Yamagishi

Non-Negative Group Sparsity with Subspace Note Modelling for Polyphonic Transcription

Authors: Ken O'Hanlon, Hidehisa Nagano, Nicolas Keriven, Mark D. Plumbley

A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis

Authors: Ling-Hui Chen, Tuomo Raitio, Cassia Valentini-Botinhao, Zhen-Hua Ling, Junichi Yamagishi

Summarizing a Document by Trimming the Discourse Tree

Authors: Tsutomu Hirao, Masaaki Nishino, Yasuhisa Yoshida, Jun Suzuki, Norihito Yasuda, Masaaki Nagata

Optimal Coding of Generalized-Gaussian-Distributed Frequency Spectra for Low-Delay Audio Coder With Powered All-Pole Spectrum Estimation

Authors: Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Takehiro Moriya

Summarization Based on Task-Oriented Discourse Parsing

Authors: Xun Wang, Yasuhisa Yoshida, Tsutomu Hirao, Katsuhito Sudoh, Masaaki Nagata

Bilingual Continuous-Space Language Model Growing for Statistical Machine Translation

Authors: Rui Wang, Hai Zhao, Bao-Liang Lu, Masao Utiyama, Eiichiro Sumita

Generative Modeling of Voice Fundamental Frequency Contours

Authors: Hirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Kento Kadowaki, Yasunori Ohishi, Kunio Kashino

Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration

Authors: Daichi Kitamura, Hiroshi Saruwatari, Hirokazu Kameoka, Yu Takahashi, Kazunobu Kondo, Satoshi Nakamura

Automatic Speech Recognition for Mixed Dialect Utterances by Mixing Dialect Language Models

Authors: Naoki Hirayama, Koichiro Yoshino, Katsutoshi Itoyama, Shinsuke Mori, Hiroshi G. Okuno

Automatic Expressive Opinion Sentence Generation for Enjoyable Conversational Systems

Authors: Yoichi Matsuyama, Akihiro Saito, Shinya Fujie, Tetsunori Kobayashi

Resolution Warped Spectral Representation for Low-Delay and Low-Bit-Rate Audio Coder

Authors: Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Takehiro Moriya

AutoMashUpper: automatic creation of multi-song music mashups

Authors: Matthew E. P. Davies, Philippe Hamel, Kazuyoshi Yoshii, Masataka Goto

Nonparametric Bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes

Authors: Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno

Multichannel sound source dereverberation and separation for arbitrary number of sources based on Bayesian nonparametrics

Authors: Takuma Otsuka, Katsuhiko Ishiguro, Takuya Yoshioka, Hiroshi Sawada, Hiroshi G. Okuno

Harmonic/percussive sound separation based on anisotropic smoothness of spectrograms

Authors: Hideyuki Tachibana, Nobutaka Ono, Hirokazu Kameoka, Shigeki Sagayama

Wave Field Reconstruction Filtering in Cylindrical Harmonic Domain for With-Height Recording and Reproduction

Authors: Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda, Yiti Suzuki

A Synthesis Model With Intuitive Control Capabilities for Rolling Sounds

Authors: Simon Conan, Olivier Derrien, Mitsuko Aramaki, Slvi Ystad, Richard Kronland-Martinet

Dependency Parse Reranking with Rich Subtree Features

Authors: Mo Shen, Daisuke Kawahara, Sadao Kurohashi

Bayesian Nonparametrics for Microphone Array Processing

Authors: Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi G. Okuno

Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays

Authors: Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani

A MAP-based Online Estimation Approach to Ensemble Speaker and Speaking Environment Modeling

Authors: Yu Tsao, Shigeki Matsuda, Chiori Hori, Hideki Kashioka, Chin-Hui Lee

Alaryngeal Speech Enhancement Based on One-to-Many Eigenvoice Conversion

Authors: Hironori Doi, Tomoki Toda, Keigo Nakamura, Hiroshi Saruwatari, Kiyohiro Shikano

Dominance Based Integration of Spatial and Spectral Features for Speech Enhancement

Authors: Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Masakiyo Fujimoto

Diffused Sensing for Sharp Directive Beamforming

Authors: Kenta Niwa, Yusuke Hioka, Ken'ichi Furuya, Yoichi Haneda

Scalable Speech Coding for IP Networks: Beyond iLBC

Authors: Koji Seto and Tokunbo Ogunfunmi

Feature Enhancement With Joint Use of Consecutive Corrupted and Noise Feature Vectors With Discriminative Region Weighting

Authors: Masayuki Suzuki, Takuya Yoshioka, Shinji Watanabe, Nobuaki Minematsu, Keikichi Hirose

Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling

Authors: Stanislaw Andrzej Raczynski, Emmanuel Vincent, Shigeki Sagayama

A Multichannel MMSE-Based Framework for Speech Source Separation and Noise Reduction

Authors: Mehrez Souden, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani, Hiroshi Sawada

Optimized Speech Dereverberation From Probabilistic Perspective for Time Varying Acoustic Transfer Function

Authors: Masahito Togami, Yohei Kawaguchi, Ryu Takeda, Yasunari Obuchi, Nobuo Nukaga

Underdetermined Sound Source Separation Using Power Spectrum Density Estimated by Combination of Directivity Gain

Authors: Yusuke Hioka, Ken'ichi Furuya, Kazunori Kobayashi, Kenta Niwa, Yoichi Haneda

Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data

Authors: Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki, Naonori Ueda

Analytical Approach to Wave Field Reconstruction Filtering in Spatio-Temporal Frequency Domain

Authors: Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda

Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech

Authors: Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Inma Hernez, Ibon Saratxaga

Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction

Authors: Ryoichi Miyazaki, Hiroshi Saruwatari, Takayuki Inoue, Yu Takahashi, Kiyohiro Shikano, Kazunobu Kondo

Reproducing Virtual Sound Sources in Front of a Loudspeaker Array Using Inverse Wave Propagator

Authors: Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda

Statistical Voice Conversion Based on Noisy Channel Model

Authors: Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu

Bitext Dependency Parsing With Auto-Generated Bilingual Treebank

Authors: Wenliang Chen, Jun'ichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisawa, Haizhou Li

Topic-Dependent-Class-Based $n$-Gram Language Model

Authors: Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa

An Efficient Time-Frequency Method for Synthesizing Noisy Sounds With Short Transients and Narrow Spectral Components

Authors: Damin Marelli, Mitsuko Aramaki, Richard Kronland-Martinet, Charles Verron

Speaker Identification and Verification by Combining MFCC and Phase Information

Authors: Seiichi Nakagawa, Longbiao Wang, Shinji Ohtsuka

Round-Robin Duel Discriminative Language Models

Authors: Takanobu Oba, Takaaki Hori, Atsushi Nakamura, Akinori Ito

Product of Experts for Statistical Parametric Speech Synthesis

Authors: Heiga Zen, Mark J. F. Gales, Yoshihiko Nankaku, Keiichi Tokuda

Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera

Authors: Takaaki Hori, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato

Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information

Authors: Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada

Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment

Authors: Matthias Mauch, Hiromasa Fujihara, Masataka Goto

Introduction to the Special Section on Deep Learning for Speech and Language Processing

Authors: Dong Yu, Geoffrey E. Hinton, Nelson Morgan, Jen-Tzung Chien, Shigeki Sagayama

Diffuse Noise Suppression Using Crystal-Shaped Microphone Arrays

Authors: Nobutaka Ito, Hikaru Shimizu, Nobutaka Ono, Shigeki Sagayama

An Attempt to Calibrate Headphones for Reproduction of Sound Pressure at the Eardrum

Authors: Ryouichi Nishimura, Parham Mokhtari, Hironori Takemoto, Hiroaki Kato

Theoretical Analysis of Musical Noise in Generalized Spectral Subtraction Based on Higher Order Statistics

Authors: Takayuki Inoue, Hiroshi Saruwatari, Yu Takahashi, Kiyohiro Shikano, Kazunobu Kondo

Musical Noise Controllable Algorithm of Channelwise Spectral Subtraction and Adaptive Beamforming Based on Higher Order Statistics

Authors: Hiroshi Saruwatari, Yohei Ishikawa, Yu Takahashi, Takayuki Inoue, Kiyohiro Shikano, Kazunobu Kondo

Controlling the Perceived Material in an Impact Sound Synthesizer

Authors: Mitsuko Aramaki, Mireille Besson, Richard Kronland-Martinet, Slvi Ystad

Continuous Stochastic Feature Mapping Based on Trajectory HMMs

Authors: Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda

HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering

Authors: Tuomo Raitio, Antti Suni, Junichi Yamagishi, Hannu Pulakka, Jani Nurminen, Martti Vainio, Paavo Alku

Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

Authors: Takuya Yoshioka, Tomohiro Nakatani, Masato Miyoshi, Hiroshi G. Okuno

Time-Frequency Synthesis of Noisy Sounds With Narrow Spectral Components

Authors: Damin Marelli, Mitsuko Aramaki, Richard Kronland-Martinet, Charles Verron

Introduction to the Special Issue on Processing Reverberant Speech: Methodologies and Applications

Authors: Tomohiro Nakatani, Walter Kellermann, Patrick A. Naylor, Masato Miyoshi, Biing-Hwang Juang

Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction

Authors: Tomohiro Nakatani, Takuya Yoshioka, Keisuke Kinoshita, Masato Miyoshi, Biing-Hwang Juang

Penalized Logistic Regression With HMM Log-Likelihood Regressors for Speech Recognition

Authors: ystein Birkenes, Tomoko Matsui, Kunio Tanabe, Sabato Marco Siniscalchi, Tor Andr Myrvoll, Magne Hallstein Johnsen

Analysis and Recognition of NAM Speech Using HMM Distances and Visual Information

Authors: Panikos Heracleous, V.-A. Tran, Takayuki Nagai, Kiyohiro Shikano

Blind Source Separation With Parameter-Free Adaptive Step-Size Method for Robot Audition

Authors: Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino

A 3-D Immersive Synthesizer for Environmental Sounds

Authors: Charles Verron, Mitsuko Aramaki, Richard Kronland-Martinet, Grgory Pallone

Optimum Loss Factor for a Perfectly Matched Layer in Finite-Difference Time-Domain Acoustic Simulation

Authors: Parham Mokhtari, Hironori Takemoto, Ryouichi Nishimura, Hiroaki Kato

Introduction to the Special Section on Voice Transformation

Authors: Yannis Stylianou, Tomoki Toda, Chung-Hsien Wu, Alexander Kain, Olivier Rosec

Synthesis of Child Speech With HMM Adaptation and Voice Conversion

Authors: Oliver Watts, Junichi Yamagishi, Simon King, Kay Berkling

Thousands of Voices for HMM-Based Speech Synthesis-Analysis and Application of TTS Systems Built on Various ASR Corpora

Authors: Junichi Yamagishi, Bela Usabaev, Simon King, Oliver Watts, John Dines, Jilei Tian, Yong Guan, Rile Hu, Keiichiro Oura, Yi-Jian Wu, Keiichi Tokuda, Reima Karhila, Mikko Kurimo

Editorial for the Special Issue on Signal Models and Representations of Musical and Environmental Sounds

Authors: Bertrand David, Masataka Goto, Laurent Daudet, Paris Smaragdis

Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis

Authors: Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, Ren-Hua Wang

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

Authors: Junichi Yamagishi, Takashi Nose, Heiga Zen, Zhen-Hua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, Steve Renals

Beamforming With a Maximum Negentropy Criterion

Authors: Ken'ichi Kumatani, John W. McDonough, Barbara Rauch, Dietrich Klakow, Philip N. Garner, Weifeng Li

Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction

Authors: Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi

Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation

Authors: Hiroko Kato Solvang, Yuichi Nagahara, Shoko Araki, Hiroshi Sawada, Shoji Makino

Blind Spatial Subtraction Array for Speech Enhancement in Noisy Environment

Authors: Yu Takahashi, Tomoya Takatani, Keiichi Osako, Hiroshi Saruwatari, Kiyohiro Shikano

Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources

Authors: Leandro E. Di Persia, Diego H. Milone, Masuzo Yanagida

Integrated Speech Enhancement Method Using Noise Suppression and Dereverberation

Authors: Takuya Yoshioka, Tomohiro Nakatani, Masato Miyoshi

Binaural Localization Based on Weighted Wiener Gain Improved by Incremental Source Attenuation

Authors: Yoshifumi Nagata, Satoshi Iwasaki, Takahiko Hariyama, Toyota Fujioka, Tomita Obara, Takayuki Wakatake, Masato Abe

Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm

Authors: Junichi Yamagishi, Takao Kobayashi, Yuji Nakano, Katsumi Ogata, Juri Isogai

Speech Dereverberation Based on Maximum-Likelihood Estimation With Time-Varying Gaussian Source Model

Authors: Tomohiro Nakatani, Biing-Hwang Juang, Takuya Yoshioka, Keisuke Kinoshita, Marc Delcroix, Masato Miyoshi

Stochastic Analysis of the FXLMS-Based Narrowband Active Noise Control System

Authors: Yegui Xiao, Akira Ikuta, Liying Ma, Khashayar Khorasani

Specmurt Analysis of Polyphonic Music Signals

Authors: Shoichiro Saito, Hirokazu Kameoka, Keigo Takahashi, Takuya Nishimoto, Shigeki Sagayama

A Quick Search Method for Audio Signals Based on a Piecewise Linear Representation of Feature Trajectories

Authors: Akihiro Kimura, Kunio Kashino, Takayuki Kurozumi, Hiroshi Murase

Computational Models of Similarity for Drum Samples

Authors: Elias Pampalk, Perfecto Herrera, Masataka Goto

An Efficient Hybrid Music Recommender System Using an Incrementally Trainable Probabilistic Generative Model

Authors: Kazuyoshi Yoshii, Masataka Goto, Kazuhiro Komatani, Tetsuya Ogata, Hiroshi G. Okuno

A Cascaded Broadcast News Highlighter

Authors: Heidi Christensen, Yoshihiko Gotoh, Steve Renals

A Method for Automatic Detection of Vocal Fry

Authors: Carlos Toshinori Ishi, Ken-Ichi Sakakibara, Hiroshi Ishiguro, Norihiro Hagita

Adaptive Beamforming With a Minimum Mutual Information Criterion

Authors: Ken'ichi Kumatani, Tobias Gehrig, Uwe Mayer, Emilian Stoimenov, John W. McDonough, Matthias Wlfel

Dereverberation and Denoising Using Multichannel Linear Prediction

Authors: Marc Delcroix, Takafumi Hikichi, Masato Miyoshi

Spatio-Temporal FastICA Algorithms for the Blind Separation of Convolutive Mixtures

Authors: Scott C. Douglas, Malay Gupta, Hiroshi Sawada, Shoji Makino

Adaptive Parallel Quadratic-Metric Projection Algorithms

Authors: Masahiro Yukawa, Konstantinos Slavakis, Isao Yamada

Multichannel Bin-Wise Robust Frequency-Domain Adaptive Filtering and Its Application to Adaptive Beamforming

Authors: Wolfgang Herbordt, Herbert Buchner, Satoshi Nakamura, Walter Kellermann

Single and Multiple F0 Contour Estimation Through Parametric Spectrogram Modeling of Speech in Noisy Environments

Authors: Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveign, Shigeki Sagayama

A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering

Authors: Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama

On Active Noise Control Systems With Online Acoustic Feedback Path Modeling

Authors: Muhammad Tahir Akhtar, Masahide Abe, Masayuki Kawamata

Precise Dereverberation Using Multichannel Linear Prediction

Authors: Marc Delcroix, Takafumi Hikichi, Masato Miyoshi

Geometrically Constrained Independent Component Analysis

Authors: Mirko Knaak, Shoko Araki, Shoji Makino

Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics

Authors: Ian R. Lane, Tatsuya Kawahara, Tomoko Matsui, Satoshi Nakamura

Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error

Authors: Erik McDermott, Timothy J. Hazen, Jonathan Le Roux, Atsushi Nakamura, Shigeru Katagiri

Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals

Authors: Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi

A Dynamic Compressive Gammachirp Auditory Filterbank

Authors: Toshio Irino and Roy D. Patterson

Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements

Authors: Toshio Irino, Roy D. Patterson, Hideki Kawahara

Impairment Factor Framework for Wide-Band Speech Codecs

Authors: Sebastian Mller, Alexander Raake, Nobuhiko Kitawaki, Akira Takahashi, Marcel Wltermann

Fast Implementation of KLT-Based Speech Enhancement Using Vector Quantization

Authors: Yoshifumi Nagata, K. Mitsubori, T. Kagi, Toyota Fujioka, Masato Abe

Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking

Authors: Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino

Objective Assessment Methodology for Estimating Conversational Quality in VoIP

Authors: Akira Takahashi, Atsuko Kurashima, Hideaki Yoshino

A New Robust Narrowband Active Noise Control System in the Presence of Frequency Mismatch

Authors: Yegui Xiao, Liying Ma, Khashayar Khorasani, Akira Ikuta

Lossless Audio Coding Using the IntMDCT and Rounding Error Shaping

Authors: Yoshikazu Yokotani, Ralf Geiger, G. D. T. Schuller, Soontorn Oraintara, K. R. Rao

Comparative study on corpora for speech translation

Authors: Gen-ichiro Kikui, Seiichi Yamamoto, Toshiyuki Takezawa, Eiichiro Sumita

Using multiple edit distances to automatically grade outputs from Machine translation systems

Authors: Yasuhiro Akiba, Kenji Imamura, Eiichiro Sumita, Hiromi Nakaiwa, Shun'ichi Yamamoto, Hiroshi G. Okuno

Analysis-synthesis of impact sounds by real-time dynamic filtering

Authors: Mitsuko Aramaki and Richard Kronland-Martinet

The ATR multilingual speech-to-speech translation system

Authors: Satoshi Nakamura, Konstantin Markov, Hiromi Nakaiwa, Gen-ichiro Kikui, Hisashi Kawai, Takatoshi Jitsuhiro, Jinsong Zhang, Hirofumi Yamamoto, Eiichiro Sumita, Seiichi Yamamoto

Blind source separation based on a fast-convergence algorithm combining ICA and beamforming

Authors: Hiroshi Saruwatari, Toshiya Kawamura, Tsuyoki Nishikawa, Akinobu Lee, Kiyohiro Shikano

Evaluation of the affective valence of speech using pitch substructure

Authors: Norman D. Cook, Takashi X. Fujisawa, Kazuaki Takami

Speech enhancement based on auto gain control

Authors: Yoshifumi Nagata, Toyota Fujioka, Masato Abe

Introduction to the Special Issue on Spontaneous Speech Processing

Authors: Sadaoki Furui, Mary E. Beckman, Julia Hirschberg, Shuichi Itahashi, Tatsuya Kawahara, Satoshi Nakamura, Shrikanth S. Narayanan

Speech-to-text and speech-to-speech summarization of spontaneous speech

Authors: Sadaoki Furui, Tomonori Kikuchi, Yosuke Shinnaka, Chiori Hori

Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers

Authors: Tatsuya Kawahara, Masahiro Hasegawa, Kazuya Shitaoka, Tasuku Kitade, Hiroaki Nanjo

Morphological analysis of the corpus of spontaneous Japanese

Authors: Kiyotaka Uchimoto, Kazuma Takaoka, Chikashi Nobata, Atsushi Yamada, Satoshi Sekine, Hitoshi Isahara

Variational bayesian estimation and clustering for speech recognition

Authors: Shinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda

Coloration perception depending on sound direction

Authors: Yoshikazu Seki and Kiyohide Ito

Combined approach of array processing and independent component analysis for blind separation of acoustic signals

Authors: Futoshi Asano, Shiro Ikeda, Michiaki Ogawa, Hideki Asoh, Nobuhiko Kitawaki

The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech

Authors: Shoko Araki, Ryo Mukai, Shoji Makino, Tsuyoki Nishikawa, Hiroshi Saruwatari

Extending the sound impulse response of room using extrapolation

Authors: Zihou Meng, Kimihiro Sakagami, Masayuki Morimoto, Guoan Bi, Alex ChiChung Kot

Active control system for low-frequency road noise combined with an audio system

Authors: Hisashi Sano, Toshio Inoue, Akira Takahashi, Kenichi Terai, Yoshio Nakamura

Weighted autocorrelation for pitch extraction of noisy speech

Authors: Tetsuya Shimamura and Hajime Kobayashi

A structural Bayes approach to speaker adaptation

Authors: Koichi Shinoda and Chin-Hui Lee

An application of discriminative feature extraction to filter-bank-based speech recognition

Authors: Alain Biem, Shigeru Katagiri, Erik McDermott, Biing-Hwang Juang

HMM-separation-based speech recognition for a distant moving speaker

Authors: Tetsuya Takiguchi, Satoshi Nakamura, Kiyohiro Shikano

A Japanese TTS system based on multiform units and a speech modification algorithm with harmonics reconstruction

Authors: Satoshi Takano, Kimihito Tanaka, Hideyuki Mizuno, Masanobu Abe, Shin'ya Nakajima

A minimax search algorithm for robust continuous speech recognition

Authors: Hui Jiang, Keikichi Hirose, Qiang Hue

Speech enhancement based on the subspace method

Authors: Futoshi Asano, Satoru Hayamizu, Takeshi Yamada, Satoshi Nakamura

Speech visualization by integrating features for the hearing impaired

Authors: Akira Watanabe, Shingo Tomishige, Masahiro Nakatake

Efficient training algorithms for HMMs using incremental estimation

Authors: Yoshihiko Gotoh, Michael M. Hochberg, Harvey F. Silverman

Flexible speech understanding based on combined key-phrase detection and verification

Authors: Tatsuya Kawahara, Chin-Hui Lee, Biing-Hwang Juang

Fast deconvolution of multichannel systems using regularization

Authors: Ole Kirkeby, Philip Arthur Nelson, Hareo Hamada, Felipe Ordua-Bustamante

Design and description of CS-ACELP: a toll quality 8 kb/s speech coder

Authors: Redwan Salami, Claude Laflamme, Jean-Pierre Adoul, Akitoshi Kataoka, Shinji Hayashi, Takehiro Moriya, Claude Lamblin, Dominique Massaloux, Stphane Proust, Peter Kroon, Yair Shoham

Estimation of the waveform of a sound source by using an iterative technique with many sensors

Authors: Masato Abe, Kiyohito Fujii, Yoshifumi Nagata, Toshio Sone, Ken'iti Kido

An 8-kb/s conjugate structure CELP (CS-CELP) speech coder

Authors: Akitoshi Kataoka, Takehiro Moriya, Shinji Hayashi

Subadaptive piecewise linear quantization for speech signal (64 kbit/s) compression

Authors: Hiroto Saito, Isao Umoto, Akira Sasou, Shogo Nakamura, Yoshihiko Horio, Tahiro Kubota

Adaptive cepstral analysis of speech

Authors: Keiichi Tokuda, Takao Kobayashi, Satoshi Imai

Weighted RLS adaptive beamformer with initial directivity

Authors: Futoshi Asano, Yiti Suzuki, Toshio Sone

Inverse filter design and equalization zones in multichannel sound reproduction

Authors: Philip Arthur Nelson, Felipe Ordua-Bustamante, Hareo Hamada