|
|
(3位用户的9个中间修订版本未显示) |
第1行: |
第1行: |
− | =Task To Do=
| |
− | ==Speech Recognition==
| |
− | ===End-to-End speech recognition===
| |
− | * Discriminative-Learning code implementation
| |
− | :* Zhiyuan Tang
| |
− | *Zhiyuan Tang/Mengyuan Zhao/Zhiyong Zhang
| |
| | | |
− | ===Multi-task=== | + | =Tasks at hand= |
− | * Fusion of speech-recognition and speech-rate
| + | |
− | :* Xiangyu Zeng
| + | |
− | * Self-informed neural network structure learning
| + | |
− | :* Mengyuan Zhao
| + | |
| | | |
− | ===Integrate the class information to HCLG fst for speech recognition=== | + | ==Speech Recognition== |
− | | + | |
− | ===Distant speech recognition===
| + | |
− | *RNN-DAE: echo or reverberation
| + | |
− | :*Xuewei Zhang/Zhiyuan Tang/Mengyuan Zhao/Zhiyong Zhang
| + | |
− | *Reverberation
| + | |
− | :*Mutli-microphones
| + | |
− | :*(Lasso),Xuewei Zhang
| + | |
− | | + | |
− | ===Voice conversation===
| + | |
− | | + | |
− | ===Sparse DNN===
| + | |
− | *Zhiyuan Tang
| + | |
− | | + | |
− | ===Correlation based SENONE cluster===
| + | |
− | | + | |
− | ===NN Multi-GPU parallel traing===
| + | |
− | *Multi-Machine
| + | |
− | :*Sheng Su
| + | |
− | *Multi-GPU on one Machine
| + | |
− | :*Sheng Su
| + | |
− | * nnet3 code test
| + | |
− | | + | |
− | ===Audio Embedding===
| + | |
− | *Ke Ning
| + | |
− | | + | |
− | ===RNN training accelerating===
| + | |
− | | + | |
− | ===Data selection===
| + | |
− | *Zhiyong Zhang
| + | |
− | *Sub-modular data selection
| + | |
− | *Objective-function loss training self-adaptation
| + | |
− | | + | |
− | ===Decoder===
| + | |
− | *Confidence output for task-required
| + | |
− | | + | |
− | ==Speaker Verification==
| + | |
− | ===binary code===
| + | |
− | *Lantian Li
| + | |
− | | + | |
− | ===RNN-ivector===
| + | |
− | *Lantian Li
| + | |
− | | + | |
− | ===DNN clustering===
| + | |
− | *Lantian Li
| + | |
− | | + | |
− | =Task DONE=
| + | |
− | ==Multi-Mode features based VAD==
| + | |
− | * Shi Yin
| + | |
− | | + | |
− | ==DNN based Language identification and Speaker identification==
| + | |
− | * Xuewei Zhang/Zhiyuan Tang
| + | |
| | | |
− | ==Neural network visulization== | + | ===joint learning=== |
− | * Mian Wang,DONE | + | * Hang Luo, Zhiyuan Tang |
| | | |
− | ==Dark knowledge== | + | ===visualization=== |
− | * Mengyuan Zhao, Xiangyu Zeng, Zhiyong Zhang, Chao Liu | + | * Ying Shi, Zhiyuan Tang |
| | | |
− | ==Normal RNN speech recognition== | + | ==Speaker Recognition== |
− | * Mengyuan Zhao | + | *Lantian Li, Yixiang Chen |
| | | |
− | ==Monmentum-like Hessien-Free acceleration==
| |
− | * Nestrov/Adagrad/AdaDelta/AdaM
| |
− | * Zhiyong Zhang/Xiangyu Zeng
| |
| | | |
− | ==Activation value normalization through time --Batch Normalization== | + | =Tasks Done= |
− | * Zhiyong Zhang
| + | |
| | | |
− | ==Mix-training Balance decision tree== | + | =Technical Reports to write= |
− | * Zhiyong Zhang
| + | |
| | | |
− | ==20-h Chinese data-set release== | + | =Papers to write= |
− | * Xuewei Zhang
| + | |
| | | |
− | ==Unbound activation function(Rectifier/Maxout/Pnorm) go-through searching method== | + | =Patents to write= |
− | * nne3 test --Xuewei Zhang
| + | |
| | | |
− | =Technical Report To Write= | + | =Patents done= |
− | 1, DNN-DAE based noise cancellation -- Xiangyu Zeng / Mengyuan Zhao / Zhiyong Zhang --DONE
| + | |
− | 2, Speech Rate DNN speech recognition --Shi Yin/Xiangyu Zeng --DONE
| + | |
− | 3, CNN+fbank feature combination --Mian Wang /Yiye Lin /Mengyuan Zhao /Shi Yin
| + | |
− | 4, Uyghur low-resource acoustic model enhancement -- Shi Yin / Mengyuan Zhao / Zhiyong Zhang --DONE
| + | |
− | 5, Uyghur 20h database release --Kaer /Shi Yin --DONE
| + | |
− | 6,Dark-Knowledge Transfer
| + | |
− | *: Xiangyu Zeng/ Mengyuan Zhao / Zhiyong Zhang
| + | |
| | | |
− | =Paper to Write= | + | =Projects= |
| | | |
− | =Project=
| |
− | * Xiaomi TV
| |
− | :*Mengyuan Zhao/Zhiyong Zhang
| |
− | :*TAG-lm & Domain-specific general lm
| |
| | | |
− | *Chinese-English mix-training
| + | ------------------------------ |
| + | [[task previous]] |