来自cslt Wiki
(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
- reproduce batch-normalization experiments on aurora4 dataset. got similar results as ZZY.
- test Ad-grd & Ad-max on sino-100h dataset. Ad-max is bad. Ad-grd looks reasonable, but training diverge after 5 iters.