“2024-08-19”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第49行: 第49行:
 
|Junming Yuan
 
|Junming Yuan
 
||
 
||
* Verified two parameters in Hubert pre-training config file that were confused with the original paper.[https://z1et6d3xtb.feishu.cn/docx/PaATdHi26oEc0Pxovd4cSyp0nQ2]
+
* Verified two parameters in Hubert pretraining config file that were confused with the original paper.[https://z1et6d3xtb.feishu.cn/docx/PaATdHi26oEc0Pxovd4cSyp0nQ2]
 
** Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
 
** Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
 
*** in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
 
*** in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
** Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pre-training.
+
** Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pretraining.
 
||
 
||
 
*
 
*

2024年8月19日 (一) 08:47的版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
Lantian Li
Ying Shi
Zhenghai You
Junming Yuan
  • Verified two parameters in Hubert pretraining config file that were confused with the original paper.[1]
    • Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
      • in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
    • Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pretraining.
Chen Chen
Xiaolou Li
Zehua Liu
Pengqi Li
Wan Lin
Tianhao Wang
Zhenyu Zhou
Junhui Chen
Jiaying Wang
Yu Zhang
Wenqiang Du
Yang Wei
Lily
Turi
Yue Gu
Qi Qu