“2024-08-19”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“{| class="wikitable" !People !! This Week !! Next Week !! Task Tracking (<font color="red">DeadLine</font>) |- |- |Dong Wang || * || * || * |- |- |Lantian Li || *...”为内容创建页面)
 
第49行: 第49行:
 
|Junming Yuan
 
|Junming Yuan
 
||
 
||
*
+
* Verified two parameters in Hubert pre-training config file that were confused with the original paper.[https://z1et6d3xtb.feishu.cn/docx/PaATdHi26oEc0Pxovd4cSyp0nQ2]
 +
** Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
 +
*** in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
 +
** Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pre-training.
 
||
 
||
 
*
 
*

2024年8月19日 (一) 08:46的版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
Lantian Li
Ying Shi
Zhenghai You
Junming Yuan
  • Verified two parameters in Hubert pre-training config file that were confused with the original paper.[1]
    • Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
      • in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
    • Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pre-training.
Chen Chen
Xiaolou Li
Zehua Liu
Pengqi Li
Wan Lin
Tianhao Wang
Zhenyu Zhou
Junhui Chen
Jiaying Wang
Yu Zhang
Wenqiang Du
Yang Wei
Lily
Turi
Yue Gu
Qi Qu