“第十二章机器学习基本流程”版本间的差异

2023年8月8日 (二) 09:47的版本

王东，机器学习导论,第一章“绪论”，第十一章“优化方法”[38]
Wolpert, David (1996), "The Lack of A Priori Distinctions between Learning Algorithms", Neural Computation, pp. 1341–1390 [*][39]
Sebastian Ruder, An overview of gradient descend algorithms,2017 [40]
Kirkpatrick, S.; Gelatt Jr, C. D.; Vecchi, M. P. (1983). "Optimization by Simulated Annealing". Science. 220 (4598): 671–680. [41]
Brown et al., Language Models are Few-Shot Learners [42]

@@ 第46行： / 第46行： @@
 * 王东，机器学习导论,第一章“绪论”，第十一章“优化方法”[http://mlbook.cslt.org]
-* Wolpert, David (1996), "The Lack of A Priori Distinctions between Learning Algorithms", Neural Computation, pp. 1341–1390 [https://web.archive.org/web/20161220125415/http://www.zabaras.com/Courses/BayesianComputing/Papers/lack_of_a_priori_distinctions_wolpert.pdf]
+* Wolpert, David (1996), "The Lack of A Priori Distinctions between Learning Algorithms", Neural Computation, pp. 1341–1390 [*][https://web.archive.org/web/20161220125415/http://www.zabaras.com/Courses/BayesianComputing/Papers/lack_of_a_priori_distinctions_wolpert.pdf]
 * Sebastian Ruder, An overview of gradient descend algorithms,2017 [https://arxiv.org/pdf/1609.04747.pdf]
 * Kirkpatrick, S.; Gelatt Jr, C. D.; Vecchi, M. P. (1983). "Optimization by Simulated Annealing". Science. 220 (4598): 671–680. [https://sci2s.ugr.es/sites/default/files/files/Teaching/GraduatesCourses/Metaheuristicas/Bibliography/1983-Science-Kirkpatrick-sim_anneal.pdf]
 * Brown et al., Language Models are Few-Shot Learners [https://arxiv.org/pdf/2005.14165.pdf]