PySpark分布式机器学习与大数据分析(Distributed Machine Learning and Big Data Analysis with PySpark)

来自cslt Wiki
2016年3月2日 (三) 23:11Fanmiao讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley, the Spark codebase was later donated to the Apache Software Foundation that has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance.