方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR...
Transcript of 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR...
![Page 1: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/1.jpg)
方圆并济:基于 Spark on Angel 的高性能分布式机器学习
![Page 2: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/2.jpg)
•
•
•
•
•
•
![Page 3: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/3.jpg)
源起
![Page 4: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/4.jpg)
腾讯的产品需求
SmallModel
d
Big Datan
d
d<<n
SparseBig Data
d
Big Model
d
d ≈ n
寻找满足十亿级维度的工业级的分布式机器学习平台
![Page 5: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/5.jpg)
Executor
Driver
ModelExecutor
Executor
Executor
Executor
Executor
Driver
Model
Executor
Executor
Executor
Spark机器学习的瓶颈
●
●
●
![Page 6: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/6.jpg)
One Issue
https://issues.apache.org/jira/browse/SPARK-6932
A Prototype of Parameter Server
2015
![Page 7: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/7.jpg)
Glint & Yahoo
2016
![Page 8: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/8.jpg)
理念
Worker
PS PS PS
Spark Worker Worker Worker Worker
Angel mutable
immutable
—— 方圆并济
![Page 9: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/9.jpg)
Spark on Angel
![Page 10: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/10.jpg)
核心抽象
MapperReducer RDD PSModel
![Page 11: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/11.jpg)
RDD vs PSModel
RDD-1 RDD-2 RDD-3 RDD-4 RDD-5
PSModel
epoch-1 epoch-2 epoch-3 epoch-4 epoch-5
epoch……………………
![Page 12: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/12.jpg)
RDD的核心抽象RDD
Partition-1
Partition-2
Partition-3
Partition-4
Partition-n
Compute Func
…………………
Dependencies
NodeMemory Node Disk
MemoryBlock -n
DiskBlock -n
Preferred locationsPartitioners
RDD
RDD
…………………
(Transformation or Action)
![Page 13: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/13.jpg)
PSModel的核心抽象
PSModelM
pull
ΔM
push
Shard
PSServer
MatrixContext
Sync
PSPartitioner
Partition1
Partition2
Partition-……
Partition3
PSClient
Clock
![Page 14: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/14.jpg)
Spark on Angel的架构
PSAgent PSAgent
SPARKRDD ……………………
Parameter Server Shard
PSServer
Shard
PSServer
PSAgent
Shard
PSServer
PSModel
Executor
TASK
TASK
TASK
PSModel
Executor
TASK
TASK
TASK
AngelContext
SparkDriver
……………………
![Page 15: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/15.jpg)
PSAgentPSAgentPSAgent
Parameter Server
Model M pull ΔMpush
Shard
PSServer
Shard
PSServer
Shard
PSServer
Worker
psFuncModel PartitionersyncProtocol
PsClient
DataBlock
Task
PsClient
DataBlock
Task
![Page 16: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/16.jpg)
•••
丰富的机器学习及数学计算库
•••
友好的用户编程接口
•••
工业级别可用的参数服务器
![Page 17: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/17.jpg)
Angel和Glint的比较
PSPartitioner
Partition1
Partition2
Partition-……
Partition3
更丰富的模型切分 更灵活的异步模式 更强大的psFunc
![Page 18: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/18.jpg)
Angel的定位
https://github.com/tencent/angel
![Page 19: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/19.jpg)
Spark on Angel的开发
![Page 20: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/20.jpg)
Angel的API设计
TrainTask
1. Start PS
2. Load Model
3.runTask
4.parse & preProcess
5.train
6.learn
HDFS
8.Save ModelHDFS
AngelClient
MLLearner
DataBlockLabledData
LabledData
LabledData
MLModel
7.push & pullPSModel
PSModel
PSModel
Model
PSServer
MLRunner
![Page 21: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/21.jpg)
MLModelRDD
Spark on Angel的API设计
RDD2
RDD3
……
RDD1
Shard
PSServer
AngelClient
PSClient
AngelSpark on AngelSpark
SparkPSContext
PSModel
{ RDD_PS_Functions }
PSVector PSMartrix
BreezePSVector CachedPSVector
![Page 22: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/22.jpg)
Spark on Angel的基础写法
•
•
••••
![Page 23: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/23.jpg)
<<class>>BreezeVector
def round(t: T):Tdef dot(t: T):Tdef max(t: T):T
…
<<trait>>NumericOps[T]
def round(t: T):Tdef dot(t: T):Tdef max(t: T):T
…
<<class>>BreezePSVector
def round(t: T):Tdef dot(t: T):Tdef max(t: T):T
…
混入相同特征
PSAgent
进行透明替换
Angle PS
•••
Vector的透明替换
Executor
Task
BreezePSVector
BreezePSVector
BreezePSVector
PSClient
![Page 24: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/24.jpg)
![Page 25: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/25.jpg)
Angel的算法
Spark on AngelAvailable
![Page 26: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/26.jpg)
LR on Angel
Pull parameters from PS
Push update value to PS
2.
PS PS PS PS
Worker Worker Worker
HDFS HDFS HDFS
0.
1.
![Page 27: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/27.jpg)
[Spark on Angel] LR
[spark_on_angel_quick_start.md]
{BreezeOps}
wPS gradientPS
Angel
Spark sampleRDDmapPartitions
DenseVectorArray
![Page 28: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/28.jpg)
优化方法
![Page 29: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/29.jpg)
[Spark on Angel] LR with Optimizer
wPS statePS Angel
DenseVector
SparksampleRDD
mapPartitions
SGD OWLQN LBFGS
Breeze.optimizer
DiffFunction(BreezePSVector) : (Double, BreezePSVector)
[spark_on_angel_optimizer.md]
![Page 30: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/30.jpg)
GBDT:树模型+Boosting
Age<30
Wage<10K
IsMale?Y
Y
YN
N
N
tree 1 tree 2
predict( ) 5+0.5=5.5
predict( ) 10+1.5=11.5
predict( ) 1+1.5=2.5
predict( ) 1+0.5=1.5
predict( ) 1+1.5=2.5
A
B
C
D
E
![Page 31: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/31.jpg)
GBDT on Angel: 模型存储
feature value
feature ID
leaf prediction
PS1
feature value
feature ID
leaf prediction
PS2
feature value
feature ID
leaf prediction
PS3
grad histogram
hess histogram
![Page 32: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/32.jpg)
GBDT on Angel(1):构建森林
PS1 PS2 PS3
Worker1 Worker2 Worker3
![Page 33: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/33.jpg)
GBDT on Angel(2): 分裂树节点
find split feature & value
[gbdt_on_angel.md]
![Page 34: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/34.jpg)
Angel
Spark
[Spark on Angel] GBDT
Instance RDD Gradient RDD Prediction RDDzip zip
InstanceLayout
PS
map
Grad Histogram
PS
SplitFeature
PS
SplitValue
PS
LeafWeight
PS
[spark_on_angel_gbdt.md]
![Page 35: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/35.jpg)
![Page 36: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/36.jpg)
(Spark on Angel)vs Spark —— LR
![Page 37: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/37.jpg)
Angel vs XGBoost —— GBDT
![Page 38: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/38.jpg)
Angel vs Spark —— LDA
![Page 39: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/39.jpg)
Angel vs Spark —— GD-LR
![Page 40: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/40.jpg)
Angel vs Spark —— ADMM-LR
![Page 41: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/41.jpg)
Spark on Angel的特点
![Page 42: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/42.jpg)
OpenSource & Perspective
![Page 43: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/43.jpg)
Angel开源
• [GBDT] The purposes of using parameter server in GBDT #7
•
•
•
•
(PR 60)
![Page 44: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/44.jpg)
学术创新
•
•
•
•
•
• 国际顶级会议Paper(CCF A类)
![Page 45: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN](https://reader034.fdocuments.co/reader034/viewer/2022042807/5f79b21a23d375299c2214bd/html5/thumbnails/45.jpg)
版本展望(What is Next)
V1.3 V1.5 V2.0