Monitor Dashboard
主页
用户蓝点:
语言
EN
CN
VPN
退出
Cluster
File System
Monitor
Bigdata Monitor
Spark Monitor
Hadoop Monitor
Deeplearning Monitor
DeepLearning Monitor
Application List
Application 1
Application 1
Cluster Master
IP 1
Start Time
2016/07/09, 11:22:33
End Time
2016/07/09, 11:22:33
Iteration Number
4
Static Parameters
Solver Parameters
net
"lenet_memory_train_test.prototxt"
test_iter
1
test_interval
500
base_lr
0.01
momentum
0.9
weight_decay
0.0005
lr_policy
"inv"
gamma
0.001
power
0.75
display
10
max_iter
10000
snapshot
5000
snapshot_prefix
"mnist_lenet"
solver_mode
GPU
Train Test Parameters
L1
data
L2
data
L3
conv1
L4
pool1
L5
conv2
L6
pool2
L7
ip1
L8
relu1
L9
ip2
L10
accuracy
L11
loss
Algorithm’s view and Quality Optimization
Time
Time Estimation
100% 1min9s
100% 1min9s
Iteration Eclipse
10% 1min9s
10%
Problems Analysis and Suggestion
Name
Optimize
Suggestion
Back
Name
Optimize
Name
Optimize
Suggestion
Back
Name
Optimize
Suggestion
Back
System’s view and Performance Optimization
Communication Overhead
Optimize
Suggestion
Back
GPU utilization
Optimize
Suggestion
Back
CPU utilization
Optimize
Suggestion
Back
CPU/GPU memory over all nodes
Optimize
DiskIO
Optimize
NetworkIO
Optimize
Disk
Optimize