TASK |
DATASET |
MODEL |
METRIC NAME |
METRIC VALUE |
GLOBAL RANK |
REMOVE |
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
APS
|
Walker (mean normalized return)
|
7.62±7.36
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
APS
|
Quadruped (mean normalized return)
|
20.73±5.13
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
APS
|
Jaco (mean normalized return)
|
0.75±0.82
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
APS
|
Walker (mean normalized return)
|
8.87±7.33
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
APS
|
Quadruped (mean normalized return)
|
21.26±5.30
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
APS
|
Jaco (mean normalized return)
|
0.87±1.01
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
APS
|
Walker (mean normalized return)
|
8.29±7.70
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
APS
|
Quadruped (mean normalized return)
|
19.27±12.35
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
APS
|
Jaco (mean normalized return)
|
1.73±1.86
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
APS
|
Walker (mean normalized return)
|
8.07±7.13
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
APS
|
Quadruped (mean normalized return)
|
22.74±5.44
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
APS
|
Jaco (mean normalized return)
|
0.38±0.43
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
APS
|
Walker (mean normalized return)
|
77.49±25.00
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
APS
|
Quadruped (mean normalized return)
|
65.51±10.79
|
# 1
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
APS
|
Jaco (mean normalized return)
|
81.53±6.54
|
# 1
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
APS
|
Walker (mean normalized return)
|
74.41±30.02
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
APS
|
Quadruped (mean normalized return)
|
56.81±12.66
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
APS
|
Jaco (mean normalized return)
|
50.04±3.76
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
APS
|
Walker (mean normalized return)
|
66.80±31.50
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
APS
|
Quadruped (mean normalized return)
|
64.97±9.15
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
APS
|
Jaco (mean normalized return)
|
41.39±5.93
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
APS
|
Walker (mean normalized return)
|
76.50±29.10
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
APS
|
Quadruped (mean normalized return)
|
59.76±14.86
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
APS
|
Jaco (mean normalized return)
|
61.34±6.64
|
# 4
|
|