Partially observable environments present an important open challenge in the domain of sequential control learning with delayed rewards. Despite numerous attempts during the two last decades, the majority of reinforcement learning algorithms and associated approximate models, applied to this context, still assume Markovian state transitions... (read more)
PDFMETHOD | TYPE | |
---|---|---|
![]() |
Working Memory Models |