no code implementations • 5 Jan 2024 • Parvin Malekzadeh, Ming Hou, Konstantinos N. Plataniotis
In this paper, we propose an algorithm that clarifies the theoretical connection between aleatory and epistemic uncertainty, unifies aleatory and epistemic uncertainty estimation, and quantifies the combined effect of both uncertainties for a risk-sensitive exploration.
1 code implementation • 4 Jan 2024 • Parvin Malekzadeh, Konstantinos N. Plataniotis, Zissis Poulos, Zeyu Wang
Distributional Reinforcement Learning (RL) estimates return distribution mainly by learning quantile values via minimizing the quantile Huber loss function, entailing a threshold parameter often selected heuristically or via hyperparameter search, which may not generalize well and can be suboptimal.
no code implementations • 16 Oct 2023 • Parvin Malekzadeh, Ming Hou, Konstantinos N. Plataniotis
Putting together two ideas of hybrid model-based successor feature (MB-SF) and uncertainty leads to an approach to the problem of sample efficient uncertainty-aware knowledge transfer across tasks with different transition dynamics or/and reward functions.
no code implementations • 15 Dec 2022 • Parvin Malekzadeh, Konstantinos N. Plataniotis
Despite this exploratory behaviour of AIF, its usage is limited to discrete spaces due to the computational challenges associated with EFE.
no code implementations • 31 Mar 2022 • Parvin Malekzadeh, Mohammad Salimibeni, Ming Hou, Arash Mohammadi, Konstantinos N. Plataniotis
Recent studies in neuroscience suggest that Successor Representation (SR)-based models provide adaptation to changes in the goal locations or reward function faster than model-free algorithms, together with lower computational cost compared to that of model-based algorithms.
no code implementations • 30 Dec 2021 • Mohammad Salimibeni, Arash Mohammadi, Parvin Malekzadeh, Konstantinos N. Plataniotis
The proposed MAK-TD/SR frameworks consider the continuous nature of the action-space that is associated with high dimensional multi-agent environments and exploit Kalman Temporal Difference (KTD) to address the parameter uncertainty.
1 code implementation • 30 May 2020 • Parvin Malekzadeh, Mohammad Salimibeni, Arash Mohammadi, Akbar Assa, Konstantinos N. Plataniotis
As a result, the proposed MM-KTD framework can learn the optimal policy with significantly reduced number of samples as compared to its DNN-based counterparts.