Search Results for author: David M. Bossens

Found 10 papers, 5 papers with code

Robust Lagrangian and Adversarial Policy Gradient for Robust Constrained Markov Decision Processes

no code implementations22 Aug 2023 David M. Bossens

Adversarial RCPG also formulates the worst-case dynamics based on the Lagrangian but learns this directly and incrementally as an adversarial policy through gradient descent rather than indirectly and abruptly through constrained optimisation on a sorted value list.

Incremental Learning

Low Variance Off-policy Evaluation with State-based Importance Sampling

1 code implementation7 Dec 2022 David M. Bossens, Philip S. Thomas

In off-policy reinforcement learning, a behaviour policy performs exploratory interactions with the environment to obtain state-action-reward samples which are then used to learn a target policy that optimises the expected return.

Density Ratio Estimation Off-policy evaluation

Trust in Language Grounding: a new AI challenge for human-robot teams

no code implementations5 Sep 2022 David M. Bossens, Christine Evers

The challenge of language grounding is to fully understand natural language by grounding language in real-world referents.

Resilient robot teams: a review integrating decentralised control, change-detection, and learning

no code implementations21 Apr 2022 David M. Bossens, Sarvapali Ramchurn, Danesh Tarapore

Purpose of review: This paper reviews opportunities and challenges for decentralised control, change-detection, and learning in the context of resilient robot teams.

Change Detection Fault Detection +2

Explicit Explore, Exploit, or Escape ($E^4$): near-optimal safety-constrained reinforcement learning in polynomial time

no code implementations14 Nov 2021 David M. Bossens, Nicholas Bishop

Constrained Markov decision processes (CMDPs) can provide long-term safety constraints; however, the agent may violate the constraints in an effort to explore its environment.

Reinforcement Learning (RL)

Quality-Diversity Meta-Evolution: customising behaviour spaces to a meta-objective

1 code implementation8 Sep 2021 David M. Bossens, Danesh Tarapore

To illuminate the elite solutions for a space of behaviours, QD algorithms require the definition of a suitable behaviour space.

Dimensionality Reduction

Lifetime policy reuse and the importance of task capacity

1 code implementation3 Jun 2021 David M. Bossens, Adam J. Sobey

A long-standing challenge in artificial intelligence is lifelong reinforcement learning, where learners are given many tasks in sequence and must transfer knowledge between tasks while avoiding catastrophic forgetting.

reinforcement-learning

On the use of feature-maps and parameter control for improved quality-diversity meta-evolution

no code implementations21 May 2021 David M. Bossens, Danesh Tarapore

In Quality-Diversity (QD) algorithms, which evolve a behaviourally diverse archive of high-performing solutions, the behaviour space is a difficult design choice that should be tailored to the target application.

feature selection reinforcement-learning +1

Rapidly adapting robot swarms with Swarm Map-based Bayesian Optimisation

1 code implementation21 Dec 2020 David M. Bossens, Danesh Tarapore

We also investigate disturbances in the operating environment of the swarm, where the swarm has to adapt to drastic changes in the number of resources available in the environment, and to one of the robots behaving disruptively towards the rest of the swarm, with 30 unique conditions for each such perturbation.

Bayesian Optimisation

QED: using Quality-Environment-Diversity to evolve resilient robot swarms

1 code implementation4 Mar 2020 David M. Bossens, Danesh Tarapore

To allow fault recovery from randomly injected faults to different robots in a swarm, a model-free approach may be preferable due to the accumulation of faults in models and the difficulty to predict the behaviour of neighbouring robots.

Cannot find the paper you are looking for? You can Submit a new open access paper.