Symbolic regression is the process of identifying mathematical expressions that fit observed output from a black-box process.
In this work, we propose a general and hybrid approach, based on DRL and CP, for solving combinatorial optimization problems.
Our goal is to learn a semantic parser that maps natural language utterances into executable programs when only indirect supervision is available: examples are labeled with the correct execution result, but not the program itself.
This paper presents the proof of concept for SeaPearl, a new CP solver implemented in Julia, that supports machine learning routines in order to learn branching decisions using reinforcement learning.
Relative to a set of randomly generated problem instances, agents trained through reinforcement learning techniques are capable of producing short quantum programs which generate high quality solutions on both types of quantum resources.
Harnessing the statistical power of neural networks to perform language understanding and symbolic reasoning is difficult, when it requires executing efficient discrete operations against a large knowledge-base.
In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters.
Ranked #3 on Atari Games on Atari 2600 Skiing (using extra training data)
We present Memory Augmented Policy Optimization (MAPO), a simple and novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimate.
We compare NICE with (1) a baseline integer programming formulation that produces a feasible crew schedule, and (2) a robust integer programming formulation that explicitly tries to minimize the impact of disruptions.
We convert the DeepMind Mathematics Dataset into a reinforcement learning environment by interpreting it as a program synthesis problem.