Preference Conditioned Neural Multi-objective Combinatorial Optimization
Multiobjective combinatorial optimization (MOCO) problems can be found in many real-world applications. However, exactly solving these problems would be very challenging, particularly when they are NP-hard. Many handcrafted heuristic methods have been proposed to tackle different MOCO problems over the past decades. In this work, we generalize the idea of neural combinatorial optimization, and develop a learning-based approach to approximate the whole Pareto set for a given MOCO problem without further search procedure. To be concrete, we propose a single preference-based attention model to directly generate approximate Pareto solutions of all the different trade-offs. We design an efficient multiobjective reinforcement learning algorithm to train the model with different preferences simultaneously. Experimental results show that our proposed method significantly outperforms the other methods on the multiobjective traveling salesman problem (MOTSP), multiobjective vehicle routing problem (MOVRP) and multiobjective knapsack problem (MOKP) in solution quality, speed, and model efficiency.
PDF Abstract