1 code implementation • EMNLP 2021 • Subhajit Chaudhury, Prithviraj Sen, Masaki Ono, Daiki Kimura, Michiaki Tatsubori, Asim Munawar
We outline a method for end-to-end differentiable symbolic rule learning and show that such symbolic policies outperform previous state-of-the-art methods in text-based RL for the coin collector environment from 5-10x fewer training games.
1 code implementation • 7 May 2024 • Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda
Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously.
no code implementations • 27 Feb 2024 • Keshav Ramji, Young-suk Lee, Ramón Fernandez Astudillo, Md Arafat Sultan, Tahira Naseem, Asim Munawar, Radu Florian, Salim Roukos
It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response.
1 code implementation • 23 Feb 2024 • Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Sadhana Kumaravel, Vinod Muthusamy, Pavan Kapanipathi, Luis A. Lastras
There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks.
no code implementations • 4 Feb 2024 • Gaurav Pandey, Yatin Nandwani, Tahira Naseem, Mayank Mishra, Guangxuan Xu, Dinesh Raghu, Sachindra Joshi, Asim Munawar, Ramón Fernandez Astudillo
Following the success of Proximal Policy Optimization (PPO) for Reinforcement Learning from Human Feedback (RLHF), new techniques such as Sequence Likelihood Calibration (SLiC) and Direct Policy Optimization (DPO) have been proposed that are offline in nature and use rewards in an indirect manner.
no code implementations • 31 Aug 2023 • Hina Raja, Asim Munawar, Mohammad Delsoz, Mohammad Elahi, Yeganeh Madadi, Amr Hassan, Hashem Abu Serhan, Onur Inam, Luis Hermandez, Sang Tran, Wuqas Munir, Alaa Abd-Alrazaq, Hao Chen, SiamakYousefi
Moreover, the extendibility of the model to other scientific fields broadens its impact in facilitating research and trend analysis across diverse disciplines.
1 code implementation • 5 Jul 2023 • Subhajit Chaudhury, Sarathkrishna Swaminathan, Daiki Kimura, Prithviraj Sen, Keerthiram Murugesan, Rosario Uceda-Sosa, Michiaki Tatsubori, Achille Fokoue, Pavan Kapanipathi, Asim Munawar, Alexander Gray
Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games.
no code implementations • EMNLP 2021 • Daiki Kimura, Masaki Ono, Subhajit Chaudhury, Ryosuke Kohita, Akifumi Wachi, Don Joven Agravante, Michiaki Tatsubori, Asim Munawar, Alexander Gray
Deep reinforcement learning (RL) methods often require many trials before convergence, and no direct interpretability of trained policies is provided.
1 code implementation • ACL 2021 • Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, Alexander Gray
We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games.
no code implementations • 3 Mar 2021 • Daiki Kimura, Subhajit Chaudhury, Akifumi Wachi, Ryosuke Kohita, Asim Munawar, Michiaki Tatsubori, Alexander Gray
Specifically, we propose an integrated method that enables model-free reinforcement learning from external knowledge sources in an LNNs-based logical constrained framework such as action shielding and guide.
no code implementations • 10 Dec 2020 • Guillaume Le Moing, Phongtharin Vinayavekhin, Don Joven Agravante, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana
Moreover, learning for different microphone array layouts makes the task more complicated due to the infinite number of possible layouts.
no code implementations • 10 Dec 2020 • Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante
In this paper, we propose novel deep learning based algorithms for multiple sound source localization.
no code implementations • 10 Dec 2020 • Guillaume Le Moing, Don Joven Agravante, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Phongtharin Vinayavekhin
This paper introduces an ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources.
1 code implementation • EMNLP 2020 • Subhajit Chaudhury, Daiki Kimura, Kartik Talamadupula, Michiaki Tatsubori, Asim Munawar, Ryuki Tachibana
Our bootstrapped agent shows improved generalization in solving unseen TextWorld games, using 10x-20x fewer training games compared to previous state-of-the-art methods despite requiring less number of training episodes.
no code implementations • 19 Feb 2020 • Subhajit Chaudhury, Daiki Kimura, Phongtharin Vinayavekhin, Asim Munawar, Ryuki Tachibana, Koji Ito, Yuki Inaba, Minoru Matsumoto, Shuji Kidokoro, Hiroki Ozaki
In this paper, we study the case of event detection in sports videos for unstructured environments with arbitrary camera angles.
no code implementations • 17 Dec 2019 • Michiaki Tatsubori, Asim Munawar, Takao Moriyama
Planning is a critical component of any artificial intelligence system that concerns the realization of strategies or action sequences typically for intelligent agents and autonomous robots.
no code implementations • 23 Mar 2019 • Daiki Kimura, Minori Narita, Asim Munawar, Ryuki Tachibana
Visual anomaly detection is common in several applications including medical screening and production quality check.
no code implementations • 2 Oct 2018 • Subhajit Chaudhury, Daiki Kimura, Asim Munawar, Ryuki Tachibana
Experimental results show that the proposed adversarial learning method from raw videos produces a similar performance to state-of-the-art imitation learning techniques while frequently outperforming existing hand-crafted video imitation methods.
1 code implementation • 21 Sep 2018 • Tu-Hoa Pham, Giovanni De Magistris, Don Joven Agravante, Subhajit Chaudhury, Asim Munawar, Ryuki Tachibana
We consider the problem of reinforcement learning under safety requirements, in which an agent is trained to complete a given task, typically formalized as the maximization of a reward signal over time, while concurrently avoiding undesirable actions or states, associated to lower rewards, or penalties.
no code implementations • 12 Sep 2018 • Akifumi Wachi, Hiroshi Kajino, Asim Munawar
This paper presents a learning algorithm called ST-SafeMDP for exploring Markov decision processes (MDPs) that is based on the assumption that the safety features are a priori unknown and time-variant.
1 code implementation • 21 Aug 2018 • Takao Moriyama, Giovanni De Magistris, Michiaki Tatsubori, Tu-Hoa Pham, Asim Munawar, Ryuki Tachibana
Common approaches to control a data-center cooling system rely on approximated system/environment models that are built upon the knowledge of mechanical cooling and electrical and thermal management.
Systems and Control
no code implementations • 22 Jun 2018 • Phongtharin Vinayavekhin, Subhajit Chaudhury, Asim Munawar, Don Joven Agravante, Giovanni De Magistris, Daiki Kimura, Ryuki Tachibana
This paper is a contribution towards interpretability of the deep learning models in different applications of time-series.
no code implementations • 16 Aug 2017 • Asim Munawar, Phongtharin Vinayavekhin, Giovanni De Magistris
In the results section we demonstrate the features of the algorithm using MNIST handwritten digit dataset and latter apply the technique to a real-world obstacle detection problem.
no code implementations • 14 Aug 2017 • Tadanobu Inoue, Giovanni De Magistris, Asim Munawar, Tsuyoshi Yokoya, Ryuki Tachibana
High precision assembly of mechanical parts requires accuracy exceeding the robot precision.
no code implementations • 4 Jul 2017 • Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan, Ryuki Tachibana
We present a conditional generative model that maps low-dimensional embeddings of multiple modalities of data to a common latent space hence extracting semantic relationships between them.