no code implementations • 10 Mar 2024 • Debolena Basak, P. K. Srijith, Maunendra Sankar Desarkar
We propose TICOD, Transformer-based Image Captioning and Object detection model for jointly training both tasks by combining the losses obtained from image captioning and object detection networks.