Search Results for author: Andrew Gilbert

Found 31 papers, 8 papers with code

PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization

no code implementations • 27 Mar 2024 • Edward Fish, Jon Weinbren, Andrew Gilbert

This paper introduces a novel approach to temporal action localization (TAL) in few-shot learning.

Few-Shot Learning Few Shot Temporal Action Localization +1

Paper
Add Code

A Data Augmentation Pipeline to Generate Synthetic Labeled Datasets of 3D Echocardiography Images using a GAN

no code implementations • 8 Mar 2024 • Cristiana Tiago, Andrew Gilbert, Ahmed S. Beela, Svein Arne Aase, Sten Roar Snare, Jurica Sprem

A quantitative analysis of the 3D segmentations given by the models trained with the synthetic images indicated the potential use of this GAN approach to generate 3D synthetic data, use the data to train DL models for different clinical tasks, and therefore tackle the problem of scarcity of 3D labeled echocardiography datasets.

Computed Tomography (CT) Data Augmentation +2

Paper
Add Code

ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet

1 code implementation • 5 Dec 2023 • Soon Yau Cheong, Armin Mustafa, Andrew Gilbert

This paper introduces ViscoNet, a novel method that enhances text-to-image human generation models with visual prompting.

Image Generation Visual Prompting

Paper
Code

ZeST-NeRF: Using temporal aggregation for Zero-Shot Temporal NeRFs

no code implementations • 30 Nov 2023 • Violeta Menéndez González, Andrew Gilbert, Graeme Phillipson, Stephen Jolly, Simon Hadfield

Recent approaches have had great success at performing novel view image synthesis of static scenes.

Image Generation Video Editing

Paper
Add Code

Multi-Resolution Audio-Visual Feature Fusion for Temporal Action Localization

no code implementations • 5 Oct 2023 • Edward Fish, Jon Weinbren, Andrew Gilbert

Temporal Action Localization (TAL) aims to identify actions' start, end, and class labels in untrimmed videos.

Temporal Action Localization

Paper
Add Code

DECORAIT -- DECentralized Opt-in/out Registry for AI Training

no code implementations • 25 Sep 2023 • Kar Balan, Alex Black, Simon Jenni, Andrew Gilbert, Andy Parsons, John Collomosse

We report a prototype of DECORAIT, which explores hierarchical clustering and a combination of on/off-chain storage to create a scalable decentralized registry to trace the provenance of GenAI training data in order to determine training consent and reward creatives who contribute that data.

Paper
Add Code

MOFO: MOtion FOcused Self-Supervision for Video Understanding

1 code implementation • 23 Aug 2023 • Mona Ahmadian, Frank Guerin, Andrew Gilbert

Despite the importance of motion in supervised learning techniques for action recognition, SSL methods often do not explicitly consider motion information in videos.

Action Classification Action Recognition +3

Paper
Code

DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer

no code implementations • 9 Jul 2023 • Dan Ruta, Gemma Canet Tarrés, Andrew Gilbert, Eli Shechtman, Nicholas Kolkin, John Collomosse

Neural Style Transfer (NST) is the field of study applying neural techniques to modify the artistic appearance of a content image to match the style of a reference style image.

Image Generation Style Transfer

Paper
Add Code

UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer

1 code implementation • 18 Apr 2023 • Soon Yau Cheong, Armin Mustafa, Andrew Gilbert

Text-to-image models (T2I) such as StableDiffusion have been used to generate high quality images of people.

Ranked #1 on Pose Transfer on Deep-Fashion (FID metric)

Disentanglement Pose Transfer +2

Paper
Code

ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer

no code implementations • 12 Apr 2023 • Dan Ruta, Gemma Canet Tarres, Alexander Black, Andrew Gilbert, John Collomosse

Representation learning aims to discover individual salient features of a domain in a compact and descriptive form that strongly identifies the unique characteristics of a given sample respective to its domain.

Descriptive Disentanglement +1

Paper
Add Code

NeAT: Neural Artistic Tracing for Beautiful Style Transfer

1 code implementation • 11 Apr 2023 • Dan Ruta, Andrew Gilbert, John Collomosse, Eli Shechtman, Nicholas Kolkin

As a component of curating this data, we present a novel model able to classify if an image is stylistic.

Image Generation Style Transfer

Paper
Code

EKILA: Synthetic Media Provenance and Attribution for Generative Art

no code implementations • 10 Apr 2023 • Kar Balan, Shruti Agarwal, Simon Jenni, Andy Parsons, Andrew Gilbert, John Collomosse

We present EKILA; a decentralized framework that enables creatives to receive recognition and reward for their contributions to generative AI (GenAI).

Paper
Add Code

SVS: Adversarial refinement for sparse novel view synthesis

1 code implementation • 14 Nov 2022 • Violeta Menéndez González, Andrew Gilbert, Graeme Phillipson, Stephen Jolly, Simon Hadfield

This is a view synthesis problem where the number of reference views is limited, and the baseline between target and reference view is significant.

Novel View Synthesis

Paper
Code

HyperNST: Hyper-Networks for Neural Style Transfer

no code implementations • 9 Aug 2022 • Dan Ruta, Andrew Gilbert, Saeid Motiian, Baldo Faieta, Zhe Lin, John Collomosse

We present HyperNST; a neural style transfer (NST) technique for the artistic stylization of images, based on Hyper-networks and the StyleGAN2 architecture.

Style Transfer

Paper
Add Code

Two-Stream Transformer Architecture for Long Video Understanding

no code implementations • 2 Aug 2022 • Edward Fish, Jon Weinbren, Andrew Gilbert

Pure vision transformer architectures are highly effective for short video classification and action recognition tasks.

Action Recognition Inductive Bias +3

Paper
Add Code

Light-weight spatio-temporal graphs for segmentation and ejection fraction prediction in cardiac ultrasound

1 code implementation • 6 Jul 2022 • Sarina Thomas, Andrew Gilbert, Guy Ben-Yosef

In particular, segmentations of the left ventricle can be used to derive ventricular volume, ejection fraction (EF) and other relevant measurements.

Ranked #2 on on Echonet-Dynamic

LV Segmentation Segmentation +1

Paper
Code

SaiNet: Stereo aware inpainting behind objects with generative networks

no code implementations • 14 May 2022 • Violeta Menéndez González, Andrew Gilbert, Graeme Phillipson, Stephen Jolly, Simon Hadfield

In this work, we present an end-to-end network for stereo-consistent image inpainting with the objective of inpainting large missing regions behind objects.

Image Inpainting

Paper
Add Code

StyleBabel: Artistic Style Tagging and Captioning

no code implementations • 10 Mar 2022 • Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse

We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.

Attribute Representation Learning +2

Paper
Add Code

KPE: Keypoint Pose Encoding for Transformer-based Image Generation

1 code implementation • 9 Mar 2022 • Soon Yau Cheong, Armin Mustafa, Andrew Gilbert

Therefore we propose a new method; Keypoint Pose Encoding (KPE); KPE is 10 times more memory efficient and over 73% faster at generating high quality images from text input conditioned on the pose.

Image Generation

Paper
Code

Human-like Relational Models for Activity Recognition in Video

no code implementations • 12 Jul 2021 • Joseph Chrol-Cannon, Andrew Gilbert, Ranko Lazic, Adithya Madhusoodanan, Frank Guerin

We apply the method to a challenging subset of the something-something dataset and achieve a more robust performance against neural network baselines on challenging activities.

Activity Recognition

Paper
Add Code

ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity

no code implementations • ICCV 2021 • Dan Ruta, Saeid Motiian, Baldo Faieta, Zhe Lin, Hailin Jin, Alex Filipkowski, Andrew Gilbert, John Collomosse

We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style.

Representation Learning

Paper
Add Code

Rethinking movie genre classification with fine-grained semantic clustering

no code implementations • 4 Dec 2020 • Edward Fish, Jon Weinbren, Andrew Gilbert

We expand these 'coarse' genre labels by identifying 'fine-grained' semantic information within the multi-modal content of movies.

Classification Clustering +2

Paper
Add Code

Neural Architecture Search for Deep Image Prior

2 code implementations • 14 Jan 2020 • Kary Ho, Andrew Gilbert, Hailin Jin, John Collomosse

We present a neural architecture search (NAS) technique to enhance the performance of unsupervised image de-noising, in-painting and super-resolution under the recently proposed Deep Image Prior (DIP).

Decoder Image Restoration +2

Paper
Code

Automated Left Ventricle Dimension Measurement in 2D Cardiac Ultrasound via an Anatomically Meaningful CNN Approach

no code implementations • 6 Nov 2019 • Andrew Gilbert, Marit Holden, Line Eikvil, Svein Arne Aase, Eigil Samset, Kristin McLeod

Treating the problem as a landmark detection problem, we propose a modified U-Net CNN architecture to generate heatmaps of likely coordinate locations.

Paper
Add Code

Doppler Spectrum Classification with CNNs via Heatmap Location Encoding and a Multi-head Output Layer

no code implementations • 6 Nov 2019 • Andrew Gilbert, Marit Holden, Line Eikvil, Mariia Rakhmail, Aleksandar Babic, Svein Arne Aase, Eigil Samset, Kristin McLeod

We analyze example images that fall outside of our proposed classes to show our confidence metric can prevent many misclassifications.

Decision Making General Classification

Paper
Add Code

Semantic Estimation of 3D Body Shape and Pose using Minimal Cameras

no code implementations • 8 Aug 2019 • Andrew Gilbert, Matthew Trumble, Adrian Hilton, John Collomosse

We aim to simultaneously estimate the 3D articulated pose and high fidelity volumetric occupancy of human performance, from multiple viewpoint video (MVV) with as few as two views.

Ranked #163 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation Decoder

Paper
Add Code

Volumetric performance capture from minimal camera viewpoints

no code implementations • ECCV 2018 • Andrew Gilbert, Marco Volino, John Collomosse, Adrian Hilton

We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views.

Paper
Add Code

Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling

no code implementations • ECCV 2018 • Matthew Trumble, Andrew Gilbert, Adrian Hilton, John Collomosse

We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views.

Ranked #9 on 3D Human Pose Estimation on Total Capture

3D Human Pose Estimation

Paper
Add Code

Disentangling Structure and Aesthetics for Style-Aware Image Completion

no code implementations • CVPR 2018 • Andrew Gilbert, John Collomosse, Hailin Jin, Brian Price

Content-aware image completion or in-painting is a fundamental tool for the correction of defects or removal of objects in images.

Paper
Add Code

Total capture: 3D human pose estimation fusing video and inertial sensors

no code implementations • BMVC 2017 2017 • Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton, and John Collomosse

We incorporate this model within a dual stream network integrating pose embeddings derived from MVV and a forward kinematic solve of the IMU data.

Ranked #11 on 3D Human Pose Estimation on Total Capture

3D Human Pose Estimation

Paper
Add Code

Image and Video Mining through Online Learning

no code implementations • 9 Sep 2016 • Andrew Gilbert, Richard Bowden

On the UCF11 video dataset, the accuracy is 86. 7% despite using only 90 labelled examples from a dataset of over 1200 videos, instead of the standard 1122 training videos.

Action Recognition Active Learning +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.