Search Results for author: Simon Jenni

Found 23 papers, 3 papers with code

Learning Video Representations by Transforming Time

no code implementations • ECCV 2020 • Simon Jenni, Givi Meishvili, Paolo Favaro

Our representations can be learned from data without human annotation and provide a substantial boost to the training of neural networks on small labeled data sets for tasks such as action recognition, which require to accurately distinguish the motion of objects.

Action Recognition Self-Supervised Learning

Paper
Add Code

FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

no code implementations • 23 Apr 2024 • Hang Hua, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo

To address this, we propose FineMatch, a new aspect-based fine-grained text and image matching benchmark, focusing on text and image mismatch detection and correction.

Hallucination In-Context Learning +2

Paper
Add Code

Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models

no code implementations • 5 Apr 2024 • Gihyun Kwon, Simon Jenni, DIngzeyu Li, Joon-Young Lee, Jong Chul Ye, Fabian Caba Heilbron

While there has been significant progress in customizing text-to-image generation models, generating images that combine multiple personalized concepts remains challenging.

Text-to-Image Generation

Paper
Add Code

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

no code implementations • 20 Dec 2023 • Ishan Rajendrakumar Dave, Simon Jenni, Mubarak Shah

To address these issues, we propose 1) a more challenging reformulation of temporal self-supervision as frame-level (rather than clip-level) recognition tasks and 2) an effective augmentation strategy to mitigate shortcuts.

Action Classification Attribute +7

Paper
Add Code

DECORAIT -- DECentralized Opt-in/out Registry for AI Training

no code implementations • 25 Sep 2023 • Kar Balan, Alex Black, Simon Jenni, Andrew Gilbert, Andy Parsons, John Collomosse

We report a prototype of DECORAIT, which explores hierarchical clustering and a combination of on/off-chain storage to create a scalable decentralized registry to trace the provenance of GenAI training data in order to determine training consent and reward creatives who contribute that data.

Paper
Add Code

Meta-Personalizing Vision-Language Models to Find Named Instances in Video

1 code implementation • CVPR 2023 • Chun-Hsiao Yeh, Bryan Russell, Josef Sivic, Fabian Caba Heilbron, Simon Jenni

Large-scale vision-language models (VLM) have shown impressive results for language-guided search applications.

Retrieval Word Embeddings

Paper
Code

EKILA: Synthetic Media Provenance and Attribution for Generative Art

no code implementations • 10 Apr 2023 • Kar Balan, Shruti Agarwal, Simon Jenni, Andy Parsons, Andrew Gilbert, John Collomosse

We present EKILA; a decentralized framework that enables creatives to receive recognition and reward for their contributions to generative AI (GenAI).

Paper
Add Code

VADER: Video Alignment Differencing and Retrieval

no code implementations • ICCV 2023 • Alexander Black, Simon Jenni, Tu Bui, Md. Mehrab Tanjim, Stefano Petrangeli, Ritwik Sinha, Viswanathan Swaminathan, John Collomosse

We propose VADER, a spatio-temporal matching, alignment, and change summarization method to help fight misinformation spread via manipulated videos.

Misinformation Retrieval +2

Paper
Add Code

Audio-Visual Contrastive Learning with Temporal Self-Supervision

no code implementations • 15 Feb 2023 • Simon Jenni, Alexander Black, John Collomosse

We propose a self-supervised learning approach for videos that learns representations of both the RGB frames and the accompanying audio without human supervision.

Action Recognition Audio Classification +3

Paper
Add Code

Spatio-Temporal Crop Aggregation for Video Representation Learning

no code implementations • ICCV 2023 • Sepehr Sameni, Simon Jenni, Paolo Favaro

We propose Spatio-temporal Crop Aggregation for video representation LEarning (SCALE), a novel method that enjoys high scalability at both training and inference time.

Action Classification Dimensionality Reduction +3

Paper
Add Code

SImProv: Scalable Image Provenance Framework for Robust Content Attribution

no code implementations • 28 Jun 2022 • Alexander Black, Tu Bui, Simon Jenni, Zhifei Zhang, Viswanathan Swaminanthan, John Collomosse

We present SImProv - a scalable image provenance framework to match a query image back to a trusted database of originals and identify possible manipulations on the query.

Re-Ranking Retrieval

Paper
Add Code

Video-ReTime: Learning Temporally Varying Speediness for Time Remapping

no code implementations • 11 May 2022 • Simon Jenni, Markus Woodson, Fabian Caba Heilbron

Furthermore, we propose an optimization for video re-timing that enables precise control over the target duration and performs more robustly on longer videos than prior methods.

Action Recognition

Paper
Add Code

Representation Learning by Detecting Incorrect Location Embeddings

1 code implementation • 10 Apr 2022 • Sepehr Sameni, Simon Jenni, Paolo Favaro

We represent object parts with image tokens and train a ViT to detect which token has been combined with an incorrect positional embedding.

Ranked #91 on Image Classification on ObjectNet (using extra training data)

Image Classification Object +2

Paper
Code

Learning to Deblur and Rotate Motion-Blurred Faces

no code implementations • 14 Dec 2021 • Givi Meishvili, Attila Szabó, Simon Jenni, Paolo Favaro

Our method handles the complexity of face blur by implicitly learning the geometry and motion of faces through the joint training on three large datasets: FFHQ and 300VW, which are publicly available, and a new Bern Multi-View Face Dataset (BMFD) that we built.

Paper
Add Code

Time-Equivariant Contrastive Video Representation Learning

no code implementations • ICCV 2021 • Simon Jenni, Hailin Jin

We introduce a novel self-supervised contrastive learning method to learn representations from unlabelled videos.

Action Recognition Contrastive Learning +3

Paper
Add Code

VPN: Video Provenance Network for Robust Content Attribution

no code implementations • 21 Sep 2021 • Alexander Black, Tu Bui, Simon Jenni, Vishy Swaminathan, John Collomosse

We present VPN - a content attribution method for recovering provenance information from videos shared online.

Contrastive Learning

Paper
Add Code

Self-Supervised Multi-View Synchronization Learning for 3D Pose Estimation

no code implementations • 13 Oct 2020 • Simon Jenni, Paolo Favaro

Current state-of-the-art methods cast monocular 3D human pose estimation as a learning problem by training neural networks on large data sets of images and corresponding skeleton poses.

3D Pose Estimation Monocular 3D Human Pose Estimation +1

Paper
Add Code

Video Representation Learning by Recognizing Temporal Transformations

no code implementations • 21 Jul 2020 • Simon Jenni, Givi Meishvili, Paolo Favaro

Action Recognition Representation Learning +1

Paper
Add Code

Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

no code implementations • CVPR 2020 • Simon Jenni, Hailin Jin, Paolo Favaro

Based on this criterion, we introduce a novel image transformation that we call limited context inpainting (LCI).

Paper
Add Code

Learning to Have an Ear for Face Super-Resolution

no code implementations • CVPR 2020 • Givi Meishvili, Simon Jenni, Paolo Favaro

To combine the aural and visual modalities, we propose a method to first build the latent representations of a face from the lone audio track and then from the lone low-resolution image.

Audio Super-Resolution Face Reconstruction +2

Paper
Add Code

On Stabilizing Generative Adversarial Training with Noise

no code implementations • CVPR 2019 • Simon Jenni, Paolo Favaro

We notice that the distributions of real and generated data should match even when they undergo the same filtering.

Paper
Add Code

Deep Bilevel Learning

1 code implementation • ECCV 2018 • Simon Jenni, Paolo Favaro

Our approach is based on the principles of cross-validation, where a validation set is used to limit the model overfitting.

Bilevel Optimization

Paper
Code

Self-Supervised Feature Learning by Learning to Spot Artifacts

no code implementations • CVPR 2018 • Simon Jenni, Paolo Favaro

To generate images with artifacts, we pre-train a high-capacity autoencoder and then we use a damage and repair strategy: First, we freeze the autoencoder and damage the output of the encoder by randomly dropping its entries.

Self-Supervised Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.