Search Results for author: Amnon Shashua

Found 37 papers, 13 papers with code

Tradeoffs Between Alignment and Helpfulness in Language Models

no code implementations • 29 Jan 2024 • Yotam Wolf, Noam Wies, Dorin Shteyman, Binyamin Rothberg, Yoav Levine, Amnon Shashua

Representation engineering yields gains in alignment oriented tasks such as resistance to adversarial attacks and reduction of social biases, but was also shown to cause a decrease in the ability of the model to perform basic tasks.

Language Modelling

Paper
Add Code

Generating Benchmarks for Factuality Evaluation of Language Models

2 code implementations • 13 Jul 2023 • Dor Muhlgay, Ori Ram, Inbal Magar, Yoav Levine, Nir Ratner, Yonatan Belinkov, Omri Abend, Kevin Leyton-Brown, Amnon Shashua, Yoav Shoham

FACTOR automatically transforms a factual corpus of interest into a benchmark evaluating an LM's propensity to generate true facts from the corpus vs. similar but incorrect statements.

Language Modelling Retrieval

341

Paper
Code

Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework

no code implementations • 4 Jul 2023 • Eliya Segev, Maya Alroy, Ronen Katsir, Noam Wies, Ayana Shenhav, Yael Ben-Oren, David Zar, Oren Tadmor, Jacob Bitterman, Amnon Shashua, Tal Rosenwein

Here we propose $\textit{Align With Purpose}$, a $\textbf{general Plug-and-Play framework}$ for enhancing a desired property in models trained with the CTC criterion.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Fundamental Limitations of Alignment in Large Language Models

no code implementations • 19 Apr 2023 • Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua

An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users.

Paper
Add Code

In-Context Retrieval-Augmented Language Models

1 code implementation • 31 Jan 2023 • Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham

Retrieval-Augmented Language Modeling (RALM) methods, which condition a language model (LM) on relevant documents from a grounding corpus during generation, were shown to significantly improve language modeling performance.

Language Modelling Retrieval +1

228

Paper
Code

Parallel Context Windows for Large Language Models

1 code implementation • 21 Dec 2022 • Nir Ratner, Yoav Levine, Yonatan Belinkov, Ori Ram, Inbal Magar, Omri Abend, Ehud Karpas, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham

We present Parallel Context Windows (PCW), a method that alleviates the context window restriction for any off-the-shelf LLM without further training.

In-Context Learning Playing the Game of 2048 +2

Paper
Code

MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning

no code implementations • 1 May 2022 • Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, Moshe Tenenholtz

Huge language models (LMs) have ushered in a new era for AI, serving as a gateway to natural-language-based knowledge tasks.

Paper
Add Code

Standing on the Shoulders of Giant Frozen Language Models

no code implementations • 21 Apr 2022 • Yoav Levine, Itay Dalmedigos, Ori Ram, Yoel Zeldes, Daniel Jannai, Dor Muhlgay, Yoni Osin, Opher Lieber, Barak Lenz, Shai Shalev-Shwartz, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham

To demonstrate this, we introduce three novel methods for leveraging frozen models: input-dependent prompt tuning, frozen readers, and recursive LMs, each of which vastly improves on current frozen-model approaches.

Paper
Add Code

Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks

1 code implementation • 6 Apr 2022 • Noam Wies, Yoav Levine, Amnon Shashua

Recently, several works have demonstrated high gains by taking a straightforward approach for incorporating intermediate supervision in compounded natural language problems: the sequence-to-sequence LM is fed with an augmented input, in which the decomposed tasks' labels are simply concatenated to the original input.

Paper
Code

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

no code implementations • ICLR 2022 • Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua

We highlight a bias introduced by this common practice: we prove that the pretrained NLM can model much stronger dependencies between text segments that appeared in the same training example, than it can between text segments that appeared in different training examples.

Chunking In-Context Learning +4

Paper
Add Code

Which transformer architecture fits my data? A vocabulary bottleneck in self-attention

no code implementations • 9 May 2021 • Noam Wies, Yoav Levine, Daniel Jannai, Amnon Shashua

After their successful debut in natural language processing, Transformer architectures are now becoming the de-facto standard in many domains.

Paper
Add Code

Neural tensor contractions and the expressive power of deep neural quantum states

no code implementations • 18 Mar 2021 • Or Sharir, Amnon Shashua, Giuseppe Carleo

We establish a direct connection between general tensor networks and deep feed-forward artificial neural networks.

Tensor Networks

Paper
Add Code

The Depth-to-Width Interplay in Self-Attention

1 code implementation • NeurIPS 2020 • Yoav Levine, Noam Wies, Or Sharir, Hofit Bata, Amnon Shashua

Our guidelines elucidate the depth-to-width trade-off in self-attention networks of sizes up to the scale of GPT3 (which we project to be too deep for its size), and beyond, marking an unprecedented width of 30K as optimal for a 1-Trillion parameter network.

Paper
Code

On the Ethics of Building AI in a Responsible Manner

no code implementations • 30 Mar 2020 • Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua

The AI-alignment problem arises when there is a discrepancy between the goals that a human designer specifies to an AI learner and a potential catastrophic outcome that does not reflect what the human designer really wants.

BIG-bench Machine Learning Ethics

Paper
Add Code

SenseBERT: Driving Some Sense into BERT

no code implementations • ACL 2020 • Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham

The ability to learn from large unlabeled corpora has allowed neural language models to advance the frontier in natural language understanding.

Ranked #11 on Word Sense Disambiguation on Words in Context

Language Modelling Natural Language Inference +2

Paper
Add Code

Deep autoregressive models for the efficient variational simulation of many-body quantum systems

2 code implementations • 11 Feb 2019 • Or Sharir, Yoav Levine, Noam Wies, Giuseppe Carleo, Amnon Shashua

Artificial Neural Networks were recently shown to be an efficient representation of highly-entangled many-body quantum states.

Variational Monte Carlo

Paper
Code

Quantum Entanglement in Deep Learning Architectures

no code implementations • 26 Mar 2018 • Yoav Levine, Or Sharir, Nadav Cohen, Amnon Shashua

Modern deep learning has enabled unprecedented achievements in various domains.

BIG-bench Machine Learning Image Classification

Paper
Add Code

Benefits of Depth for Long-Term Memory of Recurrent Networks

no code implementations • ICLR 2018 • Yoav Levine, Or Sharir, Amnon Shashua

We prove that deep recurrent networks support Start-End separation ranks which are exponentially higher than those supported by their shallow counterparts.

Attribute Time Series Analysis

Paper
Add Code

On the Long-Term Memory of Deep Recurrent Networks

1 code implementation • 25 Oct 2017 • Yoav Levine, Or Sharir, Alon Ziv, Amnon Shashua

A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies.

Attribute Tensor Networks

Paper
Code

Sum-Product-Quotient Networks

no code implementations • 12 Oct 2017 • Or Sharir, Amnon Shashua

We present a novel tractable generative model that extends Sum-Product Networks (SPNs) and significantly boosts their power.

Paper
Add Code

On a Formal Model of Safe and Scalable Self-driving Cars

3 code implementations • 21 Aug 2017 • Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua

In the second part we describe a design of a system that adheres to our safety assurance requirements and is scalable to millions of cars.

Autonomous Driving Self-Driving Cars

326

Paper
Code

Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions

no code implementations • 5 May 2017 • Nadav Cohen, Or Sharir, Yoav Levine, Ronen Tamari, David Yakira, Amnon Shashua

Expressive efficiency refers to the ability of a network architecture to realize functions that require an alternative architecture to be much larger.

Inductive Bias

Paper
Add Code

Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design

no code implementations • ICLR 2018 • Yoav Levine, David Yakira, Nadav Cohen, Amnon Shashua

This description enables us to carry a graph-theoretic analysis of a convolutional network, with which we demonstrate a direct control over the inductive bias of the deep network via its channel numbers, that are related to the min-cut in the underlying graph.

Inductive Bias

Paper
Add Code

Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions

no code implementations • ICLR 2018 • Nadav Cohen, Ronen Tamari, Amnon Shashua

By introducing and analyzing the concept of mixed tensor decompositions, we prove that interconnecting dilated convolutional networks can lead to expressive efficiency.

Paper
Add Code

On the Expressive Power of Overlapping Architectures of Deep Learning

1 code implementation • ICLR 2018 • Or Sharir, Amnon Shashua

Expressive efficiency refers to the relation between two architectures A and B, whereby any function realized by B could be replicated by A, but there exists functions realized by A, which cannot be replicated by B unless its size grows significantly larger.

Attribute

Paper
Code

Tensorial Mixture Models

2 code implementations • 13 Oct 2016 • Or Sharir, Ronen Tamari, Nadav Cohen, Amnon Shashua

Other methods, based on arithmetic circuits and sum-product networks, do allow tractable marginalization, but their performance is challenged by the need to learn the structure of a circuit.

Paper
Code

Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving

no code implementations • 11 Oct 2016 • Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua

Second, the Markov Decision Process model often used in robotics is problematic in our case because of unpredictable behavior of other agents in this multi-agent scenario.

Autonomous Driving Multi-agent Reinforcement Learning +3

Paper
Add Code

Learning a Metric Embedding for Face Recognition using the Multibatch Method

no code implementations • NeurIPS 2016 • Oren Tadmor, Yonatan Wexler, Tal Rosenwein, Shai Shalev-Shwartz, Amnon Shashua

This work is motivated by the engineering task of achieving a near state-of-the-art face recognition on a minimal computing budget running on an embedded system.

Face Recognition

Paper
Add Code

Inductive Bias of Deep Convolutional Networks through Pooling Geometry

1 code implementation • 22 May 2016 • Nadav Cohen, Amnon Shashua

In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input.

Inductive Bias

Paper
Code

On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training

no code implementations • 23 Apr 2016 • Shai Shalev-Shwartz, Amnon Shashua

We compare the end-to-end training approach to a modular approach in which a system is decomposed into semantically meaningful components.

Autonomous Driving

Paper
Add Code

Convolutional Rectifier Networks as Generalized Tensor Decompositions

no code implementations • 1 Mar 2016 • Nadav Cohen, Amnon Shashua

Second, and more importantly, we show that depth efficiency is weaker with convolutional rectifier networks than it is with convolutional arithmetic circuits.

Paper
Add Code

Long-term Planning by Short-term Prediction

no code implementations • 4 Feb 2016 • Shai Shalev-Shwartz, Nir Ben-Zrihem, Aviad Cohen, Amnon Shashua

We argue that dual versions of the MDP framework (that depend on the value function and the $Q$ function) are problematic for autonomous driving applications due to the non Markovian of the natural state space representation, and due to the continuous state and action spaces.

Autonomous Driving

Paper
Add Code

On the Expressive Power of Deep Learning: A Tensor Analysis

no code implementations • 16 Sep 2015 • Nadav Cohen, Or Sharir, Amnon Shashua

In this work we derive a deep network architecture based on arithmetic circuits that inherently employs locality, sharing and pooling.

Paper
Add Code

Deep SimNets

no code implementations • CVPR 2016 • Nadav Cohen, Or Sharir, Amnon Shashua

We present a deep layered architecture that generalizes convolutional neural networks (ConvNets).

Paper
Add Code

SimNets: A Generalization of Convolutional Networks

1 code implementation • 3 Oct 2014 • Nadav Cohen, Amnon Shashua

We present a deep layered architecture that generalizes classical convolutional neural networks (ConvNets).

Paper
Code

ShareBoost: Efficient multiclass learning with feature sharing

no code implementations • NeurIPS 2011 • Shai Shalev-Shwartz, Yonatan Wexler, Amnon Shashua

We consider the problem of learning a multiclass predictor that uses only few features, and in particular, the number of used features should increase sub-linearly with the number of possible classes.

Paper
Add Code

Introduction to Machine Learning: Class Notes 67577

1 code implementation • 23 Apr 2009 • Amnon Shashua

Introduction to Machine learning covering Statistical Inference (Bayes, EM, ML/MaxEnt duality), algebraic and spectral methods (PCA, LDA, CCA, Clustering), and PAC learning (the Formal model, VC dimension, Double Sampling theorem).

BIG-bench Machine Learning Clustering +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.