Search Results for author: Alex Gittens

Found 25 papers, 7 papers with code

SPOCK at FinCausal 2022: Causal Information Extraction Using Span-Based and Sequence Tagging Models

no code implementations • FNP (LREC) 2022 • Anik Saha, Jian Ni, Oktie Hassanzadeh, Alex Gittens, Kavitha Srinivas, Bulent Yener

Causal information extraction is an important task in natural language processing, particularly in finance domain.

Language Modelling

Paper
Add Code

Aligners: Decoupling LLMs and Alignment

no code implementations • 7 Mar 2024 • Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications.

Paper
Add Code

Improving Neural Ranking Models with Traditional IR Methods

1 code implementation • 29 Aug 2023 • Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions.

Information Retrieval Retrieval

Paper
Code

A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction

1 code implementation • 7 Aug 2023 • Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation.

Binary Classification

Paper
Code

Deception by Omission: Using Adversarial Missingness to Poison Causal Structure Learning

no code implementations • 31 May 2023 • Deniz Koyuncu, Alex Gittens, Bülent Yener, Moti Yung

Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed.

Paper
Add Code

Reduced Label Complexity For Tight $\ell_2$ Regression

no code implementations • 12 May 2023 • Alex Gittens, Malik Magdon-Ismail

Open question: Can label complexity be reduced by $\Omega(n)$ with tight $(1+d/n)$-approximation?

Open-Ended Question Answering regression

Paper
Add Code

Word Sense Induction with Knowledge Distillation from BERT

no code implementations • 20 Apr 2023 • Anik Saha, Alex Gittens, Bulent Yener

This paper proposes a two-stage method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context and transferring this sense information to fit multi-sense embeddings in a skip-gram-like framework.

Knowledge Distillation Language Modelling +3

Paper
Add Code

Simple Disentanglement of Style and Content in Visual Representations

1 code implementation • 20 Feb 2023 • Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

Learning visual representations with interpretable features, i. e., disentangled representations, remains a challenging problem.

Disentanglement Domain Generalization

Paper
Code

Output Randomization: A Novel Defense for both White-box and Black-box Adversarial Models

no code implementations • 8 Jul 2021 • Daniel Park, Haidar Khan, Azer Khan, Alex Gittens, Bülent Yener

Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model in a "white box" setting and to the opposite in a "black box" setting.

Paper
Add Code

Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation

1 code implementation • ACL (NLP4Prog) 2021 • Gabriel Orlanski, Alex Gittens

We evaluate prior state-of-the-art CoNaLa models with this additional data and find that our proposed method of using the body and mined data beats the BLEU score of the prior state-of-the-art by $71. 96\%$.

Ranked #1 on Code Generation on CoNaLa-Ext

Code Generation

Paper
Code

Learning Fair Canonical Polyadical Decompositions using a Kernel Independence Criterion

no code implementations • 27 Apr 2021 • Kevin Kim, Alex Gittens

This work proposes to learn fair low-rank tensor decompositions by regularizing the Canonical Polyadic Decomposition factorization with the kernel Hilbert-Schmidt independence criterion (KHSIC).

Fairness

Paper
Add Code

NoisyCUR: An algorithm for two-cost budgeted matrix completion

1 code implementation • 16 Apr 2021 • Dong Hu, Alex Gittens, Malik Magdon-Ismail

Specifically, we consider that it is possible to obtain low noise, high cost observations of individual entries or high noise, low cost observations of entire columns.

Matrix Completion Vocal Bursts Valence Prediction

Paper
Code

TINKER: A framework for Open source Cyberthreat Intelligence

no code implementations • 10 Feb 2021 • Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, Charu Aggarwal

The information is extracted and stored in a structured format using knowledge graphs such that the semantics of the threat intelligence can be preserved and shared at scale with other security analysts.

Information Retrieval Intrusion Detection +3

Paper
Add Code

MALOnt: An Ontology for Malware Threat Intelligence

1 code implementation • 20 Jun 2020 • Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, Charu Aggarwal

The knowledge graph that uses MALOnt is instantiated from a corpus comprising hundreds of annotated malware threat reports.

Decision Making Graph Generation +1

Paper
Code

Fast Fixed Dimension L2-Subspace Embeddings of Arbitrary Accuracy, With Application to L1 and L2 Tasks

no code implementations • 27 Sep 2019 • Malik Magdon-Ismail, Alex Gittens

We give a fast oblivious L2-embedding of $A\in \mathbb{R}^{n x d}$ to $B\in \mathbb{R}^{r x d}$ satisfying $(1-\varepsilon)\|A x\|_2^2 \le \|B x\|_2^2 <= (1+\varepsilon) \|Ax\|_2^2.$ Our embedding dimension $r$ equals $d$, a constant independent of the distortion $\varepsilon$.

Paper
Add Code

Skip-Gram âˆ’ Zipf + Uniform = Vector Additivity

no code implementations • ACL 2017 • Alex Gittens, Dimitris Achlioptas, Michael W. Mahoney

An unexpected {``}side-effect{''} of such models is that their vectors often exhibit compositionality, i. e., \textit{adding}two word-vectors results in a vector that is only a small angle away from the vector of a word representing the semantic composite of the original words, e. g., {``}man{''} + {``}royal{''} = {``}king{''}.

Caption Generation Dimensionality Reduction +1

Paper
Add Code

Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds

no code implementations • 9 Jun 2017 • Shusen Wang, Alex Gittens, Michael W. Mahoney

This work analyzes the application of this paradigm to kernel $k$-means clustering, and shows that applying the linear $k$-means clustering algorithm to $\frac{k}{\epsilon} (1 + o(1))$ features constructed using a so-called rank-restricted Nystr\"om approximation results in cluster assignments that satisfy a $1 + \epsilon$ approximation ratio in terms of the kernel $k$-means cost function, relative to the guarantee provided by the same algorithm without the use of the Nystr\"om method.

Clustering

Paper
Add Code

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

no code implementations • ICML 2017 • Shusen Wang, Alex Gittens, Michael W. Mahoney

In particular, there is a bias-variance trade-off in sketched MRR that is not present in sketched LSR.

regression

Paper
Add Code

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

1 code implementation • 5 Jul 2016 • Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat

We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms.

Distributed, Parallel, and Cluster Computing G.1.3; C.2.4

Paper
Code

Hardware Compliant Approximate Image Codes

no code implementations • CVPR 2015 • Da Kuang, Alex Gittens, Raffay Hamid

In recent years, several feature encoding schemes for the bags-of-visual-words model have been proposed.

Computational Efficiency General Classification

Paper
Add Code

Tensor machines for learning target-specific polynomial features

no code implementations • 7 Apr 2015 • Jiyan Yang, Alex Gittens

Recent years have demonstrated that using random feature maps can significantly decrease the training and testing times of kernel-based algorithms without significantly lowering their accuracy.

Paper
Add Code

piCholesky: Polynomial Interpolation of Multiple Cholesky Factors for Efficient Approximate Cross-Validation

no code implementations • 2 Apr 2014 • Da Kuang, Alex Gittens, Raffay Hamid

The dominant cost in solving least-square problems using Newton's method is often that of factorizing the Hessian matrix over multiple values of the regularization parameter ($\lambda$).

Paper
Add Code

Compact Random Feature Maps

no code implementations • 17 Dec 2013 • Raffay Hamid, Ying Xiao, Alex Gittens, Dennis Decoste

Kernel approximation using randomized feature maps has recently gained a lot of interest.

Paper
Add Code

Spectral Clustering via the Power Method -- Provably

no code implementations • 12 Nov 2013 • Christos Boutsidis, Alex Gittens, Prabhanjan Kambadur

Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis.

Clustering

Paper
Add Code

Revisiting the Nystrom Method for Improved Large-Scale Machine Learning

no code implementations • 7 Mar 2013 • Alex Gittens, Michael W. Mahoney

Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.