no code implementations • FNP (LREC) 2022 • Anik Saha, Jian Ni, Oktie Hassanzadeh, Alex Gittens, Kavitha Srinivas, Bulent Yener
Causal information extraction is an important task in natural language processing, particularly in finance domain.
no code implementations • 7 Mar 2024 • Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin
Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications.
1 code implementation • 29 Aug 2023 • Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener
Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions.
1 code implementation • 7 Aug 2023 • Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener
Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation.
no code implementations • 31 May 2023 • Deniz Koyuncu, Alex Gittens, Bülent Yener, Moti Yung
Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed.
no code implementations • 12 May 2023 • Alex Gittens, Malik Magdon-Ismail
Open question: Can label complexity be reduced by $\Omega(n)$ with tight $(1+d/n)$-approximation?
no code implementations • 20 Apr 2023 • Anik Saha, Alex Gittens, Bulent Yener
This paper proposes a two-stage method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context and transferring this sense information to fit multi-sense embeddings in a skip-gram-like framework.
1 code implementation • 20 Feb 2023 • Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin
Learning visual representations with interpretable features, i. e., disentangled representations, remains a challenging problem.
no code implementations • 8 Jul 2021 • Daniel Park, Haidar Khan, Azer Khan, Alex Gittens, Bülent Yener
Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model in a "white box" setting and to the opposite in a "black box" setting.
1 code implementation • ACL (NLP4Prog) 2021 • Gabriel Orlanski, Alex Gittens
We evaluate prior state-of-the-art CoNaLa models with this additional data and find that our proposed method of using the body and mined data beats the BLEU score of the prior state-of-the-art by $71. 96\%$.
Ranked #1 on Code Generation on CoNaLa-Ext
no code implementations • 27 Apr 2021 • Kevin Kim, Alex Gittens
This work proposes to learn fair low-rank tensor decompositions by regularizing the Canonical Polyadic Decomposition factorization with the kernel Hilbert-Schmidt independence criterion (KHSIC).
1 code implementation • 16 Apr 2021 • Dong Hu, Alex Gittens, Malik Magdon-Ismail
Specifically, we consider that it is possible to obtain low noise, high cost observations of individual entries or high noise, low cost observations of entire columns.
no code implementations • 10 Feb 2021 • Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, Charu Aggarwal
The information is extracted and stored in a structured format using knowledge graphs such that the semantics of the threat intelligence can be preserved and shared at scale with other security analysts.
1 code implementation • 20 Jun 2020 • Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, Charu Aggarwal
The knowledge graph that uses MALOnt is instantiated from a corpus comprising hundreds of annotated malware threat reports.
no code implementations • 27 Sep 2019 • Malik Magdon-Ismail, Alex Gittens
We give a fast oblivious L2-embedding of $A\in \mathbb{R}^{n x d}$ to $B\in \mathbb{R}^{r x d}$ satisfying $(1-\varepsilon)\|A x\|_2^2 \le \|B x\|_2^2 <= (1+\varepsilon) \|Ax\|_2^2.$ Our embedding dimension $r$ equals $d$, a constant independent of the distortion $\varepsilon$.
no code implementations • ACL 2017 • Alex Gittens, Dimitris Achlioptas, Michael W. Mahoney
An unexpected {``}side-effect{''} of such models is that their vectors often exhibit compositionality, i. e., \textit{adding}two word-vectors results in a vector that is only a small angle away from the vector of a word representing the semantic composite of the original words, e. g., {``}man{''} + {``}royal{''} = {``}king{''}.
no code implementations • 9 Jun 2017 • Shusen Wang, Alex Gittens, Michael W. Mahoney
This work analyzes the application of this paradigm to kernel $k$-means clustering, and shows that applying the linear $k$-means clustering algorithm to $\frac{k}{\epsilon} (1 + o(1))$ features constructed using a so-called rank-restricted Nystr\"om approximation results in cluster assignments that satisfy a $1 + \epsilon$ approximation ratio in terms of the kernel $k$-means cost function, relative to the guarantee provided by the same algorithm without the use of the Nystr\"om method.
no code implementations • ICML 2017 • Shusen Wang, Alex Gittens, Michael W. Mahoney
In particular, there is a bias-variance trade-off in sketched MRR that is not present in sketched LSR.
1 code implementation • 5 Jul 2016 • Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat
We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms.
Distributed, Parallel, and Cluster Computing G.1.3; C.2.4
no code implementations • CVPR 2015 • Da Kuang, Alex Gittens, Raffay Hamid
In recent years, several feature encoding schemes for the bags-of-visual-words model have been proposed.
no code implementations • 7 Apr 2015 • Jiyan Yang, Alex Gittens
Recent years have demonstrated that using random feature maps can significantly decrease the training and testing times of kernel-based algorithms without significantly lowering their accuracy.
no code implementations • 2 Apr 2014 • Da Kuang, Alex Gittens, Raffay Hamid
The dominant cost in solving least-square problems using Newton's method is often that of factorizing the Hessian matrix over multiple values of the regularization parameter ($\lambda$).
no code implementations • 17 Dec 2013 • Raffay Hamid, Ying Xiao, Alex Gittens, Dennis Decoste
Kernel approximation using randomized feature maps has recently gained a lot of interest.
no code implementations • 12 Nov 2013 • Christos Boutsidis, Alex Gittens, Prabhanjan Kambadur
Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis.
no code implementations • 7 Mar 2013 • Alex Gittens, Michael W. Mahoney
Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices.