1 code implementation • 23 Nov 2021 • David Cortes
The isolation forest algorithm for outlier detection exploits a simple yet effective observation: if taking some multivariate data and making uniformly random cuts across the feature space recursively, it will take fewer such random cuts for an outlier to be left alone in a given subspace as compared to regular observations.
1 code implementation • 26 Oct 2021 • David Cortes
Isolation forest or "iForest" is an intuitive and widely used algorithm for anomaly detection that follows a simple yet effective idea: in a given data distribution, if a threshold (split point) is selected uniformly at random within the range of some variable and data points are divided according to whether they are greater or smaller than this threshold, outlier points are more likely to end up alone or in the smaller partition.
2 code implementations • 2 Jan 2020 • David Cortes
This work describes an outlier detection procedure (named "OutlierTree") loosely based on the GritBot software developed by RuleQuest research, which works by evaluating and following supervised decision tree splits on variables, in whose branches 1-d confidence intervals are constructed for the target variable and potential outliers flagged according to these confidence intervals.
1 code implementation • 15 Nov 2019 • David Cortes
This work proposes a non-iterative strategy for missing value imputations which is guided by similarity between observations, but instead of explicitly determining distances or nearest neighbors, it assigns observations to overlapping buckets through recursive semi-random hyperplane cuts, in which weighted averages are determined as imputations for each variable.
1 code implementation • 27 Oct 2019 • David Cortes
This work briefly explores the possibility of approximating spatial distance (alternatively, similarity) between data points using the Isolation Forest method envisioned for outlier detection.
2 code implementations • 11 Nov 2018 • David Cortes
This work explores adaptations of successful multi-armed bandits policies to the online contextual bandits scenario with binary rewards using binary classification algorithms such as logistic regression as black-box oracles.
1 code implementation • 5 Nov 2018 • David Cortes
This work explores non-negative low-rank matrix factorization based on regularized Poisson models (PF or "Poisson factorization" for short) for recommender systems with implicit-feedback data.
1 code implementation • 2 Sep 2018 • David Cortes
This work explores the ability of collective matrix factorization models in recommender systems to make predictions about users and items for which there is side information available but no feedback or interactions data, and proposes a new formulation with a faster cold-start prediction formula that can be used in real-time systems.