Feature Engineering

393 papers with code • 1 benchmarks • 5 datasets

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Benchmarks

Add a Result

These leaderboards are used to track progress in Feature Engineering

Trend	Dataset	Best Model	Paper	Code	Compare
	2019_test set	CNN			See all

Libraries

Use these libraries to find Feature Engineering models and implementations

shenweichen/DeepCTR

6 papers

7,354

xue-pai/FuxiCTR

6 papers

787

UlionTse/mlgb

6 papers

311

DataCanvasIO/DeepTables

4 papers

636

See all 12 libraries.

Datasets

Subtasks

Imputation

Most implemented papers

Most implemented Social Latest No code

Discovering Neural Wirings

allenai/dnw • • NeurIPS 2019

In this work we propose a method for discovering neural wirings.

Paper
Code

Mill.jl and JsonGrinder.jl: automated differentiable feature extraction for learning from raw JSON data

CTUAvastLab/Mill.jl • 19 May 2021

Learning from raw data input, thus limiting the need for manual feature engineering, is one of the key components of many successful applications of machine learning methods.

Paper
Code

Modelling Context with User Embeddings for Sarcasm Detection in Social Media

samiroid/CUE-CNN • CONLL 2016

We introduce a deep neural network for automated sarcasm detection.

Paper
Code

Deep Voice: Real-time Neural Text-to-Speech

NVIDIA/nv-wavenet • • ICML 2017

We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks.

Paper
Code

Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

shenweichen/DeepCTR • • 18 Apr 2017

CTR prediction in real-world business is a difficult machine learning problem with large scale nonlinear sparse data.

Paper
Code

Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking

gtolomei/ml-feature-tweaking • 20 Jun 2017

There are many circumstances however where it is important to understand (i) why a model outputs a certain prediction on a given instance, (ii) which adjustable features of that instance should be modified, and finally (iii) how to alter such a prediction when the mutated instance is input back to the model.

Paper
Code