Feature Engineering

393 papers with code • 1 benchmarks • 5 datasets

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Benchmarks

Add a Result

These leaderboards are used to track progress in Feature Engineering

Trend	Dataset	Best Model	Paper	Code	Compare
	2019_test set	CNN			See all

Libraries

Use these libraries to find Feature Engineering models and implementations

shenweichen/DeepCTR

6 papers

7,358

xue-pai/FuxiCTR

6 papers

791

UlionTse/mlgb

6 papers

311

DataCanvasIO/DeepTables

4 papers

636

See all 12 libraries.

Datasets

Subtasks

Imputation

Latest papers with no code

Most implemented Social Latest No code

Sentiment analysis and random forest to classify LLM versus human source applied to Scientific Texts

no code yet • 5 Apr 2024

After the launch of ChatGPT v. 4 there has been a global vivid discussion on the ability of this artificial intelligence powered platform and some other similar ones for the automatic production of all kinds of texts, including scientific and technical texts.

Paper
Add Code

The Death of Feature Engineering? BERT with Linguistic Features on SQuAD 2.0

no code yet • 4 Apr 2024

We conclude that the BERT base model will be improved by incorporating the features.

Paper
Add Code

AI WALKUP: A Computer-Vision Approach to Quantifying MDS-UPDRS in Parkinson's Disease

no code yet • 2 Apr 2024

Parkinson's Disease (PD) is the second most common neurodegenerative disorder.

Paper
Add Code

Leveraging Machine Learning for Early Autism Detection via INDT-ASD Indian Database

no code yet • 2 Apr 2024

Using the proposed model, we succeeded in predicting ASD using a minimized set of 20 questions rather than the 28 questions presented in AMI with promising accuracy.

Paper
Add Code

Explainable AI Integrated Feature Engineering for Wildfire Prediction

no code yet • 1 Apr 2024

In our research, we conducted a thorough assessment of various machine learning algorithms for both classification and regression tasks relevant to predicting wildfires.

Paper
Add Code

PIPNet3D: Interpretable Detection of Alzheimer in MRI Scans

no code yet • 27 Mar 2024

Information from neuroimaging examinations (CT, MRI) is increasingly used to support diagnoses of dementia, e. g., Alzheimer's disease.

Paper
Add Code

Thelxinoë: Recognizing Human Emotions Using Pupillometry and Machine Learning

no code yet • 27 Mar 2024

In this study, we present a method for emotion recognition in Virtual Reality (VR) using pupillometry.

Paper
Add Code

VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections

no code yet • 24 Mar 2024

Therefore, mini-batch training for graph transformers is a promising direction, but limited samples in each mini-batch can not support effective dense attention to encode informative representations.

Paper
Add Code

Utilizing the LightGBM Algorithm for Operator User Credit Assessment Research

no code yet • 21 Mar 2024

First, for the massive data related to user evaluation provided by operators, key features are extracted by data preprocessing and feature engineering methods, and a multi-dimensional feature set with statistical significance is constructed; then, linear regression, decision tree, LightGBM, and other machine learning algorithms build multiple basic models to find the best basic model; finally, integrates Averaging, Voting, Blending, Stacking and other integrated algorithms to refine multiple fusion models, and finally establish the most suitable fusion model for operator user evaluation.

Paper
Add Code

Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification

no code yet • 20 Mar 2024

By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of severity classification across several machine learning algorithms.

Paper
Add Code

Feature Engineering

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result