Feature Engineering

393 papers with code • 1 benchmarks • 5 datasets

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Libraries

Use these libraries to find Feature Engineering models and implementations
6 papers
7,358
6 papers
791
6 papers
311
See all 12 libraries.

Subtasks


Latest papers with no code

Sentiment analysis and random forest to classify LLM versus human source applied to Scientific Texts

no code yet • 5 Apr 2024

After the launch of ChatGPT v. 4 there has been a global vivid discussion on the ability of this artificial intelligence powered platform and some other similar ones for the automatic production of all kinds of texts, including scientific and technical texts.

The Death of Feature Engineering? BERT with Linguistic Features on SQuAD 2.0

no code yet • 4 Apr 2024

We conclude that the BERT base model will be improved by incorporating the features.

AI WALKUP: A Computer-Vision Approach to Quantifying MDS-UPDRS in Parkinson's Disease

no code yet • 2 Apr 2024

Parkinson's Disease (PD) is the second most common neurodegenerative disorder.

Leveraging Machine Learning for Early Autism Detection via INDT-ASD Indian Database

no code yet • 2 Apr 2024

Using the proposed model, we succeeded in predicting ASD using a minimized set of 20 questions rather than the 28 questions presented in AMI with promising accuracy.

Explainable AI Integrated Feature Engineering for Wildfire Prediction

no code yet • 1 Apr 2024

In our research, we conducted a thorough assessment of various machine learning algorithms for both classification and regression tasks relevant to predicting wildfires.

PIPNet3D: Interpretable Detection of Alzheimer in MRI Scans

no code yet • 27 Mar 2024

Information from neuroimaging examinations (CT, MRI) is increasingly used to support diagnoses of dementia, e. g., Alzheimer's disease.

Thelxinoë: Recognizing Human Emotions Using Pupillometry and Machine Learning

no code yet • 27 Mar 2024

In this study, we present a method for emotion recognition in Virtual Reality (VR) using pupillometry.

VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections

no code yet • 24 Mar 2024

Therefore, mini-batch training for graph transformers is a promising direction, but limited samples in each mini-batch can not support effective dense attention to encode informative representations.

Utilizing the LightGBM Algorithm for Operator User Credit Assessment Research

no code yet • 21 Mar 2024

First, for the massive data related to user evaluation provided by operators, key features are extracted by data preprocessing and feature engineering methods, and a multi-dimensional feature set with statistical significance is constructed; then, linear regression, decision tree, LightGBM, and other machine learning algorithms build multiple basic models to find the best basic model; finally, integrates Averaging, Voting, Blending, Stacking and other integrated algorithms to refine multiple fusion models, and finally establish the most suitable fusion model for operator user evaluation.

Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification

no code yet • 20 Mar 2024

By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of severity classification across several machine learning algorithms.