Model Selection

497 papers with code • 0 benchmarks • 1 datasets

Given a set of candidate models, the goal of Model Selection is to select the model that best approximates the observed data and captures its underlying regularities. Model Selection criteria are defined such that they strike a balance between the goodness of fit, and the generalizability or complexity of the models.

Source: Kernel-based Information Criterion

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Selection

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Model Selection models and implementations

ModelOriented/DrWhy

2 papers

672

acerbilab/pyvbmc

2 papers

107

PKU-DAIR/mindware

2 papers

Datasets

Image Caption Quality Dataset

Latest papers

Most implemented Social Latest No code

Evaluating Large Language Models as Generative User Simulators for Conversational Recommendation

granelle/naacl24-user-sim • 13 Mar 2024

Synthetic users are cost-effective proxies for real users in the evaluation of conversational recommender systems.

13 Mar 2024

Paper
Code

Detection of Unobserved Common Causes based on NML Code in Discrete, Mixed, and Continuous Variables

matsushima-lab/cloud • 11 Mar 2024

Causal discovery in the presence of unobserved common causes from observational data only is a crucial but challenging problem.

11 Mar 2024

Paper
Code

On the Model-Agnostic Multi-Source-Free Unsupervised Domain Adaptation

spiresearch/mmda • 3 Mar 2024

Specifically, we first conduct source model selection based on the proposed selection principles.

03 Mar 2024

Paper
Code

Defining Expertise: Applications to Treatment Effect Estimation

qiyaowei/expertise • • 1 Mar 2024

Actions of an expert thus naturally encode part of their domain knowledge, and can help make inferences within the same domain: Knowing doctors try to prescribe the best treatment for their patients, we can tell treatments prescribed more frequently are likely to be more effective.

01 Mar 2024

Paper
Code

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

nanshineloong/self-evolving-benchmark • 18 Feb 2024

Towards a more scalable, robust and fine-grained evaluation, we implement six reframing operations to construct evolving instances testing LLMs against diverse queries, data noise and probing their problem-solving sub-abilities.

18 Feb 2024

Paper
Code

LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents

lbaa2022/llmtaskplanning • • 13 Feb 2024

Large language models (LLMs) have recently received considerable attention as alternative solutions for task planning.

13 Feb 2024

Paper
Code

Model Assessment and Selection under Temporal Distribution Shift

eliselyhan/arw • 13 Feb 2024

We investigate model assessment and selection in a changing environment, by synthesizing datasets from both the current time period and historical epochs.

13 Feb 2024

Paper
Code

MT-HCCAR: Multi-Task Deep Learning with Hierarchical Classification and Attention-based Regression for Cloud Property Retrieval

ai-4-atmosphere-remote-sensing/mt-hccar • • 29 Jan 2024

In response, this paper introduces MT-HCCAR, an end-to-end deep learning model employing multi-task learning to simultaneously tackle cloud masking, cloud phase retrieval (classification tasks), and COT prediction (a regression task).

29 Jan 2024

Paper
Code

INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning

daod/inters • • 12 Jan 2024

Despite this, their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.

184

12 Jan 2024

Paper
Code

Valid causal inference with unobserved confounding in high-dimensional settings

stat4reg/hdim.ui • 12 Jan 2024

We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small relative to the sample size; the latter is formalized in terms of convergence rates.

12 Jan 2024

Paper
Code

Model Selection

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result