Search Results for author: Shahriar Golchin

Found 5 papers, 2 papers with code

Large Language Models As MOOCs Graders

no code implementations6 Feb 2024 Shahriar Golchin, Nikhil Garuda, Christopher Impey, Matthew Wenger

Specifically, we focus on two state-of-the-art LLMs: GPT-4 and GPT-3. 5, across three distinct courses: Introductory Astronomy, Astrobiology, and the History and Philosophy of Astronomy.

Astronomy Philosophy

Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models

1 code implementation10 Nov 2023 Shahriar Golchin, Mihai Surdeanu

We propose the Data Contamination Quiz (DCQ), a simple and effective approach to detect data contamination in large language models (LLMs) and estimate the amount of it.

Multiple-choice Sentence

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

1 code implementation16 Aug 2023 Shahriar Golchin, Mihai Surdeanu

To estimate contamination of individual instances, we employ "guided instruction:" a prompt consisting of the dataset name, partition type, and the random-length initial segment of a reference instance, asking the LLM to complete it.

In-Context Learning WNLI

Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords

no code implementations14 Jul 2023 Shahriar Golchin, Mihai Surdeanu, Nazgol Tavabi, Ata Kiapour

We propose a novel task-agnostic in-domain pre-training method that sits between generic pre-training and fine-tuning.

A Compact Pretraining Approach for Neural Language Models

no code implementations25 Aug 2022 Shahriar Golchin, Mihai Surdeanu, Nazgol Tavabi, Ata Kiapour

We construct these compact subsets from the unstructured data using a combination of abstractive summaries and extractive keywords.

Domain Adaptation

Cannot find the paper you are looking for? You can Submit a new open access paper.