Search Results for author: Juliana Freire

Found 12 papers, 5 papers with code

ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models

1 code implementation27 Oct 2023 Benjamin Feuer, Yurong Liu, Chinmay Hegde, Juliana Freire

We introduce ArcheType, a simple, practical method for context sampling, prompt serialization, model querying, and label remapping, which enables large language models to solve CTA problems in a fully zero-shot manner.

 Ranked #1 on Column Type Annotation on WDC SOTAB (Weighted F1 metric)

Column Type Annotation Zero-Shot Learning

eTOP: Early Termination of Pipelines for Faster Training of AutoML Systems

no code implementations17 Apr 2023 Haoxiang Zhang, Juliana Freire, Yash Garg

Recent advancements in software and hardware technologies have enabled the use of AI/ML models in everyday applications has significantly improved the quality of service rendered.

AutoML Feature Engineering

AlphaD3M: Machine Learning Pipeline Synthesis

no code implementations3 Nov 2021 Iddo Drori, Yamuna Krishnamurthy, Remi Rampin, Raoni de Paula Lourenco, Jorge Piazentin Ono, Kyunghyun Cho, Claudio Silva, Juliana Freire

We introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play.

AutoML BIG-bench Machine Learning +3

Correlation Sketches for Approximate Join-Correlation Queries

no code implementations7 Apr 2021 Aécio Santos, Aline Bessa, Fernando Chirigati, Christopher Musco, Juliana Freire

The increasing availability of structured datasets, from Web tables and open-data portals to enterprise data, opens up opportunities~to enrich analytics and improve machine learning models through relational data augmentation.

Data Augmentation

Auctus: A Dataset Search Engine for Data Augmentation

no code implementations10 Feb 2021 Sonia Castelo, Rémi Rampin, Aécio Santos, Aline Bessa, Fernando Chirigati, Juliana Freire

The large volumes of structured data currently available, from Web tables to open-data portals and enterprise data, open up new opportunities for progress in answering many important scientific, societal, and business questions.

Data Augmentation

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

1 code implementation arXiv 2020 Jorge Piazentin Ono, Sonia Castelo, Roque Lopez, Enrico Bertini, Juliana Freire, Claudio Silva

In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to search and generate end-to-end learning pipelines.

Human-Computer Interaction

Debugging Machine Learning Pipelines

1 code implementation11 Feb 2020 Raoni Lourenço, Juliana Freire, Dennis Shasha

Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions.

BIG-bench Machine Learning

AutoML using Metadata Language Embeddings

2 code implementations8 Oct 2019 Iddo Drori, Lu Liu, Yi Nian, Sharath C. Koorathota, Jie S. Li, Antonio Khalil Moretti, Juliana Freire, Madeleine Udell

We use these embeddings in a neural architecture to learn the distance between best-performing pipelines.

AutoML

Visus: An Interactive System for Automatic Machine Learning Model Building and Curation

no code implementations5 Jul 2019 Aécio Santos, Sonia Castelo, Cristian Felix, Jorge Piazentin Ono, Bowen Yu, Sungsoo Hong, Cláudio T. Silva, Enrico Bertini, Juliana Freire

In this paper, we present Visus, a system designed to support the model building process and curation of ML data processing pipelines generated by AutoML systems.

AutoML BIG-bench Machine Learning

A Topic-Agnostic Approach for Identifying Fake News Pages

1 code implementation2 May 2019 Sonia Castelo, Thais Almeida, Anas Elghafari, Aécio Santos, Kien Pham, Eduardo Nakamura, Juliana Freire

Fake news and misinformation have been increasingly used to manipulate popular opinion and influence political processes.

Misinformation TAG

Bootstrapping Domain-Specific Content Discovery on the Web

no code implementations25 Feb 2019 Kien Pham, Aécio Santos, Juliana Freire

Given a domain of interest $D$, subject-matter experts (SMEs) must search for relevant websites and collect a set of representative Web pages to serve as training examples for creating a classifier that recognizes pages in $D$, as well as a set of pages to seed the crawl.

Cannot find the paper you are looking for? You can Submit a new open access paper.