1 code implementation • 19 Apr 2024 • Sibei Chen, Yeye He, Weiwei Cui, Ju Fan, Song Ge, Haidong Zhang, Dongmei Zhang, Surajit Chaudhuri
Spreadsheets are widely recognized as the most popular end-user programming tools, which blend the power of formula-based computation, with an intuitive table-based interface.
no code implementations • 13 Oct 2023 • Peng Li, Yeye He, Dror Yashar, Weiwei Cui, Song Ge, Haidong Zhang, Danielle Rifinski Fainman, Dongmei Zhang, Surajit Chaudhuri
Language models, such as GPT-3. 5 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks.
1 code implementation • 27 Jul 2023 • Peng Li, Yeye He, Cong Yan, Yue Wang, Surajit Chaudhuri
Relational tables, where each row corresponds to an entity and each column corresponds to an attribute, have been the standard for tables in relational databases.
no code implementations • 4 Jun 2023 • Dezhan Tu, Yeye He, Weiwei Cui, Song Ge, Haidong Zhang, Han Shi, Dongmei Zhang, Surajit Chaudhuri
Data pipelines are widely employed in modern enterprises to power a variety of Machine-Learning (ML) and Business-Intelligence (BI) applications.
no code implementations • 11 Dec 2021 • Yeye He, Jie Song, Yue Wang, Surajit Chaudhuri, Vishal Anil, Blake Lassiter, Yaron Goland, Gaurav Malhotra
As data lakes become increasingly popular in large enterprises today, there is a growing need to tag or classify data assets (e. g., files and databases) in data lakes with additional metadata (e. g., semantic column-types), as the inferred metadata can enable a range of downstream applications like data governance (e. g., GDPR compliance), and dataset search.
1 code implementation • 25 Jun 2021 • Junwen Yang, Yeye He, Surajit Chaudhuri
We in this work propose to automate multiple such steps end-to-end, by synthesizing complex data pipelines with both string transformations and table-manipulation operators.
no code implementations • 10 Jan 2020 • Kaushik Chakrabarti, Zhimin Chen, Siamak Shakeri, Guihong Cao, Surajit Chaudhuri
For (ii), we develop novel features to compute structure-aware match and train a machine learning model.
no code implementations • 13 May 2019 • Amanda E. Bauer, Eric C. Bellm, Adam S. Bolton, Surajit Chaudhuri, A. J. Connolly, Kelle L. Cruz, Vandana Desai, Alex Drlica-Wagner, Frossie Economou, Niall Gaffney, J. Kavelaars, J. Kinney, Ting S. Li, B. Lundgren, R. Margutti, G. Narayan, B. Nord, Dara J. Norman, W. O'Mullane, S. Padhi, J. E. G. Peek, C. Schafer, Megan E. Schwamb, Arfon M. Smith, Erik J. Tollerud, Anne-Marie Weijmans, Alexander S. Szalay
A Kavli foundation sponsored workshop on the theme \emph{Petabytes to Science} was held 12$^{th}$ to 14$^{th}$ of February 2019 in Las Vegas.
Instrumentation and Methods for Astrophysics
no code implementations • Proceedings of the VLDB Endowment 2019 • Anshuman Dutt, Chi Wang, Azade Nazi, Srikanth Kandula, Vivek Narasayya, Surajit Chaudhuri
Query optimizers depend on selectivity estimates of query predicates to produce a good execution plan.
no code implementations • 8 Nov 2018 • Silu Huang, Chi Wang, Bolin Ding, Surajit Chaudhuri
A machine learning configuration refers to a combination of preprocessor, learner, and hyperparameters.