1 code implementation • 17 Oct 2023 • Ralph Peeters, Christian Bizer
We show that for use cases that do not allow data to be shared with third parties, open-source LLMs can be a viable alternative to hosted LLMs given that a small amount of training data or matching knowledge...
Ranked #1 on Entity Resolution on Amazon-Google
1 code implementation • 5 May 2023 • Ralph Peeters, Christian Bizer
Always using the same set of 10 handpicked demonstrations leads to an improvement of 4. 92% over the zero-shot performance.
1 code implementation • 23 Jan 2023 • Ralph Peeters, Reng Chiz Der, Christian Bizer
It also shows that for entity matching contrastive learning is more training data efficient compared to cross-encoders.
1 code implementation • SemTab@ISWC 2023 • Keti Korini, Ralph Peeters, Christian Bizer
This paper presents the WDC Schema. org Table Annotation Benchmark (SOTAB) for comparing the performance of table annotation systems.
Ranked #1 on Columns Property Annotation on WDC SOTAB
1 code implementation • 4 Feb 2022 • Ralph Peeters, Christian Bizer
We thus conclude that contrastive pre-training has a high potential for product matching use cases in which explicit supervision is available.
Ranked #1 on Entity Resolution on WDC Computers-xlarge
1 code implementation • 7 Oct 2021 • Ralph Peeters, Christian Bizer
This poster explores along the use case of matching product offers from different e-shops to which extent it is possible to improve the performance of Transformer-based matchers by complementing a small set of training pairs in the target language, German in our case, with a larger set of English-language training pairs.
1 code implementation • Proceedings of the VLDB Endowment 2021 • Ralph Peeters, Christian Bizer
The task can be approached by learning a binary classifier which distinguishes pairs of entity descriptions for the same real-world entity from descriptions of different entities.
Ranked #1 on Entity Resolution on WDC Watches-xlarge
2 code implementations • DI2KG: International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs @ VLDB 2020 2020 • Ralph Peeters, Christian Bizer, Goran Glavas
Adding the masked language modeling objective in the intermediate training step in order to further adapt the language model to the application domain leads to an additional increase of up to 3% F1.
Ranked #1 on Entity Resolution on WDC Computers-small (using extra training data)