1 code implementation • 26 Dec 2023 • Timo Spinde, Smi Hinterreiter, Fabian Haak, Terry Ruas, Helge Giese, Norman Meuschke, Bela Gipp
However, we have identified a lack of interdisciplinarity in existing projects, and a need for more awareness of the various types of media bias to support methodologically thorough performance evaluations of media bias detection systems.
1 code implementation • 22 May 2023 • Ankit Satpute, André Greiner-Petter, Moritz Schubotz, Norman Meuschke, Akiko Aizawa, Olaf Teschke, Bela Gipp
This demo paper presents the first tool to annotate the reuse of text, images, and mathematical formulae in a document pair -- TEIMMA.
no code implementations • 12 May 2023 • Bela Gipp, André Greiner-Petter, Moritz Schubotz, Norman Meuschke
This project investigated new approaches and technologies to enhance the accessibility of mathematical content and its semantic information for a broad range of information retrieval applications.
no code implementations • 17 Mar 2023 • Norman Meuschke, Apurva Jagdale, Timo Spinde, Jelena Mitrović, Bela Gipp
Using the new framework, we benchmark ten freely available tools in extracting document metadata, bibliographic references, tables, and other content elements from academic PDF documents.
no code implementations • 11 May 2022 • Corinna Breitinger, Kay Herklotz, Tim Flegelskamp, Norman Meuschke
For example, in the field of biomedicine and chemistry, researchers are not only interested in textual relevance but may also want to discover or compare the contained chemical entity information found in a paper's full text.
no code implementations • 14 Dec 2021 • Timo Spinde, Kanishka Sinha, Norman Meuschke, Bela Gipp
We present a free and open-source tool for creating web-based surveys that include text annotation tasks.
1 code implementation • 18 Nov 2021 • Johannes Stegmüller, Fabian Bauer-Marquart, Norman Meuschke, Terry Ruas, Moritz Schubotz, Bela Gipp
Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations.
1 code implementation • 15 Nov 2021 • Jan Philip Wahle, Nischal Ashok, Terry Ruas, Norman Meuschke, Tirthankar Ghosal, Bela Gipp
We expect that evaluating a broad spectrum of datasets and models will benefit future research in developing misinformation detection systems.
2 code implementations • 15 Jun 2021 • Jan Philip Wahle, Terry Ruas, Norman Meuschke, Bela Gipp
We present two supervised (pre-)training methods to incorporate gloss definitions from lexical resources into neural language models (LMs).
no code implementations • 10 Jun 2021 • Norman Meuschke
Analyzing non-textual content complements text-based detection approaches and increases the detection effectiveness, particularly for disguised forms of academic plagiarism.
no code implementations • 23 Mar 2021 • Jan Philip Wahle, Terry Ruas, Norman Meuschke, Bela Gipp
The rise of language models such as BERT allows for high-quality text paraphrasing.
2 code implementations • 22 Mar 2021 • Jan Philip Wahle, Terry Ruas, Tomáš Foltýnek, Norman Meuschke, Bela Gipp
Employing paraphrasing tools to conceal plagiarized text is a severe threat to academic integrity.
1 code implementation • 23 May 2020 • Cornelius Ihle, Moritz Schubotz, Norman Meuschke, Bela Gipp
Plagiarism detection systems are essential tools for safeguarding academic and educational integrity.
no code implementations • 22 May 2020 • Philipp Scharpf, Moritz Schubotz, Abdou Youssef, Felix Hamborg, Norman Meuschke, Bela Gipp
In this paper, we show how selecting and combining encodings of natural and mathematical language affect classification and clustering of documents with mathematical content.
no code implementations • 20 Mar 2020 • Moritz Schubotz, André Greiner-Petter, Norman Meuschke, Olaf Teschke, Bela Gipp
This poster summarizes our contributions to Wikimedia's processing pipeline for mathematical formulae.
no code implementations • 27 Jun 2019 • Norman Meuschke, Vincent Stange, Moritz Schubotz, Michael Karmer, Bela Gipp
Overall, we show that analyzing the similarity of mathematical content and academic citations is a striking supplement for conventional text-based detection approaches for academic literature in the STEM disciplines.