Avada Agency
Doctoral Thesis2022-11-28T11:26:36+01:00

Doctoral Thesis Norman Meuschke

Below you find the data, source code, and other resources for the doctoral thesis:

Analyzing Non-Textual Content Elements to Detect Academic Plagiarism

Norman Meuschke, University of Konstanz, 2021, Grade: summa cum laude.

PDF  |  DOI  |   BibTeX

Doctoral Defense

Slides

Click on the image to download the slides for the defense talk (PDF, 7 MB)

Data & Source Code

Hybrid Plagiarism Detection System HyPlag

    • Demo system (user: guest@hyplag.org | pw: hybridPD)
    • Source code (login to GitHub first! user: hyplag-guest | pw: hybridPD20)

Citation-based Plagiarism Detection

    • Source Code: see HyPlag source code above
    • Data:
      • Reference collection: 185,170 documents from PMC OAS collection, provided as part of the CITREC dataset (5 GB zipped, ~20 GB raw) — includes document metadata, citation data and pre-computed similarity scores
      • User-perceived cases of plagiarism (available upon request)

Image-based Plagiarism Detection

  • Source Code
  • Data: 15 test cases embedded into a reference collection of 10,000 images extracted from PMC OAS documents (547 MB zipped)

Mathematics-based Plagiarism Detection

    • Source Code: see HyPlag source code above
    • Data:
      • Test cases: 10 confirmed cases of plagiarism available as PDF and TEI
        (login to GitHub first! user: hyplag-guest | pw: hybridPD20)
      • Reference collection: 105,120 arXiv documents converted to XHMTL

MEDIA COVERAGE

Go to Top