CitePlag is the first prototype of a citation-based Plagiarism Detection (CbPD) System. The prototype was recently demonstrated at the SIGIR conference 2013.

What makes CitePlag novel?

In contrast to existing text-based approaches to plagiarism detection, CitePlag does not solely analyze literal text matches to determine document suspiciousness – but rather, CitePlag makes use of the unique citation placement in the full-text of documents to determine similarity and detect potential plagiarism.

In examining citation placement, position, and order, CitePlag forms a text-independent / and even language-barrier transcending “fingerprint” of the semantic content of documents, which can then be used to detect potential unoriginality and plagiarism.

CitePlag has come a long way from its humble beginnings in 2010, when we proposed the first citation-based approach to detect semantic similarity between documents for use in plagiarism detection. A year later, we developed the algorithms, and today we have a working prototype available for public use.

CitePlag now has a new user interface with improved functionality.

You can:

  1. upload your own files  (PDF/ text documents)
  2. examine recent plagiarism findings and examples of retracted plagiarism cases
  3. compare any two publications from the Open Access subset of PubMed’s database  (200,000+ medical publications)

Test the CitePlag prototype for yourself at its new home on the web: http://citeplag.org/

If you’re curious about the project, see our related publications or the doctoral thesis of Bela Gipp, which narrows in on all aspects of Citation-based Plagiarism Detection.