When is plagiarism plagiarism??

Many users of PlagAware (and other programs for plagiarism controlling) asking the question how the results of the conducted plagiarism controlling should be analyzed. Or more concrete: Is this presented text with a certain percentage rate of copied words plagiarism - or not?

Unfortunately, there is no generally valid answer to this seemingly easy question. For this reason, systems like PlagAware forgo often to use a classification, just like

  • No plagiarism
  • Possible plagiarism
  • Most likely plagiarism
  • Most certainly plagiarism

Programs for plagiarism controlling generally limited themselves more to present to which extend the text has adopted outside resource and to which way it was incorporated into the presented work.

In the final analysis of found sources will always the personal assessment and judgement of the controller be pivotal, which depends on

  • the input of the controlling texts,
  • the experience of the controller and
  • the targeted (scientific) level of the work.

Thus the plagiarism review of academic homework should use different guidelines as the plagiarism controlling of a Master thesis - and again another than the plagiarism controlling of a dissertation or doctoral thesis.

Therefore in the following paragraphs, we will give you some fundamental and assistance which will help you with individual analysis of found sourced in your work.

Types of plagiarism: Structure plagiarism and text plagiarism

At first it is helpful to distinguish between the two different types of plagiarism - and to determine if both are relevant to the same degree.

  • When we talk about plagiarism in general, we mostly think of text plagiarism. Text plagiarism is unmodified adoption of multiple sentences, paragraphs or even whole pages without any cited sources in your own work. Systems for plagiarism controlling like PlagAware can identify and show this text plagiarism mostly without any problems, as long as the sources are electronically publically accessible. It is up to the controller to review the scope and relevance of the copy - and to make a decision in regards to the admissibility of the questioned section. You can find tips for that in the following sections.
  • Structure plagiarism is plagiarism of contextual composition - the structure - of the work. That way a presented Master thesis could not have copied a single sentence from previous work, but the general composition from the argument all the way to the conclusion could have been completely copied. Structure plagiarism is much harder to detect, since programs for plagiarism controlling are often designed to identified coinciding texts.

Even PlagAware will rate a complete structure plagiarism with a copied words percentage rate of "0% - no plagiarism". However, the algorithm of PlagAware can detect beside sheer text comparison also texts with similar content. If during a plagiarism controlling the same source document is detected multiple times, will this source appear on PlagAware's list of "potential sources without consistency". The (contextual) comparison of these sources of the presented work must be completed manually by the controller.

Percentage rate of copied words, interruptions and revisions

Even with text plagiarism sentences and paragraphs are not often word for word adopted. Instead the sentences are rearranged, individual words with synonyms exchanged or the tense is changed. This happens generally to better stylistically integrate the plagiarism into the own work or to disguise the usage of the plagiarism - or both. Revised text plagiarism are therefore also called "disguised plagiarism" or "paraphrased plagiarism".

The plagiarism controlling of PlagAware can in many cases detect and show even heavily revised disguised plagiarism. Especially helpful here is the graphical text comparison, which allows a direct analysis of the paraphrasing:

Graphical text comparison of plagiarism controlling with Plagaware

The scope of the plagiarism sections and the degree of the revision are described through the three indicators percentage rate, number of matching words (MW : Matching Words) and number of matching phrases (MP: Matching Phrases).

  • Number of matching words. This value shows how many words were adopted unchanged from one or more sources. The number of sources and the distrubtion of source among the text does not play a role though. For example if there are 250 words in a text of any length, so will the other sources be found and therefore the number of matching words has the value of "250".
  • Percentage rate of matching words. This is the percentage portion of the matching words in regards to the total number of words in the controlled texts. For a text with the length of 1000 words, the above example of 250 matching words results in a percentage rate of 25% - a whole quarter of the text therefore was copied from other sources. Here as well is the number of sources and the distribution of copied paragraphs within the work irrelevant.
  • Number of matching phrases. This value reflects how many phrases were distributed in the found sources of the controlled text. A phrase is defined as a connected copied section of these sources, such as a sentence or multiple words. For example: From a Wikipedia article were 100 words copied, in fact 20 words from the introduction and 80 words from the main part of the article. Between two phrases was the own text in the controlled work inserted. The number of matching phrases here is two, the number of matching words is 100.

Tips and guidelines for analyzing the plagiarism controlling

As mentioned before, the analysis of the conducted plagiarism controlling remains for the most part within the discretionary of the controller. However, a couple tips can help to allow for a comparable and objective analysis:

  • Exclude the sources which were correctly cited from plagiarism controlling. This will falsify the overall assessment, since proper citations are of course permissible and should not be included in the analysis. By the way, with PlagAware the manually excluded sources will always be listed in the result report.
  • For the analysis of individual sources is the total number of matching words more important than percentage portion. In the end, it does not matter how long controlled work is: A non-cited source of "significant" scope is and remains a violation against good scientific practice. There are different opinions regarding the "significance" of the copied source. As a guideline we recommend:
    • less than 50 matching words: Not significant
    • between 50 and 100 matching words: Discretionary
    • more than 10 matching words: Significant
  • For the analysis of the complete work should the percentage portion of matching words be taking into consideration. Regarding the significance are the opinions naturally split as well. Please note here as well that the correct cited sources should be excluded to not receive a false positive result in the plagiarism controlling. As a guideline we recommend:
    • less than 1 percent: Not significant
    • between 1 and 5 percent: Discretionary
    • more than 5 percent: Significant
  • Like other programs for online plagiarism controlling of texts PlagAware can only (with exceptions) fall back to sources which publically available. Often are also with cost associated and protected content, summaries and abstracts freely available though. You should be suspicious when PlagAware registers findings from those scientific abstracts or summaries in homework portals. If necessary it would make sense to obtain the relevant full text and add to your library. You can then activate the plagiarism controlling considering those texts for free.
  • In line with discretion should the question be answered whether non-cited source presents a fraudulent intent - or not. A tip to fraudulent intent provides the paraphrasing as well as the degree and the way of the disguise. For this purpose is the parameter "number of matching phrases" a key indicator. Combined with the graphical overview of the comparison text it can be easily determined whether the source was knowingly and intentionally revised to disguise the origin of text. Give more weight to this disguised plagiarism as sources, which potentially were cited unintentional or uncomplete.
  • If you conduct the plagiarism controlling to detect violations of copyrights: Remember that the copyright law only applies to concrete formulations. A paraphrased plagiarism is in terms of copyright not a plagiarism, but rather a new formulation of the original text and therefore is generally not legally attackable.
  • Discuss your threshold for save acceptable and save non-acceptable parameters with your colleagues to facilitate a comparison of plagiarism controlling in your institution. For that use concrete examples and result reports.

Comments and feedback

