21

How do I tell that two source-codes (independent of their language C,Java,Lisp...) have strong indications that they could be plagiarism of each other?

Background: I going to give my first seminar on computer languages. We have prepared small exercises for major programming languages such as C/C++, Python, Java,... but also OCaml, Haskell,... to give the students some practical introduction (also into programming paradigms). We estimate to have ~300 students with more than 50 programming tasks per person. So a single person cannot check all homeworks.

I guess anti plagiarism techniques used for natural languages (essays, papers, book chapters, etc) will not work for source code, right? Also solutions to those programming tasks will have inherent similarity due to the demanded interface.

I've done a little search and found: MOSS mentioned in: Checking for code plagiarism with JavaScript and Variable renaming for plagiarism detection for C/C++

Community
  • 1
  • 1
math
  • 8,514
  • 10
  • 53
  • 61
  • Nice idea. I guess you can make a PhD on it :) – gefei Apr 25 '12 at 09:39
  • Recently someone tested some plagiarism detection software on scientific homework (so this is not checking source code): http://plagiat.htw-berlin.de/software-en/test2013/ but still may be useful for other homeowrk. – math Oct 08 '13 at 06:20
  • There are a few papers on source code plagiarism detection found here: http://www.ics.heacademy.ac.uk/resources/assessment/plagiarism/research_sourcecode.html – ElFik Jan 08 '14 at 11:34
  • I could share a proper solution as an answer, which is not bullet proof but does almost the job. So please remove the hold. – math Jul 19 '15 at 19:39
  • There are various tools; e.g., https://theory.stanford.edu/~aiken/moss/ and http://simicheck.com – Luca Jan 06 '16 at 20:17
  • I'm also very interesting in this! From what I've extensively searched online, there is no FOSS and local (not server side) software that codes this for any language. There a lot of Python projects targeting only Python code... The two most serious solutions I found are: a) use Compilatio plugin for the popular LMS Moodle, and ask students to submit files as `file.c.txt` as Compilation only accepts pure text files (or PDF) ; b) download and use https://github.com/a-nikolaev/study-in-scarlet which seems to work well for multi-language files! It's FOSS and in ruby, but language agnostic!! – Næreen Feb 21 '21 at 08:03

1 Answers1

7

Award a small prize for detecting it. Given the possibility of a couple beers, students will pour over the net for hours, looking for matches from other students submissions.

With large fines for offences, it's self-financing and rewards students who do their own work - they want beer and are not going to leave themselves open to revenge by plagiarising work themselves!

Martin James
  • 24,453
  • 3
  • 36
  • 60
  • 6
    Careful, students will pair up with each other and "detect" each others plagiarism, depending on the severity of the penalty and the student's apathy/attitude – Gareth Apr 30 '12 at 01:33
  • There's always groups of students who don't like each other, so I guess what Martin said is more than grant. I would just be careful about not disclose the name f the guy who found it, but at the end of the day, you have to ask yourself what kind of personality those student's would be building up. – MeTitus Jun 04 '15 at 16:27