[The Moss Logo]

A System for Detecting Software Similarity

--------

UPDATES

What is Moss?

Moss (for a Measure Of Software Similarity) is an automatic system for determining the similarity of programs. To date, the main application of Moss has been in detecting plagiarism in programming classes. Since its development in 1994, Moss has been very effective in this role. The algorithm behind moss is a significant improvement over other cheating detection algorithms (at least, over those known to us).

--------

What is Moss Not?

Moss is not a system for completely automatically detecting plagiarism. Plagiarism is a statement that someone copied code deliberately without attribution, and while Moss automatically detects program similarity, it has no way of knowing why codes are similar. It is still up to a human to go and look at the parts of the code that Moss highlights and make a decision about whether there is plagiarism or not. One way of thinking about what Moss provides is that it saves teachers and teaching staff a lot of time by pointing out the parts of programs that are worth a more detailed examination. But once someone has looked at those portions of the programs, it shouldn't matter whether the suspect code was first discovered by Moss or by a human; the case that there was plagiarism should stand on its own.

In particular, it is a misuse of Moss to rely solely on the similarity scores. These scores are useful for judging the relative amount of matching between different pairs of programs and for more easily seeing which pairs of programs stick out with unusual amounts of matching. But the scores are certainly not a proof of plagiarism. Someone must still look at the code.

--------

Languages

Moss can currently analyze code written in the following languages:

C, C++, Java, C#, Python, Visual Basic, Javascript, FORTRAN, ML, Haskell, Lisp, Scheme, Pascal, Modula2, Ada, Perl, TCL, Matlab, VHDL, Verilog, Spice, MIPS assembly, a8086 assembly, a8086 assembly, HCL2.

--------

An Internet Service

Moss is being provided as an Internet service. The service has been designed to be very easy to use--you supply a list of files to compare and Moss does the rest.

The current Moss submission script is for Linux.

In response to a query the Moss server produces HTML pages listing pairs of programs with similar code. Moss also highlights individual passages in programs that appear the same, making it easy to quickly compare the files. Finally, Moss can automatically eliminate matches to code that one expects to be shared (e.g., libraries or instructor-supplied code), thereby eliminating false positives that arise from legitimate sharing of code.

--------

Registering for Moss

Moss is being provided in the hope that it will benefit the educational community. Moss is fast, easy to use, and free. In the past, access has been restricted to instructors and staff of programming courses. This is no longer the case, and anyone may obtain a Moss account.

However, Moss is for non-commercial use. If you are interested in commercial uses of Moss, contact Similix Corporation.

To obtain a Moss account, send a mail message to moss@moss.stanford.edu. The body of the message should appear exactly as follows:

registeruser
mail username@domain

where the last bit in italics is your email address.

If you already have an account, the latest submission script can be downloaded here.

--------

Security and Legal Stuff

If you use Moss, the results will contain copies of the code you submitted and these will be accessible to anyone with the result URL. Stanford can assume no liability whatsoever for your submissions.

Reasonable precautions have been taken to protect the confidentiality of the code you submit. These measures include: only you receive the result URL and it contains a random integer, so it is not easy to guess a result URL. Furthermore, the directory with the results cannot be broswed or indexed by robots. Finally, submissions are not retained indefinitely on the server; typically results are deleted after 14 days, though they may be deleted earlier to free up disk space when the server is particularly busy. If some results you need have been deleted, you can simply resubmit your job.

--------

Community Contributions

A number of Moss users have contributed versions of the submission script:

Andrew Cain has writtten a Ruby gem for Moss.

Hjalti Magnussion has written a summarization/visualization script.

--------

How Does it Work?

A paper on the ideas behind Moss can be found here.

--------

Back to Alex Aiken's Homepage