Introduction 


Welcome to nomoremarking.com, the online platform for Comparative Judgment. This guide will help you set up a simple Comparative Judgment session. We have only set out the basics here. Once you have mastered the basics, there are a great number of options to explore which will make your assessment easier. For example, you can optimise the efficiency of judging by using bar coded response sheets and images rather than .PDFs. You can set standards and then monitor them over time using linked judging sessions. We recommend first, however, that you get to grips with a basic judging session. 


Different ways of using nomoremarking.com 

 

There are many different ways to use nomoremarking.com. Here are some of the most popular methods:

 

Candidates produce work on paper. You scan the images as .PDFs and upload them to nomoremarking.com to judge.

 

Candidates type in their work online. You set up a task and send links to the candidates to type their answers into our online interface. When they have finished you can judge their responses.

 

Candidates type in their responses into an online test platform. You export their responses into an Excel le and then upload them onto nomoremarking.com to judge.

 

High volume assessment. For high volume assessment you can produce bar coded response sheets so you can track pupils through the system.

 

Setting up a simple .PDF judging task 

 

The PDF task is the simplest to set up as you can accept most of the default options.

 

Setting up the task 

 

1.  Log into nomoremarking.com

2.  Click on the + Add Task button.

 

3.  Click on the Name of the new task.

4.  Change the Name to be more meaningful.

5.  Change the Judgments per Judge.

6.  Click Save Changes.

7.  Click on Add Candidates.


 

8.  Click on the Choose Files button and the .PDFs you have saved.

9.  Select your test and click open.

10.  Order by Uploaded File Size and make sure that no les have 0 KB. If any have failed to upload click on Review Candidates and delete them. You can now try to upload them again.

11.  Click on Add Judges.


 

12.  Type in the email addresses of the judges.

13.  Click on Do Judging.


 

14.  Try the judging by clicking on the Do Judgments link.

15.  Click on Select All Emails.

16.  Click on Send Emails. Judges will now have their unique judging links and can start judging.

 

Monitoring the judging 

 

Once your judging has started, you may want to monitor it live. 


1.  From the dashboard click the Refresh button to update the scores and reliability.



2.  Click on the task name and the Judge Feedback tab to see how your judges are doing.


 

3.  Click on the task name and the Candidate Feedback tab to see how your candidates are being judged.


Getting your results 


Make sure you click on the refresh button to generate your scores before downloading your feedback!




You can view your results online, or you can download them for further analysis.


1.  From the dashboard, click on the task name and the Downloads button.

2.  Download judges to download the feedback on judges.

3.  Download candidates to download the feedback on candidates.

4.  Download decisions to download every decision for further analysis.

 

Understanding your results


Candidate: The piece of work or script that has been judged. The candidate is usually in the form of a .PDF, script, image, or video.

True Score: The score we estimate for a candidate, based on the results of the judgments.

Scaled Score: The True Score on a more meaningful scale, for example with a mean of 100 and a standard deviation of 15.

Infit: A measure of consistency. We provide an infit measure for judges and candidates. Infit values of 1 or below represent high consistency. Infit values of above 1.2 suggest inconsistency. For a judge, high infit (> 1.2) with a low median judgment time suggests poor quality judging through carelessness. For a candidate, a high infit (> 1.2) suggests that different judges have a different opinion about the quality of the candidate. Low infit values (< 1.0) are of no concern.

Reliability: A measure of consistency of your results. Values over 0.8 suggest that your scale is likely to be stable and repeatable. Values go from 0 (all noise) to 1 (perfect repeatability).

Inter-rater: The correlation between the scale produced by half of your judges and the scale produced by the other half of your judges. Values go from -1 (perfect disagreement) to 1 (perfect agreement). We take four random halves, and report the mean and standard deviation of the four replications.