One of the most powerful aspects of Comparative Judgement is the way that you can link judging sessions together with anchors. Anchoring allows you to place the results from different judging sessions on the same scale.  So, for example, recently we took samples from six schools and judged them together. We were able to take the results from that session to place all the work from the six schools on the same scale. In another experiment, we judged the work from two separate mock exams together to see if pupils had improved between the two exams.


So how do anchors work? Anchors are scripts with fixed values for their true score.  Consider the pupils here:




We can add an anchor manually to the session by typing in a new True Score value for some or any of these Candidates. For example, if, in a previous session, Pupil 1 had a True Score of 4.12, we can link this session to the previous one by entering 4.12 as the anchor value. To enter an anchor score manually, you can go to Review Candidates, then type in a new True Score, then press on the + symbol.



Then go to the General tab and set Include Anchor to Yes.


 Then click on Save Changes and Refresh, to generate new True Scores.



When you go back to to the Candidate Feedback screen you will see that the True Score of Pupil 1 is set to the Anchor value you typed in.



The other scores have been re-scaled, and are now lower than they were previously. Although anchoring one Candidate has had some effect, you need to anchor more Candidates that have also appeared in the previous task for the two tasks to be on the same scale. The easiest way to do this it to import the scores from a previous task using the import anchors button selection box on the General tab.



If you select the task where the scripts were previously judged, the scores from any Candidates with a matching name in that task will be automatically imported as anchor values. Note, that the import requires the scripts to have already been added to the new task as Candidates - the import function only imports the True Scores, not the Candidates themselves. It is hard to specify exactly how many anchors you need, as the stability of the linking will depend on how consistently the anchors are judged between sessions. Empirically, however, a sample of 20 per cent of scripts (1 in 5 scripts in the new task are anchors) has resulted in very stable links.


Once you have imported your anchor scores, press Refresh and the new task will be on the same scale as the old task. You can check that the anchoring has worked by downloading the Candidates spreadsheet and checking the True Scores are the same as the Anchor Scores.



If you take an average of the True Scores, you will now see that it is no longer 0, and the average of the Scaled Scores is not 100. The scale has shifted onto the same scale as the other task. In the example below, a lower average score shows that the Candidates in this later session were judged to be worse overall than the Candidates in the earlier session.