The Justice Syndicate: Kris's methodology / by Rachel Briscoe

*** SPOILER ALERT: If you have not yet seen The Justice Syndicate, we advise not reading this ***

Friday 8 February 2019.

During each session of The Justice Syndicate, the system anonymously logs information about how people vote, how long it takes them to vote, etc.

At the date of writing, 167 people have participated across a total of 15 sessions. That’s a sufficiently large number to do some early analysis of guilty/not-guilty voting data, but for other questions we’re interested in, we will need to play more sessions with more participants.

We presented some of these results in the newspaper that participants receive at the end. This blog explains some of the calculations behind those results.

Overall Results

Of the 15 sessions, 2 so far ended with a guilty verdict, meaning 10 or more people voting guilty in the final vote. Of the 13 that didn’t reach a guilty verdict, 4 reached outright ‘not-guilty’ verdicts (10 or more people voting not-guilty), whereas 9 sessions remained divided.

Of the 167 participants, 75 (45%) ended with a guilty vote and 89 (55%) with a not-guilty vote. In the first vote (about 10 minutes in the play), these numbers are slightly different: 55% voted guilty and 45% not-guilty. As some participants told us, the first vote feels less consequential than the last one, when “reasonable doubt” plays a stronger role in their deliberations.

What was surprising, looking at these first results, is how close the overall voting numbers are to 50/50. That’s what we would expect to see if people voted randomly, without thinking. Despite the almost 50/50 split, there are clear patterns in the voting data that speak very much against the idea of random voting. These patterns are caused by 2 competing psychological factors at work in The Justice Syndicate: the consistency effect, and the pressure for group uniformity.


People have a tendency to want to be consistent in their beliefs and actions. This is a well established principle in social psychology (see e.g., Robert Cialdini’s book “Influence”). In The Justice Syndicate, the consistency principle is triggered when you are first asked to vote, or when you take a public stance in the group discussions. If you found yourself thinking “I just voted X, I’d better come up with good arguments to support my view”, then that’s consistency at work.

We expected the voting record to show clear signs of consistency, meaning that we’d expect more people to vote the same throughout their session than people flipping back and forth.

Our first 167 participants do not disappoint: 39% of them never changed their vote from the first one. Of these 39%, 22% voted consistently guilty, whereas 17% voted consistently not-guilty. Whether this difference is statistically significant is too early to say - we need more data for that. A further 45% of participants is nearly consistent, meaning that they cast only one vote that is different from their first, or they flip only once and then stick to that new vote for the rest of the play. That leaves a mere 16% of people who change their vote more than once.

These consistent voting patterns are highly unusual. If people voted randomly, then we’d expect only 3% of the votes to be all-guilty or all-not-guilty - not 39%. Here is how to calculate that: most sessions have 6 voting rounds. If we assign a 1 to every guilty vote and a 0 to not-guilty, then the voting record of each individual can be written down as a sequence of 0s and 1s. With 6 voting rounds, there exist 2^6 = 64 unique voting records. Only 2 of these are consistent (one with all 0s, and one with all 1s), meaning there is a 2/64 ~ 3% probability of encountering all 0s or all 1s by chance. There are a further 20 sequences (32%) which are near-consistent, leaving 42 sequences (65%) which are not consistent. The 16% of actual inconsistent voters is far less than the 65% we’d expect with pure random voting. As psychologists say, this is a pretty strong “effect size”.

Pressure for Group Uniformity

The large effect of consistency is all the more surprising if we take into account the other effect we expected to see which can counteract it: the pressure for group uniformity (see Leon Festinger’s 1950 paper “Informal Social Communication). If we exclude 2 of the 15 sessions which have less than 11 participants, then there are 6 out of 13 sessions (45%) which ended with a near-uniform verdict. With that we mean that only 0, 1 or 2 jurors differ from the majority. Two of those were guilty verdicts, whereas 4 were not-guilty. Whether this is a significant difference is too early to say.

What is interesting, however is that the proportion of near-uniform verdicts is higher than we would expect without social influence. In that case, we’d expect about 4% of near-uniform verdicts - not 45%. Here’s how to calculate that: remember that the total number of guilty and not-guilty votes is close to 50%. Voting without group influence is similar to dividing those guilty and not-guilty votes randomly across different groups. If we assume (for simplicity) that each group has 12 participants, and if we again assign 1s to guilty and 0s to not-guilty votes, then there are 2^12 = 4096 sequences of 0s and 1s that stand for unique final group votes. Of these, there are only 79 which have 10 or more 1s, and 79 which have 10 or more 0s. The chance of achieving a near-uniform vote randomly (without influence) is therefore (79+79)/4096 ~ 4%.

Thirteen sessions is not enough to draw strong conclusions from, and it’s too early to say if the 45% of near-uniform end votes will hold up. However, in comparison,  the initial vote which is cast before people have a chance to talk to each other is close to the no-influence condition. So far it never been near-uniform, but is usually more evenly split. That makes it more likely that the high proportion of near-uniform final votes is indeed a consequence of pressure for group uniformity.

Demographic Differences

One question we have been asked a lot is if there are demographic differences. The tablet asks for gender and age group at the start, With the number of participants we have had to date, the only demographic measure that was worth looking at is whether different genders vote differently. So far, there is almost no difference, too small to say whether this is statistically significant or caused by random changes. As more data comes in, we will be able to analyse those effect, and also if there are connections between e.g., gender and consistency, or age group and social influence.