Lev Yashin makes a save in the 1966 World Cup semifinal between Russia & West Germany

I Expect You to Save; Introducing a new model to rate shot-stopping in the SPFL.

A life-long member of the Goalkeepers Union, Christian Wulff uses advanced stats to introduce a new method of measuring the shot-stopping performances of the goal custodians in the SPFL Premiership.

 

In a past life I was a goalkeeper. I was OK. Slow, not a great jump and quite the liability at crosses. But my distribution was decent, I had very good reactions and together with the ability to shout at my defenders constantly for 90 minutes that got me a Norwegian F.A. Youth Cup winners’ medal (an injury time substitute in the final) and to the dizzy heights of the fourth tier of the Norwegian league pyramid.

We always had our own team within the team; the goalies – and if you were lucky enough to have one – the goalkeeping coach. Other coaches and players would always have an opinion on what a goalkeeper should do, what was a good save and what wasn’t, what was difficult and what was easy. They were mostly wrong.

But the Goalkeepers’ Union knew and even if you had a rival for the starting spot you always backed each other up, help prepare for a game, always ready with a quiet word at half-time. You were in the trenches together, and nobody but a fellow comrade in gloves knew what it was like in there.

Goalkeepers are always suspicious of non-union members trying to judge a goalkeeping performance. Shot-stopping, crosses, distribution and organisation are the four columns a goalkeepers’ performance is built on, with speed, strength, agility, technique, timing, concentration, awareness, confidence and mental strength its components.

Lev-Yashin-4

With the continuing emergence of advanced statistics within football, there are now also efforts to objectively measure goalkeeping performance down to the smallest detail. It’s a difficult task but true to form, it’s one of our own union members that are at the forefront of advanced goalkeeping statistics. Sam Jackson only played at youth level, but today he is the head of research and analytics at World in Motion, a football agency just for goalkeepers.  Some of Sam’s work can be found through his twitter account (@sam_jackson94), but he has developed (and is continuing to improve) advanced stats that cover shot-stopping, crosses and distribution.

What then can advanced statistics tell us about the shot-stoppers in the Scottish Premiership using available advance stats? First of all, we’re restricted to exactly that; shot-stopping. There simply isn’t the readily available data for Scottish football in order to  judge distribution or cross management.

Luckily, the good people at Stratabet offers us access to a very detailed look at all the chances created in the Premiership, which forms the basis for our shot-stopping analysis. Matt Rhein (of @thebackpassrule.com) and I last year created the first publicly available expected goals (xG) model adapted to the Scottish Premiership. Full details can be found in Matt’s article, but essentially we’ve ignored data from Europe’s five top leagues, instead creating a model based on the 16 other leagues Strata cover, all a tier (or maybe two) below the best. Going back to 2016, we now have a dataset of over 120,000 shots and headers from this ‘League of Average Leagues’ to create our model.

An expected goals model tries to measure and attribute a value to the quality of the chances a team create (and concede). In most models, chances are grouped together based on the location of the attempt and whether it was a shot, header, direct free-kick or penalty. Our model also takes into account the defensive situation faced by the attacker; how many defenders were in-between the ball and the goal and how much pressure the attacker was under at the time. This gives us over 200 different chance categories with an expected goal (xG) value given to each one. That value is quite simply based on how many goals have been scored from each of these chance categories from our 120,000+ strong database worth of attempts.

For example, if a goal has been scored from a certain chance category 1 in 10 times, the xG value is 0.10, and so on.

Perhaps  slightly counter-intuitively, the term ‘expected goal’ is more a reflection of what has happened in the past; the outcome of previous chances are used to determine the quality of current ones.

So standard xG models looks at the quality of a chance through the probability of that chance being scored based on a range of factors *at the point of* a shot or header being taken; a shot that is blasted far over the goal by an attacker in loads of space seven yards out is a bigger chance than a thunderbolt from 30 yards that hits the underside of the bar, even though the latter is closer to being scored.

(Photo by Bob Thomas/Getty Images)

(Photo by Bob Thomas/Getty Images)

Just as we compare an attackers’ goal tally to the quality of chances they have had (goals vs expected goals), we can do the reverse for goalkeepers; the quality of the chances faced by a goalkeeper compared to the goals they’ve conceded (sometimes referred to as expected saves)

But while standard xG models determine the quality of a chance based on the factors at the point of a shot or header being taken, using such a model to judge shot-stopping by goalkeepers is likely to skew the measurement of their shot-stopping abilities. This is because of two factors;

First, a standard xG model includes shots and headers that did not go on target, i.e. ones that were not scored or saved by the goalkeeper.

As an example; 2000 shots have been taken from 10 yards out in the middle of the penalty box. Of those 2000 shots, 400 have been scored (1 in 5) giving that category of chance a xG value of 0.20.

However, only half (1000) of those shots were on target. Since 400 of the shots were scored, the expected goals value for a shot from that location that hits the target is 0.40 (400 goals from 1000 shots, or 4 in 10).

Adding such ‘post-attempt’ information to the model gives us a more accurate picture of how many saves a goalkeeper can be ‘expected’ to make from that location.

This first factor is an objective one; the shot is either on target or it isn’t.

But there is another – major – factor that would influence the expected goals value of a shot on target; the quality of the strike.

Before we look closer at it, a slight (even more) nerdish detour might be necessary as we’ve now arrived at one of the most contagious points around expected goals models. Almost all the most well-known xG models in public today use Opta data (such as the xG values on Match of the Day, which is Opta’s own model, and models belonging to analysts such as Sander Tegen and Michael Caley, whose single game xG maps are widely spread on Twitter). Almost all of these Opta based models will include a factor that gives a significant higher xG value to chances that Opta classify as a ‘Big Chance’.

The potential problem is that this is a subjective classification determined by the Opta analysts who gather all the raw data from each game. And while these analysts will have a certain set of criteria to make that decision from, there is a distinct risk of confirmation bias; if there’s an attempt that is borderline ‘Big Chance’, will whether a goal was scored or not from that attempt subconsciously sway the analysts in determining whether it was a ‘Big Chance’? And therefore giving it a higher xG value in the models?

Matt and I’s model based on Strata data do not include a ‘Big Chance’ factor (Strata’s closest equivalent is ‘Superb Chance’). However, there is one smaller factor which is subjective in our standard xG model; the ‘defensive pressure’ part.

Strata’s analysts judge the defensive pressure on the attacker at the time the attempt is made, from a scale of 1 to 5. Again, this score is based on pre-defined criteria and are routinely checked by Strata for adherence, but it is still subjective. This score make up half of the ‘defensive situation’ factor in our model (the other half being the far more objective ‘players between ball and goal’). As another aside, the man we call ‘The Godfather of Scottish Football Analytics’, Seth Dobson, is developing a lot more mathematical advanced xG model for Scottish football, using only ‘objective’ factors from Strata. But then he has a bigger brain than all of us.

As mentioned before we went wandering down the tangent way, Strata also have another subjective metric in their data; they rank the quality of each shot or header from a scale from 1 to 5, where 5 is the highest. Again, there is a risk of bias in this metric; if in doubt, analysts may be likely to choose one shot quality score over another based on whether a goal was scored or not.

However, adding shot quality into our expected saves model has obvious benefits when judging goalkeepers. With this information we now not only know if a shot was on target, we know how good the quality of the strike was.

Conejo

To come back to our hypothetical shot from 10 yards out; If 200 of those 1000 shots on target were of shot quality 4 and 160 of them were scored this would give an expected goals value of 0.80 for that chance (as 4 in 5 shots have been scored).

If 200 of those attempts had a shot quality of 2 and 20 of them was scored (1 in 10), the expected goal value is 0.10.

Without using shot quality, the expected goals value of the chance category above was 0.40, i.e. we ‘expect’ the goalkeeper to save this shot 6 out of 10 times. With the shot quality added we expect the keeper to save this shot either 2 out of 10 or 8 out of 10 times.

It’s a very clear example of how adding in shot quality can make a huge difference in judging the shot-stopping ability of goalkeepers.

As above, when writing and discussing advanced statistics it’s always important to be aware and acknowledge the potential issues with any of the metrics and models we use. But with this xG model including such ‘post-attempt’ data (target/not on target and shot quality) we can be reasonably confident that it will more accurately measure the shot-stopping ability of the goalkeepers in the Scottish Premiership this season.

Using the above model, we can compare the expected goals value of all the chances the goalkeepers have faced this season compared to the amount of goals they have conceded (penalties excluded).

Essentially we’re comparing the performance of the Premiership goalkeepers to all the previous goalkeeping actions in our database. If they are conceded less goals than the expected goals value of the shots they’ve faced, they are overperforming against this historical average.

If they are conceded more goals than the expected goal value faced, they are underperforming.

There is one more metric that we add into our main table and visualisation for rating shot-stopping and that is the average expected goal value per shot faced (the total xG value of all chances faced divided by total amount of chances faced).

This can be a significant factor due to the concept of random variance. Again, Matt Rhein has described this principle in more detail, but simply put; if one goalkeeper faces one chance with an xG value of 0.50 and another goalkeeper faces five chances with an xG value of 0.10 each, they both have faced a total of 0.50 xG worth of chances.

However, due to random variance, the goalkeeper who faced the one chance with a 0.50 xG value is more likely to have conceded a goal than the other goalkeeper (and less expected to save that one shot than the other goalkeeper is to save all five shots).

Combining these two metrics, we can start to look at which goalkeepers have statistically performed the best when it comes to shot-stopping in the Premiership this season.

As every self-respecting analysts should, we list the data per every 90 minutes played. Here’s the table listed by which goalkeeper has overperformed the most compared to the historical average, including only goalkeepers with 9 full 90 minutes played in league so far this season.

 

Table

Here is the same information visualised on a graph and to compliment the data, the amount of shots faced (size of point) and what % of all shots were saved (colour) have also been added.

Graph

The value listed is the amount of goals conceded minus the expected goals value of shots on target faced: the more expected goals faced than goals conceded, the better the performance. There are two goalkeepers who really stand out; at the top is St. Johnstone’s Alan Mannus who have conceded 0.28 goals less per every 90 minutes played this season compared to the expected goals value of the shots against him. This means he is conceding almost 1 goal less than expected over every three games. If he performed at this rate over an entire season, he would prevent over 10 goals less than what can be expected over an historical average, which would be a major contribution to his team’s results.

Having lost his place to Zander Clark earlier in the season, Mannus didn’t play between October and the end of January. Based on this, it was a strange decision by Tommy Wright. Clark’s shot-stopping stats are not terrible, but they are below average in the league; conceding 0.08  goals more than expected (3 goals over a whole season). It’s worth noting that Mannus has also faced the 5th highest quality of average chances, confirming his status as the statistically best shot-stopper in the league so far.

It’ll be interesting to see whether his reintroduction into the St. Johnstone team might help them turn around their recent prevails.

Mannus is followed on the table by a goalkeeper who has recently moved to his third club of the season. Celtic’s recent recruit Scott Bain lost his place to Elliot Parish earlier in the season and was never given a game for Hibernian in his very short spell there. In the 12 full games he has played, Bain have saved on average 0.21 goals more than expected in each one – that’s almost 8 goals over a league season. Parish himself is also among the better shot-stoppers statistically, with 0.12 fewer goals conceded per 90, although his quality of average changes faced is among the lowest in the league.

Bain could have been a much needed addition to Hibernian during the second half of the season as their first choice goalkeeper Ofir Marciano comes out towards the bottom of this shot-stopping table. He concedes 0.13 more goals per every 90 minutes played than can be expected, which is almost 5 goals over an entire season.

Craig Gordon would probably be most people’s choice as the best goalkeeper in the league, and his shot-stopping stats are more than decent. He has the sixth best Goals Conceded vs Expected Goals Faced value, but it’s very close up to Tomas Cerny in third place. Cerny has the best ratio in this category for the Glasgow clubs, with Wes Foderingham 8th out of 15, and 0.03 more goals than xG conceded per 90, so pretty much exactly ‘as expected’.

Ross County have been underperforming on a team level when it comes to expected goal all season, and some blame probably needs to be layed at the door of their goalkeepers, with Scott Fox and Aaron McCarey propping up the shot-stopping table. They’ve both conceded 0.20 goals per full game more than can be expected. That’s one goal every five games and almost eight goals over a full season. It gets worse for McCarey especially, as on average he’s faced the second lowest chance quality. Based on these numbers, he’s been the worst shot-stopper in the SPFL Premiership this season.

It’s worth mentioning Jamie McLaughlin’s numbers as well. He’s actually got the highest save percentage in the league this season (stopping 79.5% of all the attempts he’s faced) – a more traditional stat metric for goalkeepers. However,  he has the lowest quality of average chances faced. So he might save a high percentage of the attempts fired at him, but the difficulty of those shots and headers are on average the lowest in the league. It’s a good example of how advanced stats like expected goals allows us to drill further into the detail and can give us a more comprehensive picture of a goalkeeper’s shot-stopping abilities than more standard statistics.

Matt and I will be tracking these numbers over at @thebackpassrue and @x90Cynic on twitter for the rest of the season to determine who will be the statistically best shot-stopper in the SPFL Premiership this season. Certified by the Goalkeepers’ Union, of course.

 

This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations.

 


Christian came to Scotland in 2001 and nobody have still managed to get rid of him. A native of Oslo, Norway, he was a huge fan of Ronny Deila before it was cool and still is now that nobody likes him anymore. Christian joined the Cynics in 2014 and is now website editor and infrequent podcaster. He has previously written for The Herald, Scotsman and has also contributed to STV (face) and BBC (voice)


'I Expect You to Save; Introducing a new model to rate shot-stopping in the SPFL.' have no comments

Be the first to comment this post!

Would you like to share your thoughts?

Your email address will not be published.

© 2018 90 Minute Cynic. All rights reserved.