By: Shail Mirpuri
In the sport of Tennis, there are very few things discussed more than a player's serve. The serve helps a player dictate the flow of a particular point, and can be crucial in determining a player’s overall success within a Grand Slam. With a rise in point-level data collection over the last few years, we have seen a lot more thought going into the serve. From considering the surface being played on, to the level of risk taken on each serve, tactical services rule modern men’s tennis. In this article we will analyze the change in service statistics over a 15 year period from 2000 to 2015 to see if there have been any notable impacts of serve analytics. Apart from this, we will look at how the importance of the serve varies on each surface in order to gauge any surface-specific tactical insights that tennis players should adopt. Finally, we will analyze the overall importance of the serve to a player’s success in tennis by applying unsupervised machine learning, and comparing the serves of the Big Three (Rafael Nadal, Roger Federer, Novak Djokovic) with other seeded and non-seeded players.
The Dataset
In order to perform this analysis, we needed to examine a ton of match level data. We decided to focus only on matches that were played in the four Grand Slams: Wimbledon, Roland Garros, The Australian Open, and The US Open. This was essential in order to control for the quality of opponents played against, and the total number of sets played per match. To judge the ‘success’ of a player based upon their serving statistics, we need to first adequately define what we consider as ‘success’ in Tennis. In this article, we will consider two definitions of success; first, we shall define a successful player by the percentage of grand-slam matches they played as a seeded player over the course of their career. Under this definition, players who are successful would have been consistent over the entirety of their career, rather than having a short period of success. Apart from considering the percentage of time a player was seeded, we will also look at their win percentage in Grand Slams as a metric of success. With these two definitions of success, we can holistically analyse how a player’s serve relates to his long term success in the game of tennis.
The Evolution of the Serve
Nowadays, it is almost impossible for a professional player to not use data when analysing their game. With the adoption of a data-centric approach to services rising, it is definitely interesting to consider how things have changed since the 2000s, where data was not as heavily collected as it is today.
First, we can see from the graph above that the difference between the percentage of points won on the first service by the winners and losers of a given match has slightly increased over time. This graph is based upon data that looks at the yearly average percentage points won on the first serve by winners and losers of every match. We did this in order to look at whether or not the serve has become more important for a player’s success within Grand Slam tennis over recent years. In particular if we compare the differential percentage of points won on the first serve in 2015 to that of 2000, we can see that there is a 1.5 percentage point increase. This difference, although relatively small, can have a huge impact on a player’s chance of winning a given match. This suggests that the serve has actually become more important to winning Grand Slam tennis matches, which may imply that taking a more data-centric approach to services, as has been increasingly done over the last decade, is actually working since the percentage of points won on service are increasing. Furthermore, the improvement in the points won on the first serve may suggest that matches on average may be shorter since good serves tend to decrease point duration. This may become more of a priority for players especially in best of five matches, in which it is more beneficial for a player to reduce energy expenditure within a particular match by playing for a shorter period of time. This preservation of energy may be a crucial factor impacting the success of a player in the latter rounds of a Grand Slam tournament.
On the other hand, there are several other reasons that could potentially account for the slight rise in the differential of points won on first serve over the years. Firstly, the quality of the ATP players over this 15 year period is not accounted for. For instance, if the gap between high quality and low quality players has widened, then this could explain why the winners of a match are winning slightly more of their first serve points. Another plausible explanation for this increase is the decreasing quality of players’ returns throughout this time period. Rather than the quality of services improving through a data-driven approach, there may have instead been a decrease in the quality of returns over recent years. This would mean that those who win a given match are more likely to win their service points, not through their improved serving quality, but rather the poor returning ability of their opponents.
How does the serve vary on each surface?
Apart from varying throughout the years, the service has also played a different role on each surface. Changing up service strategies according to the surface played on is commonly seen in modern day Tennis. For instance, Andy Murray talks about how he maximizes the use of the slice serve when playing on grass as the ball takes off, and goes out of the receiver’s reach (BBC). We will now take a closer look at key service statistics broken down by the surface played on.
From the table above, we can see that serving aces seems to be crucial to a player’s success on grass courts, in comparison to clay and hard courts. We often see in Wimbledon, the only Grand Slam played on grass, that points tend to be shorter. The importance of serving aces is crucial for a player’s success in a given match. Using this insight, when playing a match on grass, a player could adopt a more-risky approach to their service since going ‘hard’ on the serve can potentially lead to more aces, which tends to be associated with winning the overall match. Another interesting finding is that even though the amount of aces served within clay courts is the least, there still seems to be server dominance. Matches played on clay surfaces boast the highest average difference in both first and second serves between eventual winners and losers of the match. This means that in these courts a player’s serve seems to be more important than those played on a grass or hard court. This may also suggest that on grass and hard courts it is slightly easier for players to break each other’s serves. Thus, we can see that on Grass surfaces, a player should serve aggressively, while on Clay surfaces they should focus more on getting the serve into the court, and subsequently minimizing errors such as double faults.
How important is a player’s serve to his overall career success?
We’ve seen how a good serve can be a destructive weapon in the arsenal of any player, but how important is it actually to a player’s overall career success? In order to get a clearer understanding of the answer to this question, we will now consider the service statistics for over 300 players who have played at least 10 grand slam matches between 2000 and 2015. We will consider 10 different features of each player’s serve throughout their career. In order to find natural groupings among different players’ career service statistics, we applied K-Means clustering to our 10 engineered features.
The first 5 features focus on the differential between when a player is losing and winning a match. For instance the Ace differential for a specific player would be the average amount of aces they serve in matches they have won minus the average amount served in matches they have lost. By analysing these metrics, we can consider how much an improvement in service performance has on a player’s likelihood of winning a match. The second set of 5 features focus on the career total for each metric. This allows us to distinguish between good and bad servers, as well as the consistency of their serves throughout their career. We have also transformed the double fault variable into -1 times the number of double faults. This is because when interpreting the model we wanted it to be such that for each variable the greater the value of the statistic, the better the service performance. Therefore, since a lower amount of double faults is indicative of better serving performance, we decided to take the negative of the original value.
After performing feature selection and engineering, we moved onto scaling our data using Minimum-Maximum normalization. This is because we did not want one specific feature to have an overruling impact on the clustering of our data. By scaling all of our features from 0 to 1, we ensured that each variable is accounted for equally. With our model pre-processing now out of the way, let’s see what groups were formed after applying the K-Means algorithm.
Summary and interpretation of each cluster:
Group 0 - The Average Aggressive Server
We can see that group 0 tends to contain players who are more aggressive with their serves. This is because they have the lowest average total percentage of first serves that go into the court. Furthermore, they have the lowest statistic for total double faults, and as mentioned previously, this means that they are likely to serve the most double faults, since we have taken the negative of this value. On the other hand, they serve more aces on average than group 2, meaning that they may have a more aggressive serving strategy. In addition to this, these servers seem to have relatively mediocre percentages of service points won, meaning that they are quite average servers as a whole. This is because they are ranked worse than group 1 but better than group 2 in the average total percentage of 1st serves won.
Group 1 - The Best Servers
This group clearly seems to be made up of the most successful servers within the game of tennis. We see this with their outstanding average total percentage of points won on both the first and second serve. Another interesting thing to note is that this group tends to take a balance between aggressive and defensive services, which can be seen in their average total percentage of first serves in and average double faults.
Group 2 - The Weakest Safe Servers
Finally we can see that group 2 comprises the weakest servers over the last 15 years. Players in this group hold the worst records in terms of average total points won on the first and second serves. Apart from being weak servers, this group seems to consist of relatively safer servers, who are more concerned with getting their services into the court. We can clearly see this by lack of aces, high percentage of first service in, and the lack of double faults served (high double fault statistic).
Now that we have interpreted the representation of each group based upon the features we have clustered our data with, let’s consider the breakdown of other features that were not accounted for while grouping our data. We will look at the average percentage of matches seeded (%), win percentage (%), and total grand slam games. The average percentage of matches seeded (%) refers to the amount of grand slams games a player played as a seeded player during their career. As mentioned earlier, this will be one of our metrics for success as it provides insight into whether or not a player is able to maintain a high level consistently throughout his career.
From the table above we can see that for all three success metrics, the cluster consisting of the best servers seems to comfortably be ahead of the other two clusters. From this breakdown, we can see that those who are part of the best server group not only tend to boast higher win percentages, and percentage of matches seeded throughout their careers, but they also tend to play more total grand slam tournaments, which is a testament to their longer term success within the sport of tennis. We can also see that the safe and aggressive servers both have equally as poor records when it comes to win percentage and average percentage of matches seeded. This suggests that finding a balanced approach between attacking and defensive services is the key to success within Grand Slam tennis. We often see that the Big Three (Federer, Nadal, Djokovic), who are all a part of Cluster 1 (The Best Servers), know when it’s the right time to serve hard, and when it's time to ensure their serve lands in the court. We will now take a look at these three modern greats, and investigate how their serve varies from all other players.
The Big Three, Seeded and Non-Seeded Players
Roger Federer, Rafael Nadal, and Novak Djokovic, who make up the Big Three, have dominated the modern era of Tennis. These tennis greats have won 58 out of the last 70 grand slams since 2003. The over 15-year dominance of this trio is unmatched within the sport. We will now compare their service statistics with other seeded and unseeded players to see if a player’s service strategies should vary depending on their overall quality. More specifically, we will look at the differential statistics to compare how these groups perform when winning compared to when losing. The statistics for the seeded and unseeded groups focus on match-level data, meaning that the group a player’s serve statistic belongs to in a given match depends on whether or not he was seeded or unseeded for that given match.
One interesting finding when comparing these three groups was the difference in average additional aces served when winning a match. From the graph, we can clearly see that unseeded players tend to serve significantly more aces on average when they win a grand slam match especially in comparison to those in the Big Three. This may be because those in the Big Three do not tend to rely on aces when winning a particular match since the quality of their overall groundstrokes is head and shoulders above the rest. On the other hand, unseeded players tend to be of a lower quality, and therefore might struggle to sustain longer points. Thus, by relying on aces, this group of players is able to play shorter points, and potentially increase their chances of winning a match. Using this insight, unseeded players may decide to go harder on their serves in order to increase their likelihood of serving aces, and in turn their chances of winning a given match. We will now move on to considering how the quality of the serves for the Big Three has changed throughout their careers.
From the graph above we can see that the Big Three have improved both their first and second serves throughout their careers. This suggests that these three great players actually prioritised improving their serve in order to achieve more success within their career. This is particularly evident from the sharp improvement in service performance in 2008, the first year where the Grand Slams were shared among all members of the Big Three. Not only does this show the importance of the serve within the game of tennis, but it also provides insight that the domination of points won on the serve is what distinguishes Federer, Nadal, and Djokovic from their counterparts.
Key Takeaways
Overall, we have seen the pivotal role that the serve plays in the modern game of Tennis. The rise in serve analytics has seemed to slightly improve the overall service performance of players in recent years. We have seen that serve analytics can inform various tactical service decisions for a given match depending on the surface played on and the quality of the player (seeded or unseeded). These tactical decisions can give players a slight edge over their opponents in Grand Slam matches. We’ve also seen that the service of a player is a great indicator of his success both in terms of winning a large amount of matches, and remaining consistent over a long period of time. With Tennis being a game of tight margins, data-driven service decisions can be the difference between a player either holding onto a set, and obtaining Grand Slam glory or getting broken back, and subsequently knocked out of the tournament.
Github Repository: github.com/shailm99/GrandSlamServing
Comments