Application of AI to Tennis Match Footage Transcription
One of the best ways for tennis players to improve their game is to record and watch their own match footage, find patterns in the points they win and lose, and practice based on these realizations. However, watching match footage and documenting each point shot by shot is a very time-consuming process. This paper investigates an AI approach to transcribing tennis match footage, combining a deep convolution neural network (YOLOv4), a pose estimation model (Movenet), and a long short- term memory (LSTM) deep neural network. Looking at a transcript of each point will be far more efficient than watching entire match footage for a player to understand how they are losing and winning and analyze patterns in their game. The LSTM model in this project achieved accura- cies of 73.33% and 79.31% when classifying shot type (forehand, forehand volley, forehand slice, backhand, backhand volley, backhand slice, over- head/smash, and serve) for players on the close side and opposite side of the net, respectively, and 55.17% and 60.00% when classifying the di- rection of a shot (cross-court, down the line, down the middle, inside in, inside out, out wide, down the t, and body) for players on the close side and opposite side of the net, respectively.
Mentored by Eric Bradford