Smurfing Detection in Online Chess
Creating a model that will identify two or more usernames as being the same player based on their decision making
Timeline
January - May 2022
Deliverables
Final Presentation
Context
Pennsylvania State University Data Science Capstone Project
Team
Sarvjot Baxi
Owen Finkbeiner
Overview
Smurfing can be described as a high level player using multiple accounts on the same gaming website in an effort to play a lower level competitor. It is a phenomenon that makes games like online chess unfair to many players. Our goal was to create a model that will identify two or more usernames as being the same player based on their decision making over the course of a number of moves. We have obtained data consisting of online chess games from the Free Internet Chess Server, FICS, Games Database in order to carry out the project. Our dataset contains 5,000 games from 2021 and 2022, including games from over 1,000 distinct players. Our approach has four main parts:
Encoding player vectors representing each of their moves
Simulating smurfing within our dataset
Utilizing principal component analysis to reduce the dimensions of the player vectors
Clustering players together if the model projects them to be the same player using hierarchical agglomerative clustering
After inducing 7 smurfers into our dataset, we used a total of 69 players as an input to our agglomerative clustering algorithm which resulted in 28 clusters overall which showed us that our model has room for improvement.Â
My Role:
Data Scientist
As a team, we worked together to choose a project topic, design a plan of execution, and tackle any challenges along the way. Some of my distinct contributions include:
Conducted secondary research on relevant literature in the field of smurfing detection
Created dendrogram data visualizations in Python to visualize results from hierarchical clustering