Osu! is a rhythm game that I've been playing for the past few months. The goal of the game is just to simply click circles on your screen with the right timing. The game also contains a rich competitive performance system where each completed play rewards players with performance points (pp for short) and players are given global rankings based on their average pp. One cool feature that the user base has been asking for is to announce to discord channels whenever the user has made a new top play. See below image for an example. Osu provides a set of public endpoints that can be queried for information regarding a particular beatmap, player and scores.
So I wanted to create something similar to the image above and decided on a serverless and event-driven architecture for it's simplicity and scalability and to also experiment to see what it can offer since it would be the first one I've ever built.
One of the main issue is that the Osu! api is rate limited. In general, you should not exceed 1 request per second. That means with a large user base, the server will need to put players in a queue for tracking. Every player is different, some play occasionally and casually while some are competitive and always seem to be setting new top plays. Therefore, it is obvious that some players should have a higher priority in the queue and should be checked more often than others.
Bayesian Model for Player Priority
A player's priority in the queue should scale as a function of their play count over time. However, having a deterministic function will prevent a lot of users ever being checked on given a large enough user base. Therefore, we need to introduce a bit of randomness so that every users will be eventually checked at some point. I assumed that the number of plays a player sets within a time period of a day follows a poisson distribution. It then follows from some probability theory that the days between plays is distributed exponentially with a rate parameter. To model for differences in players and also introduce bayesian priors for new players, I assumed the rate parameter to be gamma distributed with alpha and beta parameter. The posterior predictive is a Lomax distribution and updating each user's hyperparameter is actually quite easy and can be done additively. Alpha is simply equal to the prior plus the number of plays they have set since being added. Beta is equal to the prior plus the time in days between each play summed up for all the plays we have observed. For example, if my priors are 1 and I've set 3 plays over the span of 2/1/3 days between each. Then the alpha = 1 + 3 and beta = 1 + 2 + 1 + 3. Whenever a user is added to the queue, a random sample from the posterior is taken as their new priority #. Users that are more inactive will sample at a lower priority than more active players but will still be added to the queue at a lower probability. Furthermore, when a user makes a new play, then their time between this and the previous play is calculated and their alpha/betas are updated accordingly.
The process starts from the left with a cloudwatch scheduled event. Every minute, the scheduler will invoke the queueing handler in aws lambda that will query for a list of users by highest priority and then send them to a queue managed by aws's message service called SQS. One message is sent to a kezuru-handler which handles all checking of user top plays. Another is sent to priority handler which is responsible for doing all the rerolling for a user's queue priority. When the kezuru handler detects a new top play, it gets sent to an announcement handler which publishes to the user's specified channels.