Since May last year I’ve been working on what’s become an International Ranking System for the highest level of TF2 6v6 teams. I’ll say straight away that this is not meant to be the be-all and end-all of TF2 ranking systems, and some of its judgements show this. It’s entirely logs-based, and you don’t need me to tell you that there’s an awful lot more to being good at TF2 than just producing big numbers on logs.tf. However, I do still feel that the data maintains a good level of accuracy that at least makes for interesting reading, and it did correctly predict the finishing order of both i58 and ESA Rewind.
How It Works
I’ve been keeping a private record of top-level TF2 matches for almost a year. Each entry features information logging what players were playing each role (including pocket and flank scout), what their overall DPM was (except for medics), and what their overall KA/D was. For medics, I log Healing per Minute rather than DPM. Why HPM and not ubers? Because of the Vaccinator. I also log other stuff, but it’s these stats that matter for the Rankings. Because of the international scope of this system, any data I do track must, of course, be available from both logs.tf and the ESEA logs.
If a player, the Pocket for example, has a higher DPM and higher KA/D than his counterpart on the other team, he gets ‘gilded’. Whenever a player makes an appearance in one of these record entries, their presence and whether or not they were gilded is recorded. Each player in the rankings therefore has two numbers next to their name: their number of appearances, and the number of times they’ve been gilded. To keep the Rankings current, these two numbers are based on only the last 300 matches in the Records. This covers a 7-ish month period, enough to span roughly two seasons for each region. At this particular moment, this period spans back to partway through the opening day of i58. To give an example, in the past 300 matches (at the time of writing) Saam has 25 entries and was gilded 4 times, also written as 4/25.
These two figures per player in the Rankings then go through a bit of processing to produce two more figures for each player. These are what I call the ‘hit-rate’ and the ‘mileage’. The hit-rate is very simple – the percentage of appearances in which the player was gilded. For Saam, this is 16%. The mileage is the number of entries and number of gildings added together, which for Saam is 29.
Each player’s hit-rate and mileage are now compared to everyone else’s hit-rate and mileage to determine their score, the metric by which they are all ranked. The score is two numbers added together: it basically counts how many players in the Rankings have a worse hit-rate than you, and adds that to the number of people who have a worse mileage than you. If you were on a list of 100 people and you had the 10th-best hit-rate and the 50th-best mileage, your score would be 140.
This applies to every player in the Rankings, except for one small divergence regarding people who are 1/0 (one appearance, never gilded). Because these Rankings only cover the last 300 matches on record, there are a bunch of players at the bottom who are now 0/0, and rightly have a score of 0. However, these people aren’t included when determining peoples’ scores. This means that people who are 1/0 would normally also get a score of 0 because there’s ‘nobody’ worse than them. 1/0 is obviously better than 0/0, so people who are 1/0 have a special score that’s halfway between 0 and whatever 2/0’s score is.
You can probably imagine how these player rankings get turned into team rankings – the six team members have their scores averaged out to determine the team score. Some of the Asian teams and one or two of the Aussie teams seem to turn up to every match with a different roster, and in these cases I’ve generally determined their six players to be the ones most often seen on each role. Only top-level (i.e. Prem/Invite/equivalent) teams are included in the team rankings simply because I can’t be bothered to keep track of the rosters of lower-level teams. As of today, this system ranks a total of 425 players and 27 teams. These numbers will go up and down as new players are added and old ones become unranked having not taken part in any of the past 300 matches. It’s likely that, due to aliasing, there are a few cases where I’ve unknowingly listed the same individual under multiple names without realising they refer to the same person.
Interplay Between Regions
This system ranks players from all four regions together in one single list. You might be wondering how I factor in the skill differential between the four different regions. Early on I tried a variety of techniques to do this, but in the end I realised that this system actually polices itself in this regard to a good extent. Most would say that of the four, AsiaFortress is the scene with the lowest skill standard. It also just so happens to be the region that features the fewest matches. The Asian scene doesn’t appear to really feature LANs and secondary tournaments, and its top division in this and the last season featured only 7 and 4 teams respectively rather than 8. This basically puts a natural cap on Asian player mileage, restraining them in the Rankings even if their hit-rate is north of 80%.
OzFortress, meanwhile, has a healthier prem division of 8 teams but unlike Europe there aren’t many secondary tournaments and LANs, again restricting the heights Aussies can reach by limiting their potential mileage.
I think many would say that North America and Europe are at least somewhat equal in quality, and it just so happens that this, too, is reflected in the level of activity in each region. ETF2L has pre-season Premiership playoffs and secondary tournaments and European LANs come around reasonably often. North America is generally a bit less rich in secondary activity, however this is accounted for by all the ESEA-I teams playing eachother more than once in the group stage unlike in ETF2L. This leaves these two regions with a higher mileage ceiling than the other two.
With all that in mind, I’ve not imposed any artificial restraints on the heights that AsiaFortress or OzFortress players can reach. All the numbers are pure and unmodified.
What Matches Count Toward the Rankings?
I started out making records for every top-level ETF2L, ESEA, OzFortress, and Asiafortress official, plus basically anything else that got a TFTV stream. Nowadays, I use a much more mature system of guidelines to determine whether or not a match should have a record entry. These are the matches that get recorded:
- Any official ETF2L Prem, ESEA Invite, OzFortress Prem, or AsiaFortress Div-1 match, be they in the group stage or playoffs.
- The grand final of the second division down in these same leagues (i.e. ETF2L High, ESEA-IM, etc).
- Any ETF2L Pre-Season Premiership Playoffs match that results in a team being promoted to Prem.
- Every match, group stage and playoffs, of the invite tournament of a major international LAN like Insomnia or ESA Rewind, featuring only top-level teams.
- The playoff matches, but not the group stage, of smaller-scale LANs featuring more than two top-level teams (such as Dreamhack Winter Battle for the North).
- The grand final of LANs featuring perhaps just one or two top-level teams, possibly including the upcoming Gamers Assembly LAN.
- The playoff matches, but not the group stage, of secondary tournaments that include many top-level players (such as the ETF2L 6v6 Nations Cup or TFTV New Map Cup).
- The grand final of any other secondary tournaments that only feature a handful of top-level teams/players (such as some of the FACEIT tourneys that happened a little while ago)
- Any showmatch featuring top-level teams.
Matches that meet these criteria may still be excluded from the records if they’re plainly unrepresentative (e.g. a 5v6, half the players are pyros and spies the whole time, etc).
I’ll reiterate that this is not a perfect system. For example, Stark, widely regarded as one of the greatest players ever to grace TF2, peaked at a mere 14th-best in the world by these metrics.
With all that said, the current standings can be viewed in full via Dropbox here.
I plan on updating this blog regularly, but on quite a random basis. If something comes up that I want to talk about, then I’ll talk about it. I hope to not go more than a week with no posts, and there may be times when updates come out quite rapidly. At the very least I should provide a weekly update with commentary on any significant changes in the rankings, at least while the scene is active. Other than that, I’m also likely to post:
- Special reports about how specific matches have influenced the rankings.
- Analysis of the top-ranked players within their region/class.
- General discussion on the accuracy of certain players’ or teams’ placements within the rankings.
- Analysis of how and why individual players or teams are ranked the way they are, and how their position has changed over time.
- Speculation about how newly-formed teams will perform based on these rankings.
- Toying about by creating completely fictitious fantasy teams.
- Comparisons between similarly ranked teams.
- Comparisons between actual match results and those predicted by the system.
- Explanations for the inevitable outliers in the system.
At the time of writing, the last match to go on record was the ETF2L Playoffs match between SE7EN and Arctic Foxes
Thanks for reading.