elo 评分
正如我在较早的博客文章中提到的那样, 我一直在学习有关排名系统的知识,而我遇到的第一个系统是Elo评级系统 ,该系统最有名的是用于对棋手进行排名的系统 。
Elo评分系统使用以下公式计算球员/团队参加比赛后的排名:
R'= R + K *(S – E)
- R'是新的评分
- R是旧评分
- K是增加或减少等级的最大值(ELO为16或32)
- S是一场比赛的分数
- E是游戏的预期分数
我将该公式转换为以下Clojure函数:
(defn ranking-after-win
[{ ranking :ranking opponent-ranking : opponent-ranking importance :importance}]
(+ ranking (* importance (- 1 (expected ranking opponent-ranking) ))))
(defn ranking-after-loss
[{ ranking :ranking opponent-ranking : opponent-ranking importance :importance}]
(+ ranking (* importance (- 0 (expected ranking opponent-ranking) ))))
(defn expected [my-ranking opponent-ranking]
(/ 1.0
(+ 1 (math/expt 10 (/ (- opponent-ranking my-ranking) 400)))))
可以这样算出1200击败1500的新排名:
> (ranking-after-win { :ranking 1200 : opponent-ranking 1500 :importance 32 })
1227.1686541692377
它的工作方式是,首先我们的可能性,我们应该呼吁预期赢得比赛:
> (expected 1200 1500)
0.15097955721132328
这告诉我们,我们有15%的机会赢得比赛,因此,如果我们赢了,那么我们的排名应该大大提高,因为我们预计不会赢。 在这种情况下,获胜使我们的积分增加了'32 *(1-0.15)',即〜27分。
我总是通过将增加或减少的重要性/最大值设置为32来简化事情。 世界足球排名采用了不同的方法,他们根据比赛的重要性和胜利幅度来改变它。
我决定在2002/2003冠军联赛赛季尝试该算法。 我能够从The Rec Sport足球统计基金会中获取数据 ,并且之前已经写过有关如何使用Enlive进行刮取的信息 。
在Paul Bostrom的大力帮助下,我最终得到了以下代码来简化比赛,并在每次比赛后更新球队排名:
(defn top-teams [number matches]
(let [teams-with-rankings
(apply array-map (mapcat (fn [x] [x {:points 1200}]) (extract-teams matches)))]
(take number
(sort-by (fn [x] (:points (val x)))
>
(seq (reduce process-match teams-with-rankings matches))))))
(defn process-match [ts match]
(let [{:keys [home away home_score away_score]} match]
(cond
(> home_score away_score)
(-> ts
(update-in [home :points]
#(ranking-after-win {:ranking % : opponent-ranking (:points (get ts away)) :importance 32}))
(update-in [away :points]
#(ranking-after-loss {:ranking % : opponent-ranking (:points (get ts home)) :importance 32})))
(> away_score home_score)
(-> ts
(update-in [home :points]
#(ranking-after-loss {:ranking % : opponent-ranking (:points (get ts away)) :importance 32}))
(update-in [away :points]
#(ranking-after-win {:ranking % : opponent-ranking (:points (get ts home)) :importance 32})))
(= home_score away_score) ts)))
我们传递给顶级团队的matchs参数如下所示 :
> (take 5 all-matches)
({:home "Tampere", :away "Pyunik Erewan", :home_score 0, :away_score 4} {:home "Pyunik Erewan", :away "Tampere", :home_score 2, :away_score 0} {:home "Skonto Riga", :away "Barry Town", :home_score 5, :away_score 0} {:home "Barry Town", :away "Skonto Riga", :home_score 0, :away_score 1} {:home "Portadown", :away "Belshina Bobruisk", :home_score 0, :away_score 0})
并调用提取团队会为我们提供一组涉及的所有团队:
> (extract-teams (take 5 all-matches))
#{"Portadown" "Tampere" "Pyunik Erewan" "Barry Town" "Skonto Riga"}
然后,我们将其映射 ,以获得包含团队/默认得分对的向量:
> (mapcat (fn [x] [x {:points 1200}]) (extract-teams (take 5 all-matches)))
("Portadown" {:points 1200} "Tampere" {:points 1200} "Pyunik Erewan" {:points 1200} "Barry Town" {:points 1200} "Skonto Riga" {:points 1200})
在调用array-map对结果进行哈希处理之前:
> (apply array-map (mapcat (fn [x] [x {:points 1200}]) (extract-teams (take 5 all-matches))))
{"Portadown" {:points 1200}, "Tampere" {:points 1200}, "Pyunik Erewan" {:points 1200}, "Barry Town" {:points 1200}, "Skonto Riga" {:points 1200}}
然后,我们对所有比赛应用归约法,并在每次迭代中调用函数process-match ,以适当地更新团队排名。 最后一步是按排名对球队进行排序,以便我们列出排名靠前的球队:
> (top-teams 10 all-matches)
(["CF Barcelona" {:points 1343.900393287903}]
["Manchester United" {:points 1292.4731214788262}]
["FC Valencia" {:points 1277.1820905112208}]
["Internazionale Milaan" {:points 1269.8028023141364}]
["AC Milan" {:points 1257.4564374787687}]
["Juventus Turijn" {:points 1254.2498432522466}]
["Real Madrid" {:points 1248.0758162475993}]
["Deportivo La Coruna" {:points 1235.7792317210403}]
["Borussia Dortmund" {:points 1231.1671952364256}]
["Sparta Praag" {:points 1229.3249513256828}])
有趣的是,优胜者(尤文图斯)仅排在第五位,而前四名则被四分之一决赛中失利的球队占据。 我编写了以下函数来调查正在发生的事情:
(defn show-matches [team matches]
(->> matches
(filter #(or (= team (:home %)) (= team (:away %))))
(map #(show-opposition team %))))
(defn show-opposition [team match]
(if (= team (:home match))
{:opposition (:away match) :score (str (:home_score match) "-" (:away_score match))}
{:opposition (:home match) :score (str (:away_score match) "-" (:home_score match))}))
如果我们用尤文图斯来称呼它,我们可以看到他们在比赛中的表现:
ranking-algorithms.parse> (show-matches "Juventus Turijn" all-matches)
({:opposition "Feyenoord", :score "1-1"}
{:opposition "Dynamo Kiev", :score "5-0"}
{:opposition "Newcastle United", :score "2-0"}
{:opposition "Newcastle United", :score "0-1"}
{:opposition "Feyenoord", :score "2-0"}
{:opposition "Dynamo Kiev", :score "2-1"}
{:opposition "Deportivo La Coruna", :score "2-2"}
{:opposition "FC Basel", :score "4-0"}
{:opposition "Manchester United", :score "1-2"}
{:opposition "Manchester United", :score "0-3"}
{:opposition "Deportivo La Coruna", :score "3-2"}
{:opposition "FC Basel", :score "1-2"}
{:opposition "CF Barcelona", :score "1-1"}
{:opposition "CF Barcelona", :score "2-1"}
{:opposition "Real Madrid", :score "1-2"}
{:opposition "Real Madrid", :score "3-1"})
尽管我错过了决赛-我需要修复解析器以选择该对局,而且无论如何还是平局-他们实际上仅直接赢得了8场比赛。 另一方面,巴塞罗那赢得了13场比赛,尽管其中有2场是预选赛。
下一步是考虑比赛的重要性,而不是全面应用32的重要性,即使在点球或客场进球的情况下,也能为赢得平局增添一些价值。
如果您想使用它,或者对其他可以尝试的建议,请在github上找到代码 。
elo 评分