Gathering player metrics
I’m thinking about the various metrics to provide, like goals, expected goals (xG), assists, critical contributions, and more. I should define "关键球员," likely referring to players who make clutch contributions during key moments of the game. I need to ask clarifying questions, such as the league and season the user is interested in, as well as their preference for language and whether they want a live ranking. If network access is an issue, I might suggest an offline approach using a CSV file and a pandas script for calculations.Planning player ranking tasks
I think the instruction to use a planning tool feels unnecessary since this is quite straightforward. I should keep my answer concise by asking just a couple of questions and offering some default suggestions. Instead of creating a list with hypothetical names, which could be misleading, I'll provide a template with sample placeholder rows. I also aim to generate a reusable notebook or Python script that calculates ranking from a CSV file of match events, including required columns, a weighting scheme, and recommended visualizations.Clarifying user intent for ranking
The user only provided a phrase, so they might be looking for a ranked list. I should ask some clarifying questions: What league or season are they interested in? What metrics should I focus on for determining "key contributions"? Do they want an automated script for rankings or a manually curated list/template? If they don’t respond, I could suggest using the EPL 2023-24 season and provide a formula combining expected points added, non-penalty G+A per 90, and other relevant metrics along with an implementation offer and code snippet as an example.Creating a pandas function for player metrics
I need to develop a small pandas function that expects several columns, including player statistics like minutes, goals, xG, and more. Computing expected points (xP) usually requires event-level timelines, which might not always be available, so we can approximate with go-ahead goals instead. I'll write code to read a CSV with the necessary columns, compute per 90 rates, and create a contribution index based on player position with corresponding weights.
I'll normalize the metrics using z-scores or min-max methods and generate an overall "关键指数," outputting the top 20 players in the process. Since it specifies "ASCII only," I’ll ensure the code is clear and the explanation concise.Providing player ranking methodology
I’ll start by saying, “Sure! Here’s a definition, calculation method, and runnable script for a 'key player performance ranking'.” I’ll quickly ask clarifying questions.

I’ll include bullet points defining components of the "关键指数." I'll provide a Python script in a code block for practical use and explain how to run it. The user should save their CSV and run the script, which will output key_players_rank.csv.

I'll invite them to share the league and season or their dataset, or ask if I can fetch data if allowed. I'll make sure to keep the style concise!
想做哪一个联赛/赛季的榜单?以及“关键”你更偏好哪种定义:
如果你暂不指定,我可以给出通用“关键指数”的计算方法和可运行脚本;你把数据CSV喂进去就能产出排行榜。
关键指数(通用版,分位置加权)
可运行脚本(把你的赛季汇总CSV喂进去)
保存为 rank_key_players.py,按需改输入/输出路径:
# rank_key_players.py
import pandas as pd
import numpy as np
def robust_z(s: pd.Series) -> pd.Series:
s = s.astype(float)
med = s.median()
mad = (s - med).abs().median()
if mad == 0:
std = s.std(ddof=0)
return (s - s.mean()) / (std if std else 1.0)
return 0.6745 * (s - med) / (mad if mad else 1.0)
def per90(df, col, minutes_col="minutes"):
mins = df[minutes_col].replace(0, np.nan)
return df[col] / (mins / 90.0)
def safe_col(df, name, default=0.0):
return df[name] if name in df.columns else pd.Series(default, index=df.index, dtype=float)
def compute_index(df: pd.DataFrame) -> pd.DataFrame:
df = df.copy()
for c in ["minutes","goals","pens_made","xg","npxg","assists","xa","sca","gca",
"prog_passes","prog_carries","pressures","tackles","interceptions",
"on_off_xg_diff","gk_psxg","gk_ga","gk_cross_stops",
"sweeper_actions","pass_launch_cmp","match_winners","go_ahead_ga"]:
if c in df.columns:
df[c] = pd.to_numeric(df[c], errors="coerce").fillna(0.0)
# Per90s
df["npg"] = (df["goals"] - safe_col(df,"pens_made")).clip(lower=0)
df["npg90"] = per90(df, "npg")
df["npxg90"] = per90(df, "npxg") if "npxg" in df.columns else per90(df,"xg")
df["xa90"] = per90(df, "xa")
df["sca90"] = per90(df, "sca")
df["gca90"] = per90(df, "gca")
df["prog_pass90"] = per90(df, "prog_passes")
df["prog_carry90"] = per90(df, "prog_carries")
df["press90"] = per90(df, "pressures")
df["tack_int90"] = per90(df, "tackles") + per90(df, "interceptions")
df["impact90"] = per90(df, "on_off_xg_diff") if "on_off_xg_diff" in df.columns else 0.0
df["clutch90"] = 0.0
if "match_winners" in df.columns:
df["clutch90"] += per90(df, "match_winners")
if "go_ahead_ga" in df.columns:
df["clutch90"] += 0.5 * per90(df, "go_ahead_ga")
# Sub-scores (robust z)
scoring = robust_z(0.55*df["npxg90"] + 0.45*df["npg90"])
creation = robust_z(0.5*df["xa90"] + 0.3*df["sca90"] + 0.2*df["gca90"])
progression = robust_z(0.6*df["prog_pass90"] + 0.4*df["prog_carry90"])
defensive = robust_z(0.6*df["press90"] + 0.4*df["tack_int90"])
impact = robust_z(df["impact90"]) if isinstance(df["impact90"], pd.Series) else 0.0
clutch = robust_z(df["clutch90"]) if isinstance(df["clutch90"], pd.Series) else 0.0
# Position group
pos = df["position"].str.upper().fillna("")
is_gk = pos.str.contains("GK")
is_def = pos.str.contains("CB|RB|LB|RWB|LWB|DF|FB")
is_mid = pos.str.contains("DM|CM|AM|RM|LM|MF|MID")
is_fwd = pos.str.contains("FW|ST|CF|LW|RW|W")
# Outfield index by weights
outfield_idx = (
np.where(is_fwd, 0.50*scoring + 0.25*creation + 0.15*progression + 0.10*impact + 0.05*clutch,
np.where(is_mid, 0.30*scoring + 0.35*creation + 0.20*progression + 0.10*defensive + 0.05*impact,
np.where(is_def, 0.15*scoring + 0.20*creation + 0.25*progression + 0.30*defensive + 0.10*impact,
0.30*scoring + 0.30*creation + 0.20*progression + 0.15*defensive + 0.05*impact)))
)
# Goalkeeper index
if "gk_psxg" in df.columns and "gk_ga" in df.columns:
psxg_ga = df["gk_psxg"] - df["gk_ga"] # positive is good
gk_core = robust_z(psxg_ga)
else:
gk_core = pd.Series(0.0, index=df.index)
gk_cross = robust_z(safe_col(df,"gk_cross_stops"))
gk_sweeper = robust_z(per90(df, "sweeper_actions")) if "sweeper_actions" in df.columns else pd.Series(0.0, index=df.index)
gk_pass = robust_z(safe_col(df,"pass_launch_cmp"))
gk_idx = gk_core + 0.25*gk_cross + 0.25*gk_sweeper + 0.20*gk_pass
base_idx = np.where(is_gk, gk_idx, outfield_idx)
# Availability penalty for very low minutes
mins = df["minutes"].clip(lower=0)
avail = np.minimum(1.0, mins / 900.0) 0.5 # soft penalty under 900'
final_idx = base_idx * avail
df["key_index"] = final_idx
# Secondary sort helpers
df["g_a90"] = per90(df, "goals") + per90(df, "assists")
cols = ["player","team","position","minutes","key_index","g_a90","npg90","npxg90","xa90","sca90","prog_pass90","prog_carry90","press90","tack_int90","impact90"]
keep = [c for c in cols if c in df.columns or c in ["key_index","g_a90","npg90","npxg90","xa90","sca90","prog_pass90","prog_carry90","press90","tack_int90","impact90"]]
out = df[["player","team","position","minutes"]].copy()
for c in keep[4:]:
out[c] = df[c]
out = out.sort_values(["key_index","g_a90","minutes"], ascending=[False, False, False])
return out
if __name__ == "__main__":
import argparse
ap = argparse.ArgumentParser()
ap.add_argument("--in", dest="inp", required=True, help="Input CSV with season aggregates")
ap.add_argument("--out", dest="outp", default="key_players_rank.csv", help="Output CSV")
ap.add_argument("--top", dest="topk", type=int, default=50, help="Rows to print")
args = ap.parse_args()
df = pd.read_csv(args.inp)
ranked = compute_index(df)
ranked.to_csv(args.outp, index=False)
print(ranked.head(args.topk).to_string(index=False, float_format=lambda x: f"{x:,.3f}"))
用法
需要我:
