r/fantasyfootballcoding

▲ 4 r/fantasyfootballcoding+2 crossposts

NFL WR Rookie Model - Looking for Feedback/Critique

I’ve been working on a Python/XGBoost model that tries to project which rookie wide receivers are most likely to produce at least one top-12 fantasy WR season within their first four NFL seasons. The goal is not to create a perfect ranking system, but to build a structured prospect model that combines NFL draft capital, college production, PFF receiving data, athletic testing, competition/context adjustments, and historical fantasy outcomes.

The model trains on historical drafted WR classes from 2014 onward, using NFL fantasy production to label whether each player eventually produced a top-12 WR season within his first four years. I also added top-24/top-36 outcome tracking and season-count columns, so the sheet can separate “ever hit top-12” from “how many top-12/top-24/top-36 seasons did this player actually produce.”

Data used

The model currently uses:

  • NFL draft data: draft year, round/pick, team, college, player IDs
  • NFL weekly fantasy production: used to calculate WR season ranks and first-four-year outcomes
  • PFF college WR production: yards, receptions, targets, touchdowns, routes, YPRR, receiving grades, route grades, drop rate, yards after catch, target share, dominator-style metrics, etc.
  • PFF context splits: man/zone performance, slot/screen/concept usage, and receiving production by depth of target
  • Combine/athletic data: height, weight, forty, vertical, broad jump, cone, shuttle, bench, derived speed/burst/size metrics
  • Competition/context adjustments: conference/team strength, competition-adjusted production, low-competition risk, screen dependency, downfield production, and trajectory flags
  • Historical validation: leave-one-draft-year-out backtesting and walk-forward testing

Main target

The primary label is:

Did this WR produce at least one top-12 fantasy WR season within his first four NFL seasons?

I also track:

  • top24_first4
  • top36_first4
  • top12_seasons_first4
  • top24_seasons_first4
  • top36_seasons_first4
  • best WR rank in first four seasons
  • best PPR season in first four seasons

Model approach

The main model is an XGBoost classifier. It uses a reduced/cleaned feature set to avoid overloading the model with too many duplicate raw and competition-adjusted versions of the same stat. The model also uses scale_pos_weight because top-12 WR hits are relatively rare.

I run two main validation views:

  1. Leave-one-year-out backtest The model trains on every completed draft class except the test year, then scores that held-out class.
  2. Walk-forward historical test The model scores each class using only information from prior classes. For example, the 2023 class is scored using training data through 2022. This is meant to test whether the model would have identified players like Puka Nacua before knowing their NFL breakout.

In the latest walk-forward test, Puka was flagged as a “late-pick outlier profile — model likes more than NFL did,” which is exactly the type of signal I wanted the model to surface. It does not mean the model viewed him as safe; it means his underlying receiving/context profile was stronger than a typical late-round WR.

Final rookie score

The final rookie score is a blended score. It is not just the XGBoost probability. It combines:

  • model probability of a top-12 outcome
  • prospect grade
  • contextual production profile
  • calibration/priors
  • draft-capital-vs-model interpretation flags

I removed the extra standalone draft-capital weight from the final score because draft capital is already represented inside both the model and prospect grade. Draft capital is still included as a feature and displayed in the output, but I wanted to avoid triple-counting it.

Recent model checks

The most recent run used 295 eligible training rows and 34 top-12 hits. The leave-one-year-out backtest produced an overall average precision around 0.70 and ROC AUC around 0.89. The model is not perfect, but the historical validation has been useful for identifying where the model is too aggressive, too draft-capital-heavy, or too reliant on certain production signals.

2023 walk-forward example

The walk-forward test for 2023 ranked the class like this:

Rank Player Pick Walk-forward score Walk-forward probability Actual best WR rank so far Interpretation
1 Jaxon Smith-Njigba 20 73.11 0.729 2 Model and draft capital agree
2 Puka Nacua 177 44.44 0.507 1 Late-pick outlier profile
3 Jordan Addison 23 43.28 0.006 24 Model and draft capital agree
4 Zay Flowers 22 40.07 0.012 8 Model and draft capital agree
5 Quentin Johnston 21 38.94 0.055 37 Model and draft capital agree

This was encouraging because Puka was not being rewarded because of hindsight. The model trained only through 2022 and still flagged him as a late-pick profile that was stronger than his draft capital.

Current 2026 model output

Here are the current 2026 WR rankings by final_rookie_score:

Rank Player School Draft pick Final rookie score Top-12 probability Prospect grade
1 Jordyn Tyson Arizona St. 8 54.52812 0.24 79.50568
2 Carnell Tate Ohio St. 4 53.14501 0.411294 62.97594
3 Makai Lemon USC 20 52.00084 0.18 79.81971
4 KC Concepcion Texas A&M 24 46.22785 0.14 72.59609
5 Elijah Sarratt Indiana 115 43.31409 0.169116 64.91615
6 Omar Cooper Indiana 30 42.6084 0.14 66.01527
7 Denzel Boston Washington 39 40.38833 0.09 66.06969
8 De'Zhaun Stribling Mississippi 33 36.30474 0.09 58.64498
9 Germie Bernard Alabama 47 34.97299 0.09 56.22361
10 Chris Brazzell II Tennessee 83 34.83543 0.04 60.06442

What I’m looking for feedback on

I’m especially looking for critiques on:

  1. Whether top-12 within first four seasons is the right primary label
  2. Whether top-24/top-36 or best WR rank should be modeled separately
  3. Whether the competition adjustment is too aggressive or not aggressive enough
  4. Whether draft capital is still being over-weighted
  5. Whether the model is overfitting to certain PFF production/context metrics
  6. Whether late-pick outlier profiles like Puka should be handled differently
  7. Whether the final rookie score should be more probability-based or more prospect-grade-based
  8. Whether I should add another model type for comparison, such as logistic regression, random forest, or a regression model for best WR rank

I know this is not perfect, and I’m not trying to claim it is predictive gospel. I’m mainly trying to build a transparent, testable framework for WR prospect analysis and would appreciate any feedback on the methodology, assumptions, feature engineering, and validation approach.

reddit.com
u/schnarfdogg — 1 day ago

I built a review system for fantasy players to help separate the good actors from the bad ones.

Do you ever wish you could know what kind of players you recruit into your public league online?

Ever had a few bad apples that you wish you never let into your league?

Have you wanted a way to show that you’re an active league member who knows ball?

Well, I want to make that easier for you. I built FanCred reviews to help the fantasy community more easily recruit the right folks for their league.

Reviews are based on a few factors like League Engagement, Ball Knowledge, and Responsiveness. Once a user has enough reviews, an AI summary is generated to quickly get an idea of what people are saying about a particular username on a platform (ex. “Commishner” on Sleeper, which is my username)

Hoping to get a good amount of folks reviewing! Try to get your league to review you so you can have proof how good of a leaguemate you are!

Check it out at https://fancred.app/review

reddit.com
u/fancredfounder — 7 days ago