Jump to content

Recommended Posts

Posted (edited)

There's been a bunch of chat following Winelands and 99er about "low betas", seeding calcs, how to move up, etc, etc.

I've done some reading on the PPA website about how seeding works, and I thought some worked examples, with explanations, might be helpful, so here goes.

If you want to argue with me, fine, but please do it with maths, not with "I feel like 'x'". I'm basing this purely off the logic as laid out by PPA, which I assume they follow. I'm also 99% sure my maths is right, but correct it if you see a glaring error.

I'll work through the PPA page point by point, noting where they no longer use certain points.

1. Ride weighting 

This seems to have fallen out of use. From everything I've seen since starting riding in 2020, your best index result, after penalties, contributes 100% to your seeding.

2. Establish a base event

Unclear exactly how this contributes, but as they states this is the last CTCT, we'll just go with it (becomes more obvious in the Tadej example below).

3. Calculate adjusted wining times and beta for a race

This is the crux of the matter, and where most of the discussion lies. There's a false perception that both of these steps are subjective, and the main point of this post is to show how they are not. I also strongly believe that PPA does themselves a massive disservice by labelling beta as "difficulty" since this suggests some subjectivity, which it very clearly does not. It's all too easy for anecdotal "I felt this race was harder than that race" to enter into the discussion, where it actually has no basis in the actual calcs, as we will show below.

The goal of this step is to say "on average, a rider coming into an event with index of 'x' should get an index of  'x for the event" (read that a few times, it's important).

Obviously, hundreds/thousands of riders do an event, and it's not mathematically possible to fit a line such that everyone's index remains the same. (Also, then your index would never change, which is obviously undesirable). That's where the linear regression comes in. Sounds complicated, really isn't, you just let Excel do it for you.

An important point to note is that, in a race with a beta of 1 and where the winner of the race had an index of 0 before the race, your index will represent the percentage over the winning time. (Eg: winner takes 200 minutes, you take 220 minutes, you did 10% more, your index is 10. If you take 300 minutes, you did 50% more, your index is 50).

Let's work through some examples to see how it plays out.

For all examples, we're going use the fictitious "Tour de Bikehub" as our event. Assume only 6 riders, since that's easier to comprehend the maths. The indexes shown in the tables represent seeding indexes prior to the event. Times are represented in minutes to make things easier to math.

The base case

In this case, the winner had an index of 0, and did 150 minutes.

Everyone else, conveniently, did exactly their index worse (so the person with an index of 10 did 10% worse, or 15minutes worse = 165mins).

image.png.7242b64632062e43166ee7d31b8bdc4b.png

We can plug those times and indexes into excel and draw a chart, with index along the x-axis and time on y-axis

image.png.ee245a46d5bfda0a83c045164e1466da.png

Excel then allows us to draw a linear trendline through those points, and to plot the equation of "y = mx + c" (high school maths reminder: m = the gradient/slope of the line, c = the y-intercept, which in this case is the "predicted winning time of a 0-indexed rider")

image.png.89aa371d1b290dd3bf973985054b7292.png

We can pull those values out as well using the "LINEST" formula in excel

image.png.d46e46886efc08f459c0582f15739639.png

For our purposes, "c" represents the "adjusted winning time" we're familiar with. 

We need to do a little maths on "m" to get to "beta", but it's not hard: beta = m/c * 100 (which in this case equates to 1).

So we now, for this event, have a beta of 1, an adjusted winning time of 150mins.

We can then proceed to step 4, and calculate new indexes for all riders:

New index = (Time / WinningTime - 1) / Beta * 100

If we do that for each rider, we get:

image.png.0e010a22b5e66b576349abf96b15e5fe.png

Nothing changed. As expected, because this was the base case, where the best possible seeded rider won and everyone else performed to expectation.

But hopefully now we understand the maths, so let's go to example 2.

Example 2: Missing elites

Let's keep the same cohort as the last example, and the same times, but this time let's say the rider seeded 0 doesn't rock up.

So the rider seeded 10 wins, and does it in the same time they did in example 1:

image.png.3c064c418faaaea6d478170fe3c78978.png

Doing the same excel gymnastics as last time, we get this chart:

image.png.096e90480a39f7c40d647b07bae157e2.png

And these params:

image.png.429bfc1053958502a4988466f6dccfad.png

Which all looks quite familiar? Again, this is expected: the winning time is adjusted down to what it would have been if a 0 index rider had shown up. The beta, however, remains unchanged at 1, since the performances relative to that time are in line with the index expectations.

So we, again, don't have any index changes:

image.png.cd33a2a64c1510262913ab20331f496b.png

Example 3: The slow day

Let's assume the race was just really slow, and the times look like this:

image.png.19da657ee6d37aeb8719558cff7213ef.png

Chart:image.png.dc07c4ac40bacc4a1de086ede0e37672.png

As you can see, some dots are above the line, some are below. In general, those above the line won't improve their index, those below the line will.

image.png.04390f2eecf23d641709a89d52fa30bb.png

We end up with a huge beta of 1.87 (since in general riders were well over the "x% more than the winning time" index heuristic) and riders C and E improved their indexes by finishing "below the trendline"

Example 4: The monster in Group D

Let's now assume the monster rider A decides to drop back to D, and pulls all his mates to a faster finishing time.

image.png.e572eef804d6788e75c573f27697ba17.png

They improve the D time from 195mins to 175mins (they catch C, who started 5 mins before them), everyone else remains the same (so B wins again).

Our chart and params now look like this (note how far some points are from the trendline now):

image.png.75d09dd5070bc34418f1fdb813ddab01.png

image.png.89b4c3c7f4328092bd05721a81c5a971.png

As you can see, this leads to a slightly adjusted winning time, but a very low beta (shades of 2024 Tour de PPA here).

This helps B, C, and D, but is penal for E & F.

Fun example: Tadej comes to town

It's also possible for the winning time to be adjusted up. Imagine Tadej (who doesn't have a Racetec chip, or a PPA seeding index) decides to come do the Tour de Bikehub. He obviously smashes everyone, including the 0 index rider.

image.png.4387d6ae815de94c29e17eea2cc167b5.png

In this case, we actually just don't include Tadej's time (since they don't have an index and thus can't be included in the calcs).

Chart and params:

image.png.8a3175cde61c5e2ecd941123567a0caf.pngimage.png.a1f06ac8b2fc5643f3be45a2a78b9368.png

Winning time is adjusted up from 135mins to 150mins (ie, what the actual 0 index rider did).

Tadej now has an index of -10, unlikely but possible (until the next CTCT he wins, at which point he becomes the "baseline" or "0 index rider"). 


Conclusion

That was long, but I hope it's helpful to understand just how little subjectivity there is in calculating the seeding numbers. Obviously with thousands of riders doing all sorts of performances the linear regression becomes less easy to interrogate, but it should "average out".

If I've missed anything, or anything is unclear, please let me know so I can update this post.

If you want to play around with the calcs, make a copy of this Google sheet and go nuts

 

image.png

Edited by MongooseMan
added google sheet
Posted
18 minutes ago, MongooseMan said:

There's been a bunch of chat following Winelands and 99er about "low betas", seeding calcs, how to move up, etc, etc.

I've done some reading on the PPA website about how seeding works, and I thought some worked examples, with explanations, might be helpful, so here goes.

If you want to argue with me, fine, but please do it with maths, not with "I feel like 'x'". I'm basing this purely off the logic as laid out by PPA, which I assume they follow. I'm also 99% sure my maths is right, but correct it if you see a glaring error.

I'll work through the PPA page point by point, noting where they no longer use certain points.

1. Ride weighting 

This seems to have fallen out of use. From everything I've seen since starting riding in 2020, your best index result, after penalties, contributes 100% to your seeding.

2. Establish a base event

Unclear exactly how this contributes, but as they states this is the last CTCT, we'll just go with it (becomes more obvious in the Tadej example below).

3. Calculate adjusted wining times and beta for a race

This is the crux of the matter, and where most of the discussion lies. There's a false perception that both of these steps are subjective, and the main point of this post is to show how they are not. I also strongly believe that PPA does themselves a massive disservice by labelling beta as "difficulty" since this suggests some subjectivity, which it very clearly does not. It's all too easy for anecdotal "I felt this race was harder than that race" to enter into the discussion, where it actually has no basis in the actual calcs, as we will show below.

The goal of this step is to say "on average, a rider coming into an event with index of 'x' should get an index of  'x for the event" (read that a few times, it's important).

Obviously, hundreds/thousands of riders do an event, and it's not mathematically possible to fit a line such that everyone's index remains the same. (Also, then your index would never change, which is obviously undesirable). That's where the linear regression comes in. Sounds complicated, really isn't, you just let Excel do it for you.

An important point to note is that, in a race with a beta of 1 and where the winner of the race had an index of 0 before the race, your index will represent the percentage over the winning time. (Eg: winner takes 200 minutes, you take 220 minutes, you did 10% more, your index is 10. If you take 300 minutes, you did 50% more, your index is 50).

Let's work through some examples to see how it plays out.

For all examples, we're going use the fictitious "Tour de Bikehub" as our event. Assume only 6 riders, since that's easier to comprehend the maths. The indexes shown in the tables represent seeding indexes prior to the event. Times are represented in minutes to make things easier to math.

The base case

In this case, the winner had an index of 0, and did 150 minutes.

Everyone else, conveniently, did exactly their index worse (so the person with an index of 10 did 10% worse, or 15minutes worse = 165mins).

image.png.7242b64632062e43166ee7d31b8bdc4b.png

We can plug those times and indexes into excel and draw a chart, with index along the x-axis and time on y-axis

image.png.ee245a46d5bfda0a83c045164e1466da.png

Excel then allows us to draw a linear trendline through those points, and to plot the equation of "y = mx + c" (high school maths reminder: m = the gradient/slope of the line, c = the y-intercept, which in this case is the "predicted winning time of a 0-indexed rider")

image.png.89aa371d1b290dd3bf973985054b7292.png

We can pull those values out as well using the "LINEST" formula in excel

image.png.d46e46886efc08f459c0582f15739639.png

For our purposes, "c" represents the "adjusted winning time" we're familiar with. 

We need to do a little maths on "m" to get to "beta", but it's not hard: beta = m/c * 100 (which in this case equates to 1).

So we now, for this event, have a beta of 1, an adjusted winning time of 150mins.

We can then proceed to step 4, and calculate new indexes for all riders:

New index = (Time / WinningTime - 1) / Beta * 100

If we do that for each rider, we get:

image.png.0e010a22b5e66b576349abf96b15e5fe.png

Nothing changed. As expected, because this was the base case, where the best possible seeded rider won and everyone else performed to expectation.

But hopefully now we understand the maths, so let's go to example 2.

Example 2: Missing elites

Let's keep the same cohort as the last example, and the same times, but this time let's say the rider seeded 0 doesn't rock up.

So the rider seeded 10 wins, and does it in the same time they did in example 1:

image.png.3c064c418faaaea6d478170fe3c78978.png

Doing the same excel gymnastics as last time, we get this chart:

image.png.096e90480a39f7c40d647b07bae157e2.png

And these params:

image.png.429bfc1053958502a4988466f6dccfad.png

Which all looks quite familiar? Again, this is expected: the winning time is adjusted down to what it would have been if a 0 index rider had shown up. The beta, however, remains unchanged at 1, since the performances relative to that time are in line with the index expectations.

So we, again, don't have any index changes:

image.png.cd33a2a64c1510262913ab20331f496b.png

Example 3: The slow day

Let's assume the race was just really slow, and the times look like this:

image.png.19da657ee6d37aeb8719558cff7213ef.png

Chart:image.png.dc07c4ac40bacc4a1de086ede0e37672.png

As you can see, some dots are above the line, some are below. In general, those above the line won't improve their index, those below the line will.

image.png.04390f2eecf23d641709a89d52fa30bb.png

We end up with a huge beta of 1.87 (since in general riders were well over the "x% more than the winning time" index heuristic) and riders C and E improved their indexes by finishing "below the trendline"

Example 4: The monster in Group D

Let's now assume the monster rider A decides to drop back to D, and pulls all his mates to a faster finishing time.

image.png.e572eef804d6788e75c573f27697ba17.png

They improve the D time from 195mins to 175mins (they catch C, who started 5 mins before them), everyone else remains the same (so B wins again).

Our chart and params now look like this (note how far some points are from the trendline now):

image.png.75d09dd5070bc34418f1fdb813ddab01.png

image.png.89b4c3c7f4328092bd05721a81c5a971.png

As you can see, this leads to a slightly adjusted winning time, but a very low beta (shades of 2024 Tour de PPA here).

This helps B, C, and D, but is penal for E & F.

Fun example: Tadej comes to town

It's also possible for the winning time to be adjusted up. Imagine Tadej (who doesn't have a Racetec chip, or a PPA seeding index) decides to come do the Tour de Bikehub. He obviously smashes everyone, including the 0 index rider.

image.png.4387d6ae815de94c29e17eea2cc167b5.png

In this case, we actually just don't include Tadej's time (since they don't have an index and thus can't be included in the calcs.

Chart and params:

image.png.8a3175cde61c5e2ecd941123567a0caf.pngimage.png.a1f06ac8b2fc5643f3be45a2a78b9368.png

Winning time is adjusted up from 135mins to 150mins (ie, what the actual 0 index rider did).

Tadej now has an index of -10, unlikely but possible (until the next CTCT he wins, at which point he becomes the "baseline" or "0 index rider"). 


Conclusion

That was long, but I hope it's helpful to understand just how little subjectivity there is in calculating the seeding numbers. Obviously with thousands of riders doing all sorts of performances the linear regression becomes less easy to interrogate, but it should "average out".

If I've missed anything, or anything is unclear, please let me know so I can update this post.

If you want to play around with the calcs, make a copy of this Google sheet and go nuts

image.png

This was so cool to read! Thanks for the time and effort. 👏

Posted

Hi, thanks for the explanation.

So would I be correct to say that once you have reached the 'B' seeding, finishing towards the front of the group in a normal race won't improve your seeding any more than it is, and you have to ask/apply for a re-seeding to join the A bunch? 

At the same time, roughly which indexes get placed into which start groups at CTCT?

Posted
9 minutes ago, James2233 said:

Hi, thanks for the explanation.

So would I be correct to say that once you have reached the 'B' seeding, finishing towards the front of the group in a normal race won't improve your seeding any more than it is, and you have to ask/apply for a re-seeding to join the A bunch? 

At the same time, roughly which indexes get placed into which start groups at CTCT?

Few years back (when there were still more road races around) - I finished at the sharp end of B bunch 4 races in a row without improving my road seeding tot A. 1 month later a little mtb race in the boland with a massive beta halved my road seeding index despite an unspectacular result. At that point I gave up being under illusion that the system is working (despite the impressive maths behind it)

Posted
7 minutes ago, Skubarra said:

Few years back (when there were still more road races around) - I finished at the sharp end of B bunch 4 races in a row without improving my road seeding tot A. 1 month later a little mtb race in the boland with a massive beta halved my road seeding index despite an unspectacular result. At that point I gave up being under illusion that the system is working (despite the impressive maths behind it)

the maths is not that impressive. it' just a simple curve fit with the assumption of the 0 index rider being the winner of the CTCT. Blame that guy😆

The gap here is that is awards one hit wonders with a great seeding and consistency can be detrimental

Posted
37 minutes ago, Skubarra said:

Few years back (when there were still more road races around) - I finished at the sharp end of B bunch 4 races in a row without improving my road seeding tot A. 1 month later a little mtb race in the boland with a massive beta halved my road seeding index despite an unspectacular result. At that point I gave up being under illusion that the system is working (despite the impressive maths behind it)

 

Same happened with my first Vines and Views.  Placed me in the A batch for 99er in 2023 ... 😬

 

And now that I consistently chip away at my times, my seeding is all over the show.... purely academic for me, as it only determines if I start 10 000th or 15 000th at the World Champs .... I do feel sorry for the good riders struggling to bridge from C to B to A ....

Posted

@MongooseMan do you mind running the following times in your spreadsheet, based on the 99er:

Beta 0,76

winner 2:08:52

 

- Finish time of 3:01 ... seeding ?

- Finish time of 3:08:52  (PPA has this seeding as 61,16)

Posted (edited)
6 minutes ago, ChrisF said:

@MongooseMan do you mind running the following times in your spreadsheet, based on the 99er:

Beta 0,76

winner 2:08:52

 

- Finish time of 3:01 ... seeding ?

- Finish time of 3:08:52  (PPA has this seeding as 61,16)

3:01:00 = 53.23

3:08:52 = 61.26 (likely some rounding error)

 

Edit: Time to index conversion for 99er:

image.png.79d40019f532316325738dff488169d9.png

Edited by MongooseMan
Posted
1 minute ago, MongooseMan said:

3:01:00 = 53.23

3:08:52 = 61.26 (likely some rounding error)

 

Thank you

 

In practical terms ....

 

7 minute technical:

- drop about 120 positions

- significant drop in seeding

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Settings My Forum Content My Followed Content Forum Settings Ad Messages My Ads My Favourites My Saved Alerts My Pay Deals Help Logout