Reward structures

Saturday 14th September 2019. 1645hrs looking at my GPS route, it seems I have accidentally ended up on the track leading to the summit of Mount Bruce (1,630m). I thought I was on the Cass Lagoon Saddle Track. Not that I'd know; I don't have a map and haven't looked at my phone to work out where I am.

It's been hard to get to this point. I started at the other end but turned around after multiple knee-deep river crossings, constant rain and being unable to find the 'trail' (Strava link). On this attempt, I went through relentless bogs (knee-deep, of course). Then the snow started. By this point, I've lost most of the feeling in my feet and I've lost sight of the 'track' again. I look back and I'm not sure how I got to this point. I trace back my steps through the snow (Strava link). I posted on Strava:

"Being out there by myself with no map, no one knowing where I was, on a track I hadn't been on before with the weather closing in, each step up the mountain was one step further away from safety and one step closer to being a problem. Maybe this is why I am just so unreasonable with my expectations at work. Your comfortable office job is a fucking walk in the park compared to this and you find that difficult?"

I'd never thought about the concept of reward structures at this point in my life; I hardly rode my bike. What lead me to the track and to this point can now be seen through the lens of thinking I might not be 'here' again. That’s not the reason I’m standing in the snow late on a Saturday afternoon when I could be inside. I may never know the reason. That’s okay because ultimately it lead me to where I am now.

Posted to Strava 2023-05-31

Last updated 2023-05-31

Word count: ~4,000. Reading time: 17 minutes.

I've previously written on mental toughness many times. My thoughts generally echo those of others - you need to do hard things. Only you know if what you do is hard. As a starting point, let's assume it isn't. Unfair? Probably not.

On Wednesday, I spoke with a well-regarded member of Australian bikepack racing. They've won races, and they've set fast times. I'll say no more to preserve their anonymity. Our discussion centred on what it takes to finish long, hard races. Based on their recent races, I had a few practical suggestions that I've drawn from ultra running. The broader point was the concept of reward structures.

They quoted the popular phrase 'finish or die'. Whilst this is better than some approaches (i.e. no approach), I term this a weak reward structure.

I'll go further and say that bikepack racing is a metaphor for our lives. These races are long enough that we can't plan them out in detail. The more detailed and rigid your plans, the less likely you'll be successful or enjoy the experience. Your experience is largely determined by your attitude and how you handle difficulty. Good gear helps, but beyond a certain point gear has no bearing. Just as in life, money helps but has no bearing beyond a certain point.

When you listen to successful endurance athletes, they talk about how they reach a point where they want to quit in most races. By standing on the start line, we accept that we will face difficulty. This is true in bikepack racing. This is also true in life. Avoidance is not a strategy.

I think any long-term success anyone has in bikepack racing is related to the strength of their reward structures. You can draw your own conclusions about life.

Reward structures are what we use to convince ourselves not to quit when we encounter difficulty.

The strength of your reward structure determines the level of difficulty and the duration you can tolerate the experience. Delayed gratification could be seen as a form of reward structure in our lives.

Weak reward structures can easily be broken. Strong reward structures don't break down. The best reward structures get stronger under load, but they are hard to find.

A weak reward structure will work as long as the level of difficulty is low, or the duration of a high level of difficulty is short. If you choose events that limit difficulty or duration, you'll be fine. This is why most people can finish a marathon, and even a 100km trail race. As slow as you might think you are, the event is over within the day. Bikepack racing can require 10+ days to complete. Weak reward structures are not going to work here. The duration is too long and the difficulty is too high.

Figure 1: The relationship between time and difficulty is expressed as an x-y plot. The ‘area under the curve’ can be thought of as integration. The reward structure is your function. The area represents a measure of duration you can sustain difficulty. If we consider time in seconds (s) and tension in Newtons (N) the area becomes N.s (Newton-second). The Newton-second is the unit of impulse. Impulse is the integral of a force over a time interval.

Weak reward structures will fail you when you most need them. Weak reward structures are like platitudes. They're nice to have but mean nothing. Once you see a weak reward structure as insipid, trite and ultimately meaningless, you'll be less surprised that they don't work.

The process of accepting this idea will be difficult.

Once a reward structure is broken, you don't have a way to return to a positive point of view. Weak reward structures are easily broken because they are in the form of a binary outcome. That outcome is any easily definable feature. All binary outcomes fit this model. The easiest version to criticise is 'I am going to race until it is not fun'. As soon as it's not fun, you'll give up. You could replace fun with winning, any time or place goal, or a reward food (alcohol, processed carbohydrates). Why bother finishing if you won't win? Why bother finishing if you won't achieve a specific time? Why bother finishing if you won't make it to the pub?

An idea in ultra running is to have an A, B & C goal for a race. The usual A goal would be your best possible time or place. Your usual B goal is what you should be able to achieve, and your usual C goal is just to finish. Ultra running is obsessed with finishing. The concept is that you start out chasing your A goal until such time as that becomes unlikely, then you move to your B goal, then your C goal. Of course, you can go in the other direction as well. The best stories are moving from your C goal to A goal during a race.

This approach is a more complicated version of a weak reward structure. Each of your A, B & C goals is a binary outcome. They're weak because they're easy to break and hard to return to a positive point of view. The stories of success seem to have nothing to do with the goals themselves, the outcome of the goal is the by-product. I don't think the idea translates to multi-week races; ultra running races are usually over in a day.

Some weak reward structures work for a while, even when my model suggests they might not—the success of using these masks their inherent weakness.

Relying on charities as your motivation is a weak reward structure. Having a charity as your motivation is another form of a binary model. There's no difference between competing to support a charity and competing whilst you enjoy the race. What happens if the charity ceases to exist (they will at some point), what if the charity does something you disagree with? What if the charity doesn't like you? What if the charity doesn't take your money? What do you do then, find another charity mid-race? You give up and go home.

Sponsors are also a weak reward structure for the same reasons. Both sponsors and charities are external structures that don't need you. You project a need they have never expressed. If Nike can sell shoes without Michael Jordan, whoever sponsors you will be just fine without you. Not convinced? Walk away and see how long it takes before they approach you with a better offer. You could be waiting a while.

The more you look at what people say, the more obvious it is that most people use weak reward structures. These people are surprised that when they fall apart, their reward structures also fall apart.

Family is a weak reward structure like charities. Using your family as a reward structure can work for a while. When your partner leaves you because you keep going away, how will that help you in a race? Maybe you race in anger to prove that you can succeed without them. Sounds fun. I hope you enjoy that experience. See how long this works for you. The obvious answer is to find a new partner. I've seen this. How many partners do you try? At some point, the answer is to get a pet.

Inspiring people is a weak reward structure; it's just a bit more make-believe than the others. Last time I checked, no one is sitting around waiting to be inspired by you. Don't believe me – go out and ask. Be specific and ask someone, 'Are you waiting for me to inspire you?'

When you look at the world, you see these repeating patterns. I often term how I see the world the way musicians hear music. It's just so obvious and clear. To continue to the metaphor, it's like a loud drum being beaten endlessly.

I come back to the central idea of applying the scientific method.

The scientific method requires you to disprove the null hypothesis based on overwhelming evidence in favour of the alternative hypothesis. You set the alternative hypothesis, then set the null hypothesis.

If the alternative hypothesis is that your family loves you because you finished a race, the null hypothesis or default position is that they don't love you if you don't finish the race. If you believe this is true, you probably need significant life changes and to file for divorce.

Find me a hypothesis that when the data support the hypothesis, the conclusion is that people are waiting for you to inspire them. Find me a hypothesis with data to support the idea that you're the reason the charity exists. Find me a hypothesis that might have data supporting the reason your family loves you is that you finished a race.

You can't.

You never will.

Most people will continue to use weak reward structures. They'll act surprised when they don't work at some point. That point is when you most need them. It's at the point when you're tired, sore and your gear choices are possibly not working that you're going to want to quit. It's when you want to quit that you need your reward structures to work.

If weak reward structures don't work, what does? Strong reward structures.

Unlike weak reward structures, strong structures don't break down. Strong reward structures are not binary, they have nuance and complexity. Strong reward structures are not trite and deterministic, they have meaning and allow you to see the situation you find yourself in differently if you reflect upon them.

The best reward structures are those that inspire you to try harder when you face difficulty. As a result of the difficulty, you emerge stronger and more resilient. This differs from the oft-quoted and incorrect cliché 'what doesn't kill you makes you stronger'. Talk to any victim of a car crash and you'll quickly realise how stupid this phrase sounds. Talk to anyone that's had cancer treatment - see if having your body pumped full of chemotherapy drugs has made them stronger. Yes, they survived, but they're not stronger. I've been fortunate enough to have had neither but have cared for people who have been in both these scenarios.

Weak and strong reward structures can be considered as akin to negative and positive feedback loops. A negative feedback loop results in a decrease in output based on a signal. In this analogy, the signal is your reward structure, and the output is your capability (or probability) to finish. As you decrease the signal, you're less likely to finish. What makes these reward structures weak is that it is easy for the signal to be reduced. A positive feedback loop is the opposite. The signal increases the output. We all have positive feedback loops; few people link these to reward structures.

A classic positive feedback loop is that as you catch people, your performance improves. The signal is catching people; the output is your performance. The more people you catch, the more confidence you gain, the more you improve. What happens when you stop catching people? What happens when someone overtakes you? What happens if all the people you caught passed you? You're now exactly where you were but in a much worse mental state because you tied your reward structure to something that is weak. This example shows that once the input changes so does the output. A strong reward structure would not be influenced by your position in the race.

You'd ideally have more than one reward structure. You could consider reward structures as a multiple linear regression model. The reward structures are the independent variables (the right-hand side X1, X2 etc). The dependent variable is your likelihood of completing the race (the left-hand side Y). Weak reward structures would have a negative coefficient, strong reward structures would have a positive coefficient. The size of the coefficient reflects the influence. You could build your own model and estimate your own coefficients. Your model would predict what level of difficulty and for duration you could sustain a positive mindset.

A common point among those who pursue endurance athletics is the idea of 'pushing your limit(s)'. I consider this a weak reward structure. Go back to the definition of the feedback loops - does approaching your limit result in more output or less? Unsure? Go to the edge case. You'll stop if you push your limit to the point you can't continue. Pushing your limit is a negative feedback loop. Any reward structure based on the concept of fighting (yourself, others, the elements) is ultimately futile and will break down. The wind will not stop just because you have run out of energy hoping it will. I went through this stage. Many people sadly seem to get stuck here. The misguided belief is that you need to fight harder. Fighting might work occasionally but it's unsustainable. Your one-off example of success proves my point.

Another version of pushing your limit is comparing yourself to people who undertook hard activities based on the equipment and knowledge from an earlier time. Endurance cyclists in Australia refer to the 'overlanders' as a group to look up to. Undoubtedly, some of these trips are incredible, some of which have probably not been replicated. In making a comparison, we don't consider their reward structures. For many, their choice was a way to escape abject poverty. They were willing to give up their lives because what they had was not worth coming home to. In some cases, the early explorers simply had to continue. If you're going to come home to a reasonably comfortable life, you can't look at these people as a comparison because you have different reward structures.

I've previously stated that "My generosity is not to solve the problem for you but to lead you far enough to solve it yourself." Though most people want stories, I'd rather provide a model. Stories are weak; models are strong. Tips and hacks are weak; ideas are strong.

I rarely divulge my approach. You can use my approach as an example, not a template. Those that mimic my approach will find it doesn't work for them and then complain. I despise few aspects of society more than people living comfortable lives in rich countries complaining about things they could easily fix.

I have many reward structures that work for me. One which I will provide is the idea that I will probably never be here again.

In a difficult situation, I work through the mathematics of the likelihood of being here again and conclude it's probably zero. I have a PhD in applied statistics, I trust my judgement of numbers. You don't have this qualification, which is part of the reason this approach might not work for you. I will often try and make a reasonable calculation based on current plans. The more effort I put into the calculation, the more likely I'll end up at a probability approximating zero.

Then I ask myself - whether I want to waste this moment I'll never get again? The answer is usually no.

This reward structure is strong for me because I believe in this idea the same way people believe in religion. I can't know if the statement is true, but I am willing to engage in the idea because it improves my life. This statement moves from being a reward structure to a guiding principle.

On Wednesday night, I was in my K1 despite the fact I am still not happy with the setup. I'd missed my start time and gone off after everyone else as I had been fiddling with my seat again. As I head up the dark end of the river, I become frustrated that my leg drive is not engaging correctly. For the first time, my foot is going numb. This is not the night I wanted.

At that moment, I, like you, have a choice.

I could be annoyed and spend the rest of the night in a bad mood. I've done this before and will no doubt again. Or, I could implement my reward structure. I could ask myself whether I will be 'here' again?

What is 'here'?

'Here' is a combination of a range of factors. 'Here' is the obvious – being frustrated by things not going right. 'Here' is more than just the simple view; 'here' is less obvious too. 'Here' is being by myself – I am not wasting energy racing, I can focus on every stroke. 'Here' is a dark river where I have no external cues, just the feel of the stroke and the moment of the boat. 'Here' is a Wednesday night in late May when I am not on the full distance course, and no one will care about my time. 'Here' is more than just these points.

The word 'here' means a lot to me. I choose this word carefully. That's why my reward structures are strong, not just because of the thought that goes into the wording but because of how I apply them.

You can train harder than me, you can have better genetics than me, but few people will ever come close to the effort I've put into forming a mental game which you can't compete with me.

The idea that you think we're competing means I've already won. I am not competing with you, nor anyone else, and never will be. Sometimes I compete in what looks and feels like a race – we have numbers, I've paid an entry fee, and there are rules. I am still not competing against you. Grasp this idea in its complexity, and your life will change.

I've offered my reward structure of 'being here' as an example. I've suggested this is an example not a template. Let's assume this works perfectly for you - will this be enough? No. You need a few reward structures, which, if expressed mathematically, would be linearly independent vectors (they would be orthogonal). By choosing reward structures that can be considered linearly independent this means that none of the reward structures (vectors) can be written as a linear combination of the others. Each reward structure makes a unique contribution and is not a repetition of the others.

You might limit yourself to three reward structures based on our three-dimensional view of the world. Let's go beyond three dimensions and think in an n-dimensional space when constructing reward structures. I've already suggested that you could build your own multiple linear regression model, you could also now calculate the dot product of your reward structures. For those interested we would not be able to assume your linearly independent (orthogonal) vectors would not show multicollinearity.

One of the obvious points of failure when developing reward structures is not only to use those of other people but to use them from a variety of people who would all be part of the same group e.g. bikepack racers in Australia. The natural tendency will be for these reward structures to be quite similar in ways that might not be obvious. Mathematically we'd test with the dot product of vectors. An easier tool of assessment is to consider the requirements for the concept of wisdom of the crowds. One of the properties required is independence, which is only possible if others don't know your opinion. In this group, opinions are well known so there is no independence.

The more your reward structures are unique to you and not based on other people, the more likely you'll have something useful. If another one of my reward structures was based around the probability of being at a specific point in time this would be a lot less useful than one that was different. I've lost count of how many I have. More than three.

The way I found my reward structures was not by looking for these directly, but by wanting to understand my capability. The outcome of these experiences resulted in the reward structures I now use as I'd look back on these experiences and ask what led me to these points.

As just one example, I embraced the experience of being cold, alone and scared outside the heads in winter on a surf ski with 3m+ waves. No one asked me to go there, no one set the course, no one measured my time and gave me a participation medal. I didn't do this once or twice; I did this week after week for a year. I had this experience two years ago. I've mostly forgotten the experience because so much has happened since.

What I've left off is that I am not comfortable in the water. I still struggle to relax when swimming in a pool. Imagine how I feel with waves towering over me at sea.

Strong reward structures compound in ways we can't imagine.

Returning to the Lane Cove River and the question I posed myself: 'will I ever be here again?'

The answer is, hopefully, no. Now I have the choice to see this moment as a gift. I choose to accept this gift. The gift is the reminder that, at some point in the future when my boat is perfect and I glide effortlessly through the water I had the chance to experience difficulty. I had the chance to work through this myself.

We see brightness, not because of light, but because of the contrast to the darkness. Our ability to be happy is directly proportional to our ability to be sad. By avoiding sadness, you rob yourself of what you often chase.

I've often said we'd all be happier if we simply accepted the gifts we're offered. Instead, I see people complaining about the gifts on offer and wanting different (better) gifts.

Imagine a world where you're grateful for anything offered to you?

The obvious metaphor is that of a young child. Had I spent the evening on the Lane Cove River complaining about my numb foot and poor leg drive, I would have been no different to the child having a tantrum because they didn't get the present they wanted at Christmas.

We scold the child but not ourselves. The hypocrisy is almost unbelievable.

I've never seen this idea of reward structures - accepting gifts - written about or discussed in endurance athletics. If you want a strong reward structure, you will have to work hard to find something meaningful to you.

Considering the time some people spend riding their bikes, I wonder what they do with this time if they don't think about ideas.

Reward structures

The Hard Way