Recap
Rebalancing
Testing
Getting To Know Your Inputs & Outputs
Adding assert() Calls
Renaming Things
Integrating Error Threshold
Rebalancing Conditionally
What changed
Next Up

Recap

In last weeks article we corrected the calculation for transaction fees paid and incorporated a very simplistic funding fee example to showcase how to sum up different layers of costs expressed as a risk adjusted metric for our backtest.

This week we're going to have a look at how to set an error threshold of 10% so we're not rebalancing our position each day but only if the difference between the ideal risk-scaled position and our actual position is bigger than the threshold. This should bring down trading costs further.

A quick note: going forward, we're going to change the "posting schedule" a bit. I'll still try to keep it at 1 article per week but I don't want to end up sacrificing focus on actual trading research and development just for the sake of "pushing content". I also don't want to end up spewing lots of words so the articles can be long just for the sake of it.

Back to topic!

Rebalancing

Right now, we're updating our position based on current volatility levels and forecast of the instrument we're trading each day. This leads to a lot of additional trading and therefor additional trading costs. To keep our trading volume in check - in the hopes of reducing trading costs - we can use other rebalancing approaches. But first of all, what is rebalancing?

To quote investopedia: "Rebalancing refers to the process of returning the values of a portfolio's asset allocations to the levels defined by an investment plan. Those levels are intended to match an investor's tolerance for risk and desire for reward."

So how does our investment plan look?

That's easy, it's our C.O.R.E. Recipe! Let's have a look at it again, I've already added the rebalancing threshold:

Testing

Now before we start messing with the code, it's worth to take a little step back in my opinion. Our current code is scattered across about 300 lines and becomes messier and messier by the minute. It's hard to keep track of everything moving; little changes at the top of the file could lead to changes in all kinds of places due to high coupling.

From now on, everytime we make a change, we want to confirm that the rest of our application is still working correctly! For this we're going to use something called testing.

Testing is a fundamental development technique that ensures the application under development (or maintenance for that matter) keeps working as expected, no matter what you do to it!

There's all kinds of testing and different agendas require different techniques. To be able to effectively (and correctly) test your application, you usually need to know the outputs of your program based on given inputs. After changing things and running your code again, you can then verify if the outputs are still correct and you didn't actually break things. If 10+2 in the context of your program should always equal 12, it's worth implementing a check that verifies that it still evaluates to 12 after you've made changes.

Getting To Know Your Inputs & Outputs

You might be asking yourself "what inputs is he talking about"? Well.. in the context of our application we're mainly concerned with price data. So the current state of historical EOD prices our datalab spits out are our inputs.

For what we're about to do next, we're going to use a very simplistic testing technique: using assert() calls to verify our code still works. It's probably worth noting that this technique goes down the drain very fast. We're relying on one specific set of inputs to verify their outputs. This means that our inputs need to stay the same for the assert() calls to be worth anything. If you want to, you can update the historical prices now, following what we did in last weeks article. You don't have to. It doesn't really matter right now if our data is up to date.

What's really important to understand is that we need to have a freeze in state for our historical EOD prices when using them as assert() inputs

Don't worry. We're going to ditch the asserts later and use more sophisticated and stable testing techniques. Right now they work just fine for our scenario and are very easy to use without any additional overhead.

But what about the outputs? What do we actually want to verify?

We're about to restructure and rename a bunch of things in our code. Things we want to verify don't change are the performance metrics we've worked so hard in the last few weeks to code up.

# Strategy Total Return 967.9519969221997
# Strategy Avg. Annual Return 66.38528351683632
# Strategy Daily Volatility 1.7778739911853931
# Strategy Sharpe Ratio 1.95444928453439

# Fees paid 1185.3404298579717
# Funding paid 3130.3644113437113

# Strategy Ann. Turnover 33.34622029131843

# Pre-Cost SR 2.043151770812106
# Post-Cost SR 1.9544492845343906
# Risk-Adjusted Costs 0.08870248627771549
# [...]

Adding assert() Calls

The only thing we have to do really is wrapping some assert() calls around the referenced variables and values at the end of our file:

assert (strat_tot_return == 967.9519969221997)
assert (strat_mean_ann_return == 66.38528351683632)
assert (strat_std_dev.iloc[-1] == 1.7778739911853931)
assert (strat_sr.iloc[-1] == 1.95444928453439)

assert (df['daily_usd_fees'].sum() == 1185.3404298579717)
assert (df['funding_paid'].sum() == 3130.3644113437113)

assert (ann_turnover == 33.34622029131843)

assert (rolling_pre_cost_sr.iloc[-1] == 2.043151770812106)
assert (rolling_post_cost_sr.iloc[-1] == 1.9544492845343906)
assert (strat_rolling_trading_costs_sr.iloc[-1] == 0.08870248627771549)

That's it! We're ready to go and change up things

Renaming Things

This is a simple 2-step refactoring. We're going to first rename and reorder some things so it's easier to understand how we need to change the rebalancing behaviour to reflect our goal of an 10% error threshold.

Right now, we're assuming that each day our position needs to be rebalanced to the same as the ideal position, so let's start by renaming the variables involved in our trading simulation logicblock:

# The current implementation
[...]
# simulate trading the signals
annual_cash_risk_target = trading_capital * annual_perc_risk_target
daily_cash_risk = annual_cash_risk_target / np.sqrt(trading_days_in_year)
for index, row in df.iterrows():
    notional_per_contract = (row['close'] * 1 * contract_unit)
    df.at[index, 'notional_per_contract'] = notional_per_contract

    daily_usd_vol = notional_per_contract * row['instr_perc_returns_vol']
    df.at[index, 'daily_usd_vol'] = daily_usd_vol

    units_needed = daily_cash_risk / daily_usd_vol
    df.at[index, 'units_needed'] = units_needed

    forecast = row['capped_forecast']
    pos_size_contracts = units_needed * forecast / 10
    df.at[index, 'pos_size_contracts'] = pos_size_contracts

    notional_pos = pos_size_contracts * notional_per_contract
    df.at[index, 'notional_pos'] = notional_pos

    if df.index.get_loc(index) > 0:
        prev_idx = df.index[df.index.get_loc(index) - 1]
        prev_contracts = df.at[prev_idx, 'pos_size_contracts']
        df.at[index, 'prev_contracts'] = prev_contracts

        if np.isnan(prev_contracts):
            prev_contracts = 0

        contract_diff = (pos_size_contracts - prev_contracts)
        df.at[index, 'contract_diff'] = contract_diff

        notional_traded = contract_diff * notional_per_contract
        df.at[index, 'notional_traded'] = notional_traded
        daily_usd_fees = abs(notional_traded) * fee
    else:
        daily_usd_fees = 0

    df.at[index, 'daily_usd_fees'] = daily_usd_fees
[...]

When renaming things the end result is solely based on your creativity. A good rule of thumb is that names should have an inherent meaning and make it easier to understand what's happening so we can grok it at a quick glance rather than following multiple lines of codes - or even worse, having to jump between your whole codebase due to lots of inflections and indirections.

A quick little proxy for name generation is to just sketch out what you're interested in, may that be variable names or flow of logic, and then think about if you could be more specific to make it more understandable.

To work out when and how much we need to trade to rebalance according to our C.O.R.E. Recipe, we're interested in the following values:

the ideal position calculated based on volatility and forecast
our current position
how much our current position is deviating from the ideal position

In addition we're also interested in our notional exposure and notional traded for cost calculations.

Currently the ideal position size is denoted in amount of contracts and named pos_size_contracts. Selecting all occurences in VScode is easy: just select one occurence and press CTRL + SHIFT + L, then start typing the new name ideal_pos_contracts (or whatever makes sense to you). While we're at it we can do the same for notional_pos and rename it to ideal_pos_notional.

Now run the script again. If everything works like before the terminal output shouldn't change:

# Strategy Total Return 967.9519969221997
# Strategy Avg. Annual Return 66.38528351683632
# Strategy Daily Volatility 1.7778739911853931
# Strategy Sharpe Ratio 1.95444928453439

# Fees paid 1185.3404298579717
# Funding paid 3130.3644113437113

# Strategy Ann. Turnover 33.34622029131843

# Pre-Cost SR 2.043151770812106
# Post-Cost SR 1.9544492845343906
# Risk-Adjusted Costs 0.08870248627771549

# Instr Daily Mean Return 0.003815858387315305
# Instr Daily Volatility 0.031071300570486668
# Instr Ann. Sharpe Ratio 2.346276814584308

If at any point our change introduced a bug and leads to different results, you should see a different output:

# [...]
# Traceback (most recent call last):
# [...]
#    assert (strat_tot_return == 967.9519969221997)
# AssertionError

Let's move on. To speed things up and keep reading time in check I'm going to now rename all things and just present the endresult. I will discuss any problems encountered while doing so afterwards. Here's our new loop:


# Renamed 'pos_size_contracts' to 'ideal_pos_contracts'
# Renamed 'notional_pos' to 'ideal_pos_notional'
# Renamed 'prev_contracts' to 'current_pos_contracts'
# Renamed 'daily_usd_fees' to 'fees_paid'
#
# Added some comments to segment logical blocks
#
# Moved all the df.at[index] insertions to the end of the loop for better readability of flow of trading logic
# kept the ones coupled to local if-clause scope within the if block but moved them to its end
# we're going to tackle these later
#
# simulate trading the signals
annual_cash_risk_target = trading_capital * annual_perc_risk_target
daily_cash_risk = annual_cash_risk_target / np.sqrt(trading_days_in_year)

df['rebalanced_pos_contracts'] = np.nan

for index, row in df.iterrows():
    # calculate units needed based on volatility and risk-target
    notional_per_contract = (row['close'] * 1 * contract_unit)
    daily_usd_vol = notional_per_contract * row['instr_perc_returns_vol']
    units_needed = daily_cash_risk / daily_usd_vol

    # adjust units needed based on forecast
    forecast = row['capped_forecast']
    ideal_pos_contracts = units_needed * forecast / 10
    ideal_pos_notional = ideal_pos_contracts * notional_per_contract

    # simulate trading
    if df.index.get_loc(index) > 0:
        prev_idx = df.index[df.index.get_loc(index) - 1]
        current_pos_contracts = df.at[prev_idx, 'rebalanced_pos_contracts']

        if np.isnan(current_pos_contracts):
            current_pos_contracts = 0

        contract_diff = ideal_pos_contracts - current_pos_contracts
        notional_traded = contract_diff * notional_per_contract
        fees_paid = abs(notional_traded) * fee

        df.at[index, 'current_pos_contracts'] = current_pos_contracts
        df.at[index, 'contract_diff'] = contract_diff
        df.at[index, 'notional_traded'] = notional_traded
    else:
        fees_paid = 0

    df.at[index, 'notional_per_contract'] = notional_per_contract
    df.at[index, 'daily_usd_vol'] = daily_usd_vol
    df.at[index, 'units_needed'] = units_needed
    df.at[index, 'ideal_pos_contracts'] = ideal_pos_contracts
    df.at[index, 'ideal_pos_notional'] = ideal_pos_notional
    df.at[index, 'fees_paid'] = fees_paid

Integrating Error Threshold

Since the new implementation needs us to keep track of our current position separate to the ideal position - because we're only going to change our exposure when we exceed the threshold - we need to introduce a new column: rebalanced_pos_contracts and base our calculations off of this column.

[...]
# simulate trading
if df.index.get_loc(index) > 0:
    prev_idx = df.index[df.index.get_loc(index) - 1]
    current_pos_contracts = df.at[prev_idx, 'rebalanced_pos_contracts']

    if np.isnan(current_pos_contracts):
        current_pos_contracts = 0

    contract_diff = ideal_pos_contracts - current_pos_contracts

    rebalanced_pos_contracts = ideal_pos_contracts
    notional_traded = contract_diff * notional_per_contract

    fees_paid = abs(notional_traded) * fee

    [...]
    df.at[index, 'rebalanced_pos_contracts'] = rebalanced_pos_contracts
    [...]
else:
    fees_paid = 0


# Calculating Turnover
[...]
actual_positions = df['rebalanced_pos_contracts'].resample(resample_period).last()
[...]

# Calculating Performance
strat_raw_usd_returns = df['rebalanced_pos_contracts'].shift(1) * df['close'].diff()
[...]

Rebalancing Conditionally

Next, we're going to introduce our conditional logic to only trade if we exceed the threshold. To calculate how much we're deviating from the ideal_position we can simply divide the absolute difference in contracts between our current and ideal position by the ideal position:

[...]
rebalance_err_threshold = 10  # percentage

[...]
contract_diff = ideal_pos_contracts - current_pos_contracts
contract_deviation = abs(contract_diff) / abs(ideal_pos_contracts) * 100
df.at[index, 'contract_deviation'] = contract_deviation

Remember to run your script after every change and check if the assert() calls still don't throw errors! If they do, you messed something up. Nothing a few CTRL + Z cycles won't fix though.

The only thing left to do now is checking if we exceed the threshold and then trade conditionally:

[...]
# simulate trading
if df.index.get_loc(index) > 0:
    prev_idx = df.index[df.index.get_loc(index) - 1]
    current_pos_contracts = df.at[prev_idx, 'rebalanced_pos_contracts']

    if np.isnan(current_pos_contracts):
        current_pos_contracts = 0

    contract_diff = ideal_pos_contracts - current_pos_contracts
    contract_deviation = abs(contract_diff) / abs(ideal_pos_contracts) * 100

    if contract_deviation > rebalance_err_threshold:
        rebalanced_pos_contracts = ideal_pos_contracts
        notional_traded = contract_diff * notional_per_contract
    else:
        rebalanced_pos_contracts = current_pos_contracts
        notional_traded = 0 * notional_per_contract

    fees_paid = abs(notional_traded) * fee
    [...]
else:
    fees_paid = 0

Run your script again and.. wait a minute, we get an AssertionError:

# [...]
# Traceback (most recent call last):
#   [...]
#     assert (strat_tot_return == 967.9519969221997)
# AssertionError

This is fine!

The assert() calls can only really shield us from changed outputs based on inputs. They don't test for relationships between them. Since we changed the inputs, the outputs also changed! assert() calls are a nice way to protect you from breaking things when solely renaming/reordering code and you got no testing suite in place. We'll have a look at how to craft up a proper test harness for our application later. For now you can either update the values checked in the assert() calls or simply comment them out.

What changed

# [...]
# Strategy Total Return 967.7822415004347
# Strategy Avg. Annual Return 66.37364114010873
# Strategy Daily Volatility 1.811027584367656
# Strategy Sharpe Ratio 1.918333652233208

# Fees paid 1038.6238698915147
# Funding paid 3130.3644113437113

# Strategy Ann. Turnover 37.672650094739545

# Pre-Cost SR 1.9914208916281093
# Post-Cost SR 1.918333652233208
# Risk-Adjusted Costs 0.07308723939490136

# Instr Daily Mean Return 0.003815858387315305
# Instr Daily Volatility 0.031071300570486668
# Instr Ann. Sharpe Ratio 2.346276814584308

On a first glance we can see that our turnover, pre-cost SR and post-cost SR changed. Turnover changed because we're using our avg_positions to calculate it. Since we changed how we construct our positions, this also changed. What's more interesting to note is the fact that we brought our Risk-Adjusted Costs (pre-cost SR - post-cost SR) down from ~0.0887 to ~0.0731 SR units, a 17.58% improvement!

The current implementation is kind of "hacky" and ugly. There are other, better ways to do this and we'll come to that. Right now the code works and that's all we need.

The full code can be found in this weeks GitHub repository.

Next Up

This is it for this weeks article. We still have a lot on our list. We'll tackle things like incorporating slippage & bid-ask spread, backfilling funding next while adding more rigorous testing before we can finally use our backtest for real research, which is going to be a lot of fun.

So long, happy trading!

Rebalancing Threshold

Table of Contents