Analysing and Refactoring a Recursive Function

At SQL Bits the wonders of networking in the SQL community showed it benefits, in a conversation with Jen Stirrup and Chris Testa-O’Neill mentioned that I was a math expert (his words not mine); later on Jen was talking to Chris Webb and he mentioned a problem he was with the performance of a recursive formula in Analysis Services and she then pointed him in my direction.

I was able to transform the recursive formula to one just based on Product and Sum of the values. Since I believe that a lot of Analytic systems start life in an excel spread sheet, so I decided to write-up some of the steps in the process of analysing and transforming these formulas.

The recursive formula in question is \text{CRoR}_t=\text{RoR}_t+\text{CRoR}_{t-1}+\text{LS}_t\text{RoR}_t\text{CRoR}_{t-1} where \text{LS}_t is either 1 for a Long Position or -1 for a Short Position for the period t and \text{RoR}_t is the percentage Profit or Loss for the period t.

Before I start the analysis it going to be helpful to give some examples of the values

Example 1: Long Position with 10% growth per period

Period 1 2 3 4
LS 1 1 1 1
RoR 0.1 0.1 0.1 0.1
CRoR 0.1 0.21 0.331 0.4641

Example 2: Short Position with 10% profit per period

Period 1 2 3 4
LS -1 -1 -1 -1
RoR 0.1 0.1 0.1 0.1
CRoR 0.1 0.19 0.271 0.3439

Example 3: Swapping Long and Short Positions

Period 1 2 3 4
LS 1 -1 1 -1
RoR 0.1 0.1 0.1 0.1
CRoR 0.1 0.19 0.309 0.3781
Period 1 2 3 4
LS -1 1 -1 1
RoR 0.1 0.1 0.1 0.1
CRoR 0.1 0.21 0.289 0.4179

As with all recursive formulas the end points have to be defined and they’re \text{RoR}_1 = 0 \text{ , } \text{CRoR}_1=0.

The first step was to factor the formula to reduce the number of terms giving \text{CRoR}_t=\text{RoR}_t+\text{CRoR}_{t-1}(\text{1}+\text{LS}_t\text{RoR}_t)

The next step is to shorten the names of the terms to save space when expanding the formulas.

\mathcal{C}_t = \text{CRoR}_t

\mathcal{R}_t = \text{RoR}_t

\mathcal{X}_t = (\text{1}+\text{LS}_t\text{RoR}_t)

Now we will look at the expansion of the first couple of terms to get an idea of the shape of the formula.

\mathcal{C}_1 = 0

\mathcal{C}_2 = \mathcal{R}_2 + \mathcal{C}_1 = \mathcal{R}_2

\mathcal{C}_3 = \mathcal{R}_3 + \mathcal{C}_2 \mathcal{X}_3 = \mathcal{R}_3 + \mathcal{R}_2 \mathcal{X}_3

\mathcal{C}_4 = \mathcal{R}_4 + \mathcal{C}_3 \mathcal{X}_4 = \mathcal{R}_4 + (\mathcal{R}_3 + \mathcal{R}_2 \mathcal{X}_3)\mathcal{X}_4 = \mathcal{R}_4 + \mathcal{R}_3 \mathcal{X}_4 + \mathcal{R}_2 \mathcal{X}_3 \mathcal{X}_4

A pattern has emerged so it easy to see that for \mathcal{C}_5 we will have \mathcal{R}_5 followed by \mathcal{R}_4\mathcal{X}_5 until \mathcal{R}_2 which will have the \mathcal{X} term from 3 to 5.

This is starting to look like the expansion of a simple series since the common term is (\text{1}+\text{LS}_t\mathcal{R}_t) a good place to start finding solution will be to set \text{LS}_t = 1 to allow the analysis on the simplified formula.

The product of our simplified term for period 4 is \displaystyle\prod_{t=2}^4 (1+\mathcal{R}_t) = (1 + \mathcal{R}_2)(1 + \mathcal{R}_3)(1 + \mathcal{R}_4) expanding and collecting the terms gives us 1+R_4 + \mathcal{R}_3 (1 + \mathcal{R}_4) + \mathcal{R}_2 (1 + \mathcal{R}_3)(1 + \mathcal{R}_4) substituting the approximated \text{X}_t = (1 + \mathcal{R}_t) gives us back 1 + \mathcal{R}_4 + \mathcal{R}_3 \text{X}_4 + \mathcal{R}_2 \text{X}_3 \text{X}_4 which is the same pattern as the formula for \mathcal{C}_4 except for the 1 at the start of the formula and the fact that we’ve ignored \text{LS}_t by setting it to 1 for all values.

Since the full definition for \mathcal{X}_t = (\text{1}+\text{LS}_t\text{RoR}_t) means that there will be no terms with \mathcal{R}_t without a matching \mathcal{LS}_t. We will need to find a function to apply to the generated formula to remove the leading 1 and the \text{LS}_t where required.

There are two methods we can use either we choose simple values to place in the formula which would be the numerical analysis of the formula or we can perform the same algebraic analysis we have used to get to this point.

Using the \mathcal{N}_p = \displaystyle\prod_{t=2}^p (\text{1}+\text{LS}_t\text{RoR}_t) function for the first couple of terms gives us

\mathcal{N}_1 = 0

\mathcal{N}_2 = (\text{1}+\text{LS}_2\text{RoR}_2)

\mathcal{N}_3 = (\text{1}+\text{LS}_2\text{RoR}_2)(\text{1}+\text{LS}_3\text{RoR}_3)=

\mathcal{N}_4 = (\text{1}+\text{LS}_2\text{RoR}_2)(\text{1}+\text{LS}_3\text{RoR}_3)(\text{1}+\text{LS}_4\text{RoR}_4)

To get \mathcal{N}_2 = \mathcal{C}_2 we need to divide by itself and times by \text{RoR}_2 which is

\mathcal{C}_2 = \mathcal{N}_2 \left(\frac{\text{RoR}_2}{\mathcal{C}_2}\right) = (\text{1}+\text{LS}_2\text{RoR}_2) *\left(\frac{\text{RoR}_2}{ (\text{1}+\text{LS}_2\text{RoR}_2)}\right) = \text{RoR}_2