## Wednesday, 26 December 2018

## Friday, 21 December 2018

### Forward Initial Margin and multiple layers of AD

In the previous blog, I presented the application of AD to the question of Initial Margin (or Capital) attribution between desks in risk-weight based measures. In this installment, I incorporate this feature into a Monte Carlo forward IM computation mechanism. The Monte Carlo forward IM is one of the approach to compute Margin Value Adjustment (MVA). The full MVA also requires the introduction of the cost of funding (the IM) and the discouting; the funding will be the focus of the next installment.

The steps to obtain the forward IM in a Monte Carlo approach for interest rates in risk-weight based measures are the following:

The calibration of a multi-curve framework from market quotes is a standard procedure. I refer to my book on the multi-cure framework, Chapter 5, for the details. Note that Algorithmic Differentiation (AD) is already important at this stage. The calibration procedure is often done using root-finding algorithm of the Newton type. This requires the computation of the gradient of the market quotes function. This is done efficiently with AD. A multi-curve dynamic model is required and similarly it needs to be calibrated to the market.

For this blog, I'm using an hybrid multi-curve model, as described in the recent working paper Hybrid Model: A Dynamic Multi-Curve Framework. This is a relatively simple model than can be calibrated to the term structure of volatilities and includes a stochastic basis between LIBOR rates and OIS rates. This feature will be important when discussing cost of funding in the next instalment.

To evolve the curves, we use very standard techniques. The model describes the curves (OIS discount factors and LIBOR processes) at a future date in an explicit way based on Gaussian distributions. It is easy to obtain the value of discount factors and LIBOR processes at those forward dates.

The next step, obtaining the sensitivities of the portfolio with respect to the market quotes in the forward scenario, is probably less standard in derivative pricing. Most of derivative valuation is based on values and cash flows, not on risk measures. The technical requirements are not be very different: we have a model evolved curve and we want to compute a result that depends on those curves. But there is an extra catch, what we need for the risk-weight based measure is the sensitivity with respect to the methodology select market quotes, not arbitrary model parameters. Those market quotes are ot provided directly by the model. Fortunately, even if this is no something that we do explicitly in many places, this is something we do implicitly. With a variation of AD techniques this can be implemented efficiently. This can be obtained by a mixture of the chapters 5 and 6 ("Derivatives to non-inputs and non-derivatives to inputs" and "Calibration") from my Algorithmic Differentiation in Finance Explained book.

Once the sensitivities are computed for each trade at each date ad for each scenario, the measure and his attribution by portfolio is simply applying the techniques described in the previous blog.

A picture is worth a thousand words. So let's put the above ideas in pictures.

First we take only one Monte Carlo scenario and look at the attribution. I have selected a small portfolio with one counterparty containing 30 swaps split between 6 sub-portfolios. The attribution is done on the different sub-portfolios.

The total IM is represented in red. The attribution is done using the Euler method described in the previous blog. With this attribution method, the offsets between positions are taken into account. This explain why some desks (Desk 2 and Desk 6) have negative attributions.

What we can also see is that the relation between the different attributions change through time. Today, desk 1 is the largest, while through time, desk 2 is becoming the largest and even the only meaningful contributor after 8 years. This emphasises that a MVA attribution based only on today's spot IM attribution would not provide a faire representation of the contributions. The attribution along the path is really required.

Once we have done the attribution on one path, we can look at how the forward IM behaves along the different paths. In this case, we have kept the IM methodology unchanged through the life of the portfolios. In practice, the model parameters are reviewed on a regular basis (at least annually in case of SIMM). We should introduce change of methodology along the paths. This is not done here and may be the subject of another blog at a latter stage.

An example with 7 paths is proposed in the graph below. We use only a small number of paths to avoid overloading the pictures. The performance analysis will be done with more paths.

The least we can say is that the graph is underwhelming. This can be explain easily as our portfolio contains only vanilla swaps the present values of which are almost linear in the underlying market quotes. As the IM methodology is sensitivity (first order derivatives) based, the IM numbers do not change significantly from one path to another. This does not mean that multiple paths are unnecessary for MVA, as we will discuss in the next blog.

Obviously a financial institution will have more than one counterparty. The next figure reproduce the example of the forward computation with three different counterparties.

What is the performance of such an implementation and where are the bottlenecks?

We have ran the above approach on a portfolio of 90 swaps split between 3 counterparties and 6 sub-portfolios. The horizon is 11 years with semi-annual dates and 101 paths. The total computation time (1) was 18s. The split is:

The first line is the original calibration of the curves from market data stored in CSV files. The second line is the loading of the portfolio from a csv file. The model is a two-factor model based on Gaussian distribution, generating the underlying random variable was 11 ms. As the trades we want to value age through the different dates, for each path we need to generate a full time series of fixing consistently with the model used; not only on the path date but on all intermediary dates as a swap can have a fixing at any date. As we have OIS in the portfolio, in practice really each single date will be required. That is also relatively quick (348 ms).

As we expect, the bulk of the time is spent computing the sensitivities and combining them in the IM. One of the time consuming task is computing the market quote Jacobian matrices required to obtain the sensitivities to market quotes, even if the model does not provide the market quotes directly. In our example we use two curves (OIS and LIBOR) with 12 nodes. Computing the Jacobian is similar to computing 24 swaps parameter sensitivities, each with respect to 24 nodes and inverting a matrix. The inversion itself is almost irrelevant in term of computation time. We are left with the parameter sensitivities. In our implementation, this is done by AD and it takes around 6 PV time while a finite difference would take 25 PV times. A gain of a factor 4 for this task.

Then there are the sensitivities of the 90 swaps in the portfolio. The computation time was around twice the Jacobian computation time. The swap in the portfolio are not all long term, so the ratio with the Jacobian is the right size. Like for the previous element, the gain here is probably a factor 4 thanks for AD. This emphasises that for computational efficiency reason, it is better to run the simulation for all counterparties in one run. One of the time consuming task, the Jacobian computation, is common to all counterparties. Note that the representation of the swap here is the full representation with all the conventions, holidays and idiosyncratic details.

We come finally to the object of the previous blog, which was attribution. The computation of the IM itself, the marginal of each exposure and the attribution to 6 sub-portfolios took around 10 times less computation time than the Jacobian. The use of AD here has probably brought a gain of a factor 2 or 3, but this is almost inconsequential as the IM computation time from the sensitivities is dwarfed by the computation time of the sensitivities.

On the performance side, the computation of forward IM using a risk-weight based methodology through Monte Carlo approach is feasible in reasonable time. The AD implementation brings real benefits. More curves are involved, more benefits it will bring. The measure computation from sensitivity itself is relatively fast and improvement to that computation are almost invisible in the final computation time.

On the business side, doing the attribution at each forward date is important to attribute the MVA correctly. A simple attribution based on the spot IM would provide unreliable results.

In forthcoming blogs we will look at the cost of funding, the change of the IM methodology parameters through time and the computation of marginal MVA.

The steps to obtain the forward IM in a Monte Carlo approach for interest rates in risk-weight based measures are the following:

- Calibrate a multi-curve framework and a dynamic model
- Evolve the curves to a sample of future dates using random scenarios
- For each date and scenario, compute the sensitivities (market quotes deltas or bucketed PV01) of the portfolio
- Computes the IM for each counterparty based on the sensitivities and apply the sub-portfolio attribution (see last blog)

### Calibration

The calibration of a multi-curve framework from market quotes is a standard procedure. I refer to my book on the multi-cure framework, Chapter 5, for the details. Note that Algorithmic Differentiation (AD) is already important at this stage. The calibration procedure is often done using root-finding algorithm of the Newton type. This requires the computation of the gradient of the market quotes function. This is done efficiently with AD. A multi-curve dynamic model is required and similarly it needs to be calibrated to the market.

For this blog, I'm using an hybrid multi-curve model, as described in the recent working paper Hybrid Model: A Dynamic Multi-Curve Framework. This is a relatively simple model than can be calibrated to the term structure of volatilities and includes a stochastic basis between LIBOR rates and OIS rates. This feature will be important when discussing cost of funding in the next instalment.

### Evolution

To evolve the curves, we use very standard techniques. The model describes the curves (OIS discount factors and LIBOR processes) at a future date in an explicit way based on Gaussian distributions. It is easy to obtain the value of discount factors and LIBOR processes at those forward dates.

### Sensitivities

The next step, obtaining the sensitivities of the portfolio with respect to the market quotes in the forward scenario, is probably less standard in derivative pricing. Most of derivative valuation is based on values and cash flows, not on risk measures. The technical requirements are not be very different: we have a model evolved curve and we want to compute a result that depends on those curves. But there is an extra catch, what we need for the risk-weight based measure is the sensitivity with respect to the methodology select market quotes, not arbitrary model parameters. Those market quotes are ot provided directly by the model. Fortunately, even if this is no something that we do explicitly in many places, this is something we do implicitly. With a variation of AD techniques this can be implemented efficiently. This can be obtained by a mixture of the chapters 5 and 6 ("Derivatives to non-inputs and non-derivatives to inputs" and "Calibration") from my Algorithmic Differentiation in Finance Explained book.

### Attribution

Once the sensitivities are computed for each trade at each date ad for each scenario, the measure and his attribution by portfolio is simply applying the techniques described in the previous blog.

### Figures

A picture is worth a thousand words. So let's put the above ideas in pictures.

First we take only one Monte Carlo scenario and look at the attribution. I have selected a small portfolio with one counterparty containing 30 swaps split between 6 sub-portfolios. The attribution is done on the different sub-portfolios.

The total IM is represented in red. The attribution is done using the Euler method described in the previous blog. With this attribution method, the offsets between positions are taken into account. This explain why some desks (Desk 2 and Desk 6) have negative attributions.

**Figure 1: Forward IM attribution between desks.**

What we can also see is that the relation between the different attributions change through time. Today, desk 1 is the largest, while through time, desk 2 is becoming the largest and even the only meaningful contributor after 8 years. This emphasises that a MVA attribution based only on today's spot IM attribution would not provide a faire representation of the contributions. The attribution along the path is really required.

Once we have done the attribution on one path, we can look at how the forward IM behaves along the different paths. In this case, we have kept the IM methodology unchanged through the life of the portfolios. In practice, the model parameters are reviewed on a regular basis (at least annually in case of SIMM). We should introduce change of methodology along the paths. This is not done here and may be the subject of another blog at a latter stage.

An example with 7 paths is proposed in the graph below. We use only a small number of paths to avoid overloading the pictures. The performance analysis will be done with more paths.

**Figure 2: Forward IM along different paths**

The least we can say is that the graph is underwhelming. This can be explain easily as our portfolio contains only vanilla swaps the present values of which are almost linear in the underlying market quotes. As the IM methodology is sensitivity (first order derivatives) based, the IM numbers do not change significantly from one path to another. This does not mean that multiple paths are unnecessary for MVA, as we will discuss in the next blog.

Obviously a financial institution will have more than one counterparty. The next figure reproduce the example of the forward computation with three different counterparties.

**Figure 3: Forward IM for different counterparties**

### Performance

What is the performance of such an implementation and where are the bottlenecks?

We have ran the above approach on a portfolio of 90 swaps split between 3 counterparties and 6 sub-portfolios. The horizon is 11 years with semi-annual dates and 101 paths. The total computation time (1) was 18s. The split is:

- Calibration in 340 ms.
- Loaded trades in 88 ms.
- Path random variables in 11 ms.
- Paths fixings in 348 ms.
- IMs in 17430 ms.

The first line is the original calibration of the curves from market data stored in CSV files. The second line is the loading of the portfolio from a csv file. The model is a two-factor model based on Gaussian distribution, generating the underlying random variable was 11 ms. As the trades we want to value age through the different dates, for each path we need to generate a full time series of fixing consistently with the model used; not only on the path date but on all intermediary dates as a swap can have a fixing at any date. As we have OIS in the portfolio, in practice really each single date will be required. That is also relatively quick (348 ms).

As we expect, the bulk of the time is spent computing the sensitivities and combining them in the IM. One of the time consuming task is computing the market quote Jacobian matrices required to obtain the sensitivities to market quotes, even if the model does not provide the market quotes directly. In our example we use two curves (OIS and LIBOR) with 12 nodes. Computing the Jacobian is similar to computing 24 swaps parameter sensitivities, each with respect to 24 nodes and inverting a matrix. The inversion itself is almost irrelevant in term of computation time. We are left with the parameter sensitivities. In our implementation, this is done by AD and it takes around 6 PV time while a finite difference would take 25 PV times. A gain of a factor 4 for this task.

Then there are the sensitivities of the 90 swaps in the portfolio. The computation time was around twice the Jacobian computation time. The swap in the portfolio are not all long term, so the ratio with the Jacobian is the right size. Like for the previous element, the gain here is probably a factor 4 thanks for AD. This emphasises that for computational efficiency reason, it is better to run the simulation for all counterparties in one run. One of the time consuming task, the Jacobian computation, is common to all counterparties. Note that the representation of the swap here is the full representation with all the conventions, holidays and idiosyncratic details.

We come finally to the object of the previous blog, which was attribution. The computation of the IM itself, the marginal of each exposure and the attribution to 6 sub-portfolios took around 10 times less computation time than the Jacobian. The use of AD here has probably brought a gain of a factor 2 or 3, but this is almost inconsequential as the IM computation time from the sensitivities is dwarfed by the computation time of the sensitivities.

### Conclusions

On the performance side, the computation of forward IM using a risk-weight based methodology through Monte Carlo approach is feasible in reasonable time. The AD implementation brings real benefits. More curves are involved, more benefits it will bring. The measure computation from sensitivity itself is relatively fast and improvement to that computation are almost invisible in the final computation time.

On the business side, doing the attribution at each forward date is important to attribute the MVA correctly. A simple attribution based on the spot IM would provide unreliable results.

In forthcoming blogs we will look at the cost of funding, the change of the IM methodology parameters through time and the computation of marginal MVA.

**(1)**Time computed on the author laptop (MacBook Pro 13' , 3.1 GHz Intel Core i5). Personal Java code on a single thread.## Wednesday, 12 December 2018

### Initial margin and double AD

The use of Algorithmic Differentiation (AD) in finance as become more popular in the last 5 to 10 years. AD can be described as "the art of calculating the differentiation of functions with a computer". An introduction to AD in finance can be found in my book with the same title.

The efficient computation of derivatives has been traditionally used in finance to compute the "

Recent regulations have pushed in the direction of more computation of cost of capital for market risk (FRTB) and Initial Margin for uncleared trades (Uncleared Margin Regulation - UMR). The method used by most financial institutions already under the UMR is the ISDA proposed SIMM™ approach. The approach is very similar to FRTB capital computation with some small twists. The base idea of both is to compute a VaR-like number based on conventional risk weights and correlations. This is equivalent to a delta-normal VaR computation in the RiskMetrics style but with variance-covariance matrix in a stylized format with prescribed values. I will use the generic term of risk-weight based measure for those capital or IM methodologies.

The "

A second topic for which Algorithmic Differentiation can bring significant improvements is the topic of marginal risk measure and measure attribution. The marginal measure is the increase in the measure coming from adding a small sensitivity (or trade) to the existing portfolio. This is the derivative of the measure with respect to an increase in the sensitivity/exposure. This marginal measure can be computed at the single sensitivity level or at the trade level or at any combination of trades level. In the rest of the blog, I will consider the marginal measure at the most atomic level of our problem, the level of a single sensitivity. Obviously if the marginal measure is available at the lowest level, the marginals can be combined to obtained the marginals at any level above that. From a computational perspective, the lowest level of marginals is the most expensive and if we can solve it cheaply, then we can solve any other combination cheaply also.

The marginal measure is also closely linked one standard method of attribution, the attribution method called "

In general a measure (Capital or IM) attribution between sub-portfolio is a way to divide in an additive way the total measure of a portfolio between different sub-portfolios.

The Euler attribution is based on the Euler's homogeneous function theorem. The theorem provides an equality for positively homogeneous functions. The standard approaches to capital, IM and VaR are in most cases positively homogeneous. This is the case for FRTB, SIMM (below the concentration risk threshold) and Delta-Normal VaR.

What are we trying to do with attribution? We start with a portfolio made of sub-portfolios. We have

We want to split the measure for the total portfolio in an additive way between the different sub-portfolios. We cannot use directly the measure of each sub-portfolio as the measure itself is not additive.

The following equality, called

We have a function which represents the measures μ on portfolios

The measure applied to the total portfolio is

Euler's theorem suggests an attribution based on

One of the reason this attribution is used is that it takes into account the offsets between sub-portfolios.

If you have the derivatives of the function

What is the performance in practice of this method combined with AD? For this I have used a simple portfolio with 20 sub-portfolios and 500 exposures each. This is a total of 10,000 exposures. The measure selected is an IM computed using the SIMM methodology.

If we compute(1) a single IM for the portfolio (10,000 exposures), the computation time is 3.4 ms. If we were to compute the marginal IM of each exposure by finite difference, it would multiply the computation time by 10,000 (34,000 ms). If we were to compute the marginal IM for each sub-portfolio by finite difference it would multiply the computation time by 21 (714 ms).

What do we obtain by Algorithmic Differentiation? For the above portfolio, the time required for the measure, all the 10,000 marginal IM and the 20 sub-portfolios attribution is 10.3 ms. Obtaining all those 10,000+ figures multiplies the computation time only by 3. This is in line with the theory (on the good side of the range). This is more than 3,000 time faster than by finite difference!

Using AD at two levels for risk-weight and correlation based risk measures improve significantly the computation time for marginal measures and attribution.

In a forthcoming blog, we will combine that with other uses of AD in MVA computations. We will add a third layer of AD. But that will probably be after the Christmas period.

Material similar to the one described in this blog was presented at the WBS xVA conference in March 2017 and at a Thalesians seminar in April 2017, that seminar that led The Wall Street Journal to use my picture (incorrectly to my opinion) in the article "The Quants Run Wall Street Now".

The efficient computation of derivatives has been traditionally used in finance to compute the "

*greeks*" associated to financial instruments and in particular the deltas or bucketed PV01.Recent regulations have pushed in the direction of more computation of cost of capital for market risk (FRTB) and Initial Margin for uncleared trades (Uncleared Margin Regulation - UMR). The method used by most financial institutions already under the UMR is the ISDA proposed SIMM™ approach. The approach is very similar to FRTB capital computation with some small twists. The base idea of both is to compute a VaR-like number based on conventional risk weights and correlations. This is equivalent to a delta-normal VaR computation in the RiskMetrics style but with variance-covariance matrix in a stylized format with prescribed values. I will use the generic term of risk-weight based measure for those capital or IM methodologies.

The "

*delta*" part of those methodologies is relying of the computation of PV01. This is where AD has been traditionally used in finance. This is the first layer of AD related to risk-weight based measure methodologies. As this is relatively standard, I will not focus on this aspect in this blog.### Marginal measure

A second topic for which Algorithmic Differentiation can bring significant improvements is the topic of marginal risk measure and measure attribution. The marginal measure is the increase in the measure coming from adding a small sensitivity (or trade) to the existing portfolio. This is the derivative of the measure with respect to an increase in the sensitivity/exposure. This marginal measure can be computed at the single sensitivity level or at the trade level or at any combination of trades level. In the rest of the blog, I will consider the marginal measure at the most atomic level of our problem, the level of a single sensitivity. Obviously if the marginal measure is available at the lowest level, the marginals can be combined to obtained the marginals at any level above that. From a computational perspective, the lowest level of marginals is the most expensive and if we can solve it cheaply, then we can solve any other combination cheaply also.

### Euler attribution

The marginal measure is also closely linked one standard method of attribution, the attribution method called "

*Euler attribution*".In general a measure (Capital or IM) attribution between sub-portfolio is a way to divide in an additive way the total measure of a portfolio between different sub-portfolios.

The Euler attribution is based on the Euler's homogeneous function theorem. The theorem provides an equality for positively homogeneous functions. The standard approaches to capital, IM and VaR are in most cases positively homogeneous. This is the case for FRTB, SIMM (below the concentration risk threshold) and Delta-Normal VaR.

What are we trying to do with attribution? We start with a portfolio made of sub-portfolios. We have

*k*sub-portfolios denoted*P*and the total portfolio is_{i}*P*:*P = Σ*

_{i=1,...,k}P_{i}= Σ_{i=1,...,k}1 x P_{i}We want to split the measure for the total portfolio in an additive way between the different sub-portfolios. We cannot use directly the measure of each sub-portfolio as the measure itself is not additive.

The following equality, called

*Euler's homogeneous function formula*, is satisfied for positively homogeneous functions*f(x) = Σ*

_{i=1,...,k}x_{i}D_{i}f(x)We have a function which represents the measures μ on portfolios

*f(X) = f((X*

_{i})_{i=1,...,k}) = μ(Σ_{i=1,...,k}X_{i}x P_{i})The measure applied to the total portfolio is

*μ(P) = f(1,1,...,1)*

Euler's theorem suggests an attribution based on

*μ(P) = Σ*

_{i=1,...,k}1 x D_{i}f(1,1,...,1)One of the reason this attribution is used is that it takes into account the offsets between sub-portfolios.

If you have the derivatives of the function

*f*with respect to each individual sensitivity in the sub-portfolios, obtaining*D*is simply the question of adding numbers for the sensitivities in the sub-portfolio._{i}f(1,1,...,1)### Performance example

What is the performance in practice of this method combined with AD? For this I have used a simple portfolio with 20 sub-portfolios and 500 exposures each. This is a total of 10,000 exposures. The measure selected is an IM computed using the SIMM methodology.

If we compute(1) a single IM for the portfolio (10,000 exposures), the computation time is 3.4 ms. If we were to compute the marginal IM of each exposure by finite difference, it would multiply the computation time by 10,000 (34,000 ms). If we were to compute the marginal IM for each sub-portfolio by finite difference it would multiply the computation time by 21 (714 ms).

What do we obtain by Algorithmic Differentiation? For the above portfolio, the time required for the measure, all the 10,000 marginal IM and the 20 sub-portfolios attribution is 10.3 ms. Obtaining all those 10,000+ figures multiplies the computation time only by 3. This is in line with the theory (on the good side of the range). This is more than 3,000 time faster than by finite difference!

**Savings from full marginal IM**: 3,000 times shorter computation time**Savings from full IM attribution**: 7 times shorter computation time### Conclusion

Using AD at two levels for risk-weight and correlation based risk measures improve significantly the computation time for marginal measures and attribution.

In a forthcoming blog, we will combine that with other uses of AD in MVA computations. We will add a third layer of AD. But that will probably be after the Christmas period.

**(1)**We have run all computations described 100 times in a loop and the figures reported are the averages by IM computation. If we run it only once, the times are too small. All times reported measured on the author's laptop running personal Java code.Material similar to the one described in this blog was presented at the WBS xVA conference in March 2017 and at a Thalesians seminar in April 2017, that seminar that led The Wall Street Journal to use my picture (incorrectly to my opinion) in the article "The Quants Run Wall Street Now".

## Saturday, 8 December 2018

### Copenhagen Risk conference and workshop - 23-24 January 2019

Marc Henrard will present a seminar at the conference

which will take place on Wednesday 23 January 2019. The agenda of the conference can be found on the organizer web site:

On the next day, Thursday 24 January 2019, he will present the workshop

The agenda of the workshop and registration details can be found on the organizer web site:

Marc will be in Copenhagen from 22 to 24 January. Don't hesitate to reach out if you want to meet during that time.

CFA Society Denmark Risk Conference

which will take place on Wednesday 23 January 2019. The agenda of the conference can be found on the organizer web site:

On the next day, Thursday 24 January 2019, he will present the workshop

The future of LIBOR:
Quantitative perspective on benchmarks, overnight, fallback and regulation.

The agenda of the workshop and registration details can be found on the organizer web site:

Marc will be in Copenhagen from 22 to 24 January. Don't hesitate to reach out if you want to meet during that time.

## Wednesday, 5 December 2018

### Event

Marc Henrard will attend the conference

which will take place at The Hotel Brussels on Monday 10 December 2018. The agenda of the conference can be found on the organizer web site:

Don't hesitate to reach out to Marc at the conference.

Annual Forecast Event

which will take place at The Hotel Brussels on Monday 10 December 2018. The agenda of the conference can be found on the organizer web site:

Don't hesitate to reach out to Marc at the conference.

Subscribe to:
Posts (Atom)