Neoteric - Building a risk database.pdf

Viewer
Transcript

Capital Compass - Neoteric

Building an op risk database Purpose of this section  Show of the components are integrated  Show of such integration is connected using database relationships  Take a look at a database example  Show how to connect the various interface layers to the database

2

Return on Investment of a risk framework occurs by integrating the components Credit Losses Board must be included and reporting of exposures to business units Govt Reporting

Interface for management ARIS

 

Must track losses

Interface for capture

Bank must track op losses related to credit activities & ensure no double counting. Loss Incident Database 

Mapping of income and business activities across business lines 

Process Mapping



CONTROL

Internal loss data clearly linked to activities

RISK Volumes & Stats Needed for indictor triggering & correlation context Volumes & Stats  Must capture key environment & internal control factors



Must use scenario analysis of expert opinion in conjunction with external data to evaluate its exposure to high severity events. Interface & process for reconciling Loss Database against general ledger

Process

Involve the use of indicators, thresholds & limits OR scorecards. Indicators / controls 

Use relevant external data & ensure relevance 



Event

Interface for control & manual entry/ adjustment

CONTROL RISK

Data importation / real time link Interface for management & workflow for approval

Scenario Planning

If the bank uses correlation’s it should assess impact from deviations embedded in the framework 

Early Warning

Interface for control, management & alarm [workflow]

A risk measurement system must have certain key features including scenario analysis 

Each component of the operational risk framework from controlled self assessment to scenario analysis is used to build a different picture of loss. The components are connected and correlated mathematically so that hypothesis can be proven or discarded.

The component layers of the Impellent system and how they benefit the bank Component Based The Impellent system (Causal Capital) is a component based product, divided up into the discrete areas which uniquely contribute to a different aspect of risk measurement. This allows the business to pick and choose which parts they wish to have and to allow the bank to stagger its role out plan.

Adding to what is present If the business already has a risk system but that risk system is lacking in a specific area, it is possible to fill such gaps by only purchasing a select group of modules from the Impellent suite.

Office Integration The Impellent system has its own library which allows the software to be directly integrated with Microsoft Office. From within Microsoft Excel, risk teams can connect to the database through the Impellent API. The API of course ensures that database integrity is maintained but it also provides a tight and easy to use interface for staff so there is no need to have an understanding of SQL Server or connection strings. All complex relationships in the repository are also

This is the Causal Capital impellent product but how do the parts hang together

hidden from the user but directly accessible in single line statements or formulas.

Secondary Elements

Risk & Event Centre

An actual look at a database

This is a risk database in the making showing loss relationships The system consists of five tables 1) RiskEventClassification Holds the event classifications of Basel II 2) Business Unit Allows business units to be assigned to specific losses 3) RiskRegistration Holds registered risks that the business units have selected as critical watch lists 4) Loss Stores a summary of a loss 5) LossPacket Holds the discrete packets of loss that make up a final loss that is registered in the main loss database This database is a relationship based system which allows interaction between the elements in a logical manner. This database structure could be access, sql server or oracle.

A risk system has many different layers 1) Interface In our example we have built the interface in Excel which is ideal for reporting or creating user entry interfaces that allow the user to change the database. 2) Program Layer The program layer will bind the user interfaces with the connectivity layer allowing data to be passed to Microsoft Access or SQL Server 3) Statistical Layer As Excel is not powerful enough to run some of the more complex routines require for risk management, the bank is able to use R-Project one of the most popular statistical engines that supports extreme value theory required for Loss Data Approach to risk management 4) Connectivity Layer The connectivity layer is part of the operating system which connects the program layer to the database 5) Repository Layer A storage of data, excel again is not suitable for storing large datasets and it doesn’t support multiple users.

Sub Connect_Database() Dim ConnectString As String On Error GoTo Sub_Err Set Main_Connection = New ADODB.Connection ConnectString = "AORM" Main_Connection.ConnectionString = ConnectString Main_Connection.Open Exit Sub Sub_Err: Exit Sub End Sub

Like R-Project Microsoft Excel needs to be able to see the database In Excel Spreadsheet interface that can be used as a dumping ground for data Main macro code that calls upon the class wrapper to access the database Class wrapper for object orientated programming 'Used to open the connection Sub Connect_Database() Dim ConnectString As String Dim DSN_Name As String On Error GoTo Sub_Err 'Point a variable at the ODBC connection layer Set Main_Connection = New ADODB.Connection

'Build Up Connection String to the settings that have been established in ODBC ConnectString = "AORM" 'Set the connection string Main_Connection.ConnectionString = ConnectString 'Open the creature Main_Connection.Open Exit Sub Sub_Err: Exit Sub End Sub

As with R-Project Excel needs to be able to communicate with the common repository so that both applications can share data.

The visual basic code shown to the left allows the user to connect to the database through an ADODB driver which means the destination database can be access, sql server or many other popular database systems in market place. Once the connection has been made the database can be accessed through the Main_Connection variable and all tables can be read into Excel or written to from Excel. Importantly this allows R-Project to be the high end statistical engine for advanced functionality such as Monte Carlo, Extreme Value Theory, Curve Fitting where the repository is used to interchange data between the two systems.

Controlling R-Project from within Excel R-project source code can be loaded and executed from Excel if the user has installed RSrv135 or the Rcom package in R-Project they will then allow the entire framework to be built up in the three packages interchanging between the systems with ease.

How to R-Project to a database through RODBC 1) RODBC As most statistical systems require large datasets R-Project allows the user to connect directly to any type of data source that is supported in the ODBC database connectivity available to the Microsoft Windows system In the example shown to the left, we start this process by setting up a connection to the main database through ODBC, please note this is a one off exercise. We are then able to load the RODBC library and make a connection to the AORM database and assign that to a chan variable. Once the active connection is established to our AORM database we can reach specific tables within that database by using ANSI sql statements. In the example to the left we have chosen to connect to the loss database we have created in Microsoft access and to load the entire loss table into an array that is bound as Loss.

R-Project Connection library(RODBC) chan=odbcConnect("AORM",uid="admin",case="tolower") PipeUnit=sqlQuery(chan, "Select * From [Loss]") Loss.LossID=PipeUnit[['LossID']] Loss.RiskRegID=PipeUnit[['RiskRegID']] Loss.ExpectedLoss=PipeUnit[['ExpectedLoss']] Loss.ActualLoss=PipeUnit[['ActualLoss']] odbcClose(chan) Loss=data.frame(Loss.LossID,Loss.RiskRegID,Loss.ExpectedLoss,Loss. ActualLoss) length(Loss)

Testing the connection > length(Loss) [1] 4 > Loss.LossID[1] [1] 1 > Loss.LossID[2] [1] 2 > Loss.LossID[3] [1] 3

Some pointers to assist you understand the common command language of R  q() will close R-Project  Assignment can be achieved using < Rnorm(1) generates a random variable  To list variables ls()  Help is achieved by typing help( function name )  Large numbers can be entered in using scientific notification X<-2.1e23  Loading data can be accomplished by mydata<-read.table(“data.dat”)  Create a sequence seq(1,5,05) gives you a sequence of 1 to 5 going up in incriments of 0.5  dev.cur() displays the active number of devices  dev.off(2) will close the second device window  dev.list() will list all open graphic devices  X11() will open a new graphic device  table() will build a table x=c("Yes","No","No","Yes","Yes") table(x)  factor() will show the number of factors in an array  barplot() creates a bar graph hist() creates a histogram pie() builds a pie chart boxplot() builds box chart  summary() shows a summary of a specific dataset  summary() shows a summary of a specific dataset  library() will display the available libraries  library() will display the available libraries  density() will compute kernel density estimates against a specific curve type

Example of a loop in R-Project Count=0 for (i in 1:10){ Count=Count+1 }

The algorithm used in density.default disperses the mass of the empirical distribution function over a regular grid of at least 512 points and then uses the fast Fourier transform to convolve this approximation with a discretized version of the kernel and then uses linear approximation to evaluate the density at the specified points.

The world changed and the model broke, what was the solution?

As we know wen we run the model, that the up grade to Office 2010 and Windows 7 left us with a system that didn’t work because ODBC would not play ball. The consequence was the R-Project would not communicate with Microsoft Excel.

Solution – To evolve The solution to this is to evolve and put Excel at the centre of both the communication between the systems and the database.

This new approach has some benefits

We are able to use Excel’s rich environment to run reports, manage data in the database, print outcomes and at the same time we have access to R-Projects advanced statistical library set. We can even embed RProjects advanced tools into an Excel spreadsheet.

We are able to connect Excel to R-Project through macros

Run an R-Project Function and pass data to R Plot an R-Project Chart in Excel Pull data from R to Excel Rinterface.Rrun “ Rinterface.InsertCurrentRPlot – Will allow the most recent R-Project graphic plot to be captured through the windows clip board and inserted into Excel.

The example model below fits insurance data using R-Project the

The table of data above shows losses across various risk categories. The charts to the right are representing a binned loss value and the relevant boxplot in Excel by generated from R-Project. The data is showing massive skew and it is possible loss values are not fitting the Gaussian Distribution.

Our data is not fitting the normal experience The QQ-Blot across a the normal distribution in RProject is showing significant clustering more so than deviation in the upper tail.

In the case above the single left tail forces a smaller number to be obtained and an underestimate of potential loss if the normal distribution is used. Yet upper clustering is a shifted distribution which is actually quite normal but the entire loss experience needs to be modelled in a tail type II curve.

Goodness of fit tests also confirm low P-Values in the goodness of fit tests and that such clustering will underestimate the potential loss because the Gaussian function can’t represent the data. H0: Sample data come from the stated distribution HA: Sample data do not come from the stated distribution So many risk analysts select the wrong curve and so often the normal or log normal curve is used when in reality clustering and tails are showing as a different experience.

The curve fitting approach used in R-Project Step 1 – Clean / Stratify

Step 2 – Curve Choice

Normal Curve

Extract the data from the database, bin the data and look for natural clustering growth or decay. If the model can’t be represented in a single distribution then bin the data to change shape of the potential curve and model as stratified. Describe curve estimates

Step 3 – Fit the data

Type of model or function is argued by a hypothesis tagging the nature of data from often histograms and other techniques.

A scary thought Curve fitting frames the analyst to look at data as continuous but not changing and randomness conceals internal growth and leverage functions in the data.

The addition of skewness over time. LogNormal Curve

Parameters estimate 1) analogic 2) moments 3) maximum likelihood

Weibull distribution The chi-square test is an old goodness of fit test.. The test may be thought of as a formal comparison of a histogram with the fitted density. where Oi is the observed frequency for bin i and Ei is the Exponential expected frequency for bin i. The expected frequency is calculated by cumulative distribution function.

Additionally limited loss experience makes it difficult to identify tail shifting behavior in the data. There are probably too many models that are historically fitting to outcome and that shifts in predictions are slower in the model than they are being realized.

R-Project has an entire function that can be used for fitting

Neoteric - Building a risk database.pdf

the software to be directly integrated with Microsoft. Office. From within Microsoft Excel, risk teams can. connect to the database through the Impellent API.

Download PDF

1MB Sizes 3 Downloads 136 Views

Report

Neoteric - Building a risk database.pdf

Recommend Documents