Capital Compass - Neoteric
Building an op risk database Purpose of this section Show of the components are integrated Show of such integration is connected using database relationships Take a look at a database example Show how to connect the various interface layers to the database
2
Return on Investment of a risk framework occurs by integrating the components Credit Losses Board must be included and reporting of exposures to business units Govt Reporting
Interface for management ARIS
Must track losses
Interface for capture
Bank must track op losses related to credit activities & ensure no double counting. Loss Incident Database
Mapping of income and business activities across business lines
Process Mapping
CONTROL
Internal loss data clearly linked to activities
RISK Volumes & Stats Needed for indictor triggering & correlation context Volumes & Stats Must capture key environment & internal control factors
Must use scenario analysis of expert opinion in conjunction with external data to evaluate its exposure to high severity events. Interface & process for reconciling Loss Database against general ledger
Process
Involve the use of indicators, thresholds & limits OR scorecards. Indicators / controls
Use relevant external data & ensure relevance
Event
Interface for control & manual entry/ adjustment
CONTROL RISK
Data importation / real time link Interface for management & workflow for approval
Scenario Planning
If the bank uses correlation’s it should assess impact from deviations embedded in the framework
Early Warning
Interface for control, management & alarm [workflow]
A risk measurement system must have certain key features including scenario analysis
Each component of the operational risk framework from controlled self assessment to scenario analysis is used to build a different picture of loss. The components are connected and correlated mathematically so that hypothesis can be proven or discarded.
The component layers of the Impellent system and how they benefit the bank Component Based The Impellent system (Causal Capital) is a component based product, divided up into the discrete areas which uniquely contribute to a different aspect of risk measurement. This allows the business to pick and choose which parts they wish to have and to allow the bank to stagger its role out plan.
Adding to what is present If the business already has a risk system but that risk system is lacking in a specific area, it is possible to fill such gaps by only purchasing a select group of modules from the Impellent suite.
Office Integration The Impellent system has its own library which allows the software to be directly integrated with Microsoft Office. From within Microsoft Excel, risk teams can connect to the database through the Impellent API. The API of course ensures that database integrity is maintained but it also provides a tight and easy to use interface for staff so there is no need to have an understanding of SQL Server or connection strings. All complex relationships in the repository are also
This is the Causal Capital impellent product but how do the parts hang together
hidden from the user but directly accessible in single line statements or formulas.
Secondary Elements
Risk & Event Centre
An actual look at a database
This is a risk database in the making showing loss relationships The system consists of five tables 1) RiskEventClassification Holds the event classifications of Basel II 2) Business Unit Allows business units to be assigned to specific losses 3) RiskRegistration Holds registered risks that the business units have selected as critical watch lists 4) Loss Stores a summary of a loss 5) LossPacket Holds the discrete packets of loss that make up a final loss that is registered in the main loss database This database is a relationship based system which allows interaction between the elements in a logical manner. This database structure could be access, sql server or oracle.
A risk system has many different layers 1) Interface In our example we have built the interface in Excel which is ideal for reporting or creating user entry interfaces that allow the user to change the database. 2) Program Layer The program layer will bind the user interfaces with the connectivity layer allowing data to be passed to Microsoft Access or SQL Server 3) Statistical Layer As Excel is not powerful enough to run some of the more complex routines require for risk management, the bank is able to use R-Project one of the most popular statistical engines that supports extreme value theory required for Loss Data Approach to risk management 4) Connectivity Layer The connectivity layer is part of the operating system which connects the program layer to the database 5) Repository Layer A storage of data, excel again is not suitable for storing large datasets and it doesn’t support multiple users.
Sub Connect_Database() Dim ConnectString As String On Error GoTo Sub_Err Set Main_Connection = New ADODB.Connection ConnectString = "AORM" Main_Connection.ConnectionString = ConnectString Main_Connection.Open Exit Sub Sub_Err: Exit Sub End Sub
Like R-Project Microsoft Excel needs to be able to see the database In Excel Spreadsheet interface that can be used as a dumping ground for data Main macro code that calls upon the class wrapper to access the database Class wrapper for object orientated programming 'Used to open the connection Sub Connect_Database() Dim ConnectString As String Dim DSN_Name As String On Error GoTo Sub_Err 'Point a variable at the ODBC connection layer Set Main_Connection = New ADODB.Connection
'Build Up Connection String to the settings that have been established in ODBC ConnectString = "AORM" 'Set the connection string Main_Connection.ConnectionString = ConnectString 'Open the creature Main_Connection.Open Exit Sub Sub_Err: Exit Sub End Sub
As with R-Project Excel needs to be able to communicate with the common repository so that both applications can share data.
The visual basic code shown to the left allows the user to connect to the database through an ADODB driver which means the destination database can be access, sql server or many other popular database systems in market place. Once the connection has been made the database can be accessed through the Main_Connection variable and all tables can be read into Excel or written to from Excel. Importantly this allows R-Project to be the high end statistical engine for advanced functionality such as Monte Carlo, Extreme Value Theory, Curve Fitting where the repository is used to interchange data between the two systems.
Controlling R-Project from within Excel R-project source code can be loaded and executed from Excel if the user has installed RSrv135 or the Rcom package in R-Project they will then allow the entire framework to be built up in the three packages interchanging between the systems with ease.
How to R-Project to a database through RODBC 1) RODBC As most statistical systems require large datasets R-Project allows the user to connect directly to any type of data source that is supported in the ODBC database connectivity available to the Microsoft Windows system In the example shown to the left, we start this process by setting up a connection to the main database through ODBC, please note this is a one off exercise. We are then able to load the RODBC library and make a connection to the AORM database and assign that to a chan variable. Once the active connection is established to our AORM database we can reach specific tables within that database by using ANSI sql statements. In the example to the left we have chosen to connect to the loss database we have created in Microsoft access and to load the entire loss table into an array that is bound as Loss.
R-Project Connection library(RODBC) chan=odbcConnect("AORM",uid="admin",case="tolower") PipeUnit=sqlQuery(chan, "Select * From [Loss]") Loss.LossID=PipeUnit[['LossID']] Loss.RiskRegID=PipeUnit[['RiskRegID']] Loss.ExpectedLoss=PipeUnit[['ExpectedLoss']] Loss.ActualLoss=PipeUnit[['ActualLoss']] odbcClose(chan) Loss=data.frame(Loss.LossID,Loss.RiskRegID,Loss.ExpectedLoss,Loss. ActualLoss) length(Loss)
Testing the connection > length(Loss) [1] 4 > Loss.LossID[1] [1] 1 > Loss.LossID[2] [1] 2 > Loss.LossID[3] [1] 3
Some pointers to assist you understand the common command language of R q() will close R-Project Assignment can be achieved using < Rnorm(1) generates a random variable To list variables ls() Help is achieved by typing help( function name ) Large numbers can be entered in using scientific notification X<-2.1e23 Loading data can be accomplished by mydata<-read.table(“data.dat”) Create a sequence seq(1,5,05) gives you a sequence of 1 to 5 going up in incriments of 0.5 dev.cur() displays the active number of devices dev.off(2) will close the second device window dev.list() will list all open graphic devices X11() will open a new graphic device table() will build a table x=c("Yes","No","No","Yes","Yes") table(x) factor() will show the number of factors in an array barplot() creates a bar graph hist() creates a histogram pie() builds a pie chart boxplot() builds box chart summary() shows a summary of a specific dataset summary() shows a summary of a specific dataset library() will display the available libraries library() will display the available libraries density() will compute kernel density estimates against a specific curve type
Example of a loop in R-Project Count=0 for (i in 1:10){ Count=Count+1 }
The algorithm used in density.default disperses the mass of the empirical distribution function over a regular grid of at least 512 points and then uses the fast Fourier transform to convolve this approximation with a discretized version of the kernel and then uses linear approximation to evaluate the density at the specified points.
The world changed and the model broke, what was the solution?
As we know wen we run the model, that the up grade to Office 2010 and Windows 7 left us with a system that didn’t work because ODBC would not play ball. The consequence was the R-Project would not communicate with Microsoft Excel.
Solution – To evolve The solution to this is to evolve and put Excel at the centre of both the communication between the systems and the database.
This new approach has some benefits
We are able to use Excel’s rich environment to run reports, manage data in the database, print outcomes and at the same time we have access to R-Projects advanced statistical library set. We can even embed RProjects advanced tools into an Excel spreadsheet.
We are able to connect Excel to R-Project through macros
Run an R-Project Function and pass data to R Plot an R-Project Chart in Excel Pull data from R to Excel Rinterface.Rrun “ Rinterface.InsertCurrentRPlot – Will allow the most recent R-Project graphic plot to be captured through the windows clip board and inserted into Excel.
The example model below fits insurance data using R-Project the
The table of data above shows losses across various risk categories. The charts to the right are representing a binned loss value and the relevant boxplot in Excel by generated from R-Project. The data is showing massive skew and it is possible loss values are not fitting the Gaussian Distribution.
Our data is not fitting the normal experience The QQ-Blot across a the normal distribution in RProject is showing significant clustering more so than deviation in the upper tail.
In the case above the single left tail forces a smaller number to be obtained and an underestimate of potential loss if the normal distribution is used. Yet upper clustering is a shifted distribution which is actually quite normal but the entire loss experience needs to be modelled in a tail type II curve.
Goodness of fit tests also confirm low P-Values in the goodness of fit tests and that such clustering will underestimate the potential loss because the Gaussian function can’t represent the data. H0: Sample data come from the stated distribution HA: Sample data do not come from the stated distribution So many risk analysts select the wrong curve and so often the normal or log normal curve is used when in reality clustering and tails are showing as a different experience.
The curve fitting approach used in R-Project Step 1 – Clean / Stratify
Step 2 – Curve Choice
Normal Curve
Extract the data from the database, bin the data and look for natural clustering growth or decay. If the model can’t be represented in a single distribution then bin the data to change shape of the potential curve and model as stratified. Describe curve estimates
Step 3 – Fit the data
Type of model or function is argued by a hypothesis tagging the nature of data from often histograms and other techniques.
A scary thought Curve fitting frames the analyst to look at data as continuous but not changing and randomness conceals internal growth and leverage functions in the data.
The addition of skewness over time. LogNormal Curve
Parameters estimate 1) analogic 2) moments 3) maximum likelihood
Weibull distribution The chi-square test is an old goodness of fit test.. The test may be thought of as a formal comparison of a histogram with the fitted density. where Oi is the observed frequency for bin i and Ei is the Exponential expected frequency for bin i. The expected frequency is calculated by cumulative distribution function.
Additionally limited loss experience makes it difficult to identify tail shifting behavior in the data. There are probably too many models that are historically fitting to outcome and that shifts in predictions are slower in the model than they are being realized.
R-Project has an entire function that can be used for fitting