Internal ratings validation survey

Internal ratings validation survey

Charles A. Andrews

The International Swaps and Derivatives Association, RMA, and the British Bankers’ Association recently sponsored a global survey of 26 firms’ validation techniques for internal ratings systems. This article offers six key findings of the survey.

The game’s afoot as banks perfect their methods of rating credits within their own institutions. The payback can be significant, both from a risk prevention and a risk benefit perspective. The better the understanding and monitoring of a credit, the more efficient the use of capital and the greater certainty of action. Even without the impetus supplied by the capital requirements of complying with Basel II, banks are well advised to use everything within their reach to better manage their risks.

So where are financial institutions on the internal risk-ratings course? ISDA, RMA, and the BBA asked 26 institutions from North America, Europe, and Asia in a survey conducted by PricewaterhouseCoopers.

The survey consisted of two parts. First, each participating institution completed a questionnaire for the different asset classes (see sidebar) and a set of additional questions relating to group-wide policy. In total, this amounted to over 130 questions.

Second, a smaller number of institutions were interviewed to explore their validation approaches in further detail. The interviews were designed to ensure adequate coverage by geography, asset class, and the use of internal ratings.

While messages emerging from the survey generally apply to all asset classes, their relative importance differs considerably. This article presents key issues as well as some additional insight into other aspects of financial institutions internal ratings systems. (1)

1. Banks employ a wide range of techniques and diversity of practice with respect to internal ratings validation.

Key differences can be seen with respect to the techniques used for corporate and retail ratings.

Corporate and middle-market asset class: Model types (see Figure 1) and validation techniques differ widely among institutions within these asset classes. In general, banks use statistical models for parts of their portfolios where the quantity of default data allows robust estimation. Expert judgment models are more common for those portfolios where default data is scarce. In some cases, hybrid models are used and/or vendor models complement the picture. Statistical validation techniques are more widespread in the middle-market asset class, whereas expert judgment remains an important validation technique in the large corporate area.

The techniques used to validate internal ratings and the assignment of default probabilities to individual ratings classes also vary widely across institutions and geographies. In part this is due to the different levels of experience and sophistication of various institutions. Generally, institutions that have been using ratings for a long period of time and have built up internal data histories tend to use more quantitative/statistical techniques–if the data permits this approach. A large number of banks are beginning to address data inadequacies to support the validation regime for their internal ratings.

Retail asset class: For the retail asset class, statistical modeling techniques are used extensively to assign a score. The greater availability of internal data history allows for more robust statistical validation tests, which are more widely used in the retail area than in other asset classes. Furthermore, the survey participants rely more heavily on the statistical tests in this asset class and less on expert judgment. A small number of survey participants have even set strict thresholds and triggers for monitoring of model performance and model redevelopment.

Other asset classes: Most of the respondents have a method for establishing internal ratings for both bank and sovereign exposures, but these generally are not “modeled” in the same fashion as corporate or retail exposures. Validation for these asset classes is done mostly by benchmarking against external ratings as well as by using expert judgment. Published default statistics are widely used for default probability estimation.

Despite the relatively small magnitude of assets categorized as specialized lending, most banks surveyed do have a ratings system in place. This is in almost all cases a model incorporating a significant amount of expert judgment. Validation techniques are also almost exclusively based on expert judgment with some level of benchmarking to external sources.

2. Ratings validation is not an exact science.

Even where banks employ statistical techniques to assess model performance during development or after implementation, with the exception of a small number of banks for retail models, they do not tend to use absolute triggers or thresholds. In other words, there is no absolute GINI coefficient, COC, or [ROC.sup.2] measure or similar statistic that models need to reach in order to be considered adequate. In fact, some banks see absolute performance measures as counterproductive.

In cases where not enough internal default data exists, banks frequently resort to the use of external data–in particular, default statistics published by the major ratings agencies. How firms use these default statistics (for example, what time period to consider, which agency’s data to use, whether it is appropriate to smooth the raw data, and if so, how this is achieved) differs considerably from institution to institution and essentially depends on each bank’s assessment of the most appropriate use of this external data. Benchmarking against external ratings also raises issues with regard to the unknown quality of external ratings as well as methodology differences, such as the time horizon under consideration, the default definition, or the inclusion of loss-given-default (LGD) elements in the external rating.

3. Expert judgment is of critical importance.

In a number of asset classes–particularly large corporate but also specialized lending, banks, and sovereigns–data scarcity makes it practically impossible to develop statistically based internal ratings models. Some banks use statistical techniques to establish the suitability of particular factors to risk rate customers, but stop short of determining the relative weights of the factors through statistical techniques. Even where weights are specified, there is usually a judgmental overlay to allow a modification of the model rating based on the assessment of a ratings expert (account officer, credit analyst, or the like). Only in retail models is this judgmental overlay generally not applied to modify ratings but rather to modify the credit decision generated by the model.

Survey participants are concerned that much of the discussion around internal ratings validation is centered on statistical techniques and absolute trigger ratios. However, large proportions of banks’ exposures are, and will be for the foreseeable future, covered by expert-judgment-type ratings systems, and there is a feeling that not enough time has been spent on discussing acceptable validation techniques for these types of systems.

4. Data issues center around quantity, not quality.

While a large number of banks commented on the inadequacy of their internal data with respect to internal ratings validation, data quality was not seen as the major obstacle to validation in the long term. Most banks have initiated projects to collect the necessary data in a consistent manner across the entire institution, which should provide sufficiently robust data for validation purposes going forward.

Unlike data quality, the quantity of data–in particular, default data–poses a real problem for most institutions. This is particularly the case for the corporate, bank, sovereign, and specialized lending exposure classes, whereas default data for retail and middle-market exposures is generally considered to be adequate. For some exposure classes–notably, banks and sovereigns–it is likely that there will never be enough default data to allow robust statistical estimates of default rates with a granular rating scale. Techniques other than statistical analysis will therefore be necessary to assess the adequacy of banks’ rating systems and default probability estimates.

A number of institutions have recently joined data-pooling initiatives, both for default probability and LGD data. It is not yet possible, however, to evaluate the applicability of these data pools for ratings validation going forward. Most industry participants remain skeptical as to the ultimate usefulness of these initiatives.

5. Definite regional differences exist with respect to internal ratings and their validation.

The structure of ratings systems and the resulting validation techniques show definite regional differences. This is true both for corporate and retail exposure classes.

While statistical scorecards are widely used for retail exposures across all institutions, these tend to be product specific in the U.S. and UK, while the focus in Continental Europe is clearly on customer scores/ratings. In addition, scorecards in the U.S.IUK tend to be redeveloped much more often–using the most recent available data–than those on the Continent, where robustness of ratings and the long-term stability of factors and their respective weights are of higher priority to institutions. This often has direct implications for the statistical measures used to assess model performance (for example, the GINI coefficient), as the longer-term, more stable models tend to show lower GINIs than those models using the latest data available.

A similar divide can be observed for corporate ratings. While both North American and European banks tend to use expert judgment models for the assessment of their large corporate portfolios, the structural underpinnings to these ratings methodologies (and consequently the validation techniques) differ significantly. In North America, among survey participants, there are generally no fixed weightings for the factors to be assessed by the experts, whereas most European banks (in this case including most of the UK banks) set specific weights for each of the factors to be considered.

While vendor models based on equity market information (like KMV Credit Monitor) or balance-sheet information (like Moody’s RiskCalc) are widely used by participants for their corporate and middle-market portfolios, the application differs between North America and the rest of the world.

In North America these models tend to be an integral part of the ratings assignment process and are often used in a hybrid approach in conjunction with expert judgment. In Europe, such external vendor models are more often used as a benchmark or validation of the rating derived by the internal ratings model. Market-based methodologies do not generally lend themselves as sources for ratings assessments outside of North America, due to the absence of deeply developed capital markets and a historical reliance on bank financing.

6. Further work is necessary with respect to defining standards for stress testing.

Stress testing is widely practiced within the industry; however, no uniform approach exists regarding the type of stress testing undertaken, its frequency, or actions undertaken in response to stress–testing results.

Most banks currently undertake stress testing at a portfolio level, with risk ratings being a key input into stress-testing scenarios for economic capital requirements. There is, however, uncertainty as to the level of additional stress testing potentially required under Basel II, as the QIS 3 Technical Guidance as well as the Third Consultative Paper indicate a potential stress-resting requirement for rating model inputs. This form of stress testing is not currently undertaken (or indeed planned) by a large number of institutions, and there is some debate in the industry as to its potential relevance. Further clarification on this issue from the regulators as well as debate within the industry are therefore considered necessary.

In Summary

The survey has shown that ratings validation is not an exact science and that banks continue to see expert judgment as a key component of the validation process. As a result, banks use a wide variety of methods for ratings validation, depending on the availability of default data and additional information such as external ratings. Data is a key concern for most survey participants; however, the problem is more the scarcity of default data than the quality and integrity of data in general. (3)

Figure 1

Model Types by Asset Class

Model Type Corporate Middle Retail


Statistical 7 4 23

Expert judgment 15 11 8

External vendor 7 2 17

Hybrid 10 7 5


(1.) It is important to note that the banks participating in the survey represented to a certain extent the “cutting edge” of internal ratings development in their respective geographies. Consequently, the results of the survey may tend to overstate the actual level of sophistication across the industry–a point readers, industry bodies, and regulators will need to keep in mind when addressing the implementation of the Basel II Accord.

(2.) COC = coefficient of concordance; ROC = receiver operating characteristic.

(3.) These key conclusions, as well as the detailed survey results, need to be considered in the context of the standards set out by the Basel Committee on Banking Supervision within both the Second Consultative Paper issued in January 2001 and the technical guidance issued in relation to the Third Quantitative Impact Study (QIS) in October 2002. It should be noted that while the study was undertaken on the basis of the requirements as set out in the QIS Technical Guidance, the answers are equally applicable to the requirements set out in the Third Consultative Paper issued on April 29, 2003.

RELATED ARTICLE: Asset Class Definitions

Corporate exposures–a debt obligation of a corporation, partnership, or proprietorship. For the purposes of this survey, this category encompassed what most institutions would refer to as “large corporate” and did not include borrowers classified as Middle Market or SME, unless there were no specific rating tools associated with these smaller companies.

Middle-market / SME exposures–a subcategory of corporate exposures. In the proposed Basel Accord, SMEs are defined as corporate where the reported sales of the consolidated group of which the firm is a part are less than $50 million. In practice, banks may have a different definition of what they consider Middle Market and/or SME and this was taken into account in the survey.

Retail–encompasses such loans to individuals as revolving credits and lines of credit (e.g., credit cards, overdrafts) as well as personal term loans (e.g., installment loans, auto loans, etc.) regardless of exposure size, as well as residential mortgages to individuals. Loans extended to small businesses and managed as retail exposures can be included in retail if the exposures are less than $1 million. As with SME exposures, banks may internally use a different definition for retail and were asked to use this for the purposes of the survey.

Specialized lending–(within the Basel Accord) referred to as a subcategory of the corporate asset class. Within this subsection, five sub-classes of specialized lending are identified: Project Finance, Object Finance, Commodities Finance, Income-Producing Real Estate, and High-Volatility Commercial Real Estate. Detailed definitions of these subcategories can be found in sections 189 to 196 of the Third Consultative Paper issued in October 2002.

Bank exposures–exposures to banks and securities firms, provided these are subject to supervisory and regulatory arrangements comparable to those of the New Basel Accord, specifically with respect to risk-based capital requirements.

Sovereigns–all exposures to sovereigns and their central banks as well as claims on certain domestic PSEs if the national regulator has determined this to be acceptable. This could apply to certain regional governments and local authorities; however, it is unlikely to include administrative bodies to governments (central or otherwise) without independent revenue-raising powers or commercial undertakings, which are owned by central, regional, or local governments.

[c] 2003 by RMA. Charles Andrews and Monika Mars are senior managers in the Global Risk Management Solutions group of PricewaterhouseCoopers; Andrews is in PWC’s New York office and Mars is in the Amsterdam office.

Contact the authors by e-mail at and

COPYRIGHT 2003 The Risk Management Association

COPYRIGHT 2005 Gale Group