influence of bureau scores, customized scores and judgemental review on the bank underwriting decision-making process, The
Collins, M Cary
Abstract In recent years commercial banks have moved toward automated forms of underwriting. This study employs unique bank loan
level data from a scoring lender to determine whether automated underwriting exhibits a potential “disparate impact” across income strata. The findings indicate that strict application of this custom scoring model leads to higher denial rates for low- to moderate-income borrowers when compared with both a naive judgmental system and a bureau scoring approach. These results suggest that financial regulators should focus more resources on the evaluation and study of customized scoring models.
Introduction
Statistically-based credit decision-making systems were pioneered during the late 1950s but only became commonplace during the 1990s.1 These statistically-based techniques are commonly referred to as “credit scoring” models.2 In recent years the use of credit scoring models has become widespread in the mortgage lending industry.3 In addition to its use in the underwriting process, credit scoring is also employed by secondary market purchasers of mortgage loans, including the government-sponsored enterprises (GSEs), and by providers of private mortgage insurance. For example, two GSEs (Fannie Mae and Freddie Mac) issued advisory letters in 1995 encouraging mortgage originators to consider credit scores from the major credit bureaus in their underwriting decisions (Fannie Mae, 1995; Freddie Mac, 1995). In addition, the three national credit bureaus have developed scoring systems designed specifically for the mortgage market.
Proponents of credit scoring and of the “automated underwriting” process that benefits from scoring’s employment, argue that it lowers the overall cost of making credit available to consumers, while simultaneously increasing the speed and objectivity of underwriting decisions.4 Detractors of credit scoring models argue, however, that the underwriting variables employed and the weights assigned to each variable are based on the payment performance of traditional consumers.5 As such, scores generated by these models may not accurately portray the creditworthiness of underrepresented groups in the applicant pool, such as low– income and minority applicants. In particular, scoring models typically omit certain nontraditional indicators of credit performance, such as rent and utility payment histories, which are important components of credit performance for many low-income applicants.6
A primary conjecture of this study is that custom credit scoring systems yield a disparity in low-income denials relative to upper income denials, since these scoring systems neglect compensating factors, or creditworthiness-related attributes, that are more common for low-income applicants.7 Specifically, the following hypothesis is tested:
H^sub 0^: Judgmental systems reduce the denial disparity between low-income and upper-income applicants over a custom credit scoring system, ceteris paribus.8
Unique data on unsecured home-improvement loans from a large lender, using an overlay system of both custom and credit bureau scores, is employed in the underwriting process to test whether the use of credit scoring models has a “disparate impact” on low-to-moderate income (LMI) applicants. Thus, these data provide a unique opportunity for examining the effects of several possible underwriting scenarios on applicant outcomes. The use of a customized scoring model results in a significantly higher disparity in denial frequencies between applicant income segments as compared with a machine-replicated “judgmental” model or when compared with a decision model based solely on generic credit bureau scores. The results suggest that medium- and upper-income applicants benefit disproportionately from implementation of this single custom scoring model, and that inclusion of alternative creditworthiness variables for LMI applicants reduces the disparate impact of credit scoring.
The remainder of the paper is outlined as follows. The next section describes the credit scoring process and provides the theoretical underpinnings. The following section describes the data sources, provides descriptive statistics and details the empirical methods. The next section provides the results of comparisons of the outcomes for the “judgmental” model with those of the credit scoring models to assess whether credit scoring has a disparate impact on LMI applicants. The section also identifies the variables in the custom scoring model that appear to drive the denial rate disparities.9 The final section is the conclusion.
Credit Scoring Process, Types and Potential Disparate Impact
Credit scores are statistically derived measures of creditworthiness that rank order credit applicants according to their degree of credit or default risk.10 A score is typically associated with an odds-ratio, addressing the question: How many applicants are likely to exhibit payment streams that become delinquent (or default) at the corresponding score? Although the models do not predict the absolute level of risk nor which borrowers within a score range are likely to perform poorly, the literature has shown them to be effective tools for rank– ordering the risk of applicants (see Avery, Bostic, Calem and Canner, 1996; Freddie Mac, 1996; and Pierzchalski, 1996). For example, Avery et al. (1996) find that borrowers in the lowest of three credit score groupings comprised only 2% of seasoned, conventional fixed-rate mortgages, but represented 32% of those that became delinquent.
Scoring systems are typically implemented in three ways. In the first approach, banks employ scoring models to eliminate the tails of the credit distribution from further consideration. In this scenario, borrowers with very high scores are approved immediately, those with the lowest scores are rejected immediately, and the remaining applicants in the middle of the distribution, so called “marginal” applicants, are underwritten judgmentally.11 In the second approach, the bank relies solely on the score in choosing whether to underwrite the credit application.12 The latter method is more common at commercial banks, reasoning that this approach minimizes the possibility of disparate-treatment issues-particularly for marginal applicants.13 As defined by the courts, disparate treatment occurs when similarly situated persons are treated differently on the basis of race.14 Disparate impact is another form of discrimination recognized by the courts. Disparate impact occurs when a policy has a disproportionate adverse impact on applicants from a protected group, unless this guideline or policy can be justified as a business necessity that cannot reasonably be achieved as well by means that have less of an impact on the protected class. This is an important area of discrimination that lenders must consider when designing scoring models since certain variables may be found to have a disparate impact on protected applicants.
Finally, scores are often employed as one of several critical elements in the underwriting process for complex loans. For example, banks typically will pull mortgage scores when underwriting these large loans and use this information in addition to several other critical variables, such as debt-to-income ratios, major delinquencies and previous bankruptcies.
Application (custom) scorecards and credit bureau scorecards are the two most common types of scoring tools employed in screening credit applicants. There are two important distinctions between these two types of scoring tools. The first distinction is that credit bureau scorecards consider information related to an applicant’s experience with debt repayment. These scorecards do not consider the non-credit-related characteristics of the applicant, such as income and employment history, included in a mortgage or home improvement credit application. Application, or custom, scorecards employ both the credit bureau information and information on these additional characteristics of the applicant. The second distinction between these two scoring approaches is that application scorecards are typically based on the historical performance of the bank’s approved applicants,15 while credit bureau scores are built off national samples that are less specific to the bank’s applicant population.16 Thus, given the differing scope and purposes of these two scoring tools, the techniques may lead to different underwriting decisions.
Previous literature assessing the influence of credit scoring in the underwriting process is sparse and focuses primarily on the role of bureau scores in that process. For example, Avery, Bostic, Calem and Canner (2000) examine several statistical issues related to credit scoring, using aggregate data, and show that omitted variables in the construction and use of bureau scores can create two problems. First, Avery et al. argue that the omission of valid predictors of creditworthiness can create inappropriate rank orderings of likely default or delinquency. Second, they argue that failure to develop a bureau scoring model using a population representative of the target population can compromise the model’s effectiveness. Specifically, they find significant variation in bureau scores across a number of economic, geographic and demographic groups, suggesting that the omitted variables and under-representation issues warrant further attention.
This paper extends the Avery et al. (2000) research by examining the loan-level data and underwriting decisions of a bank employing a custom scoring model. The underwriting decisions derived from customized, credit bureau, and machine– replicated judgmental approaches are evaluated to demonstrate how these outcomes vary by income group.” The findings indicate that the custom-scorecard decisions lead to even larger disparities in high income versus LMI denial rates than those disparities created using either the credit bureau score or the machine– replicated “judgmental” model approach. These results suggest that the issues of both omitted variable bias and the under-representation of certain sub-populations (e.g., LMI) in model development may be even greater for some customized models.18
Data Description and Empirical Methods
This study analyzes 1996 data on 2,266 unsecured, home-improvement loan applications drawn from a large regional lender’s activities in a single MSA. As such, the pool of credits is relatively homogeneous and the underwriting standards relatively stable across time. The application-level data also include information on the income from the application, the bureau and customized credit scores, and the score attributes, or individual score loadings, for all applicants. Low- to moderate-income individuals are defined as those with incomes below the U.S. Department of Housing and Urban Development’s 1996 MSA median income for this geography, while upper income are defined as those with incomes at or above the MSA median income.19
Descriptive Statistics
Exhibit 3 contains the breakdown of the 2,266 applications by income group. There are 1,698 applications from LMI applicants, accounting for 74.9% of the sample and 568 applications from upper-income applicants, accounting for 25.1% of the sample. As such, the sample has a reasonable balance and sufficient representation for both groups to perform hypothesis testing.21 Exhibit 4 describes the data overall, and then stratified by income group. From the mean difference test on application income, it can be seen that LMI applicants have significantly lower incomes than upper-income applicants ($23,176 vs. $81,267), as expected. The mean difference tests for credit bureau scores and for custom credit scores are more revealing, however, as LMI applicants overall have significantly lower credit scores relative to upper-income applicants for both the custom (189 vs. 218) and bureau (661 vs. 678) score measures.21 The Kolmogorov-Smirnov tests for differences in the second and higher moments of these two distributions show that the credit bureau score distribution for lower-income applicants is significantly different from the distribution of scores for upper-income applicants. This characteristic holds true for the custom credit score and income distributions, as well.
Exhibit 4 also provides the difference of means tests and distributional comparisons for the individual attributes, or factor loadings, of the custom scorecard. These attributes include: time at current address, number of bank trade lines, finance company credit inquiries, overall credit inquiries, number of times 30-60 days late, applicant income, trade lines opened in less than 1 year, highest revolving credit limit, number of satisfactory credits and age of the credit bureau trade file in months. The two groups, LMI and upper income, reveal significant statistical differences across each of these attributes except finance company credit inquiries and overall credit inquiries. The only scorecard attribute on which LMI applicants fare better is time at current address. This longer stay in residence may indicate a lack of upward mobility by these applicants, providing some evidence that risk characteristics may not be the same across income strata.
Exhibit 5 contains the Pearson correlation coefficients for the credit bureau score, custom credit score and the variables included in the custom scoring model. Not surprisingly, there is a relatively high level of correlation among many of the credit attributes. Exhibit 6 contains the frequencies of application outcomes shown three different ways: actual outcomes, credit bureau-scored outcomes and custom– scored outcomes. Panel A reveals the breakdown of actual outcomes into approvals and denials, by income group as rendered by the lender. Of 1,698 LMI applicants, 890 were approved, representing an approval rate of 52.41%. Of 568 upper-income applicants, 431 were approved, resulting in an approval rate of 75.88%. The difference between these two approval rates, representing a disparity of 23.47% in favor of upper-income applicants, is statistically significant at the 99% level (Chi-square statistic of 96.40).
Panel B of Exhibit 6 details the approval and denial breakdown scenario should the applicants have been judged solely on the merits of the credit bureau score, using the cutoff score of 651 to create a denial rate for the group, which is identical to the overall actual denial rate for the sample, 41.7%. In this instance, the disparity in denial rates narrows from the original case of 23.47% to a disparity of 11.4%. This difference in proportions between the two groups is, however, still significant at the 99% level.
The final panel of Exhibit 6, Panel C, contains the approval and denial breakdown scenario should the applicants have been judged solely on the merits of the custom score, using the cutoff score of 191 to recreate the overall actual denial rate of 41.7% in the sample. In this instance, the difference in denial rates between LMI and upper-income applicants widens relative to the two previously calculated disparities. In fact, the LMI and upper income denial rates are 49.59% and 18.84%, respectively. This difference is more than 30 percentage points and represents the widest disparity among these underwriting scenarios.
To recap, the results from Exhibit 6 show that the actual bank decisions strike a middle ground between the decisions dictated by the strict application of a custom credit score and the strict application of a credit bureau score. The most noticeable result, however, is the lower disparity that emerges through the strict application of the credit bureau score. This result may reflect the fact that credit bureaus do not have similar information sets relative to the banks (e.g., income, time at address) or alternatively, credit bureaus may be more concerned with disparate impact issues.22 This is an important finding since community groups have often criticized the use of generic bureau scores, claiming they result in a bias against accepting LMI and minority applicants.
Empirical Methods
Since marginal applicants are most likely the ones impacted by changes in underwriting techniques, a similar set of tests for a group of marginal applicants is examined. These marginal applicants represent those that have either characteristics or scores that are closer to the cutoff than the scores of the general applicant population. Marginal applicants are applicants receiving an aggregate custom score of between 195 and 210. This group includes 444 applicants, or 20%Io of the overall sample. After comparing the disparities across the two underwriting approaches, three factors (the number of finance company inquiries, length of credit history, and applicant income) are identified in the custom score model that drive the denial rate disparity between LMI and upper-income applicants in a custom scoring approach.23
Results
Exhibit 7 contains the results for the judgmental underwriting model for the full sample. The model includes factors that would appear in a judgmental review: (1) a comprehensive measure of creditworthiness provided by the credit bureau score; (2) the number of major credit delinquencies; (3) the number of minor credit delinquencies; and (4) a binary variable for whether the borrower has a prior relationship with this lender.
Panel A of Exhibit 7 contains the factor coefficients derived from the logistic regression, as well as the Chi-square statistics for significance. The panel shows that three of the four coefficients are significant at the 99% level and with the proper sign. The results indicate that having a prior relationship with the bank, a higher bureau score and fewer major credit derogatories lowers the likelihood of applicant denial. Panel B contains the summary statistics for this logistic regression. The judgmental model has a concordance, or goodness of fit, of 90.8%, indicating that based on a probability of denial of 41.7%, the model outcomes agree with the actual outcomes 90.8% of the time. Thus, the overall fit of the model is comparatively high and two of the three additional judgmental factors in the model are highly significant. Note that minor credit derogatories and major credit derogatories are highly correlated, likely dampening their individual influence in the regressions.
Using the probability of approval derived from the judgmental model for each applicant and the cutoff of 41.7%, each applicant is classified as either an approval or a denial in Panel C of Exhibit 7, separating the applicants into two groups based on income. The resulting disparity is 13.0%, substantially smaller than the 30.8% disparity presented in Exhibit 6, where the custom score was used as the sole underwriting criteria.
Exhibit 8 contains the logistic regression results from using the ten factors from the custom score underwriting model. When interpreting the factor coefficients shown in Panel A, remember that the variables are scorecard points so that all attributes should have a negative sign. For example, the higher the Number of 30to-60 day late payments, the fewer the points the applicant receives on that attribute and the higher the likelihood of denial. The only factor in the model that is insignificant is Time at current address for which LMI applicants have higher scores. The summary statistics in Panel B reveal that the model has a concordance, or goodness of fit, of 91.4%.
Panel C of Exhibit 8 contains the denial rate outcomes. For LMI applicants, there are 814 denials, a denial rate of 47.9%. For upper-income applicants, there are 132 denials, a denial rate of 23.2%. The resulting denial disparity between these two groups of 24.7% is roughly double the judgmental model disparity reported in Exhibit 7.
In sum, both approaches-judgmental and scorecard-result in significant disparities between LMI and upper-income applicants. On further review, however, the results show that the smaller of the two disparities is for the machine-replicated judgmental credit underwriting approach, affirming the null hypothesis that judgmental systems reduce the denial disparity between low-income and upperincome applicants over a custom credit scoring system, ceteris paribus.
As previously discussed, the merits of credit scoring are best examined by reviewing marginal applicants. These are the applicants with credit scores that are at or near the cutoff for denial. For the tests on the marginal sample, the breakdown is shown across income groups in Exhibit 9. In this sub-sample of 444 applicants, there are 317 LMI applicants and 127 upper-income applicants. For the LMI group, the actual denial percentage from the decision file is 23.7%, and for the upper-income group, the denial percentage is 22.1%. As expected for the marginal sample, these two proportions across income groups are not statistically different at the weakest permissible statistical level of 90%.
Exhibit 10 provides the difference of means tests and distributional comparisons for the custom and bureau scores for the 444 marginal applicants stratified by LMI versus upper income. As expected, differences between these marginal LMI and upper-income applicants are not as pronounced as those differences found for the full sample. The two income groups, however, are ranked differently by the two credit scoring methods. LMI marginal applicants have significantly higher credit bureau scores, but significantly lower custom scores when compared with upper-income applicants.
Exhibit 10 also provides the difference of means tests and distributional comparisons for the individual custom scorecard attributes by income grouping. Of these differences, marginal LMI applicants actually fare better than upperincome applicants on a few of the metrics, including time at current address, number of credit inquiries, number of credit derogatories and age of the trade file in months.
The same machine-replicated judgmental underwriting model constructed for the overall sample in Exhibit 7 is applied to the sub-population of marginal applicants (see Exhibit 11). Panel A of Exhibit 11 contains the information on parameter estimates and significance for this subsample. These variables display different strengths and significance levels when compared with the same coefficients and tests for the full sample. For example, the number of minor derogatory credits is not a significant determinant of credit denial for these marginal applicants, while the credit bureau score and the existence of a prior relationship with the lender remain significant. Panel B contains the summary statistics from the logistic regression, revealing a concordance of 70.6% for this model.
Using the lender’s cutoff of 23.4% for the entire marginal sample, each of the applicants is classified as either an approval or a denial in Panel C of Exhibit 11. Of the 317 LMI applicants, 63 are denied using this rule, resulting in a denial rate of 19.9%. Forty-one of the 127 upper-income applicants are denied using this rule, resulting in a denial rate of 32.3%. The stated disparity is 12.4% in favor of LMI applicants, and is statistically significant.
Exhibit 12 shows the results for the ten-factor custom score model for the marginal sample. Here the disparity remains in favor of LMI applicants, but the size of the disparity is reduced by more than 60% to 4.7%. The LMI denial rate increases to 22.1% from 19.9% in the custom model, while the upper-income denial rate decreases to 26.8% from 32.3%. This result provides further support for the null hypothesis that judgmental systems result in lower denial disparity between low– income and upper-income applicants compared to custom systems.
Exhibit 13 shows the outcomes for a seven-factor custom model that eliminates the three types of variables argued by credit-scoring detractors as likely to result in disparate impact. These variables include Applicant income, the number of finance company inquiries and the highest revolving credit limit. Detractors have argued that income and credit limits are not robust predictors of creditworthiness and therefore should be scaled in a manner that better reflects the borrower’s ability to pay, such as the debt-to-income ratio and the current debt-to-credit limit. Finally, given that low-income and minority applicants are more likely to use nontraditional financial providers (e.g., finance companies), the inclusion of the number of finance company inquiries has also been attacked as having the potential to disparately impact these groups by creating abnormally high rates of incidence.
The custom score factor model from Exhibit 12 is re-estimated in Exhibit 13 after omitting these three variables. Panel A of Exhibit 13 contains the outcomes for the full sample of credit applicants, while Panel C contains the outcomes for the marginal group. For both samples, the seven-factor model results in significantly lower disparities than those derived from the full ten-factor custom model, confirming the hypothesis that the use of these three excluded variables increase the likelihood of denial for LMI applicants. For example, the full sample disparity of 14.4% in the seven-factor model compares with a denial disparity of 24.7% from application of the ten-factor model. Similarly, the marginal group disparity is 10.5% in favor of LMI applicants in the seven-factor model, which is more than twice the 4.7% favorable disparity of the ten-factor model. When comparing these custom model disparities with the outcomes of the judgmental model in both the full sample and the marginal sample, however, the judgmental results are still more favorable to LMI applicants.
Conclusion
As a result of the underwriting evolution toward the use of credit scoring in mortgage lending, scoring is at the forefront of the policy debate surrounding fair lending and potential disparate impact. Using 1996 loan application data on home improvement loans from a large commercial bank, a framework is developed to examine whether this system of credit scoring leads to more significant denial disparities between LMI applicants versus upper-income applicants when compared with disparities observed from a judgmental underwriting approach. Custom credit scoring systems are hypothesized to result in larger disparities for LMI applicants, since these models neglect compensating or nontraditional credit factors that are more common for LMI applicants. The findings confirm this hypothesis. For example, the results from the logistic regression models of denial for the entire sample indicate LMI applicants fare significantly worse in a customized scoring environment as compared to a machine-replicated judgmental regression model. When restricting the sample to marginal applicants with credit scores around the denial cutoff level, the LMI applicants fare better in the judgmental system over the custom credit score system.
These findings are important for the current policy debate over the impact of credit scoring on LMI applicants. Proponents of credit scoring technology point out that scoring improves the objectivity of the loan decision and lowers the overall cost and time required to underwrite loans. Scoring detractors, however, are concerned that these models lack sufficient flexibility and often omit information important to the credit profile of LMI and minority applicants. The findings lend support to the latter argument. In sum, use of a custom credit score as the sole criteria in underwriting home improvement loan applications results in larger denial disparities between LMI versus upper-income applicants, ceteris paribus.
Finally, the results have important implications for bank supervision. Currently, bank supervisors, financial regulators and researchers focus their fair lending concern on dealing with disparate treatment. The results suggest that the next generation of fair lending research, however, should begin to address issues related to the potential disparate impact issues of credit scoring. This need is especially high for internally developed scoring models that have not been subject to much external scrutiny.
Although this study focuses on the implementation of a single credit scoring model and the resulting underwriting disparities, future research must extend these results with loan performance data. Such research would determine if the inclusion of potentially discriminatory variables resulted from business necessity, in that these variables significantly influence the likelihood of default or default loss. If the potentially discriminatory variables) shows a strong relation to delinquency or default, research should assess the adequacy of alternative credit scoring variables that have a smaller adverse impact on certain segments of the population while maintaining or improving the predictiveness of delinquency or default.
Endnotes
1 See Lewis (1994) for a history of credit scoring models. More recently, Mays (2001) examines development, validation and implementation issues related to scoring that includes a section on generic and customized models.
2 Various statistical methods are employed in scoring systems, including linear programming, neural networks, logit analysis and discriminant analysis. The statistical methodology employed depends on the expertise of the scorecard developer.
3 For an excellent review of the adoption of credit scoring in the mortgage lending industry, see Straka (2001).
4 Fair Isaac, one of the primary developers of scoring models employed by banks, estimates that when a bank changes from a judgmental to a scoring system they have a 20%-30% increase in the number of applicants accepted with no increase in the loss rate. This is in addition to the reduction in processing costs and faster turnaround time.
5 Traditional consumers include upper income individuals with fairly extensive and long lasting credit histories.
6 Banks and the credit bureaus do not collect or employ nontraditional forms of credit worthiness, such as information on payment history of utility bills and rent payments in their scoring models. Thus, it is argued by detractors of credit scoring that these models may not gauge adequately the true risk of this segment of the population.
7 In recent years, these issues have been compounded by the fact that some sub-prime lenders, that typically serve nontraditional groups, have neglected to report positive information on payment histories to credit bureaus to keep these profitable, but “high” risk, customers captive. See Credit Bureaus Move Against Lenders that Withhold Information, American Banker December 30, 1999.
8 One of the motivations for adopting scoring systems is to reduce the likelihood of disparate treatment. The cultural affinity hypothesis suggests that marginal applicants may be more stringently evaluated if their background differs from that of the underwriter (Hunter and Walker, 1996). This occurs in a judgmental system if the “judgment” is not applied uniformly across protected classes. Discretion in overriding a credit scored outcome, however, when based on additional qualitative information, also
offers the prospect for improving the likelihood of repayment. This augmented information set will be especially important for applicants who have positive attributes not captured by the scoring model.
9 Given that the authors do not possess loan performance data for the sample, an assessment of whether the inclusion of these variables results from “business necessity” cannot be made.
10 This study is solely concerned with the influence of scoring on the approval/denial decision. Scoring type models are used by banks for various other functions including increasing or decreasing credit lines or loan rates and in the loan monitoring process.
11 Banks typically use small business scoring systems in this manner.
12 Banks typically use home improvement scoring systems in this manner.
13 If banks implementing scoring models pen-nit “overrides” of decisions, however, then it is still possible for disparate treatment issues to arise.
14 Religion, national origin, sex, marital status and receipt of public assistance are also prohibited as a basis for lending decisions under the Equal Credit Opportunity Act.
15 If the bank does not have historical performance data to build a customized model, the developers can pull data from one of the credit bureaus using criteria that will match the overall demographics of the bank’s applicant pool.
16 In fact, no single generic “bureau” model scores all applicants. The credit depositories have approximately fifteen bureau models for different segments of the population. For example, there is a thick file sub-population, a thin file sub-population and a highly delinquent sub-population. The scores of each of these cards are risk adjusted so they have the same odds of being “good” or “bad.”
17 Van Order and Zorn (2000) examine mortgage loan default and loss rates by income levels and find that lower income neighborhoods experience somewhat higher loss and default rates. Mills and Lubuele (1994), using a limited data set, conclude that LMI mortgages perform better than their high income counterparts. Both of these studies, however, do not control for applicant credit history.
18 While low-income applicants are not a protected class under Title VIII of the Civil Rights Act, the duties of a financial institution in supporting the broader community-including low and moderate income applicants-are covered under the 1977 Community Reinvestment Act and Regulation B, which implements the Equal Credit Opportunity Act (ECOA). Community activist groups frequently cite alleged CRA violations in protests of merger applications by financial institutions. The authors also conducted an analysis of disparities by applicant race. The income results are far stronger and suggest more significant disparities in credit-scoring than the race-based analysis would indicate.
19 The median MSA income is not reported to protect the identity of the lender.
20 The data represent applications in a large urban geography with a high proportion of LMI applicants.
21 Bureau scores can range anywhere from 400 to more than 800, while the custom card under investigation is scaled in a different manner and ranges from 50 to 276.
22 The focus of fair lending exams at commercial banks typically analyze disparate treatment issues with very little focus on disparate impact.
23 Detractors of scoring models have argued for the use of debt-to-income, rather than income, as a measure of ability to pay. Separately, other detractors have argued that both
finance company inquiries and income should not be used in models due to their high correlation with applicant race.
References
Automated Underwriting: Making Mortgage Lending Simpler and Fairer for America’s Families, Freddie Mac Report, 1996.
Avery, R., R. W. Bostic, P. Calem and G. Canner, Credit Risk, Credit Scoring, and the Performance of Home Mortgages, Federal Reserve Bulletin, 1996, 82:7, 621-48.
-., Credit Scoring: Statistical Issues and Evidence, Real Estate Economics, 2000, 28:3, 523-47.
Credit Bureaus Move Against Lenders That Withhold Information, American Banker, December 30, 1999.
Ferguson, M. F. and S. Peters, What Constitutes Evidence of Discrimination in Lending, Journal of Finance, 1995, 50, 739-48.
Hunter, W. C., and M. B. Walker, The Cultural Affinity Hypothesis and Mortgage Lending Decisions, Journal of Real Estate Finance and Economics, 1996, 13, 57-70.
Lewis, E. M., An Introduction to Credit Scoring, The Athena Press, 1994.
Mays, E., Editor, Handbook of Credit Scoring, Glenlake Publishing Company, 2001. Measuring Credit Risk: Borrower Credit Scores and Lender Profiles, Fannie Mae Letter, 1995, LL09-95.
Mills, E. S. and L. S. Lubuele, The Performance of Residential Mortgages in Low and Moderate Income Neighborhoods, Journal of Real Estate Finance and Economics, 9:3, 1994, 245-60.
Pierzchalski, L., Guarding Against Risk, Mortgage Banker, June 1996, 38-45.
The Predictive Power of Selected Credit Scores, Freddie Mac Letter, 1995, McLean Virginia: Freddie Mac.
Straka, J. W., A Shift in the Mortgage Landscape: The 1990s Move to Automated Credit Evaluations, Journal of Housing Research, 2001, 11, 207-232.
Van Order, R. and P. Zorn, Income, Location and Default: Some Implications for Community Lending, Real Estate Economics, 2000, 28:3, 385-404.
The opinions expressed are those of the authors and do not necessarily represent those of the Office of the Comptroller of the Currency. The authors thank William Lang, David Nebhut and Gary Whalen for useful comments.
Authors M. Cary Collins, Keith D. Harvey and Peter J. Nigro
M. Cary Collins, University of Tennessee, Knoxville, Tennessee 37996 or MCollin6@utk.edu.
Keith D. Harvey, Boise State University, Boise, Idaho 83706 or kharvey@ boisestate. edu.
Peter J. Nigro, The Office of the Comptroller of the Currency, Washington, D. C. 20219 or peter.nigro@occ.treas.gov.
Copyright American Real Estate Society Sep/Oct 2002
Provided by ProQuest Information and Learning Company. All rights Reserved