top of page
  • Writer's pictureMeurig Chapman

What is the optimum sample size to build an application scorecard?

Determining the optimum sample size for building an application scorecard is a critical step in credit risk assessment. We explore what factors influence the choice of sample size and how to strike the right balance to ensure a robust and reliable method.

Building an application scorecard is a crucial step in credit risk assessment and lending decisions. A well-constructed scorecard relies on a sample of historical data to accurately predict an applicant’s creditworthiness. However, determining the optimum sample size is a critical consideration in this process. It involves striking a balance between statistical precision, data quality, available resources, and model complexity. While there are guidelines and statistical approaches to guide this process, it’s important to continuously monitor model performance and adjust sample sizes if necessary. Ultimately, a well-calibrated sample size ensures the development of a robust and reliable scorecard that supports informed lending decisions and risk management practices.

Understanding the application scorecard

An application scorecard is a statistical model used by lenders to evaluate the creditworthiness of individuals or businesses applying for credit. It assigns a numerical score or probability to each applicant based on their application information, such as income, credit history, employment status, and other relevant factors. This score helps lenders make informed decisions about whether to approve or decline credit applications.

Factors influencing sample size

The choice of sample size for building an application scorecard is influenced by several factors.

The total number of applicants in the population plays a role in determining the sample size. A larger population may require a larger sample to ensure representativeness. The expected default rate in the population is a critical factor. If the expected default rate is low, a larger sample may be needed to capture a sufficient number of defaults for model development.

The desired level of confidence in the model's accuracy affects the sample size. A higher confidence level typically requires a larger sample. The acceptable margin of error in the model's predictions is a consideration. A smaller margin of error may necessitate a larger sample.

The number of predictor variables used in the scorecard can impact the sample size. More complex models may require larger samples to ensure stability. The complexity of the scoring model itself may affect the sample size. 

More complex models often require larger samples for reliable estimation.

The quality and completeness of the data available for model development also matter. In cases of limited data quality, a larger sample may be needed to mitigate potential biases.

Determining the optimum sample size

Finding the optimum sample size for building an application scorecard involves a balance between precision and practicality. Here are steps to guide the process:

  • Conduct a power analysis: A power analysis helps estimate the sample size needed to achieve a desired level of statistical power. This analysis considers factors like population size, expected default rate, confidence level, and margin of error.

  • Assess data quality: Evaluate the quality, completeness, and reliability of the available data. Poor data quality may necessitate a larger sample to compensate for potential biases.

  • Consider resource constraints: Assess the available resources, including time, budget, and personnel. Practical constraints may influence the feasible sample size.

  • Conduct sample size sensitivity analysis: Perform sensitivity analyses to assess how variations in sample size affect the model's performance. This helps strike a balance between sample size and model stability.

  • Monitor model performance: Continuously monitor the model's performance during development. If the model's performance is unsatisfactory, consider adjusting the sample size or other model parameters.

  • Validate the model: After model development, validate its performance on an independent dataset to ensure that the chosen sample size has resulted in a reliable scorecard.

Common approaches to sample size

While the optimum sample size varies depending on the specific circumstances, some common approaches are used in practice.

A common rule of thumb is to have at least 500 to 1,000 observations for binary classification tasks (approve/decline), but this may vary based on the factors mentioned earlier.

Conducting a power analysis, as mentioned earlier, provides a more data-driven approach to determining sample size based on desired statistical power.

Cross-validation techniques, such as k-fold cross-validation, help assess how the model performs on different subsets of the data, providing insights into sample size requirements.

bottom of page