Here at P2P Analytics, we’ve received some emails asking the following:
1. What strategy does the Lending Club selection algorithm implement to rank notes?
2. And what exactly is the score that we see in the distribution. What does this score signify?
To answer the first question:
In constructing the algorithm to estimate default rate probability, the first step was to find the top 12 factors that corresponded with influencing default rates:
Housing status (Home, Mortgage, Rent)
Job time length
Borrower Text description (and whether it was entered)
Total Credit Lines
Type of Loan (Consolidation, Green, Car, etc)
Earliest Credit Line
Number of Credit Inquiries
Months since last delinquency
The second step was to check the effect of each of these variables across each grade. For example, Housing status was checked for it’s effect on default rates across all A,B,C, etc loans and a table was built of these variables that described how much each factor influenced default rates (for better of for worse). Each time that variable was checked across a grade, all of the notes ever issued (except the very recent notes) in that grade to make the determination.
A few things to keep in mind about the Lending Club algorithm:
Each variable was built across all loans ever issued for each grade. In other words, thousands of samples were used to make a determination as to the correlation of each variable. Due to the fact that there are twelve correlating factors used, this increases the accuracy of the algorithm’s determinations. Imagine if the algorithm only took into account 3 or 4 factors. It would probably be a better strategy than just guessing when selecting notes, but it still would be a very coarse selection method.
To answer the second question, the score is the output of the algorithm. It’s a number that is a rough estimation of the default rate probability of a particular note. The lower this number is, the better. As the algorithm currently stands, more tweaking needs to be done to make this score slightly more representative in terms of matching actual default rates, but overall the algorithm is very good at comparing notes in funding.