Potential risk on credit applicants is the probability of default on repayment of a credit facility rendered by a commercial bank. To improve efficiency in decision making on credit risk, therefore credit scoring models are developed. The objectives of this research areto classify credit applicants cluster analysis, Artificial Neural Network and K-Nearest neighbours techniques and to compare their predictive accuracy. The analysis was first by training the dataset, where by 70% of the data was used for training and the remaining 30% was used for testing. Finally, the ability of the developed models to forecast trends was investigated. Here we assume that a cluster is homogeneous, if it contains members that have a high degree of similarity. The analysis is therefore based on credit data provided by commercial banks in Kenya used to test the effectiveness of cluster analysis, K-Nearest neighbour (K-NN) and artificial neural network (ANN) models. To determine the best model in classification accuracy, confusion matrix was used. To test for the goodness of fit the chi square test was used. From the results of the study, the researcher concluded that ANN was better in predicting the classification of credit applicants than K-NN and Cluster Analysis.
Published in | American Journal of Theoretical and Applied Statistics (Volume 5, Issue 4) |
DOI | 10.11648/j.ajtas.20160504.14 |
Page(s) | 186-191 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2016. Published by Science Publishing Group |
Cluster Analysis, ANN: Artificial Neural Network, K-NN: K-Nearest Neighbour, Credit Risk, Overall Accuracy Rate, SSE: Sum of Square Errors
[1] | Abdou, H, J Pointon and A El-Masry (2007), ‘On the applicability of credit scoringmodels in Egyptian banks’, Banks Bank Syst 2 (1), 4–19. |
[2] | Bekhet, H and S Eletter (2012), ‘Credit risk management for the Jordanian commercial banks: a business intelligence approach’, Aust. J. Basic Appl. Sci 6 (18), 188–195. |
[3] | Boguslauskas, V and R Mileris (2009), ‘Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the italian experience)’, Economics of engineering decisions. |
[4] | Correa, A, A Gonzalez, C Nieto and D Amezquita (2012), Constructing a Credit Risk Scorecard using Predictive Clusters, SAS Global Forum. |
[5] | Durand, D (1941), Risk elements in consumer instalments financing, New York: national bureau of economic research. |
[6] | Enas, G G and S C Choi (1986), ‘Choice of the smoothing parameter and efficiency of k-nearest Neighbor classification’, Computers and Mathematics with Applications 12A (2), 235–244. |
[7] | Fisher, R A (1936), ‘The use of multiple measurement in taxonomic problems’, Annals ofEugenic 7, 179–188. |
[8] | Fix, E and J Hodges (1952), Discrimatory analysis; nonparametric discrimination: consistency properties, report 4, project 21-49-004 edn, us airforce school of aviation medicine, random Field. |
[9] | Glorfeld, LWand B C Hardgrave (1996), ‘an improved method for developing neural networks: the case of evaluating commercial loan credit worthiness’, Computers and Operations Research 23 (10), 933–944. |
[10] | Hand, D J and W E Henley (1996), ‘A k-nearest neighbour classifier for assessing consumer credit risk’, the statistician 45 (1), 77–95. |
[11] | Khashman, A (2010), ‘Neural network for credit risk evaluation: investigation of different neural Models and learning schemes.)’, Exp. Syst. Appl. 37 (9), 6233–6239. |
[12] | Oso, W Y and D Onen (2009), ‘A guide line to writing a research proposal and report’, A Handbook of Beginning Researchers. |
APA Style
Mutua Jennifer Ndanu, Gichuhi Anthony Waititu, Wanjoya Anthony Kiberia, Muia Patricia Nthoki. (2016). Cluster Analysis, K-Nearest Neighbour and Artificial Neural Network Applied to Credit Data to Classify Credit Applicants. American Journal of Theoretical and Applied Statistics, 5(4), 186-191. https://doi.org/10.11648/j.ajtas.20160504.14
ACS Style
Mutua Jennifer Ndanu; Gichuhi Anthony Waititu; Wanjoya Anthony Kiberia; Muia Patricia Nthoki. Cluster Analysis, K-Nearest Neighbour and Artificial Neural Network Applied to Credit Data to Classify Credit Applicants. Am. J. Theor. Appl. Stat. 2016, 5(4), 186-191. doi: 10.11648/j.ajtas.20160504.14
AMA Style
Mutua Jennifer Ndanu, Gichuhi Anthony Waititu, Wanjoya Anthony Kiberia, Muia Patricia Nthoki. Cluster Analysis, K-Nearest Neighbour and Artificial Neural Network Applied to Credit Data to Classify Credit Applicants. Am J Theor Appl Stat. 2016;5(4):186-191. doi: 10.11648/j.ajtas.20160504.14
@article{10.11648/j.ajtas.20160504.14, author = {Mutua Jennifer Ndanu and Gichuhi Anthony Waititu and Wanjoya Anthony Kiberia and Muia Patricia Nthoki}, title = {Cluster Analysis, K-Nearest Neighbour and Artificial Neural Network Applied to Credit Data to Classify Credit Applicants}, journal = {American Journal of Theoretical and Applied Statistics}, volume = {5}, number = {4}, pages = {186-191}, doi = {10.11648/j.ajtas.20160504.14}, url = {https://doi.org/10.11648/j.ajtas.20160504.14}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20160504.14}, abstract = {Potential risk on credit applicants is the probability of default on repayment of a credit facility rendered by a commercial bank. To improve efficiency in decision making on credit risk, therefore credit scoring models are developed. The objectives of this research areto classify credit applicants cluster analysis, Artificial Neural Network and K-Nearest neighbours techniques and to compare their predictive accuracy. The analysis was first by training the dataset, where by 70% of the data was used for training and the remaining 30% was used for testing. Finally, the ability of the developed models to forecast trends was investigated. Here we assume that a cluster is homogeneous, if it contains members that have a high degree of similarity. The analysis is therefore based on credit data provided by commercial banks in Kenya used to test the effectiveness of cluster analysis, K-Nearest neighbour (K-NN) and artificial neural network (ANN) models. To determine the best model in classification accuracy, confusion matrix was used. To test for the goodness of fit the chi square test was used. From the results of the study, the researcher concluded that ANN was better in predicting the classification of credit applicants than K-NN and Cluster Analysis.}, year = {2016} }
TY - JOUR T1 - Cluster Analysis, K-Nearest Neighbour and Artificial Neural Network Applied to Credit Data to Classify Credit Applicants AU - Mutua Jennifer Ndanu AU - Gichuhi Anthony Waititu AU - Wanjoya Anthony Kiberia AU - Muia Patricia Nthoki Y1 - 2016/06/07 PY - 2016 N1 - https://doi.org/10.11648/j.ajtas.20160504.14 DO - 10.11648/j.ajtas.20160504.14 T2 - American Journal of Theoretical and Applied Statistics JF - American Journal of Theoretical and Applied Statistics JO - American Journal of Theoretical and Applied Statistics SP - 186 EP - 191 PB - Science Publishing Group SN - 2326-9006 UR - https://doi.org/10.11648/j.ajtas.20160504.14 AB - Potential risk on credit applicants is the probability of default on repayment of a credit facility rendered by a commercial bank. To improve efficiency in decision making on credit risk, therefore credit scoring models are developed. The objectives of this research areto classify credit applicants cluster analysis, Artificial Neural Network and K-Nearest neighbours techniques and to compare their predictive accuracy. The analysis was first by training the dataset, where by 70% of the data was used for training and the remaining 30% was used for testing. Finally, the ability of the developed models to forecast trends was investigated. Here we assume that a cluster is homogeneous, if it contains members that have a high degree of similarity. The analysis is therefore based on credit data provided by commercial banks in Kenya used to test the effectiveness of cluster analysis, K-Nearest neighbour (K-NN) and artificial neural network (ANN) models. To determine the best model in classification accuracy, confusion matrix was used. To test for the goodness of fit the chi square test was used. From the results of the study, the researcher concluded that ANN was better in predicting the classification of credit applicants than K-NN and Cluster Analysis. VL - 5 IS - 4 ER -