Feature selection plays a significant part in medical data processing and mining, it can reduce the dimensionalities of datasets and enhance the performance of the classifiers, and it is also helpful to clinical decision support to a great extent. At present, the clinical decision support is more performed by physicians subjectively based on clinical knowledge, which may hinder the diagnosis and treatment. This paper mainly outlines the performance of GCFS (Genetic Correlation-based Feature Selection) algorithm in the processing and mining procedure of medical data, and medical UCI datasets are employed as the studied materials for proving the improvement of feature selection in data classification. Compared with the algorithms of CFS and GA (Genetic Algorithm), ensemble learning methods are employed as the testing classifiers, and the results show GCFS algorithm almost improves the performances of the testing classifiers better than CFS and GA.
Published in | American Journal of Software Engineering and Applications (Volume 3, Issue 6) |
DOI | 10.11648/j.ajsea.20140306.11 |
Page(s) | 68-73 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2014. Published by Science Publishing Group |
Feature Selection, GCFS, Ensemble learning
[1] | G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955. (references) |
[2] | J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73. |
[3] | I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350. |
[4] | K. Elissa, “Title of paper if known,” unpublished. |
[5] | R. Nicole, “Title of paper with only first word capitalized,” J. Name Stand. Abbrev., in press. |
[6] | Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982]. |
[7] | M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 198. |
[8] | I. Skrypnik, V. Terziyan, S. Puuronen and A. Tsymbal: Proceedings of the 12th IEEE Symposium on Computer-Based Medical Systems. 1999, p. 53–58. |
[9] | B. Wang, M. Zhang, B. Zhang and W. Wei: Proceedings of the 7th International Conference on Parallel and Distributed Computing, Applications and Technologies. 2006, p. 128–131. |
[10] | H. M. Yan, J. Zheng, Y. T. Jiang, C. L. Peng, S. Z. Xiao, “Selecting critical clinical features for heart diseases diagnosis with a real-coded genetic algorithm”, Applied soft computing, no.8, (2008), pp. 1105-1111. |
[11] | R. E. Abdel-Aal, “GMDH-based feature ranking and selection for improved classification of medical data”, Journal of Biomedical Informatics, vol. 38, no.6, (2005), pp. 456-468. |
[12] | M. A. Hall, Correlation based feature selection for machine learning [D]. Hamilton, New Zealand:University of Waikato, 1999: 51-69. |
[13] | B. Zheng, Y. X. Jin. “The analysis of marine human error causes based on attribute reduction”, Journal of Shanghai Marine University, vol. 31, no. 1, pp. 92-93, 2010. |
[14] | J. T. Ren, J. H. Sun, H. Y. Huang, et al. “A feature selection method based on information gain and genetic algorithm”. Computer science, vol. 33, no. 10, pp. 194, 2006. |
[15] | S. C. Song, H. Pang, X. J. Ding. “The application research of GA-SVM algorithm in text classification”. Computer simulation, vol. 28, no. 1, pp. 223-225, 2011. |
[16] | R. E. Schapire, “The strength of weak learn ability”, Machine learning, vol. 5, no.2, (1990), pp. 197-227. |
[17] | Y. Freund, “Boosting a weak algorithm by majority”, Information and computation, vol.121, no.2, (1995), pp. 256-285. |
[18] | G. I. Webb, “MultiBoosting: A technique for combining boosting and wagging” , Machine Learning, vol. 40, no.1, (2000), pp. 159-196. |
[19] | J. R. Quinlan, C4.5: Programs for machine learning, Morgan Kaufmann Publishers, San Francisco, 1993. |
[20] | L. Breiman. Bagging predictors. Machine learning. 1996(24):123-140. |
APA Style
Xiao Yu Chen, Bo Liu, Zhe Feng Zhang, Xin Xia. (2014). The Analysis of GCFS Algorithm in Medical Data Processing and Mining. American Journal of Software Engineering and Applications, 3(6), 68-73. https://doi.org/10.11648/j.ajsea.20140306.11
ACS Style
Xiao Yu Chen; Bo Liu; Zhe Feng Zhang; Xin Xia. The Analysis of GCFS Algorithm in Medical Data Processing and Mining. Am. J. Softw. Eng. Appl. 2014, 3(6), 68-73. doi: 10.11648/j.ajsea.20140306.11
AMA Style
Xiao Yu Chen, Bo Liu, Zhe Feng Zhang, Xin Xia. The Analysis of GCFS Algorithm in Medical Data Processing and Mining. Am J Softw Eng Appl. 2014;3(6):68-73. doi: 10.11648/j.ajsea.20140306.11
@article{10.11648/j.ajsea.20140306.11, author = {Xiao Yu Chen and Bo Liu and Zhe Feng Zhang and Xin Xia}, title = {The Analysis of GCFS Algorithm in Medical Data Processing and Mining}, journal = {American Journal of Software Engineering and Applications}, volume = {3}, number = {6}, pages = {68-73}, doi = {10.11648/j.ajsea.20140306.11}, url = {https://doi.org/10.11648/j.ajsea.20140306.11}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajsea.20140306.11}, abstract = {Feature selection plays a significant part in medical data processing and mining, it can reduce the dimensionalities of datasets and enhance the performance of the classifiers, and it is also helpful to clinical decision support to a great extent. At present, the clinical decision support is more performed by physicians subjectively based on clinical knowledge, which may hinder the diagnosis and treatment. This paper mainly outlines the performance of GCFS (Genetic Correlation-based Feature Selection) algorithm in the processing and mining procedure of medical data, and medical UCI datasets are employed as the studied materials for proving the improvement of feature selection in data classification. Compared with the algorithms of CFS and GA (Genetic Algorithm), ensemble learning methods are employed as the testing classifiers, and the results show GCFS algorithm almost improves the performances of the testing classifiers better than CFS and GA.}, year = {2014} }
TY - JOUR T1 - The Analysis of GCFS Algorithm in Medical Data Processing and Mining AU - Xiao Yu Chen AU - Bo Liu AU - Zhe Feng Zhang AU - Xin Xia Y1 - 2014/12/05 PY - 2014 N1 - https://doi.org/10.11648/j.ajsea.20140306.11 DO - 10.11648/j.ajsea.20140306.11 T2 - American Journal of Software Engineering and Applications JF - American Journal of Software Engineering and Applications JO - American Journal of Software Engineering and Applications SP - 68 EP - 73 PB - Science Publishing Group SN - 2327-249X UR - https://doi.org/10.11648/j.ajsea.20140306.11 AB - Feature selection plays a significant part in medical data processing and mining, it can reduce the dimensionalities of datasets and enhance the performance of the classifiers, and it is also helpful to clinical decision support to a great extent. At present, the clinical decision support is more performed by physicians subjectively based on clinical knowledge, which may hinder the diagnosis and treatment. This paper mainly outlines the performance of GCFS (Genetic Correlation-based Feature Selection) algorithm in the processing and mining procedure of medical data, and medical UCI datasets are employed as the studied materials for proving the improvement of feature selection in data classification. Compared with the algorithms of CFS and GA (Genetic Algorithm), ensemble learning methods are employed as the testing classifiers, and the results show GCFS algorithm almost improves the performances of the testing classifiers better than CFS and GA. VL - 3 IS - 6 ER -