A performance comparison using principal component analysis and differential evolution on fuzzy c-means and k-harmonic means
Keywords:
principal component analysis, fuzzy c-means, k-harmonic means, differential evolutionAbstract
Several clustering researches have attempted to optimize the clustering approaches regarding initial clusters. The purpose is to alleviate local optima traps. However, such an optimization may possibly not significantly improve the accuracy rate; contrarily it usually generates abundant runtime consumption. In addition, it may cause the emergence of local traps rather than providing the proper clusters initialization. One may turn to focus on the problems of high dimensional, noisy data and outliers hidden in real-world data. Such difficulties can seriously spoil the computation of several types of learning, including clustering. Feature reduction is one of the approaches to relieve such problems. Thereby, this paper proposes a performance comparison using principal component analysis (PCA) and differential evolution (DE) on fuzzy clustering. The purpose relates to evaluating the consequences of feature reduction, compared to those of optimization of the clustering environment. Here, the fuzzy clustering approaches, fuzzy c-means (FCM) and k-harmonic means (KHM) are experimented. FCM and KHM are soft clustering algorithms that retain more information from the original data than those of crisp or hard. PCA, the feature reduction method, is employed as a preprocessing of FCM and KHM for relieving the curse of high-dimensional, noisy data. The performance of the FCM and KHM based on PCA feature extraction, called PCAFCM and PCAKHM are compared with related algorithms, including the FCM and KHM optimized by differential evolution (DE) method. Comparison tests are performed related to 7 well-known benchmark real-world data sets. Within the scope of this study, the superiority of the feature reduction using PCA over DE optimization on FCM and KHM is indicated.
References
Balafar, M. A., Ramli, A. R., Saripan, M. I., Mahmud, R., & Mashohor, S. (2008). Medical image segmentation using fuzzy c-mean (FCM), learning vector quantization (LVQ) and user interaction. Advanced Intelligent Computing Theories and Applications With Aspects of Contemporary Intelligent Computing Techniques, 15(5), 177-184.
Bankapalli, J., Venu, B. R., & Devi, A. S. (2011). Combining k-harmonic mean and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging. International Journal on Computer Science and Engineering, 3(6), 2544-2553.
Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences, 3 (10), 191–203.
Chen, W., Giger, M. L., & Bick, U. (2006). A fuzzy c-means (FCM)-based approach for computerized segmentation of breast lesions in dynamic contrast-enhanced MR images. Academic Radiology, 13(1), 63-72.
Chuang, K.-S., Tzeng, H.-L., Chen, S., Wu, J., & Chen, T.-J. (2006). Fuzzy c-means clustering with spatial information for image segmentation. Computerized medical imaging and graphics the official journal of the Computerized Medical Imaging Society, 30(1), 9-15.
Frackiewicz, M. & Palus, H. (2008, December 16-19). Clustering with k-harmonic means applied to colour image quantization. In Signal Processing and Information Technology, 2008. IEEE International Symposium, 52-57. Sarajevo, Bosnia and Herzegovena. doi:10.1109/ISSPIT.2008.4775684
Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA, USA. Retrieved from http://archive.ics.uci.edu/ml
Gomathi, M., & Thangaraj, P. (2010). A parameter based modified fuzzy possibilistic c-means clustering algorithm for lung image segmentation. Global Journal of Computer Science and Technology, 10(4), 85-91.
Gungor, Z., & Unler, A. (2007). K-harmonic means data clustering with simulated annealing heuristic. Applied Mathematics and Computation, 184(2), 199-209.
Gungor, Z., & Unler, A. (2008). K-harmonic means data clustering with tabu-search method. Applied Mathematical Modelling, 32(6), 1115-1125.
Hamerly, G., & Elkan, C. (2002). Alternatives to the k-means algorithm that find better clusterings. In Proceedings of the eleventh international conference on Information and knowledge management CIKM 02, 600-607. New York, NY, USA.
Jolliffe, I. T. (1986). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1-3), 37-52.
Kannan, S. R., Ramathilagam, S., Sathya, A., & Pandiyarajan, R. (2010). Effective fuzzy c-means based kernel function in segmenting medical images. Computers in Biology and Medicine, 40(6), 572-579.
Kao, Y., Lin, J., & Huang, S. (2008). Fuzzy clustering by differential evolution. In Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference, 1, 246-250.
Li, D., Gu, H., & Zhang, L. (2010). A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data. Expert Systems with Applications, 37(10), 6942-6947.
Ma, L., & Staunton, R. (2007). A modified fuzzy c-means image segmentation algorithm for use with uneven illumination patterns. Pattern Recognition, 40(11), 3005-3011.
Price, K., Storn, R., & Lampinen, J. (2005). Differential Evolution: A Practical Approach to Global Optimization. Book (p. 538). Berlin, Germany: Springer.
Storn, R., & Price, K. (1997). Differential evolution - A simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341-359.
Tian, Y., Liu, D., & Qi, H. (2009). K-harmonic means data clustering with differential evolution. In BioMedical Information Engineering, 2009. FBIE 2009. International Conference, 369-372.
Wang, L., Liu, Y., Zhao, X., & Xu, Y. (2006). Particle swarm optimization for fuzzy c-means clustering. In Intelligent Control and Automation 2006. WCICA 2006. The Sixth World Congress. 2, 6055-6058.
Wei, Y. & Li, H. (2009). The Library Evaluation Based on the PCA and Fuzzy-c Means. Artificial Intelligence and Computational Intelligence, 2009. AICI '09. International Conference, 2, 167-171.
Yang, F., Sun, T., & Zhang, C. (2009). An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization. Expert Systems with Applications, 36(6), 9847-9852.
Yong, Y., Chongxun, Z., & Pan, L. (2004). A novel fuzzy c-means clustering algorithm image thresholding. Measurement Science, 4(1), 11–19.
Zhang, B. (2000). Generalized k-harmonic means- Boosting in unsupervised learning. Technical Report TR HPL-2000-137, Hewlett Packard Labs, Palo Alto, CA.
Zhang, B., Hsu, M., & Dayal, U. (1999). K- harmonic heans -A data clustering algorithm. Hewllet Packard Research Laboratory Technical Report PL1999124. Citeseer.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.