Electronic Science and Technology ›› 2019, Vol. 32 ›› Issue (8): 70-74.doi: 10.16180/j.cnki.issn1007-7820.2019.08.015

Previous Articles     Next Articles

Design of Data Mining System Based on Cloud Computing

LAN Jiman   

  1. Huizhou Engineering Vocational College,Huizhou 516001,China
  • Received:2018-10-15 Online:2019-08-15 Published:2019-08-12


In order to solve exponentially increasing data processing problems and improve data storage and computing power efficiently and quickly, this paper proposed a cloud computing-based data mining system design. The system first analyzed the component composition and operation mechanism of the mainstream cloud computing platform Spark, and deeply studied the programming principle of its computing architecture. At the same time, Spark was used to parallelize the C4.5 algorithm and K-medoids clustering algorithm, which effectively improved the running speed, convergence speed and stability of the algorithm. The test showed that in the analysis and processing of massive data, the cloud computing platform proposed in this paper could effectively improve the computing speed of the whole system and improve the classification efficiency.

Key words: cloud computing, data mining, Spark, C4.5 algorithm, K-medoids clustering algorithm

CLC Number: 

  • TN99