A Dynamic Integrated Classification Algorithm Based on Big Data Environment
Abstract— With the developing of big data application, classification algorithm has been expanded to
distributed datasets from the single dataset. So a dynamic integrated classification algorithm based on big
data environment was proposed. This algorithm gain integrated classifiers of high classification accuracy for
each local dataset, and dynamically generate the recognition model according to the distribution
characteristics of local samples to be tested. In the application process, after numerous new sample data join
the datasets, the classifier performance will drop gradually. By aiming at the above problem, this algorithm
will retrain the classification model in the dynamic expansion process of datasets. According to the
experimental results, the algorithm proposed in this paper has high classifier training performance and
classification accuracy. At the same time, it also possesses high adaptive capacity when faced with
dynamically changing distributed datasets.
Index Terms— Classification Algorithm; Integrated Algorithm; Big Data; DIC
Dan Ma, Ji-chun Jiang
College of Computer Science &Technology, GuiZhou University Guiyang, CHINA
Guizhou Gas Group Corporation Ltd, Guiyang, CHINA
Cite: Dan Ma, Ji-chun Jiang, Wei Wang, "A Dynamic Integrated Classification Algorithm Based on Big Data Environment," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering, pp. 630-637, Hong Kong, 15-17 June, 2019.