Performance Prediction of Compulsory Subjects and Recommendation of Subjects Options for China’s New College Entrance Examination

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Universiti Malaysia Sarawak

Abstract

Description

This study addresses a critical gap in Educational Data Mining by concurrently predicting performance in China’s New College Entrance Examination (NCEE) compulsory subjects and recommending personalized combinations of six optional subjects. Drawing on Bronfenbrenner’s ecological framework, we collected data from 1,127 students and 88 teachers at an urban high school across four dimensions: individual, family, school, and social. Continuous predictors were normalized, and categorical variables were transformed into numerical values. The dataset was split 80/20 for training and testing. Four machine learning algorithms: Naïve Bayes (NB), Decision Tree (DT), Artificial Neural Networks (ANNs), and Support Vector Machines (SVMs) were evaluated using accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). Pearson correlations quantified inter subject dependencies. Feature importance analyses revealed that motivation level dominated Chinese performance prediction, followed by teaching method, gender, past Chinese performance, and teacher’s self-efficacy. Mathematics predictors centered on test anxiety, parents’ education levels, socioeconomic status (SES) and peer relationships, while English hinged on annual family income, parental involvement, and past English performance. NB outperformed all competitors, attaining accuracies of 95.1% for Chinese, 96.4% for Mathematics, and 90.7% for English. Correlation coefficients indicated a weak Chinese-Mathematics association (r = 0.124–0.267), a moderate Chinese-English link (r = 0.308–0.416), and a moderate Mathematics-English relationship (r = 0.365–0.402). From DT outputs, we distilled rules mapping student profiles to optional subject trios. For example, high self-efficacy and strong peer relationships paired with quality Chemistry instruction yielded a “Physics–Chemistry–Biology” recommendation, whereas robust SES and moderate Biology performance suggested “History–Politics–Geography.” The above DT rules enable students to optimize their subject options. Limitations include single school sampling and potential regional biases. Future work should replicate across diverse contexts, explore ensemble methods to enhance both accuracy and interpretability, and implement longitudinal follow up.

Citation

Endorsement

Review

Supplemented By

Referenced By