Evaluation of machine learning algorithms on academic big dataset by using feature selection techniques
Küçük Resim Yok
Tarih
2022
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Institution of Engineering and Technology
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Identifying the most accurate methods for forecasting students’ academic achievement is the focus of this research. Globally, all educational institutions are concerned about student attrition. The goal of all educational institutions is to increase the student’s retention and graduation rates and this is only possible if at-risk students are identified early. Due to inherent classifier constraints and the incorporation of fewer student features, most commonly used prediction models are inefficient and incur. Different data mining algorithms like classification, clustering, regression, and association rule mining are used to uncover hidden patterns and relevant information in student performance big datasets in academics. Naïve Bayes, random forest, decision tree, multilayer perceptron (MLP), decision table (DT), JRip, and logistic regression (LR) are some of the data mining techniques that can be applied. A student’s academic performance big dataset comprises many features, none of which are relevant or play a significant role in the mining process. So, features with a variance close to 0 are removed from the student’s academic performance big dataset because they have no impact on the mining process. To determine the influence of various attributes on the class level, various feature selection (FS) techniques such as the correlation attribute evaluator (CAE), information gain attribute evaluator (IGAE), and gain ratio attribute evaluator (GRAE) are utilized. In this study, authors have investigated the performance of various data mining algorithms on the big dataset, as well as the effectiveness of various FS techniques. In conclusion, each classification algorithm that is built with some FS methods improves the performance of the classification algorithms in their overall predictive performance. © The Institution of Engineering and Technology 2022.
Açıklama
Anahtar Kelimeler
Big Data; Feature Selection; Classification; Correlation Attribute Evaluator; Data Mining; Gain Ratio Attribute Evaluator; Information Gain Attribute Evaluator
Kaynak
Intelligent Network Design Driven by Big Data Analytics, IoT, AI and Cloud Computing
WoS Q Değeri
Scopus Q Değeri
N/A