Evaluation of machine learning algorithms on academic big dataset by using feature selection techniques

Küçük Resim Yok

Tarih

2022

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Institution of Engineering and Technology

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Identifying the most accurate methods for forecasting students’ academic achievement is the focus of this research. Globally, all educational institutions are concerned about student attrition. The goal of all educational institutions is to increase the student’s retention and graduation rates and this is only possible if at-risk students are identified early. Due to inherent classifier constraints and the incorporation of fewer student features, most commonly used prediction models are inefficient and incur. Different data mining algorithms like classification, clustering, regression, and association rule mining are used to uncover hidden patterns and relevant information in student performance big datasets in academics. Naïve Bayes, random forest, decision tree, multilayer perceptron (MLP), decision table (DT), JRip, and logistic regression (LR) are some of the data mining techniques that can be applied. A student’s academic performance big dataset comprises many features, none of which are relevant or play a significant role in the mining process. So, features with a variance close to 0 are removed from the student’s academic performance big dataset because they have no impact on the mining process. To determine the influence of various attributes on the class level, various feature selection (FS) techniques such as the correlation attribute evaluator (CAE), information gain attribute evaluator (IGAE), and gain ratio attribute evaluator (GRAE) are utilized. In this study, authors have investigated the performance of various data mining algorithms on the big dataset, as well as the effectiveness of various FS techniques. In conclusion, each classification algorithm that is built with some FS methods improves the performance of the classification algorithms in their overall predictive performance. © The Institution of Engineering and Technology 2022.

Açıklama

Anahtar Kelimeler

Big Data; Feature Selection; Classification; Correlation Attribute Evaluator; Data Mining; Gain Ratio Attribute Evaluator; Information Gain Attribute Evaluator

Kaynak

Intelligent Network Design Driven by Big Data Analytics, IoT, AI and Cloud Computing

WoS Q Değeri

Scopus Q Değeri

N/A

Cilt

Sayı

Künye