Veri madenciliğinde veri dönüştürme yöntemlerinin sınıflandırma algoritmalarının performanslarına olan etkisi

Yükleniyor...
Küçük Resim

Tarih

2020

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Trakya Üniversitesi, Sağlık Bilimleri Enstitüsü

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

In this thesis, a simulation study was performed to investigate the effects of normalization and unsupervised discretization methods on naive Bayes (NB), C5.0 and support vector machine (SVM) algorithms. The effects of normalization and discretization methods on the three algorithms were found to be change. Normalization methods were generally ineffective in improving the performance of the C5.0 decision tree algorithm and the NB algorithm. Performance measures of the SVM algorithm were increased with normalization methods. When the most effective normalization method was investigated, it was observed that the response varies depending on the distribution of data, the number of observations and the distribution rates of the classes. Unsupervised discretization methods have generally not improved performance of the C5.0 algorithm, but have helped to achieve better results with NB and SVM. Unsupervised discretization methods increased NB performance only in classification of the datas produced from the F distribution, whereas SVM performance increased for datas produced from all sampling distributions. In the study, the C5.0 algorithm was least affected by data transformations, while SVM was the most affected algorithm. According to the overall performance of the algorithms, NB showed higher performance in classification of datas produced from normal and F distributions, whereas SVM performed better in classification of datas generated from chi-square distribution than the other methods.

Açıklama

Anahtar Kelimeler

Veri madenciliği, Sınıflandırma, Normalleştirme, Diskritizasyon, Data mining, Classification, Normalization, Discretization

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye