Derin öğrenme ile ilaç moleküllerinin sınıflandırılması

Kanberiz, Hatice

Derin öğrenme ile ilaç moleküllerinin sınıflandırılması

dc.contributor.advisor	Korkmaz, Selçuk
dc.contributor.advisor	Süt, Necdet
dc.contributor.author	Kanberiz, Hatice
dc.date.accessioned	2020-07-24T11:58:05Z
dc.date.available	2020-07-24T11:58:05Z
dc.date.issued	2020
dc.date.submitted	2020
dc.department	Enstitüler, Sağlık Bilimleri Enstitüsü, Biyoistatistik ve Tıbbi Bilişim Ana Bilim Dalı	en_US
dc.description.abstract	İlaç geliştirme çalışmalarının erken evresinde binlerce molekül arasından aktivite gösteren moleküller tespit edilerek ilaç geliştirme çalışmalarına harcanan süre ve maliyet azaltılmaya çalışılmaktadır. Bu amaçla yüksek verimli tarama deneyleri yapılarak moleküller aktif ve inaktif olarak sınıflandırılmaktadır. Bu deneylerden elde edilen veriler PubChem veri tabanına yüklenmektedir. Bu veri tabanındaki veriler kullanılarak makine öğrenimi algoritmaları yardımıyla sınıflandırma modelleri geliştirilebilir, böylece aktivite gösteren moleküller daha hızlı ve daha ucuz bir şekilde tespit edilebilir. Bu çalışmada PubChem veri tabanından elde edilen farklı derecelerde dengesizlik yapısına sahip 5 adet veri seti derin sinir ağları (DSA) algoritmasıyla eğitilmiştir. Eğitilen DSA algoritmasının performansı literatürde sıklıkla kullanılan destek vektör makineleri (DVM) ve random forest (RF) algoritmalarıyla karşılaştırılmıştır. Algoritmaların performans karşılaştırmasında dengeli doğruluk oranı, duyarlılık, pozitif kestirim değeri, F1 skor, MCC ölçütleri göz önüne alınmıştır. Bu ölçütler değerlendirildiğinde, pozitif kestirim değeri dışındaki diğer ölçütler açısından, özellikle dengesiz veri setlerinde performans değerlendirmesinde en önemli ölçütlerden olan F1 skor ve MCC açısından, DSA algoritmasının DVM ve RF algoritmalarına göre daha yüksek performans gösterdiği görülmüştür. Sonuç olarak, DSA algoritması dengesiz veri yapılarında diğer makine öğrenimi algoritmalarına göre daha iyi bir performans gösterdiği için ilaç geliştirme çalışmalarına harcanan süreyi ve maliyeti azaltmada tercih edilebilecek iyi bir makine öğrenimi algoritmasıdır.	en_US
dc.description.abstract	In the early stages of drug development studies, molecules that are active among thousands of molecules are identified and the time and cost spent on drug development studies are tried to be reduced. For this purpose, molecules are classified as active and inactive by performing high-throughput screening experiments. The data obtained from these experiments are uploaded to PubChem database. By using the data in this database, classification models can be developed with the help of machine learning algorithms, so that the molecules showing activity can be detected faster and cheaper. In this study, 5 data sets with different degree of imbalance structure obtained from PubChem database were trained with deep neural network (DSA) algorithm. The performance of the trained DSA algorithm was compared with the support vector machines (DVM) and random forest (RF) algorithms that are frequently used in the literature. Balanced accuracy, sensitivity, positive predictive value, F1 score and MCC criteria were taken into consideration in the performance comparison of the algorithms. When these criteria were evaluated, it was observed that DSA algorithm performed better than DVM and RF algorithms in terms of F1 score and MCC which is one of the most important criteria in performance evaluation especially in unbalanced data sets in terms of other criteria except positive predictive value. As a result, DSA algorithm is a good machine learning algorithm that can be preferred in reducing time and cost spent on drug development studies because it performs better in unbalanced data structures than other machine learning algorithms.	en_US
dc.identifier.uri	https://hdl.handle.net/20.500.14551/5147
dc.identifier.yoktezid	614306	en_US
dc.language.iso	tr	en_US
dc.publisher	Trakya Üniversitesi, Sağlık Bilimleri Enstitüsü	en_US
dc.relation.publicationcategory	Tez	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	PubChem,	en_US
dc.subject	Derin Öğrenme	en_US
dc.subject	Dengesiz Veri	en_US
dc.subject	Destek Vektör Makineleri	en_US
dc.subject	Random Forest	en_US
dc.subject	Sanal Tarama	en_US
dc.subject	Deep Learning	en_US
dc.subject	Unbalanced Data	en_US
dc.subject	Support Vector Machines	en_US
dc.subject	Random Forest	en_US
dc.subject	Virtual Screning	en_US
dc.title	Derin öğrenme ile ilaç moleküllerinin sınıflandırılması	en_US
dc.title.alternative	Activity classification of drug molecules using deep learning	en_US
dc.type	Master Thesis	en_US

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: 0170062.pdf
Boyut:: 1.31 MB
Biçim:: Adobe Portable Document Format
Açıklama:

İndir

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.44 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

Sağlık Bilimleri Enstitüsü Tez Koleksiyonu