Veri tabanlarında kullanılan metin arama yöntemlerinin performans karşılaştırması
Yükleniyor...
Dosyalar
Tarih
2022
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Trakya Üniversitesi Fen Bilimleri Enstitüsü
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
Bu tezde, İlişkisel ve NoSQL veri tabanlarındaki genellikle metin indeksi adı verilen özel indeksler ile gerçekleştirilen Tam Metin Arama (FTS: Full Text Search) yöntemlerinin incelenmesi ve performans karşılaştırması yapılmıştır. Metin indekslerinde, kelimeler veya terimler bulundukları belgeler ile eşleştirilir ve gerekirse kelimelerin belgelerde kaç kez ve hangi konumlarda yer aldığına dair bilgiler de saklanabilir. Veri tabanında bir kelime veya terim arandığında, tüm dokümanları taramak yerine metin indeksleri kullanılarak arama işlemi çok daha hızlı yapılır. Son yıllarda birçok veri tabanı yönetim sistemi tam metin arama desteği sunmaya başlamış ve özellikle bu amaç için kullanılan Elasticsearch gibi arama motorları da ortaya çıkmıştır. Tezin performans karşılaştırması bölümünde MSSQL Server, MySQL, MongoDB ve Elasticsearch veri tabanlarına makale özeti gibi küçük boyutlu ve kitap gibi büyük boyutlu çok sayıda metin verisi eklenmiştir. İlgili yöntemler ile metin indeksleri oluşturulduktan sonra test için belirlenen kelimeler bu indeksler üzerinde aranıp süre sonuçları elde edilmiştir. Aynı kelimeler Regex/Like türünde metin arama sorguları ile de aranmış ve sonuçlar indeks kullanılarak elde edilen sonuçlarla karşılaştırılmıştır. Çalışmanın sonucunda ise hangi veri tabanı yönetim sistemi üzerinde hangi Tam Metin Arama veya Regex/Like yönteminin daha performanslı bir şekilde çalıştığı ortaya çıkarılmıştır.
In this thesis, Full Text Search (FTS: Full Text Search) methods in Relational and NoSQL databases, which are usually performed with special indexes called text indexes, are examined and performance comparisons are made. In text indexes, words or terms are matched to the documents in which they are located, and if necessary, information about how many times and in which locations the words are located in the documents can also be stored. While searching a word or term in a database, the search process is performed much faster using text indexes instead of scaning all documents. In recent years, many database management systems have started to offer FTS support, and search engines such as Elasticsearch, which are especially used for this purpose, have also emerged. In the performance comparison section of the thesis, a large number of small-sized text data such as article abstracts and large-sized text data such as books have been added to MSSQL Server, MySQL, MongoDB and Elasticsearch databases. After the text indexes were created with the relevant methods, the words determined for the test were searched on these indexes and the time results were obtained. The same words were also searched with Regex/Like type text search queries and the results were compared with the results obtained using the index. As a result of the study, it has been revealed that which Full Text Search or Regex/Like method works better on which database management system.
In this thesis, Full Text Search (FTS: Full Text Search) methods in Relational and NoSQL databases, which are usually performed with special indexes called text indexes, are examined and performance comparisons are made. In text indexes, words or terms are matched to the documents in which they are located, and if necessary, information about how many times and in which locations the words are located in the documents can also be stored. While searching a word or term in a database, the search process is performed much faster using text indexes instead of scaning all documents. In recent years, many database management systems have started to offer FTS support, and search engines such as Elasticsearch, which are especially used for this purpose, have also emerged. In the performance comparison section of the thesis, a large number of small-sized text data such as article abstracts and large-sized text data such as books have been added to MSSQL Server, MySQL, MongoDB and Elasticsearch databases. After the text indexes were created with the relevant methods, the words determined for the test were searched on these indexes and the time results were obtained. The same words were also searched with Regex/Like type text search queries and the results were compared with the results obtained using the index. As a result of the study, it has been revealed that which Full Text Search or Regex/Like method works better on which database management system.
Açıklama
Anahtar Kelimeler
Tam metin arama, Veri tabanı, MSSQL, Server, MySQL, MongoDB, Elasticsearch, İVTYS, NoSQL, Full text search, Database, RDBMS