Generating headlines for Turkish news texts with transformer architecture based deep learning method

Küçük Resim Yok

Tarih

2024

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Gazi Univ, Fac Engineering Architecture

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Nowadays, the Internet is a structure that people can access easily and at the same time produce content easily and without control. In parallel with this situation, the ability of extract information from the raw data that makes up big data has become more complex. The fact that the headlines of the contents contain uncontrolled and misleading elements makes it difficult to reach the right information. The headlines of the contents are important for people to reach the information they want in their limited time. In this study, it is aimed to produce headlines suitable for the content instead of headlines that may be misleading for news. For this purpose, an application that produces headlines for Turkish news with deep learning method has been developed. SuDer news corpus is used as dataset. For the training of the model, it is aimed to obtain more humanoid results in the production of news headlines by using the Transformer architecture, which is frequently preferred in natural language studies today and the abstract summarization method. In this study, in order to compare the performance of the Transformer model, models are prepared and trained with Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures. At the end of 25 epochs of training with LSTM, GRU and Transformer architectures on the corpus, the values of loss are 1.03, 0.55 and 2.49 respectively. In the experiments performed on the validation data, measurements are made with ROUGE-1, ROUGE-2 and ROUGE-L metrics. As a result of the measurements, it is observed that the Transformer architecture is partially good, based on the metric values produced. In addition, when the headlines produced with these architectures are examined, it is observed that the headline obtained with the Transformer architecture produce headlines that are partially more suitable for the news content compared to other architectures.

Açıklama

Anahtar Kelimeler

Turkish Natural Language Processing, Automatic Headline Generation, Abstract Text Summarization, Deep Learning, Transformers

Kaynak

Journal Of The Faculty Of Engineering And Architecture Of Gazi University

WoS Q Değeri

N/A

Scopus Q Değeri

Q2

Cilt

39

Sayı

1

Künye