WCSE 2016
ISBN: 978-981-11-0008-6 DOI: 10.18178/wcse.2016.06.098

Automatic Summarization from Indonesian Hashtag on Twitter Using TF-IDF and Phrase Reinforcement Algorithm

Willyh Hariardi, Novita Latief, David Febryanto, Derwin Suhartono

Abstract— The objective of this research is to produce a summary about what is currently happening from Indonesian hashtag on Twitter. Combination of TF-IDF (term frequency-inverse document frequency) and Phrase Reinforcement Algorithm are used as the methodology to do the automatic summarization. We use 2 sentences as the final summary result. It contains all essential information given by Twitter data. At the end of this paper, we describe the evaluation result by analyzing result using Precision and ROUGE. Based on the result, we conclude that TF-IDF and Phrase Reinforcement Algorithm can successfully generate summary and it works well enough on hashtags that do not have such lot variants of the word. Generally, summary results quality is quite low because the data still contains too much noise. The precision is 0.327 and ROUGE-1 is 0.3087.

Index Terms— summary, hashtag, twitter, phrase reinforcement algorithm, automatic summarization, tf-idf.

Willyh Hariardi, Novita Latief, David Febryanto, Derwin Suhartono
Bina Nusantara University, School of Computer Science, INDONESIA

[Download]


Cite: Willyh Hariardi, Novita Latief, David Febryanto, Derwin Suhartono, "Automatic Summarization from Indonesian Hashtag on Twitter Using TF-IDF and Phrase Reinforcement Algorithm," Proceedings of 2016 6th International Workshop on Computer Science and Engineering, pp. 575-579, Tokyo, 17-19 June, 2016.