Sangam: A Confluence of Knowledge Streams

A Classifier to Detect Informational vs. Non-Informational Heart Attack Tweets

Show simple item record

dc.contributor Computer Science
dc.creator Karajeh, Ola
dc.creator Darweesh, Dirar
dc.creator Darwish, Omar
dc.creator Abu-El-Rub, Noor
dc.creator Alsinglawi, Belal
dc.creator Alsaedi, Nasser
dc.date 2021-01-22T18:04:21Z
dc.date 2021-01-22T18:04:21Z
dc.date 2021-01-16
dc.date 2021-01-22T15:47:02Z
dc.date.accessioned 2023-03-01T18:35:20Z
dc.date.available 2023-03-01T18:35:20Z
dc.identifier Karajeh, O.; Darweesh, D.; Darwish, O.; Abu-El-Rub, N.; Alsinglawi, B.; Alsaedi, N. A Classifier to Detect Informational vs. Non-Informational Heart Attack Tweets. Future Internet 2021, 13, 19.
dc.identifier http://hdl.handle.net/10919/102013
dc.identifier https://doi.org/10.3390/fi13010019
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/279777
dc.description Social media sites are considered one of the most important sources of data in many fields, such as health, education, and politics. While surveys provide explicit answers to specific questions, posts in social media have the same answers implicitly occurring in the text. This research aims to develop a method for extracting implicit answers from large tweet collections, and to demonstrate this method for an important concern: the problem of heart attacks. The approach is to collect tweets containing “heart attack” and then select from those the ones with useful information. Informational tweets are those which express real heart attack issues, e.g., “Yesterday morning, my grandfather had a heart attack while he was walking around the garden.” On the other hand, there are non-informational tweets such as “Dropped my iPhone for the first time and almost had a heart attack.” The starting point was to manually classify around 7000 tweets as either informational (11%) or non-informational (89%), thus yielding a labeled dataset to use in devising a machine learning classifier that can be applied to our large collection of over 20 million tweets. Tweets were cleaned and converted to a vector representation, suitable to be fed into different machine-learning algorithms: Deep neural networks, support vector machine (SVM), J48 decision tree and naïve Bayes. Our experimentation aimed to find the best algorithm to use to build a high-quality classifier. This involved splitting the labeled dataset, with 2/3 used to train the classifier and 1/3 used for evaluation besides cross-validation methods. The deep neural network (DNN) classifier obtained the highest accuracy (95.2%). In addition, it obtained the highest F1-scores with (73.6%) and (97.4%) for informational and non-informational classes, respectively.
dc.description Published version
dc.format application/pdf
dc.format application/pdf
dc.language en
dc.publisher MDPI
dc.rights Creative Commons Attribution 4.0 International
dc.rights http://creativecommons.org/licenses/by/4.0/
dc.subject Machine learning
dc.subject classification
dc.subject support vector machine
dc.subject deep neural networks
dc.subject tweets
dc.subject heart attack
dc.subject health
dc.title A Classifier to Detect Informational vs. Non-Informational Heart Attack Tweets
dc.title Future Internet
dc.type Article - Refereed
dc.type Text


Files in this item

Files Size Format View
futureinternet-13-00019.pdf 2.026Mb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse