SCIENCE
AN APPROACH FOR CLUSTERING SOCIAL MEDIA TEXT MESSAGES, RETRIEVED FROM CONTINUOUS DATA STREAMS
Using k-means clustering algorithm, a new approach to handle evolving topics and discussions in social media environment is proposed. Different segmentation techniques and applications to handle large volumes of data are explored. Relevant works that consider using fading functions and half-life weight measurements as a tool to remove inactive clusters are discussed. A set of rules and a controlling variable called time to recover are introduced as a simple means of managing cluster lifecycles. Short case study is conducted with Twitter data retrieved between the 19th and 22nd of January 2018.