Spatiotemportal and multilingual Semantic Machine Learning Analysis of Social Media Data for the recent protests in Europe – based on Twitter data –
MetadataShow full item record
The perception of inherent tensions between justice and injustice (or the disproportion of good and bad) often press a group of people (or even the whole society) to seek change concerning politics and power, for example in the form of protests. In the last few decades, the advent and rapid expansion of internet-based communication technologies transformed the way of seeking change through connective action, where besides the two main elements—the people and their intentions—the role of the information along with its spread and accessibility gained more and more significance. Social media platforms such as Twitter are considered a new mediator of collective action, in which various forms of civil movements unite around public posts, often using a common hashtag, thereby strengthening the movements. The data-driven analitical approach, relying on social media posts and activities, has many strengths—especially considering its high temporal resolution and rapid user-response to certain news and information. Twitter data serves as a unique and useful source of information for the analysis of civil movements, as the analysis can reveal important patterns in terms of spatiotemporal and sentimental aspects, which may also help to understand protest escalation over space and time. The investigation of social media in the case of events such as the murder or a protests in Belarus seems an important tool to track and understand the immediate reaction of people, unlike any other method or source of information. The methodological workflow developed in this doctoral research combines time series clustering with semantic topic modeling and sentiment analysis, performed on georeferenced social media data, which provides multi-modal insights into the public’s reactions to a specific political event. The proposed approach includes multi-lingual corpus translation, as well as location and sentiment extraction, using machine-learning topic modelling methods to reveal the hidden interests and motivators of collective action. Through this, the approach has a distinct advantage over the prior investigations that primarily focused either on hashtag-activism (ignoring the spatial dimensions) or, on the contrary, using only location-specific hashtags. Whereas by applying machine learning algorithms and techniques that are almost entirely automatable, the analysis can cover a much wider range of input data than existing studies, where the researchers solely evaluate posts manually. Overall, with this mixed-method approach, this work overcome the limitations of contemporary research on social movements that mainly focuse one one language and a restricted area. The social media data analyzed in this dissertation were obtained using the Twitter Streaming Application Programming Interface (API), the US-based social networking and microblogging service. The first dataset’s starting date is adjusted to the first official report of the murder of Ján Kuciak while the final day is adapted to the earliest statement of the resignation of Prime Minister, Robert Fico (26 February and 15 March 2018). The second dataset’s starting date is adjusted to the day of the Belarus presidential election while the final day is adapted to the formally inauguration of the president of Belarus, Alexander Lukashenko (9 August and 23 September 2020). Both datasets consists of the content of the tweets and additional attributes such as user name, user location, and the timestamp when the tweet was posted.