Emotion Detection in Public Space: A Multilanguage Comparison in Barcelona

Authors

DOI:

https://doi.org/10.5821/ctv.8515

Keywords:

Mass emotion, Twitter sentiment, Public space, Smart city

Abstract

Sentiment analysis via LBSN (Location-based social network) data has been a popular topic in urban studies since the booming of social media applications, such as work stress, the emotion of railway passengers, mapping sentiment, etc. Although it is difficult to measure variations of mass emotions on a precise level, there are some correlations between emotion and spatial environment. Therefore, understanding mass emotion is beneficial to improve the allocation of urban facilities and promote the urban environment. However, most researches are limited to English texts or single language due to the studied area or the technical problems of analyzing different languages. In fact, immigrants and visitors usually take an important portion in international metropolises. The analysis based on a single language is not sufficient to reveal perceptions about the same city from people who use other languages. Moreover, except for the cultural differences, the mass emotion is possibly different in different urban spaces, such as local and tourist spaces. As local language is usually distinct from visitors’, the sentiment analysis based on multi-language could reflect the differences to some degree. Therefore, this study aims to detect the difference of mass emotion between people who use different languages in the same public space. Moreover, Previous studies mainly focus on a single type of land-use, such as tourist attractions or green parks. For filling the gap, the ultimate goal of the research is to explore the relationship between the urban environment and the mass emotion.

This study utilizes 30 months of Twitter data to analyze the mass emotions in Barcelona. Specifically, English, Spanish, and Catalan are involved in the comparison of emotions as the case study, because the number of tweets written by the three languages account for about 90% of our dataset. The analysis is composed of an analysis of high-frequency words and sentiment analysis on plazas. The sentiment analysis is implemented by two commonly used algorithms: Senti-strength that estimates sentiments in short informal texts and Svader that specifically focus on the social media texts. Based on the sentiment (positive, neutral, negative) given by the algorithm, a comprehensive score of sentiment is assigned to each tweet.  In brief, the process includes: 1) cleaning data and removing non-individual tweets; 2) translating Spanish and Catalan tweets into English through Google Translate API; 3) calculating the sentiment score of each tweet via Senti-strength and Svader software; 4) comparing the sentiment classification from the two software; 5) a sample check of sentiment analysis via manual evaluation; 6) comparing the sentiment differences between the three groups of different language in twenty public spaces of Barcelona.

The result confirms the differences of high-frequency words between the three languages, though they have some words in common. The high-frequency Catalan tweets appeared more words which are names of local places. English tweets contained more words that are related to tourism. Spanish tweets seemed to be in between. In terms of sentiment variations, the proportion of positive emotion was higher than negative emotion in general. 

Author Biographies

Carlos Marmolejo Duarte, Universidad Politécnica de Catalunya

Profesor Titular Departamento de Tecnología de la Arquitectura, Investigador del CPSV.

Pablo Martí Ciriquián, Universidad de Alicante

Catedrático de Universidad.

Downloads

Published

2020-04-28