Automatic classification of Aurora-related tweets using machine learning methods
摘要：The constant flow of information by social media provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past few years we have been analyzing in real-time geological hazards/phenomena, such as earthquakes,volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, up to this date only a keyword-based filtering was applied that does not always filter out tweets that are unrelated to hazard-events. Therefore, this work explores five learning-based classification techniques: a Linear SVM and four Deep Neural Networks（DNNs）: a Convolutional Neural Network（CNN）, a Recurrent Neural Network（RNN）, a RNN-Long-short-term memory（RNN-LSTM） and a RNN-Gated Recurrent Unit（GRU） for automatic hazard-event classification based on tweets about Aurora sightings. In addition, for the DNNS we also trained the algorithms using pre-trained word2 vec word-embeddings. We finally evaluate the algorithms using two datasets, one from the Aurorasaurus application and one manually labeled in the BGS. We show that DNNs and especially the CNN perform better for both datasets and that there is potential for improvement. Our code is also available online1.
2019 2nd International Conference on Geoinformatics and Data Analysis （ICGDA 2019）