Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Quality Assessment of Crowdsourced Data (CSD) Using Semantics and Geographical Information Retrieval (GIR) Techniques

Version 1 : Received: 26 April 2018 / Approved: 26 April 2018 / Online: 26 April 2018 (10:19:02 CEST)

A peer-reviewed article of this Preprint also exists.

Koswatte, S.; McDougall, K.; Liu, X. Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques. ISPRS Int. J. Geo-Inf. 2018, 7, 256. Koswatte, S.; McDougall, K.; Liu, X. Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques. ISPRS Int. J. Geo-Inf. 2018, 7, 256.

Abstract

Crowdsourced Data (CSD) generated by citizens is becoming more popular as its potential utilisation in many applications is increasing due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows standard data collection procedures. The quality of CSD can be assessed according to a range of attributes including its relevance. Information relevance has been explored through using in Geographic Information Retrieval (GIR) techniques to identify relevant information. This research tested a relevance assessment approach for CSD by adapting relevance assessment techniques available in the GIR domain. The thematic and geographic relevance were assessed using the Term Frequency-Inverse Document Frequency (TF-IDF), Vector Space Model (VSM) and Natural Language Processing (NLP) techniques. The thematic and geographic specificities of the queries were calculated as 0.44 and 0.67 respectively, which indicates the queries used were more geographically specific than thematically specific. The Spearman's rho value of 0.62 indicated that the final ranked relevance lists showed reasonable agreement with a manually classified list and confirmed the potential of the approach for CSD relevance assessment for other possible crowdsourced data analysis.

Keywords

crowdsourced data; relevance; semantics; geographic information retrieval; natural language processing

Subject

Environmental and Earth Sciences, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.