Preprint Article Version 1 This version is not peer-reviewed

Quality Assessment of Crowdsourced Data (CSD) Using Semantics and Geographical Information Retrieval (GIR) Techniques

Version 1 : Received: 26 April 2018 / Approved: 26 April 2018 / Online: 26 April 2018 (10:19:02 CEST)

A peer-reviewed article of this Preprint also exists.

Koswatte, S.; McDougall, K.; Liu, X. Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques. ISPRS Int. J. Geo-Inf. 2018, 7, 256. Koswatte, S.; McDougall, K.; Liu, X. Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques. ISPRS Int. J. Geo-Inf. 2018, 7, 256.

Journal reference: ISPRS Int. J. Geo-Inf. 2018, 7, 256
DOI: 10.3390/ijgi7070256

Abstract

Crowdsourced Data (CSD) generated by citizens is becoming more popular as its potential utilisation in many applications is increasing due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows standard data collection procedures. The quality of CSD can be assessed according to a range of attributes including its relevance. Information relevance has been explored through using in Geographic Information Retrieval (GIR) techniques to identify relevant information. This research tested a relevance assessment approach for CSD by adapting relevance assessment techniques available in the GIR domain. The thematic and geographic relevance were assessed using the Term Frequency-Inverse Document Frequency (TF-IDF), Vector Space Model (VSM) and Natural Language Processing (NLP) techniques. The thematic and geographic specificities of the queries were calculated as 0.44 and 0.67 respectively, which indicates the queries used were more geographically specific than thematically specific. The Spearman's rho value of 0.62 indicated that the final ranked relevance lists showed reasonable agreement with a manually classified list and confirmed the potential of the approach for CSD relevance assessment for other possible crowdsourced data analysis.

Subject Areas

crowdsourced data; relevance; semantics; geographic information retrieval; natural language processing

Readers' Comments and Ratings (0)

Leave a public comment
Send a private comment to the author(s)
Rate this article
Views 0
Downloads 0
Comments 0
Metrics 0
Leave a public comment

×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.