Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data

Version 1 : Received: 8 February 2020 / Approved: 9 February 2020 / Online: 9 February 2020 (16:02:03 CET)

How to cite: Yin, J.; Afa Michael, I.; Afa, I.J. Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints 2020, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1 Yin, J.; Afa Michael, I.; Afa, I.J. Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints 2020, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1

Abstract

Machine learning plays a key role in present day crime detection, analysis and prediction. The goal of this work is to propose methods for predicting crimes classified into different categories of severity. We implemented visualization and analysis of crime data statistics in recent years in the city of Boston. We then carried out a comparative study between two supervised learning algorithms, which are decision tree and random forest based on the accuracy and processing time of the models to make predictions using geographical and temporal information provided by splitting the data into training and test sets. The result shows that random forest as expected gives a better result by 1.54% more accuracy in comparison to decision tree, although this comes at a cost of at least 4.37 times the time consumed in processing. The study opens doors to application of similar supervised methods in crime data analytics and other fields of data science

Supplementary and Associated Material

http://www.dropbox.com/s/7r05fag4z4vhsh9/Boston_Crime.zip?dl=0: Shapefiles codes, Modeling Codes in R and data folder

Keywords

machine learning; decision tree; random forest; crime data analytics

Subject

Computer Science and Mathematics, Information Systems

Comments (0)

Comment 1
Received: 16 April 2020
The commenter has declared there is no conflict of interests.
Comment: Hello,

I am a student in forensic science and I hope it is not too late but I may have useful informations about your visualisations.
You maybe can normalise the data to compare years 2015 and 2018 to others.
Your bar graphes at Figure 3 may be improved by generating a bar for the year mean and twelve other bars for the month mean. Since you only have 4 years it should not take to much space. You may do the same type of transformation for part b. I think it's more explicit.
The figure 6 should be easier to analyse by putting each type of crime on a different figure and labeling them as Part 1, 2 and 3 of UCR category. You could also use density maps.
For figure 8 you should be able to get more useful informations by weighting the types of offences

Hope this could be as useful as I meant it would be

Regards
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.