Hartgerink, C.H.J. 688,112 Statistical Results: Content Mining Psychology Articles for Statistical Test Results. Data2016, 1, 14.
Hartgerink, C.H.J. 688,112 Statistical Results: Content Mining Psychology Articles for Statistical Test Results. Data 2016, 1, 14.
Journal reference: Data 2016, 1, 14 DOI: 10.3390/data1030014
Cite as:
Hartgerink, C.H.J. 688,112 Statistical Results: Content Mining Psychology Articles for Statistical Test Results. Data2016, 1, 14.
Hartgerink, C.H.J. 688,112 Statistical Results: Content Mining Psychology Articles for Statistical Test Results. Data 2016, 1, 14.
Abstract
In this data deposit, I describe a dataset that is the result of content mining 167,318 published articles for statistical test results. As a result of this content mining, 688,112 results from 50,845 articles were extracted. In order to provide a comprehensive set of data, the statistical results are supplemented with metadata from the article they originate from. The dataset is provided in a comma separated file (CSV) in long-format. For each of the 688,112 results, 20 variables are included, of which seven are article metadata and 13 pertain to the individual statistical results (e.g., reported and recalculated p-value). A five-pronged approach was taken to generate the dataset: (i) collect journal lists, (ii) spider journal pages for articles, (iii) download articles, (iv) add article metadata, and (v) mine articles for statistical results.
Keywords
nhst; p-values; apa; content mining; tdm; errors
Subject
BEHAVIORAL SCIENCES, Other
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.