Differences between preprints and journal articles : Trial using bioRxiv data
In this paper, we attempted to obtain knowledge about how research is conducted, especially how journal articles are produced, by comparing preprints with journal articles that are finally published. First, due to the recent trend of open journals, we were able to secure a certain amount of full-text XML of preprints and journal articles, and verified the technical feasibility of comparing preprints and journal articles. On the other hand, within the scope of this trial, in which we tried to clarify the difference between them based on external criteria such as the number of references and the number of words, and simple document similarity, we could not find a clear difference between preprints and journal articles, or between preprints that became journal articles and those that did not. Even with the machine learning method, the classification accuracy was not high at about 47 The result that there is no significant difference between preprints and journal articles is a finding that has been shown in previous studies and has been replicated in larger and relatively recent situations. In addition to these, the new findings of this paper are that the differences in many external criteria, such as the number of authors, are small, and the differences with preprints that are not journal articles are not large.
READ FULL TEXT