Author: Camron Assadi

A mine of information – the PLOS Text Mining Collection

The growth of Open Access has increased the pool of digital information that is available for Text Mining. This relatively new interdisciplinary field emerged in the 1980s and combines techniques from linguistics, computer science and statistics to build tools that can efficiently retrieve, extract and analyze information from digital text.

PLOS has actively promoted the field of text mining by publishing reviews, opinions, tutorials and dozens of primary research articles in PLOS Biology, PLOS Computational Biology and PLOS ONE. Furthermore, PLOS is one of few publishers who enable and encourage text mining research by providing an open API to mine our journal content.

In order to raise the profile of this field and PLOS’ contribution to it, we’re delighted to announce the PLOS Text Mining Collection, which gathers together key works published by PLOS in the field of TextMining, including three new research articles published today in PLOS ONE.Text Mining Collection

Whilst the promise of Text Mining is yet to be fully realized, a world in which all scientific literature is Open Access, allowing Text Mining tools to fully compare and contrast that data, now seems possible. Text mining has the potential to make new discoveries and open up new fields of research, but will always be dependent on the literature that is available. As part of our mission to lead a transformation in research communication, we’re delighted to showcase the research from this field and encourage debate within the community and beyond.

This collection is open for new submissions across all PLOS journals. New articles will be added to the Collection page periodically. For more information you can also visit the EveryONE blog.

Category: Alt-Metrics, PLoS Biology, PLoS Collections, PLoS Computational Biology, PLoS ONE, Publishing, Technology | 1 Comment

Call for Papers: PLoS Text Mining Collection

The Public Library of Science (PLoS) seeks submissions in the broad field of text-mining research for a collection to be launched across all of its journals in 2013. All submissions submitted before October 30th, 2012 will be considered for the launch of the collection. Please read the following post for further information on how to submit your article. This post authored by Casey M. Bergman, Lawrence E. Hunter, Andrey Rzhetsky.

The scientific literature is exponentially increasing in size, with thousands of new papers published every day. Few researchers are able to keep track of all new publications, even in their own field, reducing the quality of scholarship and leading to undesirable outcomes like redundant publication. While social media and expert recommendation systems provide partial solutions to the problem of keeping up with the literature, systematically identifying relevant articles and extracting key information from them can only come through automated text-mining technologies.

Research in text mining has made incredible advances over the last decade, driven through community challenges and increasingly sophisticated computational technologies. However, the promise of text mining to accelerate and enhance research largely has not yet been fulfilled, primarily since the vast majority of the published scientific literature is not published under an Open Access model. As Open Access publishing yields an ever-growing archive of unrestricted full-text articles, text mining will play an increasingly important role in drilling down to essential research and data in scientific literature in the 21st century scholarly landscape.

As part of its commitment to realizing the maximal utility of Open Access literature, PLoS is launching a collection of articles dedicated to highlighting the importance of research in the area of text mining. The launch of this Text Mining Collection complements related PLoS Collections on Open Access and Altmetrics (forthcoming), as well as the recent release of the PLoS Application Programming Interface, which provides an open API to PLoS journal content.

As part of this Text Mining Collection, we are making a call for high quality submissions that advance the field of text-mining research, including:

  • New methods for the retrieval or extraction of published scientific facts
  • Large-scale analysis of data extracted from the scientific literature
  • New interfaces for accessing the scientific literature
  • Semantic enrichment of scientific articles
  • Linking the literature to scientific databases
  • Application of text mining to database curation
  • Approaches for integrating text mining into workflows
  • Resources (ontologies, corpora) to improve text mining research

Please note that all submissions submitted before October 30th, 2012 will be considered for the launch of the collection (expected early 2013); submissions after this date will still be considered for the collection, but may not appear in the collection at launch.

Submission Guidelines
If you wish to submit your research to the PLoS Text Mining Collection, please consider the following when preparing your manuscript:

  • All articles must adhere to the submission guidelines of the PLoS journal to which you submit.
  • Standard PLoS policies and relevant publication fees apply to all submissions.
  • Submission to any PLoS journal as part of the Text Mining Collection does not guarantee publication.

When you are ready to submit your manuscript to the collection, please log in to the relevant PLoS manuscript submission system and mention the Collection’s name in your cover letter. This will ensure that the staff is aware of your submission to the Collection. The submission systems can be found on the individual journal websites.

Please contact Samuel Moore ( if you would like further information about how to submit your research to the PLoS Text Mining Collection.

Casey Bergman (University of Manchester)
Lawrence Hunter (University of Colorado-Denver)
Andrey Rzhetsky (University of Chicago)

Update (05/23/2012): The organizers would like to point out that the Collection is open to all forms of content mining (images, diagrams, audio, etc.) and not just text. We will accept a broad interpretation of what constitutes text mining.

Category: Alt-Metrics, Open Access, PLoS Collections, Publishing, Technology | 1 Comment