Skip to the content.


Table of contents

LG-covid19-HOTP analysis is organized in the following way.


Maps & self-navigated surf-search

We provide below two maps of the articles in the current data collection. On each map, a cluster indicates closer relationships in concepts, themes or methodologies.

New features since April 11: Self-navigated surf-search

Clicking on each map will activate self-navigated surf-search, in the following three modes with seamless transition.

Here are the two maps. Enjoy the surf-search!

publication histogram
Statistical neighbor embedding of 485,097 articles by co-reference similarity, i.e., the similarity in outCitation lists, via the use of SG-t-SNE [link]. The closer two article dots are, the more common articles they have in their reference lists according to our literature graph. A dot cluster indicates closer relationships in concepts, themes or methodologies. Click on the map to activate self-navigated surf-search.
publication histogram
Statistical neighbor embedding of 485,097 articles by similarity between inCitation lists according to our literature graph, via SG-t-SNE [link]. The closer two article dots are, the more articles that cite the two articles together. A dot cluster indicates closer relationships in concepts, themes or methodologies. Click on the map to activate self-navigated surf-search

Abstract

Parallel to the dataset CORD-19 of scholarly articles, we provide the literature graph LG-covid19-HOTP (10.5281/zenodo.3728215) composed of not only articles (graph nodes) that are relevant to the study of coronavirus, but also in and out citation links (directed graph edges) to base navigation and search among the articles. The article records are related and connected, not isolated. The graph has been updated weekly since March 26, 2020. The current graph includes 42,279 hot-off-the-press (HOTP) articles since January 2020. It contains 485,097 articles and 4,259,944 links. The link-to-node ratio is remarkably higher than some other existing literature graphs. In addition to the dataset we provide more functionalities at lg-covid-19-hotp.cs.duke.edu such as new articles, weekly meta-data analysis in terms of publication growth over time, ranking by citation, and statistical near-neighbor embedding maps by similarity in co-citation, and similarity in co-reference. Since April 11, we have enabled a novel functionality - self-navigated surf-search over the maps. At the site we also take courtesy input of COVID-19 articles that are missing from the current collection. The data set is also available through Kaggle.

New (most recent articles)


Data description

Graph data are composed of not only datum records (nodes) but also relations (edges) among datum records. The literature graph LG-covid19-HOTP is generated in the following way. We started with 14,584 seed articles. We make a forward span by searching the articles that cite the seed articles, and name the set as the foreground HOTP-FG, which includes the seed articles. We then make a backward span from HOTP-FG by tracing all the reference (outCitation) lists. See the Venn diagram in Fig. 1. We complete the graphs with citation links among the collected article records.

publication histogram
Fig.1 - LG-covid19-HOTP node set formation: from 14,584 seed articles to the foreground set of 41,404 articles by a forward span, and to the final set of 485,097 articles by a backward span from HOTP-FG.

We will make updates of the literature graph. We take input from the research community on seminal or noticeable articles that are missed in the current collection.

Basic metadata analysis & visualization

publication histogram
Fig.2 - Weekly histogram of 41,404 HOTP articles in 2020 (as of May 7, 2020).
publication histogram
Fig.3 - Histograms of article counts by publication year: the one in red is with the dataset HOTP-ForeGround (41,404 in total); in blue, with the dataset HOTP-BackGround (485,097 in total), and in gold, with CORD-19 version 12 (59,000 in total).
map
Fig.4 - Geographic map of 1,657 institutes with publications in the LG-covid19-HOTP data set. The map is a contribution of Konstantinos Kitsios. Click on the image for interactive visualization.
publication histogram
Fig.5 - Rank-size distributions of in-citation count to the 14,584 seed articles from 5 data sources: (1) LG-covid19-HOTP-Background (HOTP-BG) with inCitation lists and counts (list lengths) available; (2) Crossref with counts available; . (3) S2-API with lists and counts availalbe; (4) S2-ORC, as of Nov. 2019, with lists and counts availalbe; (5) Scopus with lists and counts available. The 14,584 seed papers are highly cited, according to HOTP and Crossref.

publication histogram publication histogram

Fig.7 - Rank-size distribution of outCitaion count over HOTP (41,404 articles) among two data sources: left, HOTP (including the seed articles, forward span and backward span) with outCitation lists and counts available; right, Crossref with counts available. By the well-matched profiles, the HOTP dataset captures most of the outCitation links.

Top lists

[A] Top 10 HOTP-FG articles (within HOTP-FG, v1.0.0: March 26, 2020)

  1. W. Li, Bats are natural reservoirs of SARS-like coronaviruses, Science, 310:5748 pp676-679, 2005
  2. F Carrat, A Flahault, Influenza vaccine: The challenge of antigenic drift, Vaccine, 25:39-40 pp6852-6862, 2007
  3. Aaron R. Everitt, Simon Clare, Thomas Pertel et. al., IFITM3 restricts the morbidity and mortality associated with influenza, Nature, 484:7395, pp519-523, 2012
  4. Na Zhu, Dingyu Zhang, Wenling Wang et. al., A novel coronavirus from patients with pneumonia in China, 2019, New England Journal of Medicine, 382:8 pp727-733, 2020
  5. B. Zhou, M. E. Donnelly, D. T. Scholes et. al., Single-reaction genomic amplification accelerates sequencing and vaccine production for classical and swine origin human Influenza A viruses, Journal of Virology, 83:19 pp10309-10313, 2009
  6. Qun Li, Xuhua Guan, Peng Wu et. al., Early transmission dynamics in Wuhan, china, of novel coronavirusinfected pneumonia, New England Journal of Medicine ppNEJMoa2001316, 2020
  7. Y. Fujii, H. Goto, T. Watanabe et. al., Selective incorporation of influenza virus RNA segments into virions, Proceedings of the National Academy of Sciences, 100:4 pp2002-2007, 2003
  8. Chaolin Huang, Yeming Wang, Xingwang Li et. al., Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China, The Lancet, 395:10223 pp497-506, 2020
  9. C. H. Calisher, J. E. Childs, H. E. Field et. al., Bats: Important reservoir hosts of emerging viruses, Clinical Microbiology Reviews, 19:3 pp531-545, 2006
  10. Xing-Yi Ge, Jia-Lu Li, Xing-Lou Yang et. al., Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor, Nature, 503:7477 pp535-538, 2013

[B] Top 10 HOTP-FG articles (within HOTP, v1.0.0: March 26, 2020)

  1. W. Li, Bats are natural reservoirs of SARS-like coronaviruses, Science, 310:5748 pp676-679, 2005
  2. Sun-Woo Yoon, Richard J. Webby, Robert G. Webster, Evolution and Ecology of Influenza A Viruses, Influenza Pathogenesis and Control - Volume I, 385: pp359-375, 2014
  3. F Carrat, A Flahault, Influenza vaccine: The challenge of antigenic drift, Vaccine, 25:39-40 pp6852-6862, 2007
  4. Aaron R. Everitt, Simon Clare, Thomas Pertel et. al., IFITM3 restricts the morbidity and mortality associated with influenza, Nature, 484:7395, pp519-523, 2012
  5. C. H. Calisher, J. E. Childs, H. E. Field et. al., Bats: Important reservoir hosts of emerging viruses, Clinical Microbiology Reviews, 19:3 pp531-545, 2006
  6. V. Stalin Raj, Huihui Mou, Saskia L. Smits et. al., Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC, Nature, 495:7440 pp251-254, 2013
  7. Na Zhu, Dingyu Zhang, Wenling Wang et. al., A novel coronavirus from patients with pneumonia in China, 2019, New England Journal of Medicine, 382:8 pp727-733, 2020
  8. B. Zhou, M. E. Donnelly, D. T. Scholes et. al., Single-reaction genomic amplification accelerates sequencing and vaccine production for classical and swine origin human Influenza A viruses, Journal of Virology, 83:19 pp10309-10313, 2009
  9. S. Tong, Y. Li, P. Rivailler et. al., A distinct lineage of influenza A virus from bats, Proceedings of the National Academy of Sciences, 109:11 pp4269-4274, 2012
  10. Jianhua Sui, William C Hwang, Sandra Perez et. al., Structural and functional bases for broad-spectrum neutralization of avian and human influenza A viruses, Nature Structural & Molecular Biology, 16:3 pp265-273, 2009

[C] Top 10 HOTP-FG articles (by Crossref counts, v1.0.0: March 26, 2020)

  1. William M. Schneider, Meike Dittmann Chevillotte, Charles M. Rice, Interferon-stimulated genes: A complex web of host defenses, Annual Review of Immunology, 32:1 pp513-545, 2014
  2. W. Li, Bats are natural reservoirs of SARS-like coronaviruses, Science, 310:5748 pp676-679, 2005
  3. Jianhua Sui, William C Hwang, Sandra Perez et. al., Structural and functional bases for broad-spectrum neutralization of avian and human influenza A viruses, Nature Structural & Molecular Biology, 16:3 pp265-273, 2009
  4. C. H. Calisher, J. E. Childs, H. E. Field et. al., Bats: Important reservoir hosts of emerging viruses, Clinical Microbiology Reviews, 19:3 pp531-545, 2006
  5. Suxiang Tong, Xueyong Zhu, Yan Li et. al., New world bats harbor diverse influenza A viruses, PLoS Pathogens, 9:10 ppe1003657, 2013
  6. Finlay McNab, Katrin Mayer-Barber, Alan Sher et. al., Type I interferons in infectious disease, Nature Reviews Immunology, 15:2 pp87-103, 2015
  7. S. Tong, Y. Li, P. Rivailler et. al., A distinct lineage of influenza A virus from bats, Proceedings of the National Academy of Sciences, 109:11 pp4269-4274, 2012
  8. V. Stalin Raj, Huihui Mou, Saskia L. Smits et. al., Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC, Nature, 495:7440 pp251-254, 2013
  9. Chaolin Huang, Yeming Wang, Xingwang Li et. al., Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China, The Lancet, 395:10223 pp497-506, 2020
  10. Mark Itzstein, The war against influenza: discovery and development of sialidase inhibitors, Nature Reviews Drug Discovery, 6:12 pp967-974, 2007

Seed lists

Sources & tools

Change log

Acknowledgements

Thaleia-M. Passia, Konstantinos Kitsios, Xi Chen, Tony Goldish