Project Summary: To build a public open dataset of chest X-ray and CT images of patients which are positive or suspected of COVID-19 or other viral and bacterial pneumonias (MERS, SARS, and ARDS.). Data will be collected from public sources as well as through indirect collection from hospitals and physicians. All images and data will be released publicly in this GitHub repo.
This project is approved by the University of Montreal's Ethics Committee #CERSES-20-058-D
The labels are arranged in a hierarchy:
Current stats of PA, AP, and AP Supine views. Labels 0=No or 1=Yes. Data loader is here
COVID19_Dataset num_samples=481 views=['PA', 'AP']
{'ARDS': {0.0: 465, 1.0: 16},
'Bacterial': {0.0: 445, 1.0: 36},
'COVID-19': {0.0: 162, 1.0: 319},
'Chlamydophila': {0.0: 480, 1.0: 1},
'E.Coli': {0.0: 481},
'Fungal': {0.0: 459, 1.0: 22},
'Influenza': {0.0: 478, 1.0: 3},
'Klebsiella': {0.0: 474, 1.0: 7},
'Legionella': {0.0: 474, 1.0: 7},
'Lipoid': {0.0: 473, 1.0: 8},
'MERS': {0.0: 481},
'Mycoplasma': {0.0: 476, 1.0: 5},
'No Finding': {0.0: 467, 1.0: 14},
'Pneumocystis': {0.0: 459, 1.0: 22},
'Pneumonia': {0.0: 36, 1.0: 445},
'SARS': {0.0: 465, 1.0: 16},
'Streptococcus': {0.0: 467, 1.0: 14},
'Varicella': {0.0: 476, 1.0: 5},
'Viral': {0.0: 138, 1.0: 343}}
COVID19_Dataset num_samples=173 views=['AP Supine']
{'ARDS': {0.0: 170, 1.0: 3},
'Bacterial': {0.0: 169, 1.0: 4},
'COVID-19': {0.0: 41, 1.0: 132},
'Chlamydophila': {0.0: 173},
'E.Coli': {0.0: 169, 1.0: 4},
'Fungal': {0.0: 171, 1.0: 2},
'Influenza': {0.0: 173},
'Klebsiella': {0.0: 173},
'Legionella': {0.0: 173},
'Lipoid': {0.0: 173},
'MERS': {0.0: 173},
'Mycoplasma': {0.0: 173},
'No Finding': {0.0: 170, 1.0: 3},
'Pneumocystis': {0.0: 171, 1.0: 2},
'Pneumonia': {0.0: 26, 1.0: 147},
'SARS': {0.0: 173},
'Streptococcus': {0.0: 173},
'Varicella': {0.0: 173},
'Viral': {0.0: 41, 1.0: 132}}
Lung Bounding Boxes and Chest X-ray Segmentation (license: CC BY 4.0) contributed by General Blockchain, Inc.
Pneumonia severity scores for 94 images (license: CC BY-SA) from the paper Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning
Generated Lung Segmentations (license: CC BY-SA) from the paper Lung Segmentation from Chest X-rays using Variational Data Imputation
Brixia score for 192 images (license: CC BY-NC-SA) from the paper End-to-end learning for semiquantitative rating of COVID-19 severity on Chest X-rays
Lung and other segmentations for 517 images (license: CC BY) in COCO and raster formats by v7labs
Submit data directly to the project. View our research protocol. Contact us to start the process.
We can extract images from publications. Help identify publications which are not already included using a GitHub issue (DOIs we have are listed in the metadata file). There is a searchable database of COVID-19 papers here, and a non-searchable one (requires download) here.
Submit data to these sites (we can scrape the data from them):
Provide bounding box/masks for the detection of problematic regions in images already collected.
See SCHEMA.md for more information on the metadata schema.
Formats: For chest X-ray dcm, jpg, or png are preferred. For CT nifti (in gzip format) is preferred but also dcms. Please contact with any questions.
In the context of a COVID-19 pandemic, we want to improve prognostic predictions to triage and manage patient care. Data is the first step to developing any diagnostic/prognostic tool. While there exist large public datasets of more typical chest X-rays from the NIH [Wang 2017], Spain [Bustos 2019], Stanford [Irvin 2019], MIT [Johnson 2019] and Indiana University [Demner-Fushman 2016], there is no collection of COVID-19 chest X-rays or CT scans designed to be used for computational analysis.
The 2019 novel coronavirus (COVID-19) presents several unique features Fang, 2020 and Ai 2020. While the diagnosis is confirmed using polymerase chain reaction (PCR), infected patients with pneumonia may present on chest X-ray and computed tomography (CT) images with a pattern that is only moderately characteristic for the human eye Ng, 2020. In late January, a Chinese team published a paper detailing the clinical and paraclinical features of COVID-19. They reported that patients present abnormalities in chest CT images with most having bilateral involvement Huang 2020. Bilateral multiple lobular and subsegmental areas of consolidation constitute the typical findings in chest CT images of intensive care unit (ICU) patients on admission Huang 2020. In comparison, non-ICU patients show bilateral ground-glass opacity and subsegmental areas of consolidation in their chest CT images Huang 2020. In these patients, later chest CT images display bilateral ground-glass opacity with resolved consolidation Huang 2020.
Our goal is to use these images to develop AI based approaches to predict and understand the infection. Our group will work to release these models using our open source Chester AI Radiology Assistant platform.
The tasks are as follows using chest X-ray or CT (preference for X-ray) as input to predict these tasks:
Healthy vs Pneumonia (prototype already implemented Chester with ~74% AUC, validation study here)
Bacterial vs Viral vs COVID-19 Pneumonia
(not relevant enough for the clinical workflows)
Prognostic/severity predictions (survival, need for intubation, need for supplemental oxygen)
Tool impact: This would give physicians an edge and allow them to act with more confidence while they wait for the analysis of a radiologist by having a digital second opinion confirm their assessment of a patient's condition. Also, these tools can provide quantitative scores to consider and use in studies.
Data impact: Image data linked with clinically relevant attributes in a public dataset that is designed for ML will enable parallel development of these tools and rapid local validation of models. Furthermore, this data can be used for completely different tasks.
PI: Joseph Paul Cohen. Postdoctoral Fellow, Mila, University of Montreal
Second Paper available here and source code for baselines
COVID-19 Image Data Collection: Prospective Predictions Are the Future
Joseph Paul Cohen and Paul Morrison and Lan Dao and Karsten Roth and Tim Q Duong and Marzyeh Ghassemi
arXiv:2006.11988, https://github.com/ieee8023/covid-chestxray-dataset, 2020
@article{cohen2020covidProspective,
title={COVID-19 Image Data Collection: Prospective Predictions Are the Future},
author={Joseph Paul Cohen and Paul Morrison and Lan Dao and Karsten Roth and Tim Q Duong and Marzyeh Ghassemi},
journal={arXiv 2006.11988},
url={https://github.com/ieee8023/covid-chestxray-dataset},
year={2020}
}
Paper available here
COVID-19 image data collection, arXiv:2003.11597, 2020
Joseph Paul Cohen and Paul Morrison and Lan Dao
https://github.com/ieee8023/covid-chestxray-dataset
@article{cohen2020covid,
title={COVID-19 image data collection},
author={Joseph Paul Cohen and Paul Morrison and Lan Dao},
journal={arXiv 2003.11597},
url={https://github.com/ieee8023/covid-chestxray-dataset},
year={2020}
}
Each image has license specified in the metadata.csv file. Including Apache 2.0, CC BY-NC-SA 4.0, CC BY 4.0.
The metadata.csv, scripts, and other documents are released under a CC BY-NC-SA 4.0 license. Companies are free to perform research. Beyond that contact us.
前言 有关新冠研究的深度学习应用呈现显著增长,这在一定程度上得益于普遍传播且易于获取的公开数据集。 世卫组织/蒙特利尔大学的胸部 X 射线数据集(https://github.com/ieee8023/covid-chestxray-dataset?files=1) 白宫文章和文件数据集(https://pages.semanticscholar.org/coronavirus-research)
docs/COVIDx.md内容如下: # COVIDx Dataset **Update 06/26/2020: Released new dataset with over 14000 CXR images containing 473 COVID-19 train samples. Test dataset remains the same for consistency.**\ **Upd
先列出COVID-Net的相关内容: 论文链接:https://arxiv.org/abs/2003.09871 论文题目: COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images 论文相关工程源码: https:
COVID-Net 是一个深度卷积神经网络,旨在通过在胸部 X 光片上识别出该疾病的明显迹象来筛查可疑冠状病毒感染的患者。 系统安装要求: 经过 Tensorflow 1.13 和 1.15 测试 OpenCV 4.2.0 Python 3.6 Numpy OpenCV Scikit-Learn Matplotlib
COVID-CT The utility of this dataset has been confirmed by a senior radiologist in Tongji Hospital, Wuhan, China, who has performed diagnosis and treatment of a large number of COVID-19 patients durin
covid-notebooks 是 IBM 开源数据和 AI 技术中心(CODAIT)推出的一个新工具包,该工具包可帮助开发人员和数据科学家回答有关大流行的问题。 covid-notebooks 旨在帮助完成以下任务: 获得有关爆发当前状态的权威数据, 清除最严重的数据质量问题, 将数据整理成易于使用 Pandas 和 Scikit-Learn 等工具进行分析的格式, 以及构建一组初始的示例报告和
RadarCOVID Configuration Service Introduction Configuration Service in terms of the Radar COVID project enables: Getting exposition settings to be used by apps. Getting internationalized texts. Gettin
使用任何逻辑上的系统动力学如何建模模拟,将使用SEIR给出这种性质的传染曲线(下图)。 在此处输入图像描述 我已经尝试过模拟,但是我的图形会上下波动。根据所附图片,它不会振荡。 我需要为我的助手模拟类似于图的东西。
随着今天从欧洲到美国的旅行限制生效,以及为了减缓新冠病毒的传播更加劝导群众留在家中,我们很好奇这些措施何影响全旅行。显而易见,我们使用Cesium进行探索。 我们开始收集过去几个月每隔一天的航班数据。下列是进出北京主要国际机场的所有航班: [随着时间的推移,北京首都国际机场(PEK)的预定航班数量已可视化出来。起飞显示为红色,到达显示为绿色。] 一月底,航班数量急剧下降,从大约900架次迅速下降到