COVID-CT

COVID-CT-Dataset: A CT Scan Dataset about COVID-19
授权协议 Readme
开发语言 Python
所属分类 神经网络/人工智能、 机器学习/深度学习
软件类型 开源软件
地区 不详
投 递 者 澹台承
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

COVID-CT

The utility of this dataset has been confirmed by a senior radiologist in Tongji Hospital, Wuhan, China, who has performed diagnosis and treatment of a large number of COVID-19 patients during the outbreak of this disease between January and April.

After releasing this dataset, we received several feedbacks expressing concerns about the usability of this dataset. The major concerns are summarized as follows. First, when the original CT images are put into papers, the quality of these images is degraded, which may render the diagnosis decisions less accurate. The quality degradation includes: the Hounsfield unit (HU) values are lost; the number of bits per pixel is reduced; the resolution of images is reduced. Second, the original CT scan contains a sequence of CT slices, but when put into papers, only a few key slices are selected, which may have negative impact on diagnosis as well.

We consulted the aforementioned radiologist at Tongji Hospital regarding these two concerns. According to the radiologist, the issues raised in these concerns do not significantly affect the accuracy of diagnosis decision-making. First, experienced radiologists are able to make an accurate diagnosis from low quality CT images. For example, given a photo taken by smart phone of the original CT image, experienced radiologists can make an accurate diagnosis by just looking at the photo, though the CT image in the photo has much lower quality than the original CT image. Likewise, the quality gap between CT images in papers and original CT images will not largely hurt the accuracy of diagnosis. Second, while it is preferable to read a sequence of CT slices, oftentimes a single-slice of CT contains enough clinical information for accurate decision-making.

Data Description

The COVID-CT-Dataset has 349 CT images containing clinical findings of COVID-19 from 216 patients. They are in ./Images-processed/CT_COVID.zip

Non-COVID CT scans are in ./Images-processed/CT_NonCOVID.zip

We provide a data split in ./Data-split.Data split information see README for DenseNet_predict.md

The meta information (e.g., patient ID, patient information, DOI, image caption) is in COVID-CT-MetaInfo.xlsx

The images are collected from COVID19-related papers from medRxiv, bioRxiv, NEJM, JAMA, Lancet, etc. CTs containing COVID-19 abnormalities are selected by reading the figure captions in the papers. All copyrights of the data belong to the authors and publishers of these papers.

The dataset details are described in this preprint: COVID-CT-Dataset: A CT Scan Dataset about COVID-19

If you find this dataset and code useful, please cite:

@article{zhao2020COVID-CT-Dataset,
  title={COVID-CT-Dataset: a CT scan dataset about COVID-19},
  author={Zhao, Jinyu and Zhang, Yichen and He, Xuehai and Xie, Pengtao},
  journal={arXiv preprint arXiv:2003.13865}, 
  year={2020}
}

Baseline Performance

We developed two baseline methods for the community to benchmark with.The code are in the "baseline methods" folder and the details are in the readme files under that folder. The methods are described in Sample-Efficient Deep Learning for COVID-19 Diagnosis Based on CT Scans

If you find the code useful, please cite:

@Article{he2020sample,
  author  = {He, Xuehai and Yang, Xingyi and Zhang, Shanghang, and Zhao, Jinyu and Zhang, Yichen and Xing, Eric, and Xie,       Pengtao},
  title   = {Sample-Efficient Deep Learning for COVID-19 Diagnosis Based on CT Scans},
  journal = {medrxiv},
  year    = {2020},
}

Contribution Guide

  • To contribute to our project, please email your data to jiz077@eng.ucsd.edu with the corresponding meta information (Patient ID, DOI and Captions).
  • We recommend you also extract images from publications or preprints. Make sure the original papers you crawled have different DOIs from those listed in COVID-CT-MetaInfo.xlsx.
  • In COVID-CT-MetaInfo.xlsx, images with the form of 2020.mm.dd.xxxx are crawled from bioRxiv or medRxiv. The DOIs for these preprints are 10.1101/2020.mm.dd.xxxx.
  • 亚马逊网络服务(AWS)已形成一个公共AWS COVID-19数据湖 ,这是与新型冠状病毒的传播及相关疾病有关的集中数据集。 AWS在4月8日表示,它正在与合作伙伴合作,免费提供不断增长的COVID-19数据集,并使其保持最新。 AWS已利用Johns Hopkins和《纽约时报》的COVID-19病例跟踪数据,Definitive Healthcare的病床可用性以及艾伦AI研究所的45,000

  • 裂全球新冠肺炎COVID-19数据 全球COVID-19新冠疫苗接种 新冠肺炎声音诊断挑战赛数据集  裂印度新冠肺炎COVID-19数据 COVID-19世界新冠疫苗不良反应 辉瑞新冠裂疫苗推文 数据集 天池比赛-新冠肺炎问句匹配  新冠疫情数据分析 新冠肺炎 COVID-19 相关Twitter 新冠疫情谣言信息 美国各州各城市2019新型冠状病毒裂COVID19数据 裂COVID Indi

 相关资料
  • Recki-CT 是一个用 PHP 开发的 PHP 编译器。 简单示例: /** * @return void */function foo($bar) {}// Instead of using:foo($baz);// Use:$foo = Jit::JitFu('foo');$foo($baz);

  • CT-Eclipse 是一款 Eclipse 的持续测试插件。 首先解释一下,英文是Continuous testing,中文是持续测试  就是测试在后台自动运行,指出你的错误,然后对应的测试用例代码中会有错误提示。举一个简单例子在IDE中,我们写了代码以后然后顺手快捷键保存,然后后 台自动编译,然后报错,然后我们修改之到无错。然后运行单元测试,如果出错我们会在JUnit的Eclipse插件中查看错误,然后改之,然后运行单元测 试。

  • COVID-Net 是一个深度卷积神经网络,旨在通过在胸部 X 光片上识别出该疾病的明显迹象来筛查可疑冠状病毒感染的患者。 系统安装要求: 经过 Tensorflow 1.13 和 1.15 测试 OpenCV 4.2.0 Python 3.6 Numpy OpenCV Scikit-Learn Matplotlib

  • covid-notebooks 是 IBM 开源数据和 AI 技术中心(CODAIT)推出的一个新工具包,该工具包可帮助开发人员和数据科学家回答有关大流行的问题。 covid-notebooks 旨在帮助完成以下任务: 获得有关爆发当前状态的权威数据, 清除最严重的数据质量问题, 将数据整理成易于使用 Pandas 和 Scikit-Learn 等工具进行分析的格式, 以及构建一组初始的示例报告和

  • Computed Tomography Image Reconstruction Introduction Computed tomography is a collection of X-ray images stacked together in order to get the depth information as the third dimension of a diagnostic

  • �� Note: please do not claim diagnostic performance of a model without a clinical study! This is not a kaggle competition dataset. Please read this paper about evaluation issues: https://arxiv.org/abs