scATAC-benchmarking

授权协议 MIT License
开发语言 SHELL
所属分类 应用工具、 终端/远程登录
软件类型 开源软件
地区 不详
投 递 者 贲绪
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

scATAC-benchmarking

Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans) lead to inherent data sparsity (1-10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10-45% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level.

We present a benchmarking framework that was applied to 10 computational methods for scATAC-seq on 13 synthetic and real datasets from different assays, profiling cell types from diverse tissues and organisms. Methods for processing and featurizing scATAC-seq data were evaluated by their ability to discriminate cell types when combined with common unsupervised clustering approaches. We rank evaluated methods and discuss computational challenges associated with scATAC-seq analysis including inherently sparse data, determination of features, peak calling, the effects of sequencing coverage and noise, and clustering performance. Running times and memory requirements are also discussed.

Single Cell ATAC-seq Benchmarking Framework

Our benchmarking results highlight SnapATAC, cisTopic, and Cusanovich2018 as the top performing scATAC-seq data analysis methods to perform clustering across all datasets and different metrics. Methods that preserve information at the peak-level (cisTopic, Cusanovich2018, Scasat) or bin-level (SnapATAC) generally outperform those that summarize accessible chromatin regions at the motif/k-mer level (chromVAR, BROCKMAN, SCRAT) or over the gene-body (Cicero, Gene Scoring). In addition, methods that implement a dimensionality reduction step (BROCKMAN, cisTopic, Cusanovich2018, Scasat, SnapATAC) generally show advantages over the other methods without this important step. SnapATAC is the most scalable method; it was the only method capable of processing more than 80,000 cells. Cusanovich2018 is the method that best balances analysis performance and running time.

All the analyses performed are illustrated in Jupyter Notebooks.

Within each dataset folder, the folder 'output' stores all the output files and it consists of five sub-folders including 'feature_matrices', 'umap_rds', 'clusters', 'metrics', and 'figures'.

Real Data

Synthetic Data

Extra


Citation: Please cite our paper if you find this benchmarking work is helpful to your research. Huidong Chen, Caleb Lareau, Tommaso Andreani, Michael E. Vinyard, Sara P. Garcia, Kendell Clement, Miguel A. Andrade-Navarro, Jason D. Buenrostro & Luca Pinello. Assessment of computational methods for the analysis of single-cell ATAC-seq data. Genome Biology 20, 241 (2019).

Credits: H Chen, C Lareau, T Andreani, ME Vinyard, SP Garcia, K Clement, MA Andrade-Navarro, JD Buenrostro, L Pinello

  • What Is Runtime Benchmarking? 默认的 Substrate 区块生产系统以一致的时间间隔生产区块。这就是所谓的目标区块时间。鉴于此要求,基于 Substrate 的区块链的每个区块只能执行有限数量的extrinsics。执行外部函数所需的时间可能因计算复杂性、存储复杂性、使用的硬件和许多其他因素而异。我们使用称为weight的通用度量来表示一个区块中可以容纳多少extr

  • 1。subfamily计数 接着咱们之前的一些处理的步骤。就是说写一个指令,遍历执行语句: samtools view redup.F2.bam | grep "AluYe2" >AluYe2.sam cat AluYe2.sam | wc -l #获取屏幕的输出的结果,格式为:AluYe2\tnumber(到时候直接画成histgram) #甚至觉得有了那个list之后,可以直接用代码搞定。 #

 相关资料
  • 0.17 新版功能. Scrapy提供了一个简单的性能测试工具。其创建了一个本地HTTP服务器,并以最大可能的速度进行爬取。 该测试性能工具目的是测试Scrapy在您的硬件上的效率,来获得一个基本的底线用于对比。 其使用了一个简单的spider,仅跟进链接,不做任何处理。 运行: scrapy bench 您能看到类似的输出: 2013-05-16 13:08:46-0300 [scrapy] I

  • 设定基准点数 如果要测量执行一组行或内存使用所花费的时间,可以使用CodeIgniter中的基准点来计算它。 CodeIgniter中有一个单独的“ Benchmarking ”类用于此目的。 该类自动加载; 你不必加载它。 它可以在控制器,视图和模型类中的任何位置使用。 您所要做的就是标记起点和终点,然后在这两个标记点之间执行elapsed_time()函数,您可以获得执行该代码所需的时间,如下

  • 在本章中,我们将学习基准测试和分析如何帮助解决性能问题。 假设我们编写了一个代码并且它也提供了所需的结果,但是如果我们想要更快地运行此代码,因为需求已经改变了。 在这种情况下,我们需要找出代码的哪些部分正在减慢整个程序的速度。 在这种情况下,基准测试和分析可能很有用。 什么是基准测试? 基准测试旨在通过​​与标准进行比较来评估某些内容。 然而,这里出现的问题是什么是基准测试以及为什么在软件编程的情

  • 基准分析(Benchmarking) 好了,是时候开始消除一些误解了。我敢打赌,广大的JS开发者们,如果被问到如何测量一个特定操作的速度(执行时间),将会一头扎进这样的东西: var start = (new Date()).getTime(); // 或者`Date.now()` // 做一些操作 var end = (new Date()).getTime(); console.log(