在jupyter notebook中使用hail进行操作,使用命令如下:
import hail as hl
mt = hl.import_vcf("s3a://test-env/tmp/zzq/20200317072357")
mt.show()
可以成功运行,并展示数据如下:
Initializing Spark and Hail with default parameters...
Running on Apache Spark version 2.4.5
SparkUI available at http://10.1.39.244:4040
Welcome to
__ __ <>__
/ /_/ /__ __/ /
/ __ / _ `/ / /
/_/ /_/\_,_/_/_/ version 0.2.33-5d8cae649505
LOGGING: writing to /tmp/hail-20200319-0942-0.2.33-5d8cae649505.log
2020-03-19 09:42:33 Hail: WARN: expected input file 's3a://test-env/tmp/zzq/20200317072357' to end in .vcf[.bgz, .gz]
2020-03-19 09:42:38 Hail: INFO: Coerced sorted dataset
2020-03-19 09:42:39 Hail: INFO: Coerced sorted dataset
locus alleles A01.CEL.GT A02.CEL.GT A03.CEL.GT A04.CEL.GT
locus<GRCh37> array<str> call call call call
1:564936 ["T","C"] 0/0 0/0 0/0 0/0
1:565433 [