trinity运行原理及常见报错(一)
商经业
2023-12-01
trinity运行过程
1.检测所用软件及输入文件
1)检测输入文件是否正确
Left read files: $VAR1 = [
'/lustre/02.work/wanglihui/project/Populus_tremula_170417/clean_data/KD_L1_1.fq',
'/lustre/02.work/wanglihui/project/Populus_tremula_170417/clean_data/KD_L2_1.fq'
];
Right read files: $VAR1 = [
'/lustre/02.work/wanglihui/project/Populus_tremula_170417/clean_data/KD_L1_2.fq',
'/lustre/02.work/wanglihui/project/Populus_tremula_170417/clean_data/KD_L2_2.fq'
];
如果文件不存在或者填写参数时逗号或空格隔开方式不对,报错:
Error, cannot locate file: /lustre/02.work/wanglihui/project/Populus_tremula_170417/clean_data/KD_L1_1.fq,/lustre/02.work/wanglihui/project/Populus_tremula_170417/clean_data/KD_L2_1.fq at /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110//Trinity.pl line 1762.
main::create_full_path(ARRAY(0xe985d8), 1) called at /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110//Trinity.pl line 933
这个报错就是因为旧版本需要reads直接用空格隔开,但是shell中采用了逗号
2)检测各种软件是否正常
需要使用到bowtie samtools java
旧版本:
Paired mode requires bowtie. Found bowtie at: /lustre/02.work/liuxiaoshuang/biosoft/bowtie/1.1.1/bowtie
Found samtools at: /lustre/00.tools/Bins/samtools
java版本报错:
Error, Trinity requires access to Java version 1.6 or 1.7. Currently installed version is: java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
但是如果分步运行,加上--no_run_butterfly --no_run_quantifygraph 就不会报错,原因可能在于:Inchworm和Chrysalis是C++编写的,而 使用Butterfly是Java编写的,如果运行Butterfly需要检测java可用性
新版本:
Monday, May 15, 2017: 22:15:39
CMD: java -Xmx64m -XX:ParallelGCThreads=5 -jar /lustre/02.work/liufei/tools/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/ExitTester.jar 0
Monday, May 15, 2017: 22:15:43
CMD: java -Xmx64m -XX:ParallelGCThreads=5 -jar /lustre/02.work/liufei/tools/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/ExitTester.jar 1
3)检测trinity版本号
Trinity version: Trinity-v2.4.0
-ERROR: couldn't run the network check to confirm latest Trinity software version.
这一步报错对组装没有影响,也可以调用--no_version_check这个参数,就不显示报错
2.如果上述没有报错,从fq到fa转化
运行命令如下:
Tuesday, May 23, 2017: 10:03:01
CMD: mkdir -p /lustre/02.work/liufei/project/Noref/20170517/Analysis/Basic_Analysis/Assembly/Trinity_assembly/All_Combination/All_Combination_Trinity/chrysalis
CMD finished (0 seconds)
Converting input files. (in parallel)Tuesday, May 23, 2017: 10:03:02
CMD: /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /lustre/02.work/liufei/project/Noref/20170517/data/S1/S1_S1_L001_R1_001.fastq >> left.fa
Tuesday, May 23, 2017: 10:03:02
CMD: /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /lustre/02.work/liufei/project/Noref/20170517/data/S1/S1_S1_L001_R2_001.fastq >> right.fa
CMD finished (506 seconds)
Tuesday, May 23, 2017: 10:11:28
CMD: /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /lustre/02.work/liufei/project/Noref/20170517/data/S1/S1_S1_L002_R1_001.fastq >> left.fa
CMD finished (900 seconds)
Tuesday, May 23, 2017: 10:18:02
CMD: /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /lustre/02.work/liufei/project/Noref/20170517/data/S1/S1_S1_L002_R2_001.fastq >> right.fa
CMD finished (484 seconds)
-conversion of 27714705 from FQ to FA format succeeded.
-conversion of 27712988 from FQ to FA format succeeded.
Thursday, December 15, 2016: 11:59:35
CMD: touch left.fa.ok right.fa.ok
Thursday, December 15, 2016: 11:59:35
CMD: cat left.fa right.fa > both.fa
Thursday, December 15, 2016: 12:21:15
CMD: touch both.fa.ok
得到left.fa right.fa both.fa
注:这一步主要是由fq到fa,也可以作为一个软件用于fq2fa
/lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /lustre/02.work/liufei/test/fq_len_filter/test.fq >> out.fa
3.运行Jellyfish
building a k-mer catalog from reads
运行命令:
* Running CMD: /data/tools/trinityrnaseq-2.2.0/trinity-plugins/jellyfish/bin/jellyfish count -t 20 -m 25 -s 3726277687 both.fa
* Running CMD: /data/tools/trinityrnaseq-2.2.0/trinity-plugins/jellyfish/bin/jellyfish dump -L 2 mer_counts.jf > jellyfish.kmers.fa
#-L参数是总参数设定的 min_kmer_cov
* Running CMD: /data/tools/trinityrnaseq-2.2.0/trinity-plugins/jellyfish/bin/jellyfish histo -t 20 -o jellyfish.kmers.fa.histo mer_counts.jf
#对kmer的频率做了一个统计
如果内存不足会报错:将参数内存调大即可。
Tuesday, May 23, 2017: 12:25:21
CMD: /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/jellyfish/bin/jellyfish count -t 5 -m 25 -s 5450651468 --both-strands both.fa
Error, cmd: /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110/trinity-plugins/jellyfish/bin/jellyfish count -t 5 -m 25 -s 5450651468 --both-strands both.fa died with ret 135 at /lustre/02.work/liuxiaoshuang/biosoft/trinity/r20131110//Trinity.pl line 1793.
这一步,主要是-s 这个参数起作用,如果过小会导致报错
主程序中这个参数值是如此定义:my $jelly_hash_size = int( ($max_memory - $read_file_size)/7); # decided upon by Rick Westerman
其中$max_memory是JM设定值,$read_file_size
得到jellyfish.kmers.fa 这是一个kmer库 长度为25