Workflow 主要作用是用来连接多个进程的。
未命名的workflow是main workflow,也是程序的入口点(类似于C语言中的main函数)。
调用process
也很简单,和调用一个函数一样,eg: <process_name>(<input_ch1>,<input_ch2>,...)
//workflow_01.nf
nextflow.enable.dsl=2
process INDEX {
input:
path transcriptome
output:
path 'index'
script:
"""
salmon index -t $transcriptome -i index
"""
}
process QUANT {
input:
each path(index)
tuple(val(pair_id), path(reads))
output:
path pair_id
script:
"""
salmon quant --threads $task.cpus --libType=U -i $index -1 ${reads[0]} -2 ${reads[1]} -o $pair_id
"""
}
workflow {
transcriptome_ch = channel.fromPath('data/yeast/transcriptome/*.fa.gz',checkIfExists: true)
read_pairs_ch = channel.fromFilePairs('data/yeast/reads/*_{1,2}.fq.gz',checkIfExists: true)
//index process takes 1 input channel as a argument
index_ch = INDEX(transcriptome_ch)
//quant channel takes 2 input channels as arguments
QUANT( index_ch, read_pairs_ch ).view()
//QUANT(INDEX(transcriptome_ch),read_pairs_ch ).view()
}
也可以使用out
属性来获取process的输出,例如:
[..truncated..]
workflow {
transcriptome_ch = channel.fromPath('data/yeast/transcriptome/*.fa.gz')
read_pairs_ch = channel.fromFilePairs('data/yeast/reads/*_{1,2}.fq.gz')
//call INDEX process
INDEX(transcriptome_ch)
// INDEX process output accessed using the `out` attribute
QUANT(INDEX.out,read_pairs_ch)
QUANT.out.view()
}
当一个process有大于等于2个输出时,可以使用:out[0], out[1]
,或者使用输出的名称out.xxxName
。
我们可以给process的输出命名,使用emit
关键字。例如:
//workflow_02.nf
nextflow.enable.dsl=2
process INDEX {
input:
path transcriptome
output:
path 'index', emit: salmon_index
script:
"""
salmon index -t $transcriptome -i index
"""
}
process QUANT {
input:
each path(index)
tuple(val(pair_id), path(reads))
output:
path pair_id
script:
"""
salmon quant --threads $task.cpus --libType=U -i $index -1 ${reads[0]} -2 ${reads[1]} -o $pair_id
"""
}
workflow {
transcriptome_ch = channel.fromPath('data/yeast/transcriptome/*.fa.gz')
read_pairs_ch = channel.fromFilePairs('data/yeast/reads/*_{1,2}.fq.gz')
//call INDEX process
INDEX(transcriptome_ch)
//access INDEX object named output
QUANT(INDEX.out.salmon_index,read_pairs_ch).view()
}
workflow可以访问外部作用域中定义的任何变量和参数:
//workflow_03.nf
[..truncated..]
params.transcriptome = 'data/yeast/transcriptome/*.fa.gz'
params.reads = 'data/yeast/reads/ref1*_{1,2}.fq.gz'
workflow {
transcriptome_ch = channel.fromPath(params.transcriptome)
read_pairs_ch = channel.fromFilePairs(params.reads)
INDEX(transcriptome_ch)
QUANT(INDEX.out.salmon_index,read_pairs_ch).view()
}