snakemake 學習筆記3

目標

這次, 我要實現這個路程圖.

在這裏插入圖片描述

目標介紹

  • 第一: 生成1.txt , 2.txt, 3.txt
  • 第二: 向每個文件中加入"add a"字符, 命名爲:1_add_a.txt, 2_add_a.txt, 3_add_a.txt
  • 第三: 向文件中增加"add b", 命名爲:1_add_a_add_b.txt, 2_add_a_add_b.txt, 3_add_a_add_b.txt
  • 第四: 向文件中增加"add c", 命名爲: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt
  • 第五: 將1_add_a_add_b.txt, 2_add_a_add_b.txt, 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt 合併爲hebing.txt文件

1. 生成三個文件

(snake_test) [dengfei@localhost ex4]$ ls *txt
1.txt  2.txt  3.txt
(snake_test) [dengfei@localhost ex4]$ cat *txt
this is 1.txt
this is 2.txt
this is 3.txt

2. 在每個文件中增加"add a"

對應的Snakefile內容如下:

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"

預覽一下命令:snakemake -np {1,2,3}_add_a.txt

注意: 這裏要把生成的文件{1,2,3}_add_a.txt寫出來, 命令纔可以運行.

(snake_test) [dengfei@localhost ex4]$ snakemake -np {1,2,3}_add_a.txt
Building DAG of jobs...
Job counts:
	count	jobs
	3	adda
	3

[Tue Apr  2 21:09:19 2019]
rule adda:
    input: 3.txt
    output: 3_add_a.txt
    jobid: 2
    wildcards: file=3

cat 3.txt |xargs echo add a >3_add_a.txt

[Tue Apr  2 21:09:19 2019]
rule adda:
    input: 2.txt
    output: 2_add_a.txt
    jobid: 0
    wildcards: file=2

cat 2.txt |xargs echo add a >2_add_a.txt

[Tue Apr  2 21:09:19 2019]
rule adda:
    input: 1.txt
    output: 1_add_a.txt
    jobid: 1
    wildcards: file=1

cat 1.txt |xargs echo add a >1_add_a.txt
Job counts:
	count	jobs
	3	adda
	3
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

執行命令:

snakemake  {1,2,3}_add_a.txt

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
	count	jobs
	3	adda
	3

[Tue Apr  2 21:11:09 2019]
rule adda:
    input: 3.txt
    output: 3_add_a.txt
    jobid: 0
    wildcards: file=3

[Tue Apr  2 21:11:09 2019]
Finished job 0.
1 of 3 steps (33%) done

[Tue Apr  2 21:11:09 2019]
rule adda:
    input: 1.txt
    output: 1_add_a.txt
    jobid: 1
    wildcards: file=1

[Tue Apr  2 21:11:09 2019]
Finished job 1.
2 of 3 steps (67%) done

[Tue Apr  2 21:11:09 2019]
rule adda:
    input: 2.txt
    output: 2_add_a.txt
    jobid: 2
    wildcards: file=2

[Tue Apr  2 21:11:09 2019]
Finished job 2.
3 of 3 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211109.153566.snakemake.log

查看*add_a.txt文件:

(snake_test) [dengfei@localhost ex4]$ ls *add_a.txt
1_add_a.txt  2_add_a.txt  3_add_a.txt
(snake_test) [dengfei@localhost ex4]$ cat *add_a.txt
add a this is 1.txt
add a this is 2.txt
add a this is 3.txt

搞定.

3. 在每個文件中增加"add b"

對應的Snakefile內容如下:

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"
rule addb:
    input:
        "{file}_add_a.txt"
    output:
        "{file}_add_a_add_b.txt"
    shell:
        "cat {input} | xargs echo add b >{output}"


預覽一下命令:snakemake -np {1,2,3}_add_a_add_b.txt

(snake_test) [dengfei@localhost ex4]$ snakemake  {1,2,3}_add_a_add_b.txt
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
	count	jobs
	3	addb
	3

[Tue Apr  2 21:13:57 2019]
rule addb:
    input: 2_add_a.txt
    output: 2_add_a_add_b.txt
    jobid: 0
    wildcards: file=2

[Tue Apr  2 21:13:57 2019]
Finished job 0.
1 of 3 steps (33%) done

[Tue Apr  2 21:13:57 2019]
rule addb:
    input: 1_add_a.txt
    output: 1_add_a_add_b.txt
    jobid: 1
    wildcards: file=1

[Tue Apr  2 21:13:57 2019]
Finished job 1.
2 of 3 steps (67%) done

[Tue Apr  2 21:13:57 2019]
rule addb:
    input: 3_add_a.txt
    output: 3_add_a_add_b.txt
    jobid: 2
    wildcards: file=3

[Tue Apr  2 21:13:57 2019]
Finished job 2.
3 of 3 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211357.666661.snakemake.log


執行命令:

snakemake  {1,2,3}_add_a_add_b.txt

查看流程圖

命令:

snakemake --dag {1,2,3}_add_a_add_b.txt |dot -Tpdf >a.pdf

這裏生成的a.pdf如下:

4. 在每個文件中增加"add c"

Snakemake命令:

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"
rule addb:
    input:
        "{file}_add_a.txt"
    output:
        "{file}_add_a_add_b.txt"
    shell:
        "cat {input} | xargs echo add b >{output}"

rule addc:
    input:
        "{file}_add_a_add_b.txt"
    output:
        "{file}_add_a_add_b_add_c.txt"
    shell:
        "cat {input} | xargs echo add c >{output}"

流程圖:

命令:

snakemake --dag {1,2,3}_add_a_add_b_add_c.txt |dot -Tpdf >a1.pdf

在這裏插入圖片描述

5. 將文件合併

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"
rule addb:
    input:
        "{file}_add_a.txt"
    output:
        "{file}_add_a_add_b.txt"
    shell:
        "cat {input} | xargs echo add b >{output}"

rule addc:
    input:
        "{file}_add_a_add_b.txt"
    output:
        "{file}_add_a_add_b_add_c.txt"
    shell:
        "cat {input} | xargs echo add c >{output}"

rule hebing:
    input:
       a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]),
       b=expand("{file}_add_a_add_b.txt",file=["1","2"])
    output:"hebing.txt"
    shell:"cat {input.a} {input.b} >{output}"

執行命令:

snakemake hebing.txt

執行結果:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
	count	jobs
	3	addc
	1	hebing
	4

[Tue Apr  2 21:21:04 2019]
rule addc:
    input: 1_add_a_add_b.txt
    output: 1_add_a_add_b_add_c.txt
    jobid: 1
    wildcards: file=1

[Tue Apr  2 21:21:04 2019]
Finished job 1.
1 of 4 steps (25%) done

[Tue Apr  2 21:21:04 2019]
rule addc:
    input: 3_add_a_add_b.txt
    output: 3_add_a_add_b_add_c.txt
    jobid: 3
    wildcards: file=3

[Tue Apr  2 21:21:04 2019]
Finished job 3.
2 of 4 steps (50%) done

[Tue Apr  2 21:21:04 2019]
rule addc:
    input: 2_add_a_add_b.txt
    output: 2_add_a_add_b_add_c.txt
    jobid: 2
    wildcards: file=2

[Tue Apr  2 21:21:04 2019]
Finished job 2.
3 of 4 steps (75%) done

[Tue Apr  2 21:21:04 2019]
rule hebing:
    input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt
    output: hebing.txt
    jobid: 0

[Tue Apr  2 21:21:04 2019]
Finished job 0.
4 of 4 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T212104.719887.snakemake.log

流程圖:
在這裏插入圖片描述

搞定

歡迎關注我的公衆號: R-breeding
在這裏插入圖片描述

相關閱讀

snakemake 學習筆記1 - CSDN博客
snakemake-學習筆記2 - CSDN博客

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章