使用python的streamlit模塊搭建一個簡易的網頁版blast

streamlit的參考資料

https://docs.streamlit.io/library/get-started/create-an-app

st.button https://docs.streamlit.io/library/api-reference/widgets/st.button

st.text_area https://docs.streamlit.io/library/api-reference/widgets/st.text_area

python io https://docs.python.org/3/library/io.html

io.StringIO 主要作用 python subprocess 調用blastn,blastn輸出結果不保存到文件裏,而是輸出到屏幕,輸出到屏幕的內容需要用io.StringIO轉化一下才能被NCBIXML解析

https://janakiev.com/blog/python-shell-commands/

這個鏈接主要介紹的是python subprocess 調用blastn,blastn輸出結果不保存到文件裏,而是輸出到屏幕 ,然後如何將輸出到屏幕的內容保存到一個python 對象裏

https://stackabuse.com/the-python-tempfile-module/

這個鏈接主要介紹瞭如何生成臨時文件(用於存儲用戶上傳的fasta文件)

https://stackoverflow.com/questions/23212435/permission-denied-to-write-to-my-temporary-file

臨時文件寫入內容的時候不知道爲啥總是提示沒有權限,這個鏈接裏稍微有點介紹

st.datatable https://docs.streamlit.io/1.3.0/library/api-reference/data/st.dataframe

https://www.metagenomics.wiki/tools/blast/blastn-output-format-6

blastn output format 6 的表頭

st.file_uploader https://docs.streamlit.io/library/api-reference/widgets/st.file_uploader

完整代碼

(還很不完善,只是勉強可以運行)

import streamlit as st
import tempfile
import os
import io
from Bio import SeqIO
import subprocess
from Bio.Blast import NCBIXML
import pandas as pd

st.title("Learn how to build web blast app using streamlit")

# abc = st.text_area(label="paste your fasta here",
#              value=">seq1\nATCGA",
#              height=200)

# #print(abc)
# if st.button('save'):
#     # for line in abc:
#     #     st.write(line)
#     with open('abc.txt','w') as fw:
#         fw.write(abc)
        
#     st.write("OK")
# result = st.button("Click Here")
# 
# # st.write(result)
# print(os.getcwd())
# if result:
#     with tempfile.TemporaryFile() as fp:
#         tmp = tempfile.NamedTemporaryFile(suffix=".fasta",delete=False)
#         st.write(tmp.name)
#         tmp.write(bytes(abc,'utf-8'))
#         tmp.seek(0)
#         with open(tmp.name,'r') as fr:
#             for line in fr:
#                 st.write(line)
#         #os.write(new_file,b'abcde')
#         #st.write("OK")
#         #os.close(new_file)
#         # with open(tmp.name,'w') as fw:
#         #     fw.write(abc)
#     st.write(":smile:")
    

# you need to change this path to you own
blastn = "D:/Biotools/blast/ncbiblast/bin/blastn"
db = 'D:/Bioinformatics_Intro/streamlit/uploadfiles/blastdb/cpvirus'
tempfile.tempdir = "D:/Bioinformatics_Intro/streamlit/uploadfiles/temp"

fasta = st.text_area(label="you can paste your fasta here",
             value=">seq1\nATCGA",
             height=400)

runblastn = st.button("run blastn")

names = "qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore".split()

if runblastn:
    tmp = tempfile.NamedTemporaryFile(suffix=".fasta",delete=False)
    st.write(tmp.name)
    
    tmp.write(bytes(fasta,'utf-8'))
    tmp.seek(0)
    for rec in SeqIO.parse(tmp.name,'fasta'):
        st.write(rec.id)
        
    cmd = [blastn,'-db',db,'-query',tmp.name,'-evalue','0.0001','-outfmt','6']
    process = subprocess.Popen(cmd,stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE,
                               universal_newlines=True)
    stdout,stderr = process.communicate()
    # for record in NCBIXML.parse(io.StringIO(stdout)):
    #     st.write(record.query)
    
    df = pd.read_csv(io.StringIO(stdout),sep="\t",header=None,names=names)
    st.dataframe(df)
    tmp.close()
    os.unlink(tmp.name)
    
    
uploaded_file = st.file_uploader("or upload your fasta file here")

if uploaded_file is not None:
    bytes_data = uploaded_file.getvalue()
    #print(type(bytes_data))
    #st.write(bytes_data)
    tmp = tempfile.NamedTemporaryFile(suffix=".fasta",delete=False)
    st.write(tmp.name)
    try:
        tmp.write(bytes_data)
        tmp.seek(0)
        with open(tmp.name,'r') as fr:
            for line in fr:
                if line.startswith(">"):
                    st.write("input seq id is: %s"%(line.strip().replace(">","")))
                
        cmd = [blastn,'-db',db,'-query',tmp.name,'-evalue','0.0001','-outfmt','6']
        process = subprocess.Popen(cmd,stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE,
                                universal_newlines=True)
        stdout,stderr = process.communicate()
        # for record in NCBIXML.parse(io.StringIO(stdout)):
        #     st.write(record.query)
        
        df = pd.read_csv(io.StringIO(stdout),sep="\t",header=None,names=names)
        st.dataframe(df)
    finally:
        tmp.close()
        os.unlink(tmp.name)

運行代碼

streamlit run main.py

如何部署呢?

在查資料吧

歡迎大家關注我的公衆號

小明的數據分析筆記本

小明的數據分析筆記本 公衆號 主要分享:1、R語言和python做數據分析和數據可視化的簡單小例子;2、園藝植物相關轉錄組學、基因組學、羣體遺傳學文獻閱讀筆記;3、生物信息學入門學習資料及自己的學習筆記!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章