transformer庫是huggingface發佈的1個框架,非常好用,很多外行看起來高大上的問題,用它都可以輕鬆解決,先來看1個小例子:
一、情感分析
from transformers import pipeline classifier = pipeline('sentiment-analysis') classifier('you are beautiful')
這簡單的三行代碼,就能分析出"you are beautiful" 這句話的情感,是積極正向的(即:好話),還是消極負面(即:壞話)。順利的話,會看到類似下面的輸出:
[{'label': 'POSITIVE', 'score': 0.9998794794082642}] 表明這是一句好話,score可以理解爲可信度,0.9998即99.98%。另外注意到首次使用 sentiment-analysis 這個分類器時,會從huggingface下載依賴的模型。
萬事開頭難,如果你第1個示例就跑不通,出現下面的錯誤:
多半是transformers版本太低。可以用
import transformers transformers.__version__
看看當前版本,如果是2.1.1就表示太低了,可另開1個終端輸入:
pip install --upgrade transformers -i https://pypi.tuna.tsinghua.edu.cn/simple
將其升級至最新版本。
from transformers import pipeline print(transformers.__version__) classifier = pipeline('sentiment-analysis') classifier('you are beautiful')
這次對了,如下圖:
但是有一行警告文字 :
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english). Using a pipeline without specifying a model name and revision in production is not recommended.
這個的意思是說,沒有指定具體的模型,所以情感分析默認使用了https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english這個模型,建議指定1個具體的模型。
from transformers import pipeline print(transformers.__version__) model_id="distilbert-base-uncased-finetuned-sst-2-english" classifier = pipeline('sentiment-analysis',model=model_id) classifier('you are beautiful')
警告就被消除了。默認的模型對中文支持並不好,可以到HuggingFace上搜索"sentiment chinese",參考下圖:
可以看到很多模型,我們選下載量排行第1的這個(下圖)
複製名稱(參考下圖)
試一下:
from transformers import pipeline model_id="hw2942/bert-base-chinese-finetuning-financial-news-sentiment-v2" classifier = pipeline('sentiment-analysis',model=model_id) classifier(['這是什麼鬼天氣!','你可真棒!','看你那臉,拉得跟驢似的!','今天手氣真差,又他媽輸了!'])
模型首次使用會先下載,然後輸出分析結果,可以看到,總體還算靠譜,但也有不太合理的,比如:“這是什麼鬼天氣!”,“看你那臉,拉得跟驢似的!” ,這二句明顯是負面情緒,會被標爲“中性”,所以效果好不好,主要還得看模型本身的質量。不過總體來講,這比先前默認的英文模型,還是要強一些,來看看對比:
二、0樣本分類
from transformers import pipeline classifier = pipeline("zero-shot-classification") classifier( "This is a course about the Transformers library", candidate_labels=["education", "politics", "business"], )
效果:給一段話和幾個候選標籤,讓代碼分析每個標籤匹配的可信度。以上面的例子來說,最接近education(教育)
三、文本生成
from transformers import pipeline generator = pipeline("text-generation",model="distilgpt2") generator("once upon a time", max_length=30,num_return_sequences=2)
簡單說,就是起個頭,讓它自己接着編
四、填空
from transformers import pipeline unmasker = pipeline("fill-mask",model="distilroberta-base") unmasker("I love sweet foods,such as <mask>.", top_k=2)
<mask>部分將由算法自動填充
五、閱讀理解(提取答案)
from transformers import pipeline question_answerer = pipeline("question-answering") question_answerer( question="Is it raining today?", context="In the evening, a large cloud drifted in the distance, and soon it began to rain" )
大致效果就是給它一段話,然後提問,讓它從這段話中把跟答案相關的內容找出來。
六、翻譯
漢譯英
from transformers import pipeline translator = pipeline("translation", model="Helsinki-NLP/opus-mt-zh-en") translator("今天是週四,我要喫肯德基。")
英譯漢
from transformers import pipeline translator= pipeline("translation", model="Helsinki-NLP/opus-mt-en-zh") translator("It's Thursday. I'm gonna eat Kentucky Fried Chicken.")
七、生成摘要
from transformers import pipeline summarizer = pipeline("summarization",model="sshleifer/distilbart-cnn-12-6") summarizer("""Speaking a language is a skill, like driving a car, playing a musical instrument or learning to swim. To be a good driver, you need to practise driving. You can read a book about car mechanics. You can study the rules of the road. But nothing is as good for your driving as spending time behind the wheel of a car, actually driving. It's the same with speaking English. No matter how much you study grammar and vocabulary, if you don't practise spoken communication, it's very difficult to get good at it. So maybe you talk to yourself in English as you go about your day. Or maybe you look for opportunities to chat in English with people you meet. But however you do it, the most powerful way to improve your English speaking skills is to use them. """,max_length=100)
全民AI計劃:快來嘗試你的第一個AI程序 (baidu.com)
2 🤗 Transformers pipeline 使用 (zhihu.com)
transformers/README_zh-hans.md at main · huggingface/transformers (github.com)