Transformers是著名的深度學習預訓練模型集成庫，包含NLP模型最多，CV等其他領域也有，支持預訓練模型的快速使用和魔改，並且模型可以快速在不同的深度學習框架間（Pytorch/Tensorflow/Jax）無縫轉移。以下記錄基於HuggingFace官網教程：https://github.com/huggingface/transformers/blob/main/README_zh-hans.md

任務調用

　　直接使用兩行代碼實現各種任務，以下舉例一個情感分析任務：

from transformers import pipeline
# 使用情緒分析流水線
classifier = pipeline('sentiment-analysis', 'distilbert-base-uncased-finetuned-sst-2-english')
classifier('We are very happy to introduce pipeline to the transformers repository.')

　　pipeline第一個參數傳入實現任務類型，第二個參數傳入預訓練模型權重名。模型預訓練權重名中，distilbert-base表示使用模型蒸餾訓練的base bert；uncased表示模型權重無法區分大小寫，數據在傳入前需要小寫處理；finetuned-sst-2-english表示模型權重在英文Stanford Sentiment Treebank 2數據集上進行微調。如果權重名能在當前工作目錄中找到，就讀取當前工作目錄的文件，否則就會去HuggingFace官網下載相應的Repository。如果自動下載失敗，distilbert-base-uncased-finetuned-sst-2-english的模型權重和配置文件可以通過以下方式下載：

git lfs install
git clone https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english

　　下載下來一個文件夾，其中包含模型結構文件 config.json、模型權重文件 model.safetensors、分詞器配置文件 tokenizer_config.json、詞表文件 vocab.txt等。文件夾中有時會包含文件分詞器文件 tokenizer.json，其中保存了分詞到id的映射。tokenizer.json的映射與vocab.txt正好相反，因此沒有tokenizer.json照樣可以運行。但是除了映射之外，tokenizer.json通常還會保存一些額外的關於特殊token或是未登錄詞的詞頻信息，是會影響模型結果的。

　　如果通過git模型權重下載失敗，可以直接進網站下載單個權重文件並放入文件夾。其中後綴爲h5、weights、ckpt、pth、safetensors、bin的文件都是模型權重。比如pth是pytorch常用的權重後綴，h5是Tensorflow的常用的權重後綴。具體保存的格式不細究，只要任意下載一個就行。Transformers默認使用Pytorch，因此通常下載pth、bin或safetensors。

　　通過以上API和下載的Repository文件，可以看出Transformers把用到的預訓練模型、配置文件、分詞等都放在一個repository中，從而在使用時實現模型結構的自動構建以及配套預訓練權重的讀取，從而無需顯式使用Pytorch寫好與預訓練權重配套的結構代碼，加快預訓練模型使用流程。

預訓練模型調用

　　如果要研究模型的推理，而不是實現具體任務。可以實現爲以下代碼：

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") #1
model = AutoModel.from_pretrained("bert-base-uncased") #2
inp = tokenizer("Hello world!", return_tensors="pt") #3
outp = model(**inp)

　　其中#1表示讀取bert-base-uncased的分詞器，#2表示讀取bert-base-uncased的預訓練權重並構建模型。如果模型權重只下載了h5，而使用Pytorch作爲後端，則需要給from_pretrained添加from_tf=True參數。#3使用分詞器對輸入句子進行分詞，輸出pytorch張量。如果設置return_tensors="tf"則分詞器輸出兼容tensorflow模型的張量，此時model應該使用TFAutoModel來實例化。

　　如果要處理批量數據，可以給分詞器傳入文本列表，如：

texts = ["Hello world!", "Hello, how are you?"]
inp = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)

　　如果給分詞器傳入兩段文本，分詞器將它們合併，並額外生成句子類型id，用於句子順序判別任務。第一句token標識爲0，第二句token標識爲1：

texts = ["Hello world!", "Hello, how are you?"]
inp = tokenizer(*texts, return_tensors="pt", padding=True, truncation=True)

自定義模型推理

　　觀察config.json，其中architectures字段定義了所需預訓練權重所需使用的模型結構類，可以發現其它的各字段就是傳入該模型結構類的參數，從而能實例化出與預訓練模型權重一致的模型結構，然後再讀取權重得到預訓練模型。那麼我們可以根據這些文件以及Transformers內置的模型結構類（繼承自nn.Module），來自定義模型的數據通路。將前面的情感分類管道分解如下：

from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
from torch import nn

text = "We are very happy to introduce pipeline to the transformers repository."
model_head_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = DistilBertForSequenceClassification.from_pretrained(model_head_name).to('cuda')
tokenizer = DistilBertTokenizer.from_pretrained(model_head_name)
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True).to('cuda')

# 獲取模型內 bert 主體的輸出
distilbert_output = model.distilbert(**inputs)
# 使用 bert 輸出的第一個token [CLS] 計算情感分類概率
hidden_state = distilbert_output[0]  # (bs, seq_len, dim)
pooled_output = hidden_state[:, 0]  # (bs, dim)
pooled_output = model.pre_classifier(pooled_output)  # (bs, dim)
pooled_output = nn.ReLU()(pooled_output)  # (bs, dim)
pooled_output = model.dropout(pooled_output)  # (bs, dim)
logits = model.classifier(pooled_output)  # (bs, num_labels)
print("Positive rate: ", nn.Softmax(1)(logits)[0,1].detach().cpu().numpy())

Transformers包使用記錄

任務調用

預訓練模型調用

自定義模型推理

認知提升的方法

螞蟻面試：Springcloud核心組件的底層原理，你知道多少？

C#開源的兩款功能強大的錄屏神器

使用 python matplotlib 將 LaTex 公式轉爲 svg

python中的多繼承理解

python 遞歸比較兩個文件夾

通過squid將本地作爲代理讓不可聯網的遠端服務器聯網

python模塊導入規則（相對導入和絕對導入）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結