https://github.com/keon/awesome-nlp
https://github.com/glample/tagger
https://github.com/guoguibing/librec 優秀推薦系統代碼
https://github.com/ottokart/punctuator2 雙向lstm加attention機制的加標點代碼
https://github.com/clab/lstm-parser lstm的c++版本:
https://github.com/clab/stack-lstm-ner lstm的c++版本:
https://github.com/stanfordnlp/GloVe glove,詞向量
https://github.com/facebookresearch/fastText 分本分類:Library for fast text representation and classification.
https://github.com/dennybritz/cnn-text-classification-tf 文本分類:用tensorflow
https://github.com/Microsoft/LightLDA LDA主題建模
https://github.com/facebookresearch/fairseq Facebook的機器翻譯
https://github.com/google/re2 一個號稱比pure更好的google開發的正則表達式庫
https://github.com/overtrue/pinyin 漢字轉拼音的方案
https://github.com/cpp-netlib/cpp-netlib 比libcurl好用多了
https://github.com/mischasan/aho-corasick AC算法,專用來解決多模匹配問題
https://github.com/NLPchina/ansj_seg 號稱比hanlp更好的分詞和人名識別
https://github.com/neubig/lamtram 另一種有意思的神經網絡語言模型
https://github.com/RaRe-Technologies/gensim
https://github.com/yangyangwithgnu/hanz2piny 漢字轉拼音
https://github.com/uber/horovod uber開源的一款分佈式框架
https://github.com/L1aoXingyu/pytorch-beginner pytorch初學者的很多教程
https://github.com/graykode/nlp-tutorial 一個韓國佬搞的pytorch學習教程
https://github.com/aymericdamien/TensorFlow-Examples 學習TensorFlow的好教程
https://github.com/FuYanzhe2/Name-Entity-Recognition bert-bilstm-crf打造最牛的命名實體識別項目
https://github.com/huggingface/pytorch-pretrained-BERT pytorch實現的bert代碼
https://github.com/atpaino/deep-text-corrector 糾錯代碼
https://github.com/shibing624/pycorrector 糾錯代碼
https://github.com/wainshine/Chinese-Names-Corpus 大量人名
https://github.com/fighting41love/funNLP 大量有意義的詞庫
https://github.com/sing1ee/dict_build 新詞發現java版,可以自動從語料中生成詞典
https://github.com/pwxcoo/chinese-xinhua 大量歇後語、成語、詞語、漢字
https://github.com/robertmartin8/PyPortfolioOpt.git 一個實現廣泛使用的經典金融投資組合優化技術的庫
https://github.com/kudkudak/word-embeddings-benchmarks 評估詞向量訓練的好壞
https://github.com/bamtercelboo/Word_Similarity_and_Word_Analogy 評估詞向量訓練的好壞
https://github.com/CallMeJiaGu/WordSimilarityAnalogyData 評估詞向量訓練好壞的測試集