最近出來實習,淚奔,沒時間學習了,把一些覺得很好但是沒時間看的資源放這 以後學習
如果說理解一個技術的最高境界,就是能夠用最簡單的方式將這個技術表達出來的話,那麼Igor對於CPU Cache的理解絕對達到了此境界。他的博文:Gallery of Processor Cache Effects http://t.cn/hrXwvb 7個簡單至極的代碼示例,覆蓋了Cache Line、Cache Size、False Sharing等重要知識點,不得不服
NAACL今天的tutorial包括了斯坦福Richard Socher和Christopher Manning關於深度學習在NLP中應用的教學講座。看了一下slides,比去年ACL的版本增加了一些新內容,可以算是關於深度學習在語言技術的應用中相當全面的tutorial了。"Deep Learning for NLP (without Magic)" slides: http://t.cn/zHHyKUo
教程tutorial
ubc 的machine learning 2013 課程
有mcmc 以及最新的深度學習的課程
http://www.cs.ubc.ca/~nando/540-2013/lectures.html
文本挖掘技術
http://www.icst.pku.edu.cn/course/mining/11-12spring/index.html
rbm java 代碼 估計是最對我胃口的代碼
https://github.com/tjake/rbm-dbn-mnist
Stanford NLP組專門設置了Deep Learning in Natural Language Processing的主頁
http://nlp.stanford.edu/projects/DeepLearningInNaturalLanguageProcessing.shtml
一個大牛的主頁
這是其教學 有很多資料
http://alex.smola.org/teaching/
http://www.cs.princeton.edu/courses/archive/spring10/cos424/w/syllabus
The Large Scale Learning class notes
http://cilvr.cs.nyu.edu/doku.php?id=courses:bigdata:slides:start算法tutorial
一個劍橋大學教授的主頁 高斯過程的pdf講得很細很好
http://mlg.eng.cam.ac.uk/zoubin/
變分貝葉斯 tutorial 很nice
http://people.inf.ethz.ch/bkay/talks/Brodersen_2013_03_22.pdf
關於協同過濾 和graph mind 的hadoop 實現
https://code.google.com/p/hadoop-network/
單機模式處理大數據,蒐集一些好用的開源利器
1. LibFM
2. Svdfeature
項目主頁:http://apex.sjtu.edu.cn/apex_wiki/svdfeature
3. Libsvm和Liblinear
libsvm項目主頁:http://www.csie.ntu.edu.tw/~cjlin/libsvm/
liblinear項目主頁:http://www.csie.ntu.edu.tw/~cjlin/liblinear/
初次使用必讀:practical guide
libsvm的開發心得by林智仁:http://www.csie.ntu.edu.tw/~cjlin/talks/kdd.pdf
4. rt-rank
項目主頁:http://research.engineering.wustl.edu/~amohan/
rt-rank中實現了推薦系統中常見的random forests和gradient boosted decision trees這兩種方法,使用起來很方便。
3. Mahout
項目主頁:http://mahout.apache.org/
4. MyMediaLite
項目主頁:http://www.ismll.uni-hildesheim.de/mymedialite/
4. GraphLab 和 GraphChi
GraphLab項目主頁:http://graphlab.org/
GraphChi項目主頁:http://graphlab.org/graphchi/
GraphChi的下載地址:https://code.google.com/p/graphchi/downloads/detail?name=graphchi_src_v0.1.2_toolkits.tar.gz
CF for GraphChi: http://bickson.blogspot.com/2012/08/collaborative-filtering-with-graphchi.html
pylearn2
https://github.com/lisa-lab/pylearn2
包含很多特性 ,更新很快
-
- Training algorithms
-
-
A “default training algorithm” that asks the model to train itself
-
- Stochastic gradient descent, with extensions including
-
- Learning rate decay
- Momentum
- Polyak averaging
- Early stopping
- A simple framework for adding your own extensions
-
Batch gradient descent with line searches
-
Nonlinear conjugate gradient descent (with line searches)
-
-
- Model Estimation Criteria
-
- Score Matching
- Denoising Score Matching
- Noise-Contrastive Estimation
- Cross-entropy
- Log-likelihood
-
- Models
-
-
Autoencoders, including Contractive and Denoising Autoencoders
-
- RBMs, including gaussian and ssRBM. Varying levels of integration into
-
the full framework.
-
k-means
-
Local Coordinate Coding
-
Maxout networks
-
PCA
-
Spike-and-Slab Sparse coding
-
- SVMs (we provide a wrapper around scikit-learn that makes it easy to
-
train a multiclass svm on dense training data in a memory efficient way, which doesn’t always happen using scikit-learn directly)
-
- Partial implementation of DBMs (contact Ian Goodfellow if you would like
-
to complete it)
-
-
- Datasets:
-
- MNIST, MNIST with background and rotations
- STL-10
- CIFAR-10, CIFAR-100
- NIPS Workshops 2011 Transfer Learning Challenge
- UTLC
- NORB
- Toronto Faces Dataset
-
- Dataset pre-processing
-
- Contrast normalization
- ZCA whitening
- Patch extraction (for implementing convolution-like algorithms)
- The Coates+Lee+Ng CIFAR processing pipeline
-
- Miscellaneous algorithms and utilities:
-
-
AIS
-
Weight visualization for single layer networks
-
- Can plot learning curves showing how user-configured quantities
-
change during learning
-