【網課筆記】Fast.ai-2019 || Week3 - Multi-label Classification & Segmentation

原創

3stone_

2019-07-03 00:44

注：此博客僅爲博主複習之用，省去了大量細節和博主已在其他地方做記錄的知識點；有興趣學習fast.ai課程的同學建議紮實學習Howard的視頻以及Forum中的資源！共勉！

2019/6/28

課前熱身

介紹了Andrew Ng的ML課程：內容部分雖然稍微過時，但大部分還是非常excellent，其採用自底向上模式，結合fast.ai自頂向下模式，可以更好地學習！
combine bottom-up style and top-down style and meet somewhere in the middle.
當然Howard也介紹了自己的ML課程，約爲DL課程的兩倍時長；（涉及很多基礎知識）

學習建議：All these courses together if you want really dig deeply into the material,DO ALL OF THEM. A lot of people who have had end up saying: oh I got more out of each one by doing the whole lot。

Production

Howard提供了部署web app的簡單方案！前端javascript，後端數據JSON；鏈接在課程導航頁左側的production菜單中；

show works!

值得去Forum一一回看，會給你很多啓發！！！ （不一定視頻，有GitHub詳細筆記！）

What car!
識別視頻中的人物表情 or 手勢
查找Forum和資料，弄懂如何實現圖片到視頻的跨越
yourcityfrom
通過衛星圖像識別出是哪個城市！
識別面部表情的類別–結合2013年一篇論文

| Multi-label prediction

dataset: Planet Amazon dataset (satellite images), from Kaggle
很多人已經在深度學習中使用satellite images, but only scratching the surface!

Get dataset from Kaggle

按照notebook教程做即可

Create DataBunch - use data block API

DataBunch(train_dl: DataLoader(datasets(), batchsize = 8), valid_dl: 同) ：即一層一層作爲參數傳入！

注： fastai 所有文檔 都是notebook，可以clone下來在Jupyter上練習！具體怎麼打開可以回看lesson3-29`

----------------up to 56` min

| Image Segmentation with CamVid

注意點：調整圖片時，mask也要相應調整，否則不匹配了！

訓練方法： progressive resizing

從size較少的數據集先訓練，逐步增大圖片的size； 但是目前還沒有準確的理論指出每個level的大小應該設爲多少，Howard經驗：
低於64x64往往沒有幫助

fit_one_cycle()原理

簡單來說，就是讓lr_rate先變大變小！

Loss的形態類似於下圖，
變大的目的：helping the model to explore the whole function surface and try and find areas where both the loss is low and also it`s not bumpy! 防止困在某個局部最優處！（所以傳入fit_one_cycle()的其實是max learning rate）

所以如果用fit_one_cycle()，Loss的變化會如下，也就是說，如果你發現:
if you find that it’s just getting a little bit worse and then it gets a lot better, you’ve found a really good maximum learning rate.

那如果你發現：Loss一直下降，可以稍微上調一下學習率
so if you find that Loss is kind of always going down, particularly after you unfreeze, that suggests you can probably bump your learning rate up a little bit.
Because you really want to see this kind of shape! It’s going to train faster and generalize better.

Howard： 知道這個理論和真正運用的差距，你需要看大量這種情況的變化圖！
所以每次訓練完，看Loss的變化情況，good results怎樣，bad results怎樣，不但調整learning rate 和epochs，觀察圖像變化！

Mixed precision training

如果訓練時遇到內存超限的問題，可以嘗試使用16bit精度的浮點數進行訓練！
means: Instead of using single-precision floating-point numbers, you can do most of the calculations in your model with half-precision floating-point numbers.
So, 16 bits instead of 32 bits. 因爲比較新，所以你可能需要最新的硬件(CUDA drivers, etc)才行，fastai提供了接口，只需在create_learner的末尾加上to_fp16()即可

----------------up to 94` min

| Regression with BIWI head pose dataset

【Image Regression Model: find the center of the face，two float numbers】
Regression: any kind of model where your output is some continuous number or set of numbers.

----------------up to 101` min

| NLP quick version - IMDB

第一個學生提問： 沒太聽懂，回看！

----------------up to 110` min

最後提到了Michael Nielsen的書 《Neural Network and Deep Learning》 中的動畫：
If you have enough little matrix multiplications followed by sigmoid(ReLU), you can create arbitrary shapes. Combinations of linear functions and nonlinearities can create arbitrary shapes

Universal approximation theory: If you have stacks of linear functions and nonlinearities, you can approximate any function arbitrarily, closely.

之前這本書看到一半扔下來，看來要撿起來才行！裏面有很多細節值得一看！

Question：如果圖片數據是2 channel或者4channel的，如何使用3channel的 pretrained model?

Howard表示之後會在fastai中加入此功能；
2 channel的數據: there is few things you can do, but basically you can create a third channel as either being all zeros or being the average of the other two other channels.
4 channel的數據: 當然不能丟棄 4th channel的數據，所以只能修改你的model，簡單來說就是給weight tensor多加一維，賦值爲0或隨機值；後面課程會細講！

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.