記一次Image Caption使用過程

原創

2018-09-04 06:57

最近在搞Image Caption,在Github上找了還多項目,不是環境不支持,就是跑不通.終於最後還是找到了一個可以在win10+python3+Tensorflow上跑通的項目,我只是做的前向預測,並未做訓練,因爲數據實在太多渣渣電腦跑不起來.當然中間也有很多坑,但不是很多.希望記下來以後可以用到,利己利人.

項目地址

https://github.com/coldmanck/show-attend-and-tell

[Python 3] Tensorflow implementation of “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”

使用方法

下載解壓項目
安裝依賴
- 按照項目上的提示連接把依賴裝一下,其中nltk比較墨跡,儘量別下在系統盤,挺大的.
下載COCO數據集的annotations
- 我這裏直接給出百度雲的連接https://pan.baidu.com/s/1TkiFEsh2dRnX7qAsxuCQ4w下載解壓,and put the file captions_train2014.json in the folder train. Similarly,put the file captions_val2014.json in the folder val.
下載與訓練好的模型文件
- https://app.box.com/s/xuigzzaqfbpnf76t295h109ey9po5t8p
- 這個可能下不了,我給個百度雲連接
前向預測
- 終於到正題了.Put some images in the folder test/images,在項目文件夾下運行:python main.py --phase=test --model_file='./models/289999.npy' --beam_size=3, 沒錯是289999,這是模型的名字.只需要把下載的模型289999.npy放到models文件下即可.The generated captions will be saved in the folder test/results.

報錯與處理

開始會有很多print的錯誤,這是小事正常的,把print加括號就完事了,畢竟python2改過來的.
base_model.py的這一句image_name = os.path.splitext(image_name)[0]改一下,改爲image_name = os.path.splitext(image_name)[0].split('/')[-1]否則會報找不到文件.
plt.savefig(os.path.join(config.test_result_dir,image_name+'_result.jpg'))改爲plt.savefig(os.path.join(config.test_result_dir, image_name+'_result.png'))否則會報RGBA無法轉爲JPG.因爲要保存的圖片是四通道的,沒法保存成jpg,我也不知道作者怎麼跑通的.￣□￣｜｜
中間還有一些亂七八糟的小錯,我也都記不清了,也就這兩個比較頭疼我印象深刻.有什麼沒提到的地方可以留言

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

記一次Image Caption使用過程

項目地址

使用方法

報錯與處理

lightdb hash index的性能和限制

Keras--動態調整學習率

【C++學習】1.Kdevelop環境配置

TTFNET實踐記錄

Ubuntu opencv3.4.1 編譯之編譯錯誤: 'cuda_compile_generated_gpu_mat.cu.o'

【C++學習】2.CMakeLists

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結