FewRel解析

數據集解析

glove.5B.50d.json

word to vector轉換表

訓練集test.json與驗證集val.json

  1. 驗證集分爲兩部分(***比例???***),以實現測試:sample a pair of input and standard output file from the validation set.
  2. 格式解析
    file_name: Json file storing the data in the following format
    {
    “P155”: # relation id
    [
    {
    “token”: [“Hot”, “Dance”, “Club”, …], # sentence
    “h”: [“song for a future generation”, “Q7561099”, [[16, 17, …]]], # head entity [word, id, location]
    “t”: [“whammy kiss”, “Q7990594”, [[11, 12]]], # tail entity [word, id, location]
    },

    ],
    “P177”:
    [

    ]

    }

word_vec_file_name: Json file storing word vectors in the following format
[
{‘word’: ‘the’, ‘vec’: [0.418, 0.24968, …]},
{‘word’: ‘,’, ‘vec’: [0.013441, 0.23682, …]},

]

max_length: The length that all the sentences need to be extend to.

case_sensitive: Whether the data processing is case-sensitive(是否區分大小寫), default as False.

reprocess: Do the pre-processing whether there exist pre-processed files, default as False.

cuda: Use cuda or not, default as True.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章