Reproducible可重複性 研究論文必備屬性

說明

一篇好的研究論文需要具備可重複性,只有結果是沒有意義的。你需要告訴別人,怎麼按圖索布才能得到你的分析結果。
- 其他研究者可以檢驗你的結果和過程是否嚴密科學
- 其他研究者可以在你的研究基礎上,在某些環節進行擴展性研究
- 其他研究者可以瞭解你整個分析的脈絡,更好的理解內容

以下內容是整理自coursera的Reproducible Research 課程的內容總結

一篇Reproducible文章包含的內容

  • Tile/Author list
  • Abstract
  • Body/Results
  • Supplementary Materials/the gory details
  • Code/Data/really gory details

爲了確保Reproducible,你需要做的

  • Are we doing good science ##好的數據、團隊、專注、興趣
  • Was any part of this analysis done do by hand ##不要手工對數據做加工
    • if so ,are those parts preciselydocument
    • does the documentation match reality
  • Have we taught a computer to do as much as possible ##將處理數據的操作植入電腦
  • Dont point and click ##不要使用GUIs圖形用戶交互界面
  • Are we using a version control system ##使用類似github這樣的版本控制來觀察優化的過程
  • Have we documented our software enviroment ##記錄你的軟件環境(R sessionInfo)
  • Have we saved any output that we cannot reconstruct from original data+code ##不要只保存任何結果
  • How far back in the analysis pipeline can we go before our results are longer (automatically) reproducible
    ##分析從raw data到report的整個過程是如何實現的

Reproducible不適合的地方

  • Reproducible research is important,but does not necessarily solve the critical question of whether a data analysis is trustworthy
  • Reproducible research focuses on the most “downstream” aspect of research dissemination
  • Evidence-based data analysis would provide standardized,best practices for given scientific areas and questions
  • Gives reviewers an important tool without dramatically increasing the burden on them
  • More effort should be put into improving the quality of “upstream” aspects of scientific research

一篇好的Reproducible論文

http://www.rpubs.com/rdpeng/13396

監視是否達成Reproducible的標準細節

  • Has either a (1) valid RPubs URL pointing to a data analysis document for this assignment been submitted; or (2) a complete PDF file presenting the data analysis been uploaded?
  • Is the document written in English?
  • Does the analysis include description and justification for any data transformations?
  • Does the document have a title that briefly summarizes the data analysis?
  • Does the document have a synopsis that describes and summarizes the data analysis in less than 10 sentences?
  • Is there a section titled “Data Processing” that describes how the data were loaded into R and processed for analysis?
  • Is there a section titled “Results” where the main results are presented?
  • Is there at least one figure in the document that contains a plot?
  • Are there at most 3 figures in this document?
  • Does the analysis start from the raw data file (i.e. the original .csv.bz2 file)?
  • Does the analysis address the question of which types of events are most harmful to population health?
  • Does the analysis address the question of which types of events have the greatest economic consequences?
  • Do all the results of the analysis (i.e. figures, tables, numerical summaries) appear to be reproducible?
  • Do the figure(s) have descriptive captions (i.e. there is a description near the figure of what is happening in the figure)?
  • As far as you can determine, does it appear that the work submitted for this project is the work of the student who submitted it?
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章