問題
我們知道在由於random包中生成的是僞隨機數,因此通過設置seed可以固定住隨機結果。
但有一次使用時發現設置完seed之後結果依然改變:
代碼如下:
19 random.seed(args.seed)
20
21 # read all filenames in list files
22 with open(args.file, 'r') as f:
23 ¦ filenames = f.readlines()
24 filenames = [item.replace('\n', '') for item in filenames]
25 filenames.sort()
26 random.shuffle(filenames)
27 print('total filenames: {}'.format(len(filenames)))
28 print('==== filenames[:4]', filenames[:4])
29
30 # select patient, if None, all patients are selected
31 patient_ID = [filename[:4] for filename in filenames]
32 set_patient_ID = set(patient_ID)
33 print('total {} patients'.format(len(set_patient_ID)))
34 selected_patient_ID = list(set_patient_ID)
35 #selected_patient_ID.sort()
36 selected_patient_ID = selected_patient_ID[:args.patient_num]
37 #selected_patient_ID = list(set_patient_ID)[:args.patient_num]
38 print('select {} pathents'.format(len(selected_patient_ID)))
39 print('selected pathent IDs:\n', selected_patient_ID)
40 print('-'*100)
41 print('===== selected_patient_ID[:4]', selected_patient_ID[:4])
解決方法
我們發現28行每次輸出固定,但是41行每次輸出不固定。
查找原因,發現是第32行每次將list改爲set之後,會被自動排序,但還是沒理解爲什麼會導致出現了隨機。
解決方法是增加第35行代碼即可,如下:
19 random.seed(args.seed)
20
21 # read all filenames in list files
22 with open(args.file, 'r') as f:
23 ¦ filenames = f.readlines()
24 filenames = [item.replace('\n', '') for item in filenames]
25 filenames.sort()
26 random.shuffle(filenames)
27 print('total filenames: {}'.format(len(filenames)))
28 print('==== filenames[:4]', filenames[:4])
29
30 # select patient, if None, all patients are selected
31 patient_ID = [filename[:4] for filename in filenames]
32 set_patient_ID = set(patient_ID)
33 print('total {} patients'.format(len(set_patient_ID)))
34 selected_patient_ID = list(set_patient_ID)
35 selected_patient_ID.sort()
36 selected_patient_ID = selected_patient_ID[:args.patient_num]
37 #selected_patient_ID = list(set_patient_ID)[:args.patient_num]
38 print('select {} pathents'.format(len(selected_patient_ID)))
39 print('selected pathent IDs:\n', selected_patient_ID)
40 print('-'*100)
41 print('===== selected_patient_ID[:4]', selected_patient_ID[:4])