Coursera 機器學習第9周作業1

1、For which of thefollowing problems would anomaly detection be a suitable algorithm? 選2和3

Givendata from credit card transactions, classify each transaction according to typeof purchase (for example: food, transportation, clothing).

Froma large set of primary care patient records, identify individuals who mighthave unusual health conditions.

Ina computer chip fabrication plant, identify microchips that might be defective.

Froma large set of hospital patient records, predict which patients have aparticular disease (say, the flu).

2、Suppose you havetrained an anomaly detection system for fraud detection, and your system thatflags anomalies when p(x) is less than ε,and you find on the cross-validation set that it is missing many fradulenttransactions (i.e., failing to flag them as anomalies). What should you do?選2

Decrease ε

Increase ε

3、Suppose you aredeveloping an anomaly detection system to catch manufacturing defects inairplane engines. You model uses   p(x)=∏nj=1p(xj;μj,σ2j).

You have two features x1 =vibration intensity, and x2 = heat generated. Both x1 and x2takeon values between 0 and 1 (and are strictly greater than 0), and for most"normal" engines you expect that x1≈x2. One of thesuspected anomalies is that a flawed engine may vibrate very intensely evenwithout generating much heat (large x1, small x2), eventhough the particular values of x1 and x2 maynot fall outside their typical ranges of values. What additional feature x3 shouldyou create to capture these types of anomalies: 選2

x3=x1+x2

x3=x1/x2

x3=1/x1

x3=1/x2

4、Which of thefollowing are true? Check all that apply.選1和4

Whenchoosing features for an anomaly detection system, it is a good idea to lookfor features that take on unusually large or small values for (mainly the)anomalous examples.

Ifyou are developing an anomaly detection system, there is no way to make use oflabeled data to improve your system.

Ifyou have a large labeled training set with many positive examples and manynegative examples, the anomaly detection algorithm will likely perform just aswell as a supervised learning algorithm such as an SVM.

Ifyou do not have any labeled data (or if all your data has label y=0),then is is still possible to learn p(x), but it may beharder to evaluate the system or choose a good value of ϵ.

5、You have a 1-Ddataset {x(1),…,x(m)} and you want to detectoutliers in the dataset. You first plot the dataset and it looks like this:

                                              

Suppose you fit the gaussian distribution parameters μ1 and σ21 tothis dataset. Which of the following values for μ1 and σ21 mightyou get?  選1

μ1=−3,σ21=4

μ1=−6,σ21=4

μ1=−3,σ21=2

μ1=−6,σ21=2

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章