coursera机器学习第九周最后测验--推荐系统(带解题思路)

1. Suppose you run a bookstore, and have ratings (1 to 5 stars) of books. Your collaborative filtering algorithm has learned a parameter vector θ(j) for user j, and a feature vector x(i) for each book. You would like to compute the "training error", meaning the average squared error of your system's predictions on all the ratings that you have gotten from your users. Which of these are correct ways of doing so (check all that apply)? For this problem, let m be the total number of ratings you have gotten from your users. (Another way of saying this is that . [Hint: Two of the four options below are correct.]

答:选AC,这是基本概念题了,这里需要注意theta的下标,不要搞混了

2. In which of the following situations will a collaborative filtering system be the most appropriate learning algorithm (compared to linear or logistic regression)?

You've written apiece of software that has downloaded news articles from many news websites. In your system, you also keep track of which articles you personally like vs.dislike, and the system also stores away features of these articles (e.g., word counts, name of author). Using this information, you want to build a system to try to find additional new articles that you personally will like.

You manage an online bookstore and you have the book ratings from many users. For each user,you want to recommend other books she will enjoy, based on her own ratings and the ratings of other users.

You manage an online bookstore and you have the book ratings from many users. You want to learn to predict the expected sales volume (number of books sold) as a function of the average rating of a book.

You run an online news aggregator, and for every user, you know some subset of articles that the user likes and some different subset that the user dislikes. You'd want to use this to find other articles that the user likes.

答:选B、D  协同过滤算法的特点是其特征量和数据比较多。

A:你已经编写了一个下载许多网站里新闻文章的软件。在您的系统中,您还可以跟踪您个人喜欢与不喜欢的文章,并且系统还存储这些文章的特征(例如,单词计数、作者姓名)。使用此信息,您想要构建一个系统来尝试查找您个人喜欢的其他新文章。

这个开始我也选错了,认为是正确的,目前还没理解,理解后再更新,博友有想法的也可以评论告诉我,谢谢。

B:你管理anonline书店,你有很多用户的书评分。对于每个用户,您都希望根据自己的评分和其他用户的评分推荐其他喜欢的书籍。    这个和课堂讲的推荐电影类似,就是推荐系统。

C:你管理anonline书店,你有很多用户的书评分。您想要根据书的平均评分来预测预期的销售量(销售的书籍数量)。 很明显这个预测用线性回归等其他算法好,协同过滤算法在推荐系统方面的应用更广。

D:运行一个OnLeNeNS聚合器,对于每个用户,您知道用户喜欢的一些子集和用户不喜欢的一些不同的子集。你想用这个来找到用户喜欢的其他文章。 这个推荐系统和B类似

3 . You run a movie empire, and want to build a movie recommendation system based on collaborative filtering. There were three popular review websites (which we'll call A, B and C) which users to go to rate movies, and you have just acquired all three companies that run these websites. You'd like to merge the three companies'data sets together to build a single/unified system. On website A, users rank a movie as having 1 through 5 stars. On website B, users rank on a scale of 1 -10, and decimal values (e.g., 7.5) are allowed. On website C, the ratings are from 1 to 100. You also have enough information to identify users/movies on one website with users/movies on a different website. Which of the following statements is true?

It is notpossible to combine these websites' data. You must build three separaterecommendation systems.

You can combineall three training sets into one without any modification and expect highperformance from a recommendation system.

You can merge the three datasets into one, but you should first normalize each dataset's ratings(say rescale each dataset's ratings to a 1-100 range).

Assuming that there is at least one movie/user in one database that doesn't also appear in asecond database, there is no sound way to merge the datasets, because of the missing data.

答:选C,因为ABC每个样本的均值均不一样,要想将三个样本归为一类,类似于前面线性回归、逻辑回归等提到的方法,需要对每个类进行特征缩放后方能总体归为一类。而课堂里电影的评级并没有特征缩放时因为所有电影评级已经是可比(例如1--5星),所以他们的规模相似,无须特征缩放。注意与本题的区别。

4.  

Which of the following are true of collaborative filtering systems? Check all that apply.

Suppose you are writing a recommender system to predict a user's book preferences. In order to build such a system, you need that user to rate all the other books in your training set.

For collaborative filtering, the optimization algorithm you should use is gradient descent. In particular, you cannot use more advanced optimization algorithms(L-BFGS/conjugate gradient/etc.) for collaborative filtering, since you have to solve for both the x(i)'s and θ(j)'s simultaneously.

For collaborative filtering, it is possible to use one of the advanced optimization algoirthms (L-BFGS/conjugate gradient/etc.) to solve for both the x(i)'s and θ(j)'s  simultaneously.

Even if each user has rated only a small fraction of all of your products (so r(i,j)=0 for the vast majority of (i,j) pairs),you can still build a recommender system by using collaborative filtering.

答:选 C、D

A:假设您正在编写推荐系统来预测用户的图书偏好。为了构建这样一个系统,你需要用户对你的训练集中的所有其他书籍进行评分。 实际上没必要对所有书进行评分

B:对于协作过滤算法,您应该使用的优化算法是渐变下降。特别是,您不能使用更高级的优化算法(L-BFGS /共轭梯度/等)进行协同过滤,因为您必须解决同时更新x(i)和θ(j)的问题。 可以使用其他更高级算法,课件上提到过。

5.  

Suppose you have two matrices A and B, where A is 5x3 and B is 3x5. Their product is C=AB, a 5x5 matrix.Furthermore, you have a 5x5 matrix R where every entry is 0 or 1. You want to find the sum of all elements C(i,j) for which the corresponding R(i,j) is 1, and ignore all elements C(i,j) where R(i,j)=0. One way to do so is the following code:

 

Which of the following pieces of Octave code will also correctly compute this total? Check all that apply. Assume all options are in code.

total =sum(sum((A * B) .* R))

C = (A * B) .*R; total = sum(C(:));

total =sum(sum((A * B) * R));

C = (A * B) * R;total = sum(C(:));

答: 选A、B

这里主要就是将C中元素对应于R中元素为1的位置的元素求和。所以这里必须要用到点乘,因为这里是5x5矩阵之间相乘,用一般的矩阵相乘还是5x5的矩阵,没办法选择出单个位置上的元素。


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章