在生成ALS和LR模型以後,接下來就可以用在代碼中了。
首先ALS,其實在數據已經存在數據庫中了,只要從中取出來,去掉個逗號之類的就好
@Service
public class RecommendService {
@Autowired
private RecommendDOMapper recommendDOMapper;
//找回數據,根據userid召回shopidList
public List<Integer> recall(Integer userId){
RecommendDO recommendDO = recommendDOMapper.selectByPrimaryKey(userId);
if (recommendDO == null){
recommendDO = recommendDOMapper.selectByPrimaryKey(99999);
}
String[] shopIdArr = recommendDO.getRecommend().split(",");
List<Integer> shopIdList = new ArrayList<>();
for (int i = 0 ; i < shopIdArr.length ; i ++){
shopIdList.add(Integer.valueOf(shopIdArr[i]));
}
return shopIdList;
}
}
對於LR:
@Service
public class RecommendSortService {
private SparkSession spark;
private LogisticRegressionModel lrModel;
@PostConstruct
public void init(){
//初始化spark運行環境
spark = SparkSession.builder()
.master("local")
.appName("DianpingApp")
.getOrCreate();
lrModel = LogisticRegressionModel.load("file:///F:/mouseSpace/project/background/lr/lrmodel");
}
public List<Integer> sort(List<Integer> shopIdList , Integer userId){
//需要根據lrmodel所需要的11維的x生成特徵,然後調用預測方法
List<ShopSortModel> list = new ArrayList<>();
for (Integer shopId : shopIdList){
//造的假數據
Vector v = Vectors.dense(1,0,0,0,0,1,0.6,0,0,1,0);
Vector result = lrModel.predictProbability(v);
double[] arr = result.toArray();
double score = arr[1];
// lrModel.predict(v); 如果用這個,就是返回1或者0
ShopSortModel shopSortModel = new ShopSortModel();
shopSortModel.setShopId(shopId);
shopSortModel.setScore(score);
list.add(shopSortModel);
}
list.sort(new Comparator<ShopSortModel>() {
@Override
public int compare(ShopSortModel o1, ShopSortModel o2) {
if (o1.getScore() < o2.getScore()){
return -1;
}else if (o1.getScore() > o2.getScore()){
return 1;
}else {
return 0;
}
}
});
return list.stream().map(shopSortModel -> shopSortModel.getShopId()).collect(Collectors.toList());
}
}
代碼中自己造了一個數據,所以結果會有些偏差。
對於GBDT
跟lr算法非常像
public class GBDTRecommendSortService {
private SparkSession spark;
private GBTClassificationModel gbtClassificationModel;
@PostConstruct
public void init(){
//初始化spark運行環境
spark = SparkSession.builder()
.master("local")
.appName("DianpingApp")
.getOrCreate();
gbtClassificationModel = GBTClassificationModel.load("file:///F:/mouseSpace/project/background/lr/gbdtmodel");
}
public List<Integer> sort(List<Integer> shopIdList , Integer userId){
//需要根據lrmodel所需要的11維的x生成特徵,然後調用預測方法
List<ShopSortModel> list = new ArrayList<>();
for (Integer shopId : shopIdList){
//造的假數據
Vector v = Vectors.dense(1,0,0,0,0,1,0.6,0,0,1,0);
Vector result = gbtClassificationModel.predictProbability(v);
double[] arr = result.toArray();
double score = arr[1];
// lrModel.predict(v); 如果用這個,就是返回1或者0
ShopSortModel shopSortModel = new ShopSortModel();
shopSortModel.setShopId(shopId);
shopSortModel.setScore(score);
list.add(shopSortModel);
}
list.sort(new Comparator<ShopSortModel>() {
@Override
public int compare(ShopSortModel o1, ShopSortModel o2) {
if (o1.getScore() < o2.getScore()){
return -1;
}else if (o1.getScore() > o2.getScore()){
return 1;
}else {
return 0;
}
}
});
return list.stream().map(shopSortModel -> shopSortModel.getShopId()).collect(Collectors.toList());
}
}
A/B Test
它可以幫助我們決策算法的好壞,提供更多的真實依據的手段。
在真實場景中,假如現有的是LR算法,那麼我現在馬上在線上換成GBDT,當然是有很大風險的,那麼AB TEST就出現了,假如有10條數據,我可以分5條用lr算法,5條用GBDT算法,然後將兩個依次穿插,形成一個結果集發給前端,然後通過記錄點擊率來驗證哪種算法更好。