這個學期初參加了華爲軟件精英挑戰賽,經過了一個月的努力,最終取得了一個差強人意的成績。在杭廈賽區初賽位列40名,20-40名的成績實在是太接近,離晉級差了1分,無奈技不如人,分數上的微小差距可能是技術上的巨大差距。這次比賽門檻比較低,很多人說是華爲比賽最水的一屆,也有人稱這是一屆華爲調參大賽。我能進入賽區的64強也是僥倖。
賽題:
簡單描述,根據2-3月的虛擬機使用情況,預測接下來1-2星期的虛擬機使用情況(這一部分需要用機器學習的知識建模),然後將預測的虛擬機裝入物理服務器中,要求資源使用率達到最大。(這部分便是裝箱問題的解決)。
賽題詳細文檔鏈接:http://codecraft.devcloud.huaweicloud.com/home/detail
大致思路:
預測部分使用簡單的多元線性迴歸,將前50天的數據加上時間參數進行多元迴歸,採用梯度下降法擬合模型。數據預處理是一個關鍵點,最終使用了加權平均法處理。最近幾天發現自己寫代碼的指數平滑公式寫錯了,導致使用了指數平滑比使用加權平均法分數低很多。聽說使用3次指數平滑就達能到了240+。這也算是給我一個教訓,閱讀文檔要仔細,不要只看一半就迫不及待開始
寫程序。最終打入決賽的同學應該都實現了更高級的模型,LSTM,ARIMA。
裝箱部分,我查閱了許多資料,有first-fit,next-fit,ffd算法,爲了提高分數,我在此基礎之上加上了啓發式算法,不過在練習階段的測試,分數一直一樣。後來,我有采用揹包算法來裝箱,採用動態規劃儘量裝滿一個揹包,之後再拿出新的揹包,當我興奮地提交時卻發現分數還是一樣的。。。
初賽正式提交時,我備份了差不多10個版本,ffd算法裝箱的版本得到了220分,模擬退火法版本出現了資源超分的問題(應該是程序代碼有bug),揹包算法版本得到了237分。
吐槽:
比賽進行一半的時候,華爲官方在後臺做出了調整,大部分參賽選手都下降了8,9分。但是華爲官方卻沒有做出實際有意義的說明,只是說“評測機制有變化”。這使得大部分人都停滯不前,我在初賽結束時都還不知道機制哪裏變化了。
本次在班裏拉了2個同學組隊,結果是我一個人完成了全部代碼,一個人寫代碼,特別是機器學習的部分有很大的侷限性。大學裏一直沒有找到計算機專業志同道合的朋友,也是一種遺憾。
PS:如果明年杭電有同學參加華爲軟件精英挑戰賽可以找我一起組隊!!!
程序代碼:
程序感覺除了我應該沒人看得懂,完全是針對這個比賽寫的,期間我自己也不知道哪個變量是什麼含義。
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.Comparator;
import java.util.Date;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.PriorityQueue;
import java.util.Queue;
import java.util.Random;
import java.util.Set;
public class Predict {
public static String[] predictVm(String[] ecsContent, String[] inputContent) {
/** =========do your work here========== **/
// 虛擬機類
class VirtualMachine {
String vmname;
int cpu;
int mem;
int index;// 還不知道能不能刪去,待定
public VirtualMachine(String vmname, int cpu, int mem) {
super();
this.vmname = vmname;
this.cpu = cpu;
this.mem = mem / 1024;
}
public VirtualMachine() {
super();
// TODO Auto-generated constructor stub
}
@Override
public VirtualMachine clone() {
return new VirtualMachine(vmname,cpu,mem);
}
}
;
// 主機類
class Server {
int CPU;
int MEM;
int avariable_CPU_SIZE;
int avariable_MEM_SIZE;
Map<VirtualMachine,Integer> VM_MAP= new HashMap<>();
public Server(int cPU, int mEM) {
CPU = cPU;
MEM = mEM;
this.avariable_CPU_SIZE = CPU;
this.avariable_MEM_SIZE = MEM;
}
public Server() {
super();
// TODO Auto-generated constructor stub
}
public boolean add(VirtualMachine vm) {
if (this.avariable_CPU_SIZE >= vm.cpu && this.avariable_MEM_SIZE >= vm.mem) {
if(VM_MAP.containsKey(vm)) {
VM_MAP.put(vm,VM_MAP.get(vm)+1);
}else {
VM_MAP.put(vm, 1);
}
this.avariable_CPU_SIZE -=vm.cpu;
this.avariable_MEM_SIZE -= vm.mem;
return true;
} else {
return false;
}
}
public Server clone() {
Server copy=new Server(this.CPU,this.MEM);
copy.avariable_CPU_SIZE=this.avariable_CPU_SIZE;copy.avariable_MEM_SIZE=this.avariable_MEM_SIZE;
for(VirtualMachine vm:this.VM_MAP.keySet()) {
copy.VM_MAP.put(vm,this.VM_MAP.get(vm));
}
return copy;
}
}
;
// compare接口
Comparator<Server> ServerCPU_cmp = new Comparator<Server>() {
@Override
public int compare(Server o1, Server o2) {
// TODO Auto-generated method stub
if (o1.avariable_CPU_SIZE > o2.avariable_CPU_SIZE) {
return 1;
} else if (o1.avariable_CPU_SIZE == o2.avariable_CPU_SIZE) {
if (o1.avariable_MEM_SIZE > o2.avariable_MEM_SIZE) {
return 1;
} else if (o1.avariable_MEM_SIZE == o2.avariable_MEM_SIZE) {
return 0;
} else {
return -1;
}
} else {
return -1;
}
}
};
Comparator<Server> ServerMEM_cmp = new Comparator<Server>() {
@Override
public int compare(Server o1, Server o2) {
// TODO Auto-generated method stub
if (o1.avariable_MEM_SIZE > o2.avariable_MEM_SIZE) {
return 1;
} else if (o1.avariable_MEM_SIZE == o2.avariable_MEM_SIZE) {
if (o1.avariable_CPU_SIZE > o2.avariable_CPU_SIZE) {
return 1;
} else if (o1.avariable_CPU_SIZE > o2.avariable_CPU_SIZE) {
return 0;
} else {
return -1;
}
} else {
return -1;
}
}
};
Comparator<VirtualMachine> VM_CPU_cmp = new Comparator<VirtualMachine>() {
@Override
public int compare(VirtualMachine o1, VirtualMachine o2) {
// TODO Auto-generated method stub
if (o1.cpu > o2.cpu) {
return -1;
} else if (o1.cpu == o2.cpu) {
if (o1.mem > o2.mem) {
return -1;
} else if (o1.mem == o2.mem) {
return 0;
} else {
return 1;
}
} else {
return 1;
}
}
};
Comparator<VirtualMachine> VM_MEM_cmp = new Comparator<VirtualMachine>() {
@Override
public int compare(VirtualMachine o1, VirtualMachine o2) {
// TODO Auto-generated method stub
if (o1.mem > o2.mem) {
return -1;
} else if (o1.mem == o2.mem) {
if (o1.cpu > o2.cpu) {
return -1;
} else if (o1.cpu == o2.cpu) {
return 0;
} else {
return 1;
}
} else {
return 1;
}
}
};
int N = 50; // w向量長度
int RANGE = 100000; // 訓練次數
double learningrate = 0.0001; // 學習率
double daylearningrate = 0.2; // 時間座標學習率
Map<String, VirtualMachine> VMmap = new HashMap<>(); // 虛擬機名稱->虛擬機類的映射
List<String> VMKINDList = new ArrayList<String>(); // index->虛擬機名稱的映射
// 解析輸入文件的信息物理服務器的 cpu,mem規模
String[] CPUCong = inputContent[0].split(" ");
Integer CPUSIZE = Integer.valueOf(CPUCong[0]);
Integer MEMSIZE = Integer.valueOf(CPUCong[1]);
// 得到預測虛擬機種類,優化參數,開始結束時間
Integer VMKINDS = Integer.valueOf(inputContent[2]);
for (int i = 3; i < 3 + VMKINDS; i++) {
String[] split = inputContent[i].split(" ");
VirtualMachine virtualmachine = new VirtualMachine();
virtualmachine.vmname = split[0];
virtualmachine.cpu = Integer.valueOf(split[1]);
virtualmachine.mem = Integer.valueOf(split[2])/1024;
virtualmachine.index = i - 3;//
VMmap.put(split[0], virtualmachine);
VMKINDList.add(split[0]);
}
Set<String> VMset = VMmap.keySet();
String OPTIMIZEPARA = inputContent[4 + VMKINDS];
String STARTTIME = inputContent[6 + VMKINDS];
String ENDTIME = inputContent[7 + VMKINDS];
SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
Date starttime = null;
Date endtime = null;
try {
starttime = formatter.parse(STARTTIME);
endtime = formatter.parse(ENDTIME);
} catch (ParseException e) {
e.printStackTrace();
}
int predict_days = (int) ((endtime.getTime() - starttime.getTime()) / (24 * 60 * 60 * 1000));
// 獲得訓練天數
String TrainStartTime = ecsContent[0].split("\t")[2].split(" ")[0];
String TrainENDTime = ecsContent[ecsContent.length - 1].split("\t")[2].split(" ")[0];
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
Date beginDate = null;
Date endDate = null;
try {
beginDate = (Date) format.parse(TrainStartTime);
endDate = (Date) format.parse(TrainENDTime);
} catch (ParseException e) {
e.printStackTrace();
}
int Days = (int) ((endDate.getTime() - beginDate.getTime()) / (24 * 60 * 60 * 1000)) + 1;
double[][] W = new double[VMKINDS][N];
double[] b = new double[VMKINDS];
double time_parameter[]=new double[VMKINDS];
double[][] history = new double[Days][VMKINDS];
Arrays.fill(time_parameter, 0.85);
// 獲得歷史數據
for (int i = 0; i < ecsContent.length; i++) {
String[] array = ecsContent[i].split("\t");
String flavorName = array[1];
if (VMset.contains(flavorName)) {
String TempTime = ecsContent[i].split("\t")[2].split(" ")[0];
Date TempDate = null;
try {
TempDate = (Date) format.parse(TempTime);
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int day = (int) ((TempDate.getTime() - beginDate.getTime()) / (24 * 60 * 60 * 1000));
int kind = VMmap.get(flavorName).index;
history[day][kind] += 1;
}
}
DataAverage(history);
// 開始訓練
int traning_day=Days-N;//訓練的天數
for (int epoch = 0; epoch < RANGE; epoch++) {
if (epoch==80000)
learningrate *= 0.95;
for (int k = 0; k < VMKINDS; k++) {
double[] W_gradient=new double[N];
double b_gradient=0;
double day_gradient=0;
double[] x=new double[N];
for (int i = N; i < Days; i++) {
double[] predict = new double[VMKINDS];
for (int j = i - 1; j >= i - N; j--) {
x[i-1-j]+=history[j][k];
predict[k] += history[j][k] * W[k][i - 1 - j];
}
double partday=((double)i)/(Days+predict_days);
predict[k] += b[k]+time_parameter[k]*partday;
predict[k] = RELU(predict[k]);
double loss= predict[k] - history[i][k];
b_gradient+=2*loss/traning_day;
day_gradient+=2*loss*partday/traning_day;
for(int j=0;j<N;j++) {
W_gradient[j]+=2*history[i-1-j][k]*loss/traning_day;
}
}
for (int i=0; i<N; i++) {
W[k][i] -= W_gradient[i]* learningrate;
}
b[k] -= b_gradient* learningrate;
time_parameter[k]-=day_gradient*daylearningrate;
}
}
// 得到結果矩陣
double[][] VMpredict = new double[predict_days][VMKINDS];
for (int day = 0; day < predict_days; day++) {
for (int k = 0; k < VMKINDS; k++) {
double predictvaule = 0;
for (int i = 0; i < N; i++) {
if (day - i - 1 >= 0) {
predictvaule += VMpredict[day - i - 1][k] * W[k][i];
} else {
predictvaule += history[Days + (day - i - 1)][k] * W[k][i];
}
}
VMpredict[day][k] = RELU(predictvaule + b[k]+time_parameter[k]*((day+Days)/(double)(Days+predict_days)));
}
}
// 開始進行輸出操作
List<String> resultlist = new ArrayList<String>();
double[] kind_sum_d = new double[VMKINDS];
int[] kind_sum = new int[VMKINDS];
// 統計各類虛擬機數量
int total_sum = 0;
List<VirtualMachine> virtualmachine_list = new ArrayList<>();
List<Server> server_list = new ArrayList<>();
for (int i = 0; i < VMKINDS; i++) {
for (int day = 0; day < predict_days; day++) {
kind_sum_d[i] +=VMpredict[day][i];
}
kind_sum[i]=(int) kind_sum_d[i];
for (int k = 0; k < kind_sum[i]; k++) {
String vmname = VMKINDList.get(i);
VirtualMachine virtualMachine = VMmap.get(vmname);
virtualmachine_list
.add(new VirtualMachine(virtualMachine.vmname, virtualMachine.cpu, virtualMachine.mem));
}
total_sum += kind_sum[i];
}
String total_number = String.valueOf(total_sum);
resultlist.add(total_number);
for (int i = 0; i < VMKINDS; i++) {
resultlist.add(VMKINDList.get(i) + " " + kind_sum[i]);
}
resultlist.add("");
//開始揹包操作處理
Server[][] Server_Record=null;
if(OPTIMIZEPARA.equals("CP")) {
//以CPU爲維度優化資源
while(!isempty(kind_sum)) {
Server[][] Server_CPU=new Server[MEMSIZE+1][VMKINDS+1];
for(int i=0;i<MEMSIZE+1;i++) {
for(int j=0;j<VMKINDS+1;j++) {
Server_CPU[i][j]=new Server(CPUSIZE,MEMSIZE);
}
}
for(int k=1;k<VMKINDS+1;k++) {
String vm_name=VMKINDList.get(k-1);//虛擬機名稱
VirtualMachine virtualMachine = VMmap.get(vm_name);
int vm_cpu=virtualMachine.cpu;
int vm_mem=virtualMachine.mem;
for(int v=1;v<MEMSIZE+1;v++) {
List<Server> serverlist=new ArrayList<>();
for(int i=0;i<=v/vm_mem&&i<=kind_sum[k-1];i++) {
if(i*vm_cpu<=Server_CPU[v-i*vm_mem][k-1].avariable_CPU_SIZE) {
Server clone = Server_CPU[v-i*vm_mem][k-1].clone();
for(int j=0;j<i;j++) {
clone.add(virtualMachine);
}
serverlist.add(clone);
}
}
Collections.sort(serverlist,ServerCPU_cmp);
Server_CPU[v][k]=serverlist.get(0);
}
}
Server server_temp=Server_CPU[MEMSIZE][VMKINDS];
//數量減少
server_list.add(server_temp);
for(VirtualMachine vm:server_temp.VM_MAP.keySet()) {
kind_sum[VMmap.get(vm.vmname).index]-=server_temp.VM_MAP.get(vm);
}
}
}else {
//以MEM爲維度優化資源
while(!isempty(kind_sum)) {
Server[][] Server_MEM=new Server[CPUSIZE+1][VMKINDS+1];
for(int i=0;i<CPUSIZE+1;i++) {
for(int j=0;j<VMKINDS+1;j++) {
Server_MEM[i][j]=new Server(CPUSIZE,MEMSIZE);
}
}
for(int k=1;k<VMKINDS+1;k++) {
String vm_name=VMKINDList.get(k-1);//虛擬機名稱
VirtualMachine virtualMachine = VMmap.get(vm_name);
int vm_cpu=virtualMachine.cpu;
int vm_mem=virtualMachine.mem;
for(int v=1;v<CPUSIZE+1;v++) {
List<Server> serverlist=new ArrayList<>();
for(int i=0;i<=v/vm_cpu&&i<=kind_sum[k-1];i++) {
if(i*vm_mem<=Server_MEM[v-i*vm_cpu][k-1].avariable_MEM_SIZE) {
Server clone = Server_MEM[v-i*vm_cpu][k-1].clone();
for(int j=0;j<i;j++) {
clone.add(virtualMachine);
}
serverlist.add(clone);
}
}
Collections.sort(serverlist,ServerMEM_cmp);
Server_MEM[v][k]=serverlist.get(0);
}
}
Server server_temp=Server_MEM[CPUSIZE][VMKINDS];
//數量減少
server_list.add(server_temp);
for(VirtualMachine vm:server_temp.VM_MAP.keySet()) {
kind_sum[VMmap.get(vm.vmname).index]-=server_temp.VM_MAP.get(vm);
}
}
}
resultlist.add(String.valueOf(server_list.size()));
for(int i=0;i<server_list.size();i++) {
Server server = server_list.get(i);
String str=(i+1)+" ";
for(Entry<VirtualMachine, Integer> entry:server.VM_MAP.entrySet()) {
str+=entry.getKey().vmname+" "+entry.getValue()+" ";
}
resultlist.add(str);
}
String[] results = new String[resultlist.size()];
resultlist.toArray(results);
return results;
}
private static boolean isempty(int[] kind_sum) {
boolean flag=true;
for(int i=0;i<kind_sum.length;i++) {
if(kind_sum[i]!=0) {
flag=false;
break;
}
}
return flag;
}
private static void DataAverage(double[][] history) {
int Days = history.length, VMKINDS = history[0].length;
int Daylength =10; //移動平均值
double[] weights={0.75,0.16,0.05,0.02,0.02,0.01,0.01,0.01,0.01,0.01};
double[][] M1=new double[Days][VMKINDS];
for(int i=0;i<Days;i++) {
for(int j=0;j<VMKINDS;j++) {
M1[i][j]=history[i][j];
}
}
for (int j = 0; j < VMKINDS; j++) {
for (int i = Daylength - 1; i < Days; i++) {
double sum = 0;
if(history[i][j]>8*history[i-1][j])
history[i][j]/=2.1;
if(history[i][j]<history[i-1][j]/8)
history[i][j]*=5; //正常化特殊情況的虛擬機數量
for (int day = i; day > i - Daylength; day--) {
sum += history[day][j]*weights[i-day];
}
history[i][j] = sum;
}
}
}
private static double RELU(double d) {
return d >= 0 ? d : 0;
}
}