基於隨機森林算法的葡萄酒種類識別

嘮兩句

這篇文章是自己寫實驗報告的時候突發奇想寫的,把這學期的計算時能答辯的課題改編成自己的博客,嗯總算實現老師說的寫博客的意義了。源代碼是借鑑自https://blog.csdn.net/cyberliferk800/article/details/90549795
但是很可惜啊這哥們兒(姐們兒)的代碼根本跑不出來(不是diss您可能是matlab版本問題也好像是您確實sample錯了如果有幸看到勿噴咱們理智辯解),我理解完了debug然後改寫了整個邏輯,最後正確率還蠻喜人的嘿嘿嘿。
本文裏葡萄酒種類預測只是爲了滿足了老師所吩咐的“現實意義”這一要求,並沒有太大的研究意義,我就重點介紹隨機森林算法了,接下來進入正題。

1. 隨機森林算法原理

1.1 決策樹的構建(CART算法)

CART算法由以下兩步組成:
決策樹生成:基於訓練數據集生成決策樹,生成的決策樹要儘量大;
決策樹剪枝:用驗證數據集對已生成的樹進行剪枝並選擇最優子樹,這時損失函數最小作爲剪枝的標準。
CART決策樹的生成就是遞歸地構建二叉決策樹的過程。CART決策樹既可以用於分類也可以用於迴歸。本文我們僅討論用於分類的CART。對分類樹而言,CART用Gini係數最小化準則來進行特徵選擇,生成二叉樹。

1.2 Gini係數

決策樹建立後使用Gini係數判斷其是否爲一顆好樹
Gini係數代表了模型的不純度,基尼係數越小,不純度越低,特徵越好。
假設K個類別,第k個類別的概率爲pk,概率分佈的基尼係數表達式:
在這裏插入圖片描述
由於本文葡萄酒種類只存在兩個類別,所以基尼係數表達式是:
在這裏插入圖片描述
又由需求最佳劃分點,劃分點左右兩側都有樣本存在,左邊樣本點爲n個,右邊樣本點爲個,所以基尼係數時表達式應爲:
在這裏插入圖片描述

1.3 隨機森林的構建

決策樹相當於一個大師,通過自己在數據集中學到的知識對於新的數據進行分類。那麼隨機森林的具體構建有兩個方面:數據的隨機性選取,以及待選特徵的隨機選取。
1.數據的隨機選取:
首先,從原始的數據集中採取有放回的抽樣,構造子數據集,子數據集的數據量是和原始數據集相同的。不同子數據集的元素可以重複,同一個子數據集中的元素也可以重複。第二,利用子數據集來構建子決策樹,將這個數據放到每個子決策樹中,每個子決策樹輸出一個結果。最後,如果有了新的數據需要通過隨機森林得到分類結果,就可以通過對子決策樹的判斷結果的投票,得到隨機森林的輸出結果了。如下圖,假設隨機森林中有3棵子決策樹,2棵子樹的分類結果是A類,1棵子樹的分類結果是B類,那麼隨機森林的分類結果就是A類。
在這裏插入圖片描述
圖2-3 一個具有3個數據樣本的數據集中的數據的隨機選取
2.待選特徵的隨機選取
與數據集的隨機選取類似,隨機森林中的子樹的每一個分裂過程並未用到所有的待選特徵,而是從所有的待選特徵中隨機選取一定的特徵,之後再在隨機選取的特徵中選取最優的特徵進行劃分。這樣能夠使得隨機森林中的決策樹都能夠彼此不同,提升系統的多樣性,從而提升分類性能。
下圖中,藍色的方塊代表所有可以被選擇的特徵,也就是目前的待選特徵。橙色的方塊是分裂特徵。左邊是一棵決策樹的特徵選取過程,通過在待選特徵中選取最優的分裂特徵(本文采用CART算法),完成分裂。右邊是一個隨機森林中的子樹的特徵選取過程。
在這裏插入圖片描述

2. 數據集來源

數據集來源於UCI數據庫
在這裏插入圖片描述

3. 代碼實現(核心代碼)

3.1 隨機森林函數

%隨機森林,共有trees_num棵樹

function result=random_forest(sample,trees_num,data,sample_select,decision_select,sample_limit)
type1=0;
type0=0;
conclusion=zeros(1,trees_num);

%data的最後一個改爲自定義值,待會兒改成GUI傳進來的值
data(size(data,1),:) = sample;

for i=1:trees_num
    [path,boundary,~,result]=decision_tree(data,sample_select,decision_select,sample_limit);
    conclusion(i)=decide(path,boundary,result);
    
    if conclusion(i)==1
        type1=type1+1; 
    else
        type0=type0+1;
    end
end
if type1>type0
    result=1;
else
    result=0;
end

3.2 決策樹生成函數

%生成決策樹,輸入原始數據,採樣樣本數,採樣決策屬性數,預剪枝樣本限制
function [path,boundary,gini,result]=decision_tree(data,sample_select,decision_select,sample_limit)
score=100;        
flag=0;
temp=inf;
%data(size(data,1),:)=sample;
%評價函數得分
while(score>(sample_select*0.3))   %直到找到好樹才停止
   %%設置兩個變量conclusion4_0和conclusion4_1,如果分類在第三層停下確保01的數量不一樣
    conclusion3_0=0;
    conclusion3_1=0;
    %設置兩個變量conclusion4_0和conclusion4_1,如果葉子結點數量多於一個要判斷conclusion4_0和conclusion4_1的數量誰更多
    conclusion4_0=0;
    conclusion4_1=0;
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%分界%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    data_new=select_sample_decision(data,sample_select,decision_select);
    %計算初始gini係數
    gini_now=gini_self(data_new);
    %主程序
    layer=1;                            %記錄決策樹當前層數
    leaf_sample=zeros(1,sample_select); %記錄子結點樣本個數
    leaf_gini=zeros(1,sample_select);   %葉子節點gini係數
    leaf_num=0;                         %記錄葉子數
    path=zeros(decision_select,2^(decision_select-1));       %初始化路徑
    gini=ones(decision_select,2^(decision_select-1));        %初始化gini
    boundary=zeros(decision_select,2^(decision_select-1));   %初始化劃分邊界
    result=ones(decision_select,2^(decision_select-1));      %初始化結果
    path(:)=inf;
    gini(:)=inf;
    boundary(:)=inf;
    result(1:4,1:8)=inf;
    %第一層
    [decision_global_best,boundary_global_best,data_new1,gini_now1,data_new2,gini_now2,~]=generate_node(data_new);

    path(layer,1)=data_new(size(data_new,1),decision_global_best);
    boundary(layer,1)=boundary_global_best;
    gini(layer,1)=gini_now;
    layer=layer+1;
    gini(layer,1)=gini_now1;
    gini(layer,2)=gini_now2;
    %第二層
    
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%二層1%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    if  ((size(data_new1,1)-1)>=sample_limit)&&(gini(layer,1)>0)
        [decision_global_best,boundary_global_best,data_new1_1,gini_now1_1,data_new1_2,gini_now1_2,~]=generate_node(data_new1);
        path(layer,1)=data_new1(size(data_new1,1),decision_global_best);
        boundary(layer,1)=boundary_global_best;
        layer=layer+1;
        gini(layer,1)=gini_now1_1;
        gini(layer,2)=gini_now1_2;
        %%%%%%%%%%%%%%%%%%%%%%%%%三層1%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        if (size(data_new1_1,1)-1)>=sample_limit&&(gini(layer,1)>0)
            for i=1:size(data_new1_1,1)
                if(data_new1_1(i,end)==1)
                    conclusion3_1=conclusion3_1+1;
                else
                    conclusion3_0=conclusion3_0+1;
                end
            end
            [decision_global_best,boundary_global_best,data_new1_1_1,gini_now1_1_1,data_new1_1_2,gini_now1_1_2,~]=generate_node(data_new1_1);
            path(layer,1)=data_new1_1(size(data_new1_1,1),decision_global_best);
            boundary(layer,1)=boundary_global_best;
            layer=layer+1;

            gini(layer,1)=gini_now1_1_1;
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new1_1_1,1)
                if(data_new1_1_1(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,1)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now1_1_1;
            leaf_sample(leaf_num)=size(data_new1_1_1,1)-1;

            gini(layer,2)=gini_now1_1_2;
            %%%%%%%%%%%%%%%%%%%%%%%%%四層2%%%%%%%%%%%%%%%%%%%%%%%%%%%%
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new1_1_2,1)
                if(data_new1_1_2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_1+0;
            else
                flag=1;
            end
            result(layer,2)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now1_1_2;
            leaf_sample(leaf_num)=size(data_new1_1_2,1)-1;
        else
            %%%%%%%%%%%%%%%%%%%%%%%%%三層1else%%%%%%%%%%%%%%%%%%%%%%%%%%%%
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new1_1,1)
                if(data_new1_1(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
           if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
           end
            result(layer,1)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now1_1;
            leaf_sample(leaf_num)=size(data_new1_1,1)-1;

            path(layer,1)=nan;
            boundary(layer,1)=nan;
            gini(layer+1,1:2)=nan;
        end
        layer=3;
        %%%%%%%%%%%%%%%%%%%%%%%%%三層2%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        if (size(data_new1_2,1)-1)>=sample_limit&&(gini(layer,2)>0)
            for i=1:size(data_new1_2,1)
                if(data_new1_2(i,end)==1)
                    conclusion3_1=conclusion3_1+1;
                else
                    conclusion3_0=conclusion3_0+1;
                end
            end
            [decision_global_best,boundary_global_best,data_new1_2_1,gini_now1_2_1,data_new1_2_2,gini_now1_2_2,~]=generate_node(data_new1_2);
            path(layer,2)=data_new1_2(size(data_new1_2,1),decision_global_best);
            boundary(layer,2)=boundary_global_best;
            layer=layer+1;
            gini(layer,3)=gini_now1_2_1;
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new1_2_1,1)
                if(data_new1_2_1(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,3)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now1_2_1;
            leaf_sample(leaf_num)=size(data_new1_2_1,1)-1;

            gini(layer,4)=gini_now1_2_2;
            %%%%%%%%%%%%%%%%%%%%%%%%%四層4%%%%%%%%%%%%%%%%%%%%%%%%%%%%
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new1_2_2,1)
                if(data_new1_2_2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,4)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now1_2_2;
            leaf_sample(leaf_num)=size(data_new1_2_2,1)-1;
        else
           %%%%%%%%%%%%%%%%%%%%%%%%%三層2else%%%%%%%%%%%%%%%%%%%%%%%%%%%%
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new1_2,1)
                if(data_new1_2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,2)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now1_2;
            leaf_sample(leaf_num)=size(data_new1_2,1)-1;

            path(layer,2)=nan;
            boundary(layer,2)=nan;
            gini(layer+1,3:4)=nan;
        end
    else
        %%%%%%%%%%%%%%%%%%%%%%%%%二層1else%%%%%%%%%%%%%%%%%%%%%%%%%%%%
         %test
         temp1=0;
         temp2=0;
         for i=1:size(data_new1,1)
             if(data_new1(i,end)==1)
                 temp1=temp1+1;
             else
                 temp2=temp2+1;
             end
         end
         if(temp1>temp2)
             temp=1;
             conclusion4_1=conclusion4_1+1;
         elseif temp1<temp2
             temp=0;
             conclusion4_0=conclusion4_0+1;
         else
             flag=1;
         end
        result(layer,1)=temp;
        leaf_num=leaf_num+1;
        leaf_gini(leaf_num)=gini_now1;
        leaf_sample(leaf_num)=size(data_new1,1)-1;

        path(layer,1)=nan;
        boundary(layer,1)=nan;
        layer=layer+1;
        gini(layer,1:2)=nan;
        %第三層
        path(layer,1:2)=nan;
        boundary(layer,1:2)=nan;
        %gini第四層葉子
        layer=layer+1;
        gini(layer,1:4)=nan;
    end
    layer=2;
    %%%%%%%%%%%%%%%%%%%%%%%%%二層2%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    if (size(data_new2,1)-1)>=sample_limit&&(gini(layer,2)>0)
        [decision_global_best,boundary_global_best,data_new2_1,gini_now2_1,data_new2_2,gini_now2_2,~]=generate_node(data_new2);
        path(layer,2)=data_new2(size(data_new2,1),decision_global_best);
        boundary(layer,2)=boundary_global_best;
        layer=layer+1;
        gini(layer,3)=gini_now2_1;
        gini(layer,4)=gini_now2_2;
        %第三層
        %%%%%%%%%%%%%%%%%%%%%%%%%三層3%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        if (size(data_new2_1,1)-1)>=sample_limit&&(gini(layer,3)>0)
            for i=1:size(data_new2_1,1)
                if(data_new2_1(i,end)==1)
                    conclusion3_1=conclusion3_1+1;
                else
                    conclusion3_0=conclusion3_0+1;
                end
            end
            [decision_global_best,boundary_global_best,data_new2_1_1,gini_now2_1_1,data_new2_1_2,gini_now2_1_2,~]=generate_node(data_new2_1);
            path(layer,3)=data_new2_1(size(data_new2_1,1),decision_global_best);
            boundary(layer,3)=boundary_global_best;
            layer=layer+1;

            gini(layer,5)=gini_now2_1_1;
            
             %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new2_1_1,1)
                if(data_new2_1_1(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,5)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now2_1_1;
            leaf_sample(leaf_num)=size(data_new2_1_1,1)-1;

            gini(layer,6)=gini_now2_1_2;
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new2_1_2,1)
                if(data_new2_1_2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,6)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now2_1_2;
            leaf_sample(leaf_num)=size(data_new2_1_2,1)-1;
        else
            %%%%%%%%%%%%%%%%%%%%%%%%%三層3else%%%%%%%%%%%%%%%%%%%%%%%%%%%%
            temp1=0;
            temp2=0;
            for i=1:size(data_new2_1,1)
                if(data_new2_1(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,3)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now2_1;
            leaf_sample(leaf_num)=size(data_new2_1,1)-1;

            path(layer,3)=nan;
            boundary(layer,3)=nan;
            gini(layer+1,5:6)=nan;
        end
        layer=3;
        %%%%%%%%%%%%%%%%%%%%%%%%%三層4%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        if (size(data_new2_2,1)-1)>=sample_limit&&(gini(layer,4)>0)
            for i=1:size(data_new2_2,1)
                if(data_new2_2(i,end)==1)
                    conclusion3_1=conclusion3_1+1;
                else
                    conclusion3_0=conclusion3_0+1;
                end
            end
            [decision_global_best,boundary_global_best,data_new2_2_1,gini_now2_2_1,data_new2_2_2,gini_now2_2_2,~]=generate_node(data_new2_2);
            path(layer,4)=data_new2_2(size(data_new2_2,1),decision_global_best);
            boundary(layer,4)=boundary_global_best;
            layer=layer+1;

            gini(layer,7)=gini_now2_2_1;
%             %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new2_2_1,1)
                if(data_new2_2_1(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,7)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now2_2_1;
            leaf_sample(leaf_num)=size(data_new2_2_1,1)-1;

            gini(layer,8)=gini_now2_2_2;
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new2_2_2,1)
                if(data_new2_2_2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,8)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now2_2_2;
            leaf_sample(leaf_num)=size(data_new2_2_2,1)-1;
        else
            %%%%%%%%%%%%%%%%%%%%%%%%%三層4else%%%%%%%%%%%%%%%%%%%%%%%%%%%%
            %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new2_2,1)
                if(data_new2_2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
            result(layer,4)=temp;
            leaf_num=leaf_num+1;
            leaf_gini(leaf_num)=gini_now2_2;
            leaf_sample(leaf_num)=size(data_new2_2,1)-1;

            path(layer,4)=nan;
            boundary(layer,4)=nan;
            gini(layer+1,7:8)=nan;
        end
    else
        %%%%%%%%%%%%%%%%%%%%%%%%%二層2else%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %test
            temp1=0;
            temp2=0;
            for i=1:size(data_new2,1)
                if(data_new2(i,end)==1)
                    temp1=temp1+1;
                else
                    temp2=temp2+1;
                end
            end
            if(temp1>temp2)
                temp=1;
                conclusion4_1=conclusion4_1+1;
            elseif temp1<temp2
                temp=0;
                conclusion4_0=conclusion4_0+1;
            else
                flag=1;
            end
        result(layer,1)=temp;
        leaf_num=leaf_num+1;
        leaf_gini(leaf_num)=gini_now2;
        leaf_sample(leaf_num)=size(data_new2,1)-1;

        path(layer,2)=nan;
        boundary(layer,2)=nan;
        layer=layer+1;
        gini(layer,3:4)=nan;
        %第三層  
        path(layer,3:4)=nan;
        boundary(layer,3:4)=nan;
        %gini第四層葉子
        layer=layer+1;
        gini(layer,5:8)=nan;
    end
    if flag==1||conclusion4_1==conclusion4_0||(conclusion3_0==conclusion3_1&&conclusion4_1==0&&conclusion4_0==0) 
        score=100;
    else
        score=evaluation(leaf_num,leaf_sample,leaf_gini);
    end
    flag=0;
    result(2,:)=nan;
end

3.3 決策樹決策函數

%樣本決策函數,輸入樣本與決策樹,輸出判斷結果
function conclusion=decide(path,boundary,result)
%
%disp(sample(path(1,1)));
%disp(boundary(1,1));
%sample
conclusion0=0;
conclusion1=0;
%是否有到達第四層
flag=0;
if path(1,1)<boundary(1,1)
   if result(2,1)==0||result(2,1)==1
        conclusion=result(2,1);
    else
        %sample
        if path(2,1)<boundary(2,1)
            if result(2,1)==0||result(2,1)==1
                conclusion=result(3,1);
            else
                 %sample
                if path(3,1)<boundary(3,1)
                    if result(4,1)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                    flag=1;
                else
                    if result(4,2)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                     flag=1;
                end
            end
        else
            if result(3,2)==0||result(3,2)==1
                conclusion=result(3,2);
            else
                %sample
                if path(3,2)<boundary(3,2)
                    if result(4,3)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                      flag=1;
                else
                    if result(4,4)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                      flag=1;
                end
            end
        end
    end
else
    if  result(2,2)==0||result(2,2)==1
        conclusion=result(2,2);
    else
         %sample
        if path(2,2)<boundary(2,2)
            if  result(3,3)==0||result(3,3)==1
                conclusion=result(3,3);
            else
                 %sample
                if path(3,3)<boundary(3,3)
                    if result(4,5)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                      flag=1;
                else
                    if result(4,1)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                      flag=1;
                end
            end
        else
            if  result(3,4)==0||result(3,4)==1
                conclusion=result(3,4);
            else
                 %sample
                if path(3,4)<boundary(3,4)
                    if result(4,7)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                     flag=1;
                else
                    if result(4,8)==1
                        conclusion1=conclusion1+1;
                    else
                        conclusion0=conclusion0+1;
                    end
                       flag=1;
                end
            end
        end
    end
end
if flag==1
    if conclusion1>conclusion0
        conclusion=1;
    else
        conclusion=0;
    end
end

整個系統的代碼我會放在另外一篇博客裏大家emm有機自取吧,不喜勿噴。

4. 系統分析

因爲做自身對比分析的時候還存在一些小bug,所以準確率普遍偏低了3%-4%,也就是說在變量定義的不夠好的時候準確率也是可以比較高的,也證明了隨機森林的優勢——“性能優化過程剛好又提高了模型的準確性,這種精彩表現並不常有”
今天太累了要休息了,有空的時候我重新做一個對比然後給大家品一品,老的對比就暫時放一張大家蠻看(從自己的課程論文裏截圖出來的,醜了點,隨意觀賞)。

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章