Halide學習筆記----Halide tutorial源碼閱讀4

Halide入門教程04


// Halide tutorial lesson 4: Debugging with tracing, print, and print_when
// Halide入門第四課:用tracing,print,print_when調試

// This lesson demonstrates how to follow what Halide is doing at runtime.
// 本課展示瞭如何跟蹤Halide在運行時的行爲

// On linux, you can compile and run it like so:
// g++ lesson_04*.cpp -g -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_04 -std=c++11
// LD_LIBRARY_PATH=../bin ./lesson_04

// On os x:
// g++ lesson_04*.cpp -g -I ../include -L ../bin -lHalide -o lesson_04 -std=c++11
// DYLD_LIBRARY_PATH=../bin ./lesson_04

// If you have the entire Halide source tree, you can also build it by
// running:
//    make tutorial_lesson_04_debugging_2
// in a shell with the current directory at the top of the halide
// source tree.

#include "Halide.h"
#include <stdio.h>
using namespace Halide;

int main(int argc, char **argv) {

    Var x("x"), y("y");

    // Printing out the value of Funcs as they are computed.
    // 打印函數(Func)在計算時刻的值
    {
        // We'll define our gradient function as before.
        Func gradient("gradient");
        gradient(x, y) = x + y;

        // And tell Halide that we'd like to be notified of all
        // evaluations.
        // 告訴Halide我們想要跟蹤所有的函數計算值
        gradient.trace_stores();

        // Realize the function over an 8x8 region.
        printf("Evaluating gradient\n");
        Buffer<int> output = gradient.realize(8, 8);

        // This will print out all the times gradient(x, y) gets
        // evaluated.

        // Now that we can snoop on what Halide is doing, let's try our
        // first scheduling primitive. We'll make a new version of
        // gradient that processes each scanline in parallel.
        Func parallel_gradient("parallel_gradient");
        parallel_gradient(x, y) = x + y;

        // We'll also trace this function.
        parallel_gradient.trace_stores();

        // Things are the same so far. We've defined the algorithm, but
        // haven't said anything about how to schedule it. In general,
        // exploring different scheduling decisions doesn't change the code
        // that describes the algorithm.
        // 在Halide中,由於算法和調度解耦合,算法的調度並不影響算法的描述

        // Now we tell Halide to use a parallel for loop over the y
        // coordinate. On Linux we run this using a thread pool and a task
        // queue. On OS X we call into grand central dispatch, which does
        // the same thing for us.
        // 在y方形並行執行for循環
        parallel_gradient.parallel(y);

        // This time the printfs should come out of order, because each
        // scanline is potentially being processed in a different
        // thread. The number of threads should adapt to your system, but
        // on linux you can control it manually using the environment
        // variable HL_NUM_THREADS.
        // 由於採用了並行計算飯,每行的計算可能位於不同的線程,因此輸出結果可能會是亂序的。
        // 可以通過環境變量HL_NUM_THREADS來指定parallel的線程數
        printf("\nEvaluating parallel_gradient\n");
        parallel_gradient.realize(8, 8);
    }

    // Printing individual Exprs.
    {
        // trace_stores() can only print the value of a
        // Func. Sometimes you want to inspect the value of
        // sub-expressions rather than the entire Func. The built-in
        // function 'print' can be wrapped around any Expr to print
        // the value of that Expr every time it is evaluated.
        // trace_stores()函數打印函數值,內置的print函數可以答應表達式(Expr)對象的值

        // For example, say we have some Func that is the sum of two terms:
        Func f;
        f(x, y) = sin(x) + cos(y);

        // If we want to inspect just one of the terms, we can wrap
        // 'print' around it like so:
        // 如果我們僅僅需要關注表達式中的一個條目,我們可以在這個條目上加上print函數
        Func g;
        g(x, y) = sin(x) + print(cos(y));

        printf("\nEvaluating sin(x) + cos(y), and just printing cos(y)\n");
        g.realize(4, 4);
    }

    // Printing additional context.
    {
        // print can take multiple arguments. It prints all of them
        // and evaluates to the first one. The arguments can be Exprs
        // or constant strings. This can be used to print additional
        // context alongside the value:
        // 如果需要,可以在打印單個條目上加上額外的文本
        Func f;
        f(x, y) = sin(x) + print(cos(y), "<- this is cos(", y, ") when x =", x);

        printf("\nEvaluating sin(x) + cos(y), and printing cos(y) with more context\n");
        f.realize(4, 4);

        // It can be useful to split expressions like the one above
        // across multiple lines to make it easier to turn on and off
        // printing certain values while debugging.
        Expr e = cos(y);
        // Uncomment the following line to print the value of cos(y)
        // e = print(e, "<- this is cos(", y, ") when x =", x);
        Func g;
        g(x, y) = sin(x) + e;
        g.realize(4, 4);
    }

    // Conditional printing
    {
        // Both print and trace_stores can produce a lot of output. If
        // you're looking for a rare event, or just want to see what
        // happens at a single pixel, this amount of output can be
        // difficult to dig through. Instead, the function print_when
        // can be used to conditionally print an Expr. The first
        // argument to print_when is a boolean Expr. If the Expr
        // evaluates to true, it returns the second argument and
        // prints all of the arguments. If the Expr evaluates to false
        // it just returns the second argument and does not print.
        // 如果需要查看中間某個特定的結果,可以調用條件打印函數,打印出在特定條件下,表達式的結果。
        // print_when(bool_expr, expr, context)
        // 如果 bool_expr == ture: 返回expr,打印context內容
        // 否則只返回expr

        Func f;
        Expr e = cos(y);
        e = print_when(x == 37 && y == 42, e, "<- this is cos(y) at x, y == (37, 42)");
        f(x, y) = sin(x) + e;
        printf("\nEvaluating sin(x) + cos(y), and printing cos(y) at a single pixel\n");
        f.realize(640, 480);

        // print_when can also be used to check for values you're not expecting:
        Func g;
        e = cos(y);
        e = print_when(e < 0, e, "cos(y) < 0 at y ==", y);
        g(x, y) = sin(x) + e;
        printf("\nEvaluating sin(x) + cos(y), and printing whenever cos(y) < 0\n");
        g.realize(4, 4);
    }

    // Printing expressions at compile-time.
    {
        // The code above builds up a Halide Expr across several lines
        // of code. If you're programmatically constructing a complex
        // expression, and you want to check the Expr you've created
        // is what you think it is, you can also print out the
        // expression itself using C++ streams:
        // 在編寫一些複雜的表達式時,如果你想要查看錶達式是否和你想象中一樣,可以用c++
        // 的輸出流將表達式結果打印到標準輸出上,檢查是否如預期一致。
        Var fizz("fizz"), buzz("buzz");
        Expr e = 1;
        for (int i = 2; i < 100; i++) {
            if (i % 3 == 0 && i % 5 == 0) e += fizz*buzz;
            else if (i % 3 == 0) e += fizz;
            else if (i % 5 == 0) e += buzz;
            else e += i;
        }
        std::cout << "Printing a complex Expr: " << e << "\n";
    }

    printf("Success!\n");
    return 0;
}

編譯執行:

$ g++ lesson_04*.cpp -g -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_04 -std=c++11
$ ./lesson_04

結果:

Begin pipeline gradient.0()
Store gradient.0(0, 0) = 0
Store gradient.0(1, 0) = 1
Store gradient.0(2, 0) = 2
Store gradient.0(3, 0) = 3
Store gradient.0(4, 0) = 4
Store gradient.0(5, 0) = 5
Store gradient.0(6, 0) = 6
Store gradient.0(7, 0) = 7
Store gradient.0(0, 1) = 1
Store gradient.0(1, 1) = 2
Store gradient.0(2, 1) = 3
Store gradient.0(3, 1) = 4
Store gradient.0(4, 1) = 5
Store gradient.0(5, 1) = 6
Store gradient.0(6, 1) = 7
Store gradient.0(7, 1) = 8
Store gradient.0(0, 2) = 2
Store gradient.0(1, 2) = 3
Store gradient.0(2, 2) = 4
Store gradient.0(3, 2) = 5
Store gradient.0(4, 2) = 6
Store gradient.0(5, 2) = 7
Store gradient.0(6, 2) = 8
Store gradient.0(7, 2) = 9
Store gradient.0(0, 3) = 3
Store gradient.0(1, 3) = 4
Store gradient.0(2, 3) = 5
Store gradient.0(3, 3) = 6
Store gradient.0(4, 3) = 7
Store gradient.0(5, 3) = 8
Store gradient.0(6, 3) = 9
Store gradient.0(7, 3) = 10
Store gradient.0(0, 4) = 4
Store gradient.0(1, 4) = 5
Store gradient.0(2, 4) = 6
Store gradient.0(3, 4) = 7
Store gradient.0(4, 4) = 8
Store gradient.0(5, 4) = 9
Store gradient.0(6, 4) = 10
Store gradient.0(7, 4) = 11
Store gradient.0(0, 5) = 5
Store gradient.0(1, 5) = 6
Store gradient.0(2, 5) = 7
Store gradient.0(3, 5) = 8
Store gradient.0(4, 5) = 9
Store gradient.0(5, 5) = 10
Store gradient.0(6, 5) = 11
Store gradient.0(7, 5) = 12
Store gradient.0(0, 6) = 6
Store gradient.0(1, 6) = 7
Store gradient.0(2, 6) = 8
Store gradient.0(3, 6) = 9
Store gradient.0(4, 6) = 10
Store gradient.0(5, 6) = 11
Store gradient.0(6, 6) = 12
Store gradient.0(7, 6) = 13
Store gradient.0(0, 7) = 7
Store gradient.0(1, 7) = 8
Store gradient.0(2, 7) = 9
Store gradient.0(3, 7) = 10
Store gradient.0(4, 7) = 11
Store gradient.0(5, 7) = 12
Store gradient.0(6, 7) = 13
Store gradient.0(7, 7) = 14
End pipeline gradient.0()
Begin pipeline parallel_gradient.0()
Store parallel_gradient.0(0, 0) = 0
Store parallel_gradient.0(1, 0) = 1
Store parallel_gradient.0(2, 0) = 2
Store parallel_gradient.0(3, 0) = 3
Store parallel_gradient.0(4, 0) = 4
Store parallel_gradient.0(5, 0) = 5
Store parallel_gradient.0(6, 0) = 6
Store parallel_gradient.0(7, 0) = 7
Store parallel_gradient.0(0, 1) = 1
Store parallel_gradient.0(1, 1) = 2
Store parallel_gradient.0(2, 1) = 3
Store parallel_gradient.0(3, 1) = 4
Store parallel_gradient.0(4, 1) = 5
Store parallel_gradient.0(5, 1) = 6
Store parallel_gradient.0(6, 1) = 7
Store parallel_gradient.0(7, 1) = 8
Store parallel_gradient.0(0, 2) = 2
Store parallel_gradient.0(0, 3) = 3
Store parallel_gradient.0(1, 2) = 3
Store parallel_gradient.0(1, 3) = 4
Store parallel_gradient.0(0, 4) = 4
Store parallel_gradient.0(2, 3) = 5
Store parallel_gradient.0(2, 2) = 4
Store parallel_gradient.0(1, 4) = 5
Store parallel_gradient.0(3, 2) = 5
Store parallel_gradient.0(2, 4) = 6
Store parallel_gradient.0(3, 3) = 6
Store parallel_gradient.0(4, 2) = 6
Store parallel_gradient.0(4, 3) = 7
Store parallel_gradient.0(3, 4) = 7
Store parallel_gradient.0(5, 2) = 7
Store parallel_gradient.0(5, 3) = 8
Store parallel_gradient.0(4, 4) = 8
Store parallel_gradient.0(6, 2) = 8
Store parallel_gradient.0(6, 3) = 9
Store parallel_gradient.0(7, 2) = 9
Store parallel_gradient.0(5, 4) = 9
Store parallel_gradient.0(7, 3) = 10
Store parallel_gradient.0(0, 6) = 6
Store parallel_gradient.0(6, 4) = 10
Store parallel_gradient.0(0, 7) = 7
Store parallel_gradient.0(1, 6) = 7
Store parallel_gradient.0(1, 7) = 8
Store parallel_gradient.0(7, 4) = 11
Store parallel_gradient.0(2, 6) = 8
Store parallel_gradient.0(2, 7) = 9
Store parallel_gradient.0(3, 6) = 9
Store parallel_gradient.0(3, 7) = 10
Store parallel_gradient.0(4, 6) = 10
Store parallel_gradient.0(4, 7) = 11
Store parallel_gradient.0(5, 7) = 12
Store parallel_gradient.0(5, 6) = 11
Store parallel_gradient.0(0, 5) = 5
Store parallel_gradient.0(6, 7) = 13
Store parallel_gradient.0(6, 6) = 12
Store parallel_gradient.0(1, 5) = 6
Store parallel_gradient.0(7, 7) = 14
Store parallel_gradient.0(2, 5) = 7
Store parallel_gradient.0(7, 6) = 13
Store parallel_gradient.0(3, 5) = 8
Store parallel_gradient.0(4, 5) = 9
Store parallel_gradient.0(5, 5) = 10
Store parallel_gradient.0(6, 5) = 11
Store parallel_gradient.0(7, 5) = 12
End pipeline parallel_gradient.0()
1.000000
1.000000
1.000000
1.000000
0.540302
0.540302
0.540302
0.540302
-0.416147
-0.416147
-0.416147
-0.416147
-0.989992
-0.989992
-0.989992
-0.989992
1.000000 <- this is cos( 0 ) when x = 0
1.000000 <- this is cos( 0 ) when x = 1
1.000000 <- this is cos( 0 ) when x = 2
1.000000 <- this is cos( 0 ) when x = 3
0.540302 <- this is cos( 1 ) when x = 0
0.540302 <- this is cos( 1 ) when x = 1
0.540302 <- this is cos( 1 ) when x = 2
0.540302 <- this is cos( 1 ) when x = 3
-0.416147 <- this is cos( 2 ) when x = 0
-0.416147 <- this is cos( 2 ) when x = 1
-0.416147 <- this is cos( 2 ) when x = 2
-0.416147 <- this is cos( 2 ) when x = 3
-0.989992 <- this is cos( 3 ) when x = 0
-0.989992 <- this is cos( 3 ) when x = 1
-0.989992 <- this is cos( 3 ) when x = 2
-0.989992 <- this is cos( 3 ) when x = 3
-0.399985 <- this is cos(y) at x, y == (37, 42)
-0.416147 cos(y) < 0 at y == 2
-0.416147 cos(y) < 0 at y == 2
-0.416147 cos(y) < 0 at y == 2
-0.416147 cos(y) < 0 at y == 2
-0.989992 cos(y) < 0 at y == 3
-0.989992 cos(y) < 0 at y == 3
-0.989992 cos(y) < 0 at y == 3
-0.989992 cos(y) < 0 at y == 3
Evaluating gradient

Evaluating parallel_gradient

Evaluating sin(x) + cos(y), and just printing cos(y)

Evaluating sin(x) + cos(y), and printing cos(y) with more context

Evaluating sin(x) + cos(y), and printing cos(y) at a single pixel

Evaluating sin(x) + cos(y), and printing whenever cos(y) < 0
Printing a complex Expr: ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1 + 2) + fizz) + 4) + buzz) + fizz) + 7) + 8) + fizz) + buzz) + 11) + fizz) + 13) + 14) + (fizz*buzz)) + 16) + 17) + fizz) + 19) + buzz) + fizz) + 22) + 23) + fizz) + buzz) + 26) + fizz) + 28) + 29) + (fizz*buzz)) + 31) + 32) + fizz) + 34) + buzz) + fizz) + 37) + 38) + fizz) + buzz) + 41) + fizz) + 43) + 44) + (fizz*buzz)) + 46) + 47) + fizz) + 49) + buzz) + fizz) + 52) + 53) + fizz) + buzz) + 56) + fizz) + 58) + 59) + (fizz*buzz)) + 61) + 62) + fizz) + 64) + buzz) + fizz) + 67) + 68) + fizz) + buzz) + 71) + fizz) + 73) + 74) + (fizz*buzz)) + 76) + 77) + fizz) + 79) + buzz) + fizz) + 82) + 83) + fizz) + buzz) + 86) + fizz) + 88) + 89) + (fizz*buzz)) + 91) + 92) + fizz) + 94) + buzz) + fizz) + 97) + 98) + fizz)
Success!

Halide提供的debug方法要點提煉:

1. Func.trace_stores() 跟蹤函數運行時計算結果
2. Func.parallel(y) 在某個domain方向多線程並行計算
3. print() 打印所關注表達式的值
4. print_when() 打印在指定條件爲真情況下的值,也可用於屏蔽條件爲假時的輸出
5. 用c++的輸出流輸出複雜表達式,檢查表達式構造是否和預期一致
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章