mit6.824-lab1 MapReduce

雜談

傳統的並行計算要的是：投入更多機器，數據大小不變，計算速度更快。

分佈式計算要求：投入更多的機器，能處理更大的數據。

換句話說二者的出發點從一開始就不同，一個強調 high performance, 一個強調 scalability.

本過程實現的MapReduce是通過Go語言實現的，本人第一次接觸Go，可能在碼風等問題上有不太友好的地方。
這個課程想學很久了，最近抽出一點點時間，開個坑，做一些lab。在開始之前，一定要看論文，聽課程。附上中翻鏈接

MapReduce是一個軟件框架，基於該框架可以更容易地編寫應用程序，這些應用程序運行在十分大的集羣上，並以可靠的、具有容錯能力的方式並行處理TB級別的數據集。
軟件框架、並行處理、可靠且容錯、大規模集羣、海量數據集

因爲是第一次接觸Go語言，所以在處理語法特性上還有所生疏。（不過感覺大部分特性都能現查）只是聽過程的話是可以理解的，落實到代碼處理會有很多的問題。整體的代碼量不是特別大，可能是我實現的有點複雜，看來是窩太菜惹。

開始之前

記錄一下測試過程：

跑通程序：

運行coordinator

go run -race mrcoordinator.go pg-*.txt

運行worker

go build -race -buildmode=plugin ../mrapps/wc.go
go run -race mrworker.go wc.so

測試腳本

bash test-mr.sh
bash test-mr-many.sh 200

整個的實驗過程我是在Linux環境下做的。調試的過程中大佬的博客給了很準確的定位調錯分析。

在git上下載的壓縮包解壓之後有go語言較新版本的安裝包，可以使用那個配環境。配環境的過程一定要謹慎一些，我最開始就因爲沒配好環境所以寄了一次 T_T。官方lab指導
記得根據官方指導看一下自己的環境有沒有配置完善。

MapReduce的過程主要是理解論文中給的那幅圖，我把它放到這裏：

同時論文在3.1節Execution Overview中介紹了這個總體的實現過程思路。我們要做的就是根據論文的思路把這個過程用Go語言實現一遍。

在Linux中的默認sort排序是大小寫不敏感的，在這個lab我們應該對於大小寫敏感。可以採用以下命令：

export LC_ALL=C

實驗過程

1. 大體結構

我們通過看實驗指導可以知道我們需要做的內容主要是包括三個文件中，coordinator.go，worker.go，rpc.go，只看文件名字大概就可以理解他們各自所主要負責的部分了。
coordinator（協調者）只有一個，worker（工作者）的數量至少爲一個，worker之間是並行執行的關係。彼此之間通過RPC協議來通信。coordinator向worker分配任務（Map或Reduce任務），然後worker執行任務並寫入指定的文件。
這裏coordinator有個規劃，一般來說是，在某個worker在一定時間期限內未完成任務的時候，我們選擇將這個worker上所執行的任務分配給別的worker從而實現總體系統的可用性。

2. RPC消息結構

認爲RPC消息有三種情況：與Map任務相關，與Reduce任務相關，與Worker狀態相關。
我們在劃分傳遞消息的時候。分別地，對Map任務定義了MapRequest，MapResponse，MapTaskState三種消息格式；對Reduce任務定義了ReduceRequest，ReduceResponse，ReduceTaskState；對Worker狀態相關的消息我們僅保留NReduce，WorkId信息。

其中對於MapRequest和ReduceRequest，這兩種消息我們其實是不用分配信息的：因爲我們在接到這個信息的目的是需要向其分配任務，那麼選擇隨機分配就可以了，不用採取根據消息類型而區別的情況。（僅對於該lab來說）

【Click to check Codes】

type MapRequest struct {
	// random
}

type ReduceRequest struct {
	// random
}

type MapResponse struct {
	Filename string
	State string
}

type ReduceResponse struct {
	ReduceId int
	Filenames []string // array
	State string
}

type MapTaskState struct {
	Filename string
	WorkerId int
	TaskId int
	State string
}

type ReduceTaskState struct {
	ReduceId int
	State string
}

type WorkerInfo struct {
	NReduce int
	WorkId int
}

3. Coordinator Process

首先定義一下Coordinator的結構體格式：

var void interface{}
type stringArray []string // make a type

type Coordinator struct {
	files			[]string
	reduceId		int 
	midfFilesMap	map[int]stringArray // reduceId
	midFilesList	[]int		// unfinished & rpc Sent
	mapSend			map[string]interface{}
	reduceSend		map[int]interface{}	// reduceId
	nReduce			int
	Finish			bool		// make sure is finished
	mtx 			sync.Mutex	// lock
}

網絡上有的做法是將sync.Mutex互斥鎖放到了外面，那樣應該或許也是可以的（）。這裏把互斥鎖放到了Coordinator結構體裏面，方便實現Mutex跟着Coordinator Process進程走的作用。

然後是分配WorkerId的方法：

func (c *Coordinator) AssignWorkerId(i *int, woinfo *WorkerInfo) error {
    c.mtx.Lock()
    woinfo.WorkId = c.reduceId
    woinfo.NReduce = c.nReduce
    c.reduceId++
    c.mtx.Unlock()
    return nil
}

分發MapTask任務的方法和分發Reduce任務的方法：

// return the first File's name
func (c *Coordinator) AssignMapTask(req *MapRequest, resp *MapResponse) error {
    c.mtx.Lock()
    if len(c.files) == 0 {
        resp.Filename = ""
        if len(c.mapSend) == 0 {
            resp.State = "done"
            // fmt.Println("map task done.")
        }
    } else {
        resp.Filename = c.files[0]
        c.files = c.files[1:]
        f := mapAfterFuncWrapper(c, resp.Filename)
        c.mapSend[resp.Filename] = time.AfterFunc(time.Second*20, f)
    }
    c.mtx.Unlock()
    // fmt.Printf("c.files: %v\n", c.files)
    // fmt.Printf("c.mapSend: %v\n", c.mapSend)
    return nil
}

// before reduce, return midFile's filename
func (c *Coordinator) AssignReduceTask(req *ReduceRequest, resp *ReduceResponse) error {
    c.mtx.Lock()
    if len(c.midFilesList) == 0 {
		resp.ReduceId = -1
        resp.Filenames = nil
        if len(c.reduceSend) == 0 {
            resp.State = "done"
            // fmt.Println("reduce task done")
        }
    } else {
        resp.ReduceId = c.midFilesList[0]
        c.midFilesList = c.midFilesList[1:]
        resp.Filenames = c.midFilesMap[resp.ReduceId]
        f := reduceAfterFuncWrapper(c, resp.ReduceId)
        c.reduceSend[resp.ReduceId] = time.AfterFunc(time.Second*20, f)
    }
    // fmt.Printf("c.midFiles: %v\n", c.midFilesList)
    // fmt.Printf("c.reduceSend: %v\n", c.reduceSend)
    c.mtx.Unlock()
    return nil
}

map處理完畢之後調用查看，防止在運行過程中崩潰，如果徹底崩潰了，設置10s超時，再把這個元素加回去, 直到全部文件結束了之後，得到全部的中間文件名list。
這裏我們定義MapTaskResp方法：

func (c *Coordinator) MapTaskResp(state *MapTaskState, resp *MapResponse) error {
    c.mtx.Lock()
    if state.State == "done" {
        delete(c.mapSend, state.Filename) // finish, so delete

        for i := 0; i < c.nReduce; i++ {
            name := fmt.Sprintf("%s-%d-%d_%d", "mr-mid", state.WorkerId, state.TaskId, i) // print to mid-Files
            _, Finish := c.midFilesMap[i]
            if !Finish {
                c.midFilesMap[i] = stringArray{}
            }
            c.midFilesMap[i] = append(c.midFilesMap[i], name)
        }
    } else {
        // Failed, remake
        c.files = append(c.files, state.Filename)
        // delete(c.mapSend, state.filename)
    }
    c.mtx.Unlock()
    return nil
}

ReduceStateResp方法同理：

func (c *Coordinator) ReduceStateResp(state *ReduceTaskState, resp *ReduceResponse) error {
    c.mtx.Lock()
    if state.State == "done" {
        delete(c.reduceSend, state.ReduceId)
        if len(c.reduceSend) == 0 && len(c.midFilesList) == 0 && !c.Finish {
            c.Finish = true
        }
    } else {
        // Failed, remake
        c.midFilesList = append(c.midFilesList, state.ReduceId)
        // delete(c.mapSend, state.filename)
    }
    c.mtx.Unlock()
    return nil
}

定義檢測完成的方法Done：

func (c *Coordinator) Done() bool {
    c.mtx.Lock()
    ret := c.Finish
    c.mtx.Unlock()
    return ret
}

生成初始的Coordinator結構體，用於操作，MakeCoordinator方法：

func MakeCoordinator(files []string, nReduce int) *Coordinator {
    c := Coordinator{}
    c.nReduce = nReduce
    c.files = files
    c.midFilesMap = map[int]stringArray{}
    c.midFilesList = []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
    c.mtx = sync.Mutex{}
    c.mapSend = make(map[string]interface{})
    c.reduceSend = make(map[int]interface{})
    c.server()
    return &c
}

最後是對於crash exit的處理方法。
這個點卡了我很久，看了別人的代碼才明白的。關鍵就是在多次分配任務後（要支持這個操作），如果兩個worker執行一個task都執行完了，根據重複情況，判斷完成的文件裏是否已經存在，存在就直接不管。

相應的實現方法是mapAfterFuncWrapper和reduceAfterFuncWrapper兩個方法：（相應的調用過程在上面已經寫了）

func mapAfterFuncWrapper(c *Coordinator, filename string) func() {
    return func() {
        c.mtx.Lock()
        fmt.Printf("map task %v 超時重試\n", filename)
        c.files = append(c.files, filename)
        c.mtx.Unlock()
    }
}

func reduceAfterFuncWrapper(c *Coordinator, reduceId int) func() {
    return func() {
        c.mtx.Lock()
        fmt.Printf("reduce task %v 超時重試\n", reduceId)
        c.midFilesList = append(c.midFilesList, reduceId)
        c.mtx.Unlock()
    }
}

4. worker的實現

映射方式採用的是hash映射（有點像cache那樣）：

func ihash(key string) int {
    h := fnv.New32a()
    h.Write([]byte(key))
    return int(h.Sum32() & 0x7fffffff)
}

獲取中間文件的這一步方法，這裏採用的是json編碼再解碼的方式（按照lab的要求）

func WorkerMap(mapf func(string, string) []KeyValue, c *rpc.Client) {
    retryTimes := 0
    taskid := 0
    for retryTimes < 3 {

        req := MapRequest{}
        resp := MapResponse{}
        err2 := c.Call("Coordinator.AssignMapTask", &req, &resp)
        if err2 != nil {
            fmt.Printf("err2: %v\n", err2)
            time.Sleep(time.Second)
            retryTimes++
            continue
        }
        retryTimes = 0
        if resp.State == "done" {
            return
        }
        if resp.Filename == "" {
            // empty but not done -- problem!
            time.Sleep(time.Second)
            continue
        }
        
		// write to disk
        req2 := MapTaskState{resp.Filename, workerInfo.WorkId, taskid, "done"}
        resp2 := MapResponse{}
        f, err := os.Open(resp.Filename)
        if err != nil {
            log.Fatal(err)
            req2.State = "nosuchfile"
            c.Call("Coordinator.MapTaskResp", &req2, &resp2)
            continue
        }
        defer f.Close() // out the function
        content, err := ioutil.ReadAll(f)
        if err != nil {
            log.Fatalf("cannot read %v", resp.Filename)
            req2.State = "filereaderr"
            c.Call("Coordinator.MapTaskResp", &req2, &resp2)
        }

        kvs := mapf(resp.Filename, string(content))
        encs := []*json.Encoder{}
        midFiles = []*os.File{}

        // create temp，make sure final is mr-mid-{workid}-{taskid}_{nreduceid} (after finish)
        // in ReduceTask nreduceid is the symbol
        for i := 0; i < workerInfo.NReduce; i++ {
            f, _ := os.CreateTemp("", "di-mp")
            midFiles = append(midFiles, f)
        }

        for i := 0; i < workerInfo.NReduce; i++ {
            encs = append(encs, json.NewEncoder(midFiles[i]))
        }

        for _, kv := range kvs {
            encs[ihash(kv.Key)%workerInfo.NReduce].Encode(kv)
        }

        for i := 0; i < workerInfo.NReduce; i++ {
            name := fmt.Sprintf("%s-%d-%d_%d", "mr-mid", workerInfo.WorkId, taskid, i)
            os.Rename(midFiles[i].Name(), name)
        }

        req2.Filename = resp.Filename
        req2.TaskId = taskid
        req2.State = "done"
        err = c.Call("Coordinator.MapTaskResp", &req2, &resp2)
        if err != nil {
            fmt.Printf("err: %v\n", err)
        }
        taskid++
        // then send to MapTaskResp
    }
}

類似地，有WorkerReduce方法：

func WorkerReduce(reducef func(string, []string) string, client *rpc.Client) {
	// retry & finish by a series attemps
RESTARTREDUCE:
    retryTimes := 0
    for retryTimes < 3 {
        req := ReduceRequest{}
        var resp ReduceResponse
        err2 := client.Call("Coordinator.AssignReduceTask", &req, &resp)

        if err2 != nil {
            fmt.Printf("err2: %v\n", err2)
            time.Sleep(time.Second)
            retryTimes++
            continue
        }

        retryTimes = 0
        if resp.State == "done" {
            return
        }
        if resp.ReduceId == -1 {
            time.Sleep(time.Second)
            continue
        }

        // resp.Filename is the midFiles
        // like serial, transfer to coordinate and combine it
        req2 := ReduceTaskState{resp.ReduceId, ""}
        resp2 := ReduceResponse{}
        resp2.ReduceId = resp.ReduceId
        reduceId := resp2.ReduceId
        name := fmt.Sprintf("%s-%d", "mr-out", reduceId)
        outtmpfile, _ := os.CreateTemp("", "di-out")

        kva := []KeyValue{}
        jsonParseState := true

        for _, filename := range resp.Filenames {
            f, err := os.Open(filename)
            defer f.Close()
            if err != nil {
                fmt.Println("mid file open wrong:", err)
                req2.State = "nosuchfile"
                client.Call("Coordinator.ReduceStateResp", &req2, &resp2)
                goto RESTARTREDUCE
            }
            d := json.NewDecoder(f)

            for {
                var kv KeyValue
                if err := d.Decode(&kv); err != nil {
                    if err == io.EOF {
                        break
                    }
                    fmt.Println("json parse:", err)
                    req2.State = "jsonparseerr"
                    client.Call("Coordinator.ReduceStateResp", &req2, &resp2)
                    jsonParseState = false
                    break
                }
                kva = append(kva, kv)
            }
        }
        if jsonParseState {
            sort.Sort(byKey(kva))

            i := 0
            for i < len(kva) {
                j := i + 1
                for j < len(kva) && kva[i].Key == kva[j].Key {
                    j++
                }
                vv := []string{}
                for k := i; k < j; k++ {
                    vv = append(vv, kva[k].Value)
                }
                s := reducef(kva[i].Key, vv)
                fmt.Fprintf(outtmpfile, "%v %v\n", kva[i].Key, s)
                i = j
            }

        } else {
            goto RESTARTREDUCE
        }

        req2.State = "done"
        req2.ReduceId = resp.ReduceId
        err := client.Call("Coordinator.ReduceStateResp", &req2, &resp2)
        if err == nil {
            os.Rename(outtmpfile.Name(), name)
        }

    }
}

在我們測試的時候，實際調用的是Worker函數，我們把之前寫的方法加上去，就實現了Worker方法：

func Worker(mapf fun(string, string) []KeyValue, reducef func(string, []string) string) {
	i := 0
	c, _ := rpc.DialHTTP("tcp", "127.0.0.1"+":1234")
    defer c.Close()
	
	c.Call("Coordinator.AssignWorkerId", &i, &workerInfo)
    time.Sleep(time.Second)
    WorkerMap(mapf, c)

	WorkerReduce(reducef, c)
}

5. 實驗效果

短暫地挖個坑。

一開始是crash exit過不去

修改之後就可以ALL PASS了

mit6.824-lab1 MapReduce

雜談

開始之前

實驗過程

1. 大體結構

2. RPC消息結構

3. Coordinator Process

4. worker的實現

5. 實驗效果

mit6.824lab2B raft

mit6.824lab2A Raft

mit6.824-lab1 MapReduce

模擬飛行開發任務進度

《操作系統原理、實現與實踐》實踐項目5&6：信號量&地址映射與共享

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結