深入了解Go Slice（三）—— append的处理过程

前言

这两篇文章分别介绍了从make、array/slice切片构造slice的具体底层处理过程，本文则介绍通过append生成新的slice的过程。

Slice append

append func

// The append built-in function appends elements to the end of a slice. If
// it has sufficient capacity, the destination is resliced to accommodate the
// new elements. If it does not, a new underlying array will be allocated.
// Append returns the updated slice. It is therefore necessary to store the
// result of append, often in the variable holding the slice itself:
//  slice = append(slice, elem1, elem2)
//  slice = append(slice, anotherSlice...)
// As a special case, it is legal to append a string to a byte slice, like this:
//  slice = append([]byte("hello "), "world"...)
func append(slice []Type, elems ...Type) []Type

以上为builtin/builtin.go中关于append func的说明。append会返回一个新的slice，因此必须保存append的结果。

我们知道，append会追加一个或多个数据至slice中，这些数据会存储在slice的持有的数组中。
数组的长度是固定的，意味着存储的数据是有限的。剩余空间足以容纳追加的数据，则可以正常将数据存入数组。一旦追加数据后总长度超过数组长度后，则原数组已无法存储新数据。那要怎么处理呢？

runtime/slice.go中只有扩容的growslice func，其调用主要在cmd/compile/internal/gc/walk.go中，处理相对复杂。我们可以先看下refelct/value.go下的Append源码，此部分的处理过程很完整和简单。

reflect Append

// Append appends the values x to a slice s and returns the resulting slice.
// As in Go, each x's value must be assignable to the slice's element type.
func Append(s Value, x ...Value) Value {
    s.mustBe(Slice)
    s, i0, i1 := grow(s, len(x))
    for i, j := i0, 0; i < i1; i, j = i+1, j+1 {
        s.Index(i).Set(x[j])
    }
    return s
}
// grow grows the slice s so that it can hold extra more values, allocating
// more capacity if needed. It also returns the old and new slice lengths.
func grow(s Value, extra int) (Value, int, int) {
    i0 := s.Len()
    i1 := i0 + extra
    if i1 < i0 {
        panic("reflect.Append: slice overflow")
    }
    m := s.Cap()
    if i1 <= m {
        return s.Slice(0, i1), i0, i1
    }
    if m == 0 {
        m = extra
    } else {
        for m < i1 {
            if i0 < 1024 {
                m += m
            } else {
                m += m / 4
            }
        }
    }
    t := MakeSlice(s.Type(), i1, m)
    Copy(t, s)
    return t, i0, i1
}

Append处理过程如下:

判断当前slice长度i0与追加数据的总长度i1是否溢出，溢出则报错；
若i1小于/等于slice的cap（底层数组的长度），直接返回原slice的起始及结束数据部分
否则，当前底层数组已无法存储所有的追加数据，需要进行扩容处理：

若当前cap为0，则直接已追加数据的长度为新cap；
若i1大于slice的cap m，开始逐步扩容cap，直至大于总数据总长i1
- 若原数据长度i0<1024，则m翻倍;
- 否则，m自增1/4
构建新的Slice
将原slice的数据拷贝至新slice中，并返回新slice。

将追加的数据存入指定的位置中

append

append的具体调用处理在cmd/compile/internal/gc/walk.go中，核心处理代码如下：

// Node ops.
const (
    OXXX Op = iota
    ...
    OAPPEND       // append(List); after walk, Left may contain elem type descriptor
    ...
)
...
...
case OAPPEND:
    // x = append(...)
    r := n.Right
    if r.Type.Elem().NotInHeap() {
    yyerror("%v is go:notinheap; heap allocationdisallowed", r.Type.Elem())
    }
    switch {
    case isAppendOfMake(r):
    // x = append(y, make([]T, y)...)
    r = extendslice(r, init)
    case r.IsDDD():
    r = appendslice(r, init) // also works for appen(slice, string).
    default:
    r = walkappend(r, init, n)
    }
    n.Right = r
    if r.Op == OAPPEND {
    // Left in place for back end.
    // Do not add a new write barrier.
    // Set up address of type for back end.
    r.Left = typename(r.Type.Elem())
    break opswitch
    }
    // Otherwise, lowered for race detector.
    // Treat as ordinary assignment.
    }
...

可以看到针对append的具体处理分为3种情况：

extendslice

针对格式如下：

    append(x , make([]T, y)...)

针对带有make初始化的append处理
2. appendslice
针对格式如下：

    append(l1, l2...)

针对append slice处理
3. walkappend
针对格式如下：

    append(l1, l2...)

针对append多个具体的元素处理

三者的处理过程稍有差异，此处以appendslice为例来说下具体的处理过程，其他的处理方式如有兴趣，可以自行去查看下。

appendslice

// expand append(l1, l2...) to
//   init {
//     s := l1
//     n := len(s) + len(l2)
//     // Compare as uint so growslice can panic on overflow.
//     if uint(n) > uint(cap(s)) {
//       s = growslice(s, n)
//     }
//     s = s[:n]
//     memmove(&s[len(l1)], &l2[0], len(l2)*sizeof(T))
//   }
//   s
//
// l2 is allowed to be a string.
func appendslice(n *Node, init *Nodes) *Node {
    walkAppendArgs(n, init)

    l1 := n.List.First()
    l2 := n.List.Second()

    var nodes Nodes

    // var s []T
    s := temp(l1.Type)
    nodes.Append(nod(OAS, s, l1)) // s = l1

    elemtype := s.Type.Elem()

    // n := len(s) + len(l2)
    nn := temp(types.Types[TINT])
    nodes.Append(nod(OAS, nn, nod(OADD, nod(OLEN, s, nil), nod(OLEN, l2, nil))))

    // if uint(n) > uint(cap(s))
    nif := nod(OIF, nil, nil)
    nuint := conv(nn, types.Types[TUINT])
    scapuint := conv(nod(OCAP, s, nil), types.Types[TUINT])
    nif.Left = nod(OGT, nuint, scapuint)

    // instantiate growslice(typ *type, []any, int) []any
    fn := syslook("growslice")
    fn = substArgTypes(fn, elemtype, elemtype)

    // s = growslice(T, s, n)
    nif.Nbody.Set1(nod(OAS, s, mkcall1(fn, s.Type, &nif.Ninit, typename(elemtype), s, nn)))
    nodes.Append(nif)

    // s = s[:n]
    nt := nod(OSLICE, s, nil)
    nt.SetSliceBounds(nil, nn, nil)
    nt.SetBounded(true)
    nodes.Append(nod(OAS, s, nt))

    var ncopy *Node
    if elemtype.HasHeapPointer() {
        // copy(s[len(l1):], l2)
        nptr1 := nod(OSLICE, s, nil)
        nptr1.SetSliceBounds(nod(OLEN, l1, nil), nil, nil)

        nptr2 := l2

        Curfn.Func.setWBPos(n.Pos)

        // instantiate typedslicecopy(typ *type, dst any, src any) int
        fn := syslook("typedslicecopy")
        fn = substArgTypes(fn, l1.Type, l2.Type)
        ncopy = mkcall1(fn, types.Types[TINT], &nodes, typename(elemtype), nptr1, nptr2)

    } else if instrumenting && !compiling_runtime {
        // rely on runtime to instrument copy.
        // copy(s[len(l1):], l2)
        nptr1 := nod(OSLICE, s, nil)
        nptr1.SetSliceBounds(nod(OLEN, l1, nil), nil, nil)

        nptr2 := l2

        if l2.Type.IsString() {
            // instantiate func slicestringcopy(to any, fr any) int
            fn := syslook("slicestringcopy")
            fn = substArgTypes(fn, l1.Type, l2.Type)
            ncopy = mkcall1(fn, types.Types[TINT], &nodes, nptr1, nptr2)
        } else {
            // instantiate func slicecopy(to any, fr any, wid uintptr) int
            fn := syslook("slicecopy")
            fn = substArgTypes(fn, l1.Type, l2.Type)
            ncopy = mkcall1(fn, types.Types[TINT], &nodes, nptr1, nptr2, nodintconst(elemtype.Width))
        }

    } else {
        // memmove(&s[len(l1)], &l2[0], len(l2)*sizeof(T))
        nptr1 := nod(OINDEX, s, nod(OLEN, l1, nil))
        nptr1.SetBounded(true)
        nptr1 = nod(OADDR, nptr1, nil)

        nptr2 := nod(OSPTR, l2, nil)

        nwid := cheapexpr(conv(nod(OLEN, l2, nil), types.Types[TUINTPTR]), &nodes)
        nwid = nod(OMUL, nwid, nodintconst(elemtype.Width))

        // instantiate func memmove(to *any, frm *any, length uintptr)
        fn := syslook("memmove")
        fn = substArgTypes(fn, elemtype, elemtype)
        ncopy = mkcall1(fn, nil, &nodes, nptr1, nptr2, nwid)
    }
    ln := append(nodes.Slice(), ncopy)

    typecheckslice(ln, ctxStmt)
    walkstmtlist(ln)
    init.Append(ln...)
    return s
}

代码相对复杂，但func的注释给我们提供了极好的伪代码来说明其具体过程，func实际就是伪代码的具体实施。此处将两者结合下来看下大致处理过程：

计算追加后slice的总长度n
如果总长度n大于原cap，则调用growslice func进行扩容（cap最小为n，具体扩容规则见growslice）
对扩容后的slice进行切片，长度为n，获取slice s，用以存储所有的数据
根据不同的数据类型，调用对应的复制方法，将原slice及追加的slice的数据复制到新的slice

extendslice、walkappend也存在调用growslice的过程，现在一起来了解growslice的详细过程吧。

growslice

growslice是在append的过程中原slice的剩余空间不足以容纳追加的元素时调用的。调用时，指定的cap为追加元素后slice的总长度。

注意：func指定的cap并不一定是扩容后slice的最终cap，具体原因看源码。

// growslice handles slice growth during append.
// It is passed the slice element type, the old slice, and the desired new minimum capacity,
// and it returns a new slice with at least that capacity, with the old data
// copied into it.
// The new slice's length is set to the old slice's length,
// NOT to the new requested capacity.
// This is for codegen convenience. The old slice's length is used immediately
// to calculate where to write new values during an append.
// TODO: When the old backend is gone, reconsider this decision.
// The SSA backend might prefer the new length or to return only ptr/cap and save stack space.
func growslice(et *_type, old slice, cap int) slice {
    if raceenabled {
        callerpc := getcallerpc()
        racereadrangepc(old.array, uintptr(old.len*int(et.size)), callerpc, funcPC(growslice))
    }
    if msanenabled {
        msanread(old.array, uintptr(old.len*int(et.size)))
    }

    if cap < old.cap {
        panic(errorString("growslice: cap out of range"))
    }

    if et.size == 0 {
        // append should not create a slice with nil pointer but non-zero len.
        // We assume that append doesn't need to preserve old.array in this case.
        return slice{unsafe.Pointer(&zerobase), old.len, cap}
    }

    newcap := old.cap
    doublecap := newcap + newcap
    if cap > doublecap {
        newcap = cap
    } else {
        if old.len < 1024 {
            newcap = doublecap
        } else {
            // Check 0 < newcap to detect overflow
            // and prevent an infinite loop.
            for 0 < newcap && newcap < cap {
                newcap += newcap / 4
            }
            // Set newcap to the requested cap when
            // the newcap calculation overflowed.
            if newcap <= 0 {
                newcap = cap
            }
        }
    }
    ...
    ...
    if overflow || capmem > maxAlloc {
        panic(errorString("growslice: cap out of range"))
    }

    var p unsafe.Pointer
    if et.ptrdata == 0 {
        p = mallocgc(capmem, nil, false)
        // The append() that calls growslice is going to overwrite from old.len to cap (which will be the new length).
        // Only clear the part that will not be overwritten.
        memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
    } else {
        // Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
        p = mallocgc(capmem, et, true)
        if lenmem > 0 && writeBarrier.enabled {
            // Only shade the pointers in old.array since we know the destination slice p
            // only contains nil pointers because it has been cleared during alloc.
            bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem)
        }
    }
    memmove(p, old.array, lenmem)//拷贝数据至新分配的数组中

    return slice{p, old.len, newcap}
}

扩容逻辑：

原cap扩容一倍，即doublecap
如果指定cap大于doublecap则使用cap，否则执行如下
如果原数据长度小于1024，则使用doublecap
否则在原cap的基础上每次扩容1/4，直至不小于cap

扩容整体处理：

安装原slice的cap及指定cap计算扩容后的cap
根据计算出cap申请内存(创建新的数组)
将原slice的数据拷贝到新内存中（新数组）
返回新slice，新slilce指向新数组，len为原slice的len，cap为扩容后的cap

正常我们使用，因slice的长度相对较小，append是扩容使用的是doublecap。

与reflect的Append比较，两者主要的区别在于，growslice是指定容量的扩容，Append是基于当前slice的数据进行扩容，两者的具体处理基本一致，某种意义上可以说Append是growslice的一个具体的调用。

使用append后会产生新的slice，必须重新赋值到原slice上，才能更新原slice的数据。

Slice append的数据变动问题

结合以上append的具体处理过程，请回答以下代码运行后，两次append的data和list内的数据是什么？

data := [10]int{}
slice := data[5:8]
slice = append(slice,9)// slice=? data=?
slice = append(slice,10,11,12)// slice=? data=?

答案是：

//第一次append后结果
slice=[0 0 0 9]
data=[0 0 0 0 0 0 0 0 9 0]
//第二次append后结果
[0 0 0 9 10 11 12]
[0 0 0 0 0 0 0 0 9 0]

可以看到第一次append的结果影响到了原data的数据，第二次append的结果并没有影响到了data的数据，这是为什么呢？

未append前，slice的cap是5。第一次append一个元素，未超出cap，因此直接存入数据到数组中。第二次append三个元素，append后的元素长度为7，已大于原slice的cap，因此slice需要扩容，扩容后创建了新的数组，复制了data的数据到新数组内，然后存入append的数据，变动的是新数组，原数组data自然不受影响。

append存在对原数据影响的情况，使用时还是需要注意，如有必要，先copy原数据后再进行slice的操作。

总结

本文从反射及非反射两种角度的源码来探寻append的具体处理过程，对比后，可以发现两者的处理逻辑一致。这给我们一些思路：如发现正面的调用我们无法理解时，可以试试找到其反射对应的处理看看是否更好理解些。

最后，将本文探讨的主要内容总结如下：

slice本身并非指针，append追加元素后，意味着底层数组数据（或数组）、len、cap会发生变化，因此append后需要返回新的slice。
append在追加元素时，当前cap足够容纳元素，则直接存入数据，否则需要扩容后重新创建新的底层数组，拷贝原数组元素后，再存入追加元素。
cap的扩容意味着内存的重新分配，数据的拷贝等操作，为了提高append的效率，若是能预估cap的大小的话，尽量提前声明cap，避免后期的扩容操作。

公众号

鄙人刚刚开通了公众号，专注于分享Go开发相关内容，望大家感兴趣的支持一下，在此特别感谢。

深入了解Go Slice（三）—— append的处理过程

前言

Slice append

append func

reflect Append

append

appendslice

growslice

Slice append的数据变动问题

总结

公众号

使用c#强大的表达式树实现对象的深克隆之解决循环引用的问题

GPT-4o 引领人机交互新风向，向量数据库赛道沸腾了

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU启动那些事（12.A）- uSDHC eMMC启动时间(RT1170)

企业大模型如何成为自己数据的“百科全书”？

本地SSL证书过期输入命令在IIS自动生成

基于Ubuntu-22.04安装K8s-v1.28.2实验（二）使用kube-vip实现集群VIP访问

.NET周刊【5月第2期 2024-05-12】

從main入口開始談golang

記一次gin PostForm bug

golang map轉json的順序問題

深入瞭解Go flag

深入瞭解gorm Scan的使用

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結