Golang: Append vs Copy

Working with data structures is part of our daily routine, and most of the time, developers do not care what is happening under the hood. We add or delete elements, sort them or iterate through them to find what we need. But let’s see how our daily code changes might impact performance.

While working on the article, I wanted to get and then share with you my findings. It’s something that most devs know, but who am I ;) a guy with no CS degree… so don’t judge me

Array vs Slice

An array has a predefined length, and it can NOT grow.

Array var x [5]int

A slice is a pointer to an array it can change the size.

Slice var x []int

Fun Fact:

You might expect that working with Arrays is faster because it sits in the known and dedicated memory segment, while slice is a pointer and when it’s growing it forces memory allocation. It is right! But the array is slightly slower if you work with local scope!

goos: darwin
goarch: arm64
BenchmarkSliceGlobal-8        926196          1257.0 ns/op         0 B/op          0 allocs/op
BenchmarkArrayGlobal-8       2110324           567.0 ns/op         0 B/op          0 allocs/op
BenchmarkSliceLocal-8        2275382           535.0 ns/op         0 B/op          0 allocs/op
BenchmarkArrayLocal-8        1802491           647.4 ns/op         0 B/op          0 allocs/op

Check the code on your local and compare the results. The code is available at StackOverflow

Append vs Copy

Now let’s compare two main functions that we can use to make a slice grow

slice = append(slice, anotherSlice...)

AND

copy(slice, anotherSlice)

Append

Most of the time, devs are using append(). It’s simple, elegant and provides us with good readability…

func mergeTwoSliceOfFloatsWithAppend(floatsToIterate []float64, floatsToAdd []float64) (int, error) {
    var newSlice []float64
    if len(floatsToIterate) == 0 || len(floatsToAdd) == 0 {
        return len(newSlice), nil
    }
    for _, _ = range floatsToIterate {
        newSlice = append(newSlice, floatsToAdd...)
    }
    return len(newSlice), nil
}

Copy

On the other hand, copy() needs some work to be done before it performs its action. First of all, the new slice’s length needs to be specified and then track the index where the new slice has to be added. Kind of complicated, right?

func mergeTwoSliceOfFloatsWithCopy(floatsToIterate []float64, floatsToAdd []float64) (int, error) {
    sliceLength := len(floatsToIterate) * len(floatsToAdd)
    newSlice := make([]float64, sliceLength)
    if len(floatsToIterate) == 0 || len(floatsToAdd) == 0 {
        return len(newSlice), nil
    }
    var i int
    for _, _ = range floatsToIterate {
        i =+ copy(newSlice[i:], floatsToAdd)
    }
    return len(newSlice), nil
}

So, let’s compare the results! Copy works not just faster but also requires only one memory allocation!

Benchmark_withCopy/slice_of_100_floats      88.00 ns/op	     896 B/op    1 allocs/op
Benchmark_withCopy/slice_of_10000_floats     4456 ns/op	   81920 B/op    1 allocs/op
Benchmark_WithAppend/slice_of_100_floats    252.5 ns/op	    2480 B/op    5 allocs/op
Benchmark_WithAppend/slice_of_10000_floats  18985 ns/op	  356224 B/op   12 allocs/op

Looks good, right? But what if the magic is in the newSlice := make([]float64, sliceLength)? Let’s see

func mergeTwoSliceOfFloatsWithAppendToSliceWithSize(floatsToIterate []float64, floatsToAdd []float64) (int, error) {
	sliceLength := len(floatsToIterate) * len(floatsToAdd)
	newSlice := make([]float64, sliceLength)
	if len(floatsToIterate) == 0 || len(floatsToAdd) == 0 {
		return len(newSlice), nil
	}
	for _, _ = range floatsToIterate {
		newSlice = append(newSlice, floatsToAdd...)
	}
	return len(newSlice), nil
}

Running our benchmark on the same set of floats

Benchmark_withCopy/slice_of_100_floats              88.00 ns/op	     896 B/op    1 allocs/op
Benchmark_withCopy/slice_of_10000_floats             4456 ns/op	   81920 B/op    1 allocs/op
Benchmark_WithAppend/slice_of_100_floats            252.5 ns/op	    2480 B/op    5 allocs/op
Benchmark_WithAppend/slice_of_10000_floats          18985 ns/op	  356224 B/op   12 allocs/op
Benchmark_withAppendToDefinedSlice/100_floats       193.8 ns/op	    2688 B/op    2 allocs/op
Benchmark_withAppendToDefinedSlice/10000_floats     23807 ns/op	  507904 B/op    4 allocs/op

Append with a predefined slice seems to be working better. It does less memory allocation. Did it help? Not really! It’s slower than regular append, and it takes 2x times more spaces in memory now than it needs. Append() do not use the slice with predefined length but add a new slice at its end, so the first part of our new slice is empty but takes space in memory. Just try to run the code with a debugger or print the function’s output.

Code is here, please feel free to modify and run your personal benchmarks

What can I say? I was hoping to get similar results, but tests never lie. copy() works faster and needs fewer memory allocations, but the readability of the code is not the selling point of copy() function. I will use append() in my code if I foresee that nobody will pass a massive chunk of data. Most developers know this function, and speaking of trade-offs, I prefer not to spend expensive dev time understanding the code, but add a few bucks on memory. But if the code expects to be performance-focused, then copy(), and some additional lines of comments would be my to-go option.

Share on: