Zoltan Madarassy

June 30, 2022

Golang code review notes

Quick summary of some of the bug classes in Go

golang

codereview

On This Page

Intro

Our aim is both to collate various resources and patterns one can find on the Internet, but also to add a couple of notes about quirks of the language that could lead to security vulnerabilities.

Preface

Go in the past years has become a language of many faces. It’s compiled and garbage collected nature with easy cross-compilation support for many architectures make it ideal for embedded and IoT projects, while the simplicity of goroutines make it ideal for web-based projects, which is further supported by the many third-party libraries. Not to mention its built-in testing framework which now natively supports fuzzing.

Because of the popularity of the language, in this blog post we wanted to highlight some code constructs for security engineers to look out for.

A quick note on Golang versions: the examples in this post were tested on Go v1.17 and v1.18. Future changes to the language might alter the described behaviour.

Getting started

Before we get to the juicy bits though, I would like to take a moment to highlight a couple of tools that come really handy when one is facing a dauntingly large code base. To get our bearings, of course we could start with the route handlers in a web project or a function dispatcher in a command line tool and go from there. However, if we want to get a quick overview of some of the hotspots, files where complexity is high, we have two tools that can come to our aid:

abcgo, which will measure the ABC metrics of each file
gocyclo can help us get a feel for each file’s cyclomatic complexity

With the information from these two tools and our preferred code review approach we can quickly home in on parts of the code base where some juicy bugs might reside.

Bug classes

Even though Go is a memory managed language with a strong focus on security be default, there are still some dangerous code constructs developers need to keep an eye on. The goal of this blog post is not to list each of the relevant bug categories, but if one is interested, a good place to start would be the various semgrep rule packs available - or check out gosec.Roughly, most common Go security bugs can be categorised as follows:

Cryptographic misconfigurations
Input validation (SQLi, RCE, etc.)
File/URL path handling
File permissions (less common as default permissions are pretty secure since ~v1.17)
Concurrency issues (goroutine leaks, race conditions, etc.)
Denial of Service
Time of check, time of use issues

Rather than going through each of these categories, we will highlight a couple of patterns that either tend to be high impact, obscure, too common, relatively unknown, just interesting or any combination of these.

Also, as a general note for the discussions below: not all bugs might have a security risk, or at least not immediately apparent. To write about these patterns in a generic way, we must abstract a lot of the context away, meaning should the reader run into any of these in a code review, they would have to figure out the impact individually. Sometimes a bug will be just a bug, but in some cases they can turn out to be serious security flaws.

Directory traversal

To ease into some of the more obscure bug classes, let’s start with something relatively simple, but all too common in Go code. Path/directory traversal is a classic vulnerability where an attacker can interact with the server’s file system by supplying malicious input. This usually takes the form of adding multiple “../” or “..”’s to control a file path.

The functions to note here both belong to the path/filepath package, and namely they are:

filepath.Join()
filepath.Clean()

Demonstrated by several bug reports, filepath.Join() is a common culprit for directory traversal vulnerabilities. The reason might be that the documentation is a little misleading. It states that:

Join joins any number of path elements into a single path, separating them with an OS specific Separator. Empty elements are ignored. The result is Cleaned.

The key word here being “Cleaned”, which is none other than the above mentioned filepath.Clean() function. From the documentation (and source code comments), here is what filepath.Clean() does:

Replace multiple Separator elements with a single one.
Eliminate each . path name element (the current directory).
Eliminate each inner .. path name element (the parent directory) along with the non-.. element that precedes it.
Eliminate .. elements that begin a rooted path: that is, replace “/..” by “/” at the beginning of a path, assuming Separator is ‘/’.

On a cursory glance this could be easily read in a way that ensures that the function protects against directory traversal attacks.

However, if we prepare a quick test program, we can see that it doesn’t exactly do that:

package main

import (
	'fmt'
	'path/filepath'
)

func main() {
	strings := []string{
		'/elttam/./../elttam',
		'elttam/../elttam',
		'..elttam/./../elttam',
		'elttam/../../../../../../../elttam/elttam',
	}

	domain := 'elttam.com'

	for _, s := range strings {
		fmt.Println(filepath.Join(domain, s))
	}
}

Output:

❯ go run path-traversal.go
elttam.com/elttam
elttam.com/elttam
elttam.com/elttam
../../../../../elttam/elttam

Now as we can see, we did clean our input strings, that is, removed anything that might not land us in elttam. However, if we look under the hood, starting on line 127 of filepath/path.go (as of Go v1.18.2, in case the file changes in the future), we can see that the filepath.Clean() function specifically has a switch-case to allow ‘../../../../’ types of input. That is, if a dotdotslash string starts without a leading separator, the string will remain intact.

If we run our previous test strings through an additional round of filepath.Clean(), we’ll see that since our last string after filepath.Join() starts with a leading ‘../’, we’ll run into the backtracking switch-case as expected, thus our path traversal payload will remain intact:

package main

import (
	'fmt'
	'path/filepath'
)

func main() {
	strings := []string{
		'/elttam/./../elttam',
		'elttam/../elttam',
		'..elttam/./../elttam',
		'elttam/../../../../../../../elttam/elttam',
	}

	domain := 'elttam.com'

	for _, s := range strings {
		fmt.Println(filepath.Clean(filepath.Join(domain, s)))
	}
}

Output:

❯ go run path-traversal-clean.go
elttam.com/elttam
elttam.com/elttam
elttam.com/elttam
../../../../../elttam/elttam

As a quick summary, it is a misconception that Go’s filepath.Clean() will save us from path traversal attacks. Additionally, we needn’t be bothered calling it directly after a filepath.Join(), as Clean() is already built in to it.

Goroutine leaks

Go’s easy-to-use concurrency¹ model is also host to one of the most subtle but prominent bug classes. The reader is probably familiar with race conditions, but might not have heard about goroutine leaks.

While there are excellent descriptions on what goroutine leaks are, for the sake of completeness, let me explain it quickly here as well.

There are two basic concepts/keywords that dictate goroutines. One is simply an inline function declaration with the leading go keyword, like so:

package main

import (
	'fmt'
	'time'
)

func print(s []string) {
	for _, v := range s {
		fmt.Println(v)
	}
}

func main() {
	strings := []string{'one', 'two', 'three', 'four', 'five'}
	

	go print(strings)

	time.Sleep(3 * time.Second)
	fmt.Println('Exiting')
}

Output:

❯ go run goroutines-test.go
one
two
three
four
five
Exiting

The interesting thing about goroutines is that the calling function doesn’t have to wait for them to return before returning itself. In the above case it usually leads to the program exiting before we see any prints on the console. This is one part of the problem for goroutine leaks.

The other important Go concept that contributes to goroutine leaks is channels. As the Go by Example website explains:

Channels are the pipes that connect concurrent goroutines.

At its basic usage, we can either send data to channels or receive data from them:

package main

import (
	'fmt'
)

func mod(min int, max int, div int, signal chan int) {
	for n := min; n <= max; n++ {
		if n%div == 0 {
			signal <- n
		}
	}
}

func main() {
	fsignal := make(chan int)
	ssignal := make(chan int)

	go mod(15, 132, 14, fsignal)
	go mod(18, 132, 17, ssignal)

	fourteen, seventeen := <-fsignal, <-ssignal
	fmt.Println('Divisible by 14: ', fourteen)
	fmt.Println('Divisible by 17: ', seventeen)
}

In this example we have blocking, unbuffered channels. The two go hand in hand, as unbuffered channels are meant to be used for synchronous operations, where the program cannot proceed until data has been received from the channel, so it blocks further execution.

Goroutine leaks happen when an unbuffered channel doesn’t have a chance to send data on its channel because its calling function has already returned. This will mean that the hanging goroutine will remain in memory because the garbage collector will always see it waiting for data. Take this example for example:

package main

import (
	'fmt'
	'time'
)

func userChoice() string {
	time.Sleep(5 * time.Second)
	return 'right choice'
}

func someAction() string {
	ch := make(chan string)
	timeout := make(chan bool)

	go func() {
		res := userChoice()
		ch <- res
	}()

	go func() {
		time.Sleep(2 * time.Second)
		timeout <- true
	}()

	select {
	case <-timeout:
		return 'Timeout occured'
	case userchoice := <-ch:
		fmt.Println('User made a choice: ', userchoice)
		return ''

	}
}

func main() {
	fmt.Println(someAction())
	time.Sleep(1 * time.Second)
	fmt.Println('Exiting')
}

This is a quick example to simulate a functionality requiring user interaction but with a very low timeout value. This means that unless our user is very quick, timeout will happen sooner than they can make a choice and thus the userChoice() function inside the anonymous goroutine will never return and thus leak.

Security implication

The security impact of this bug class heavily depends on the context, but most likely it would lead to a Denial of Service condition. It’s important to note that this will only be an issue if the program has a sufficiently long lifetime and it starts enough goroutines to exhaust memory resources. Again, this will highly depend on the use-case and environment of each Go program, heavily influencing the impact of such a bug.

Fix

The easiest fix is to use buffered channels, which will mean non-blocking (asynchronous) behaviour for goroutines:

func someAction() string {
	ch := make(chan string, 1)
	timeout := make(chan bool)

`fmt.Sprintf()`

People familiar with C will also feel familiar with Go’s fmt.Sprintf(). This string formatting function works similarly to that of C, where each formatting verb² formats successive arguments. This is easy so far and there is nothing inherently wrong with it (as opposed to C, Go’s fmt.Sprintf() is memory safe).However, it’s all too common for developers to use this function where they shouldn’t.

Creating host:port strings

All too frequently developers will do something similar to the below:

target := fmt.Sprintf('%s:%s', hostname, port)

At first glance this line looks like it’s appending a port to a hostname separated by a colon, probably to connect to a server later. At second glance though… it still does that. However, what happens if the hostname is an IPv6 address? In this case if the resulting string is used in a network connection, the first time a colon is encountered by the network library, it will assume it to be a protocol separator, which will at the least create an exception.

To avoid this issue, using the net.JoinHostPort will create a string in the following way: [host]:port, which is a generally accepted connection string.

Unescaped control characters

One of the most commonly used formatting verb with fmt.Sprintf() is the familiar %s, which represents a plain string. However, what happens if such a formatted string would be used in an REST API call, such as this one:

URI := fmt.Sprintf('admin/updateUser/%s', userControlledParam)
resp, err := http.Post(filepath.Join('https://victim.site/', URI), 'application/json', body)

The important bit to remember is that the %s formatting verb denotes a plain string. A user could inject control characters such as \0xA for a new line or \xB for a tab. This could lead to various header injection vulnerabilities in most cases.

We have two potential solutions for this:

use the %q formatting verb for example, which will create a quoted string, with encoded control characters
use strconv.Quote() which will quote the string and also encode control characters

The `unsafe` package

Go has an aptly named package: unsafe. As the documentation explains (along with warnings around its use):

Package unsafe contains operations that step around the type safety of Go programs.

A typical use of its functions from a security perspective is along with the syscall package (there are many others too though). To understand why this pairing is common, we need to understand a little what an unsafe.Pointer and a uintptr in Go is, respectively.

To put it simply, an unsafe.Pointer is a Go built-in type (just like string, map, chan, etc.), meaning it has an associated Go object in memory. Basically any Go pointer can be cast to unsafe.Pointer which will tell the compiler not to perform bounds checking on the object - ie developers can tell the Go compiler to bypass its type safety. In addition to this, an uintptr is basically just an integer representation of the memory address unsafe.Pointer is pointing to.

Now back to syscall. As the reader with kernel or C programming knowledge might expect, system calls are operate “below” a compiled Go binary, meaning they expect raw pointers when invoked - they wouldn’t know what to do with a full Go unsafe.Pointer object. So when we want to invoke a syscall from a Go program, we will need to convert an unsafe.Pointer to an uintptr to lose all the additional data a Go object has in memory. This will turn the pointer into a simple integer representation of a memory address the pointer is pointing to:

rawPointer := uintptr(unsafe.Pointer(pointer))

So far so good. Another important thing to our present discussion is the fact that Go has a non-generational concurrent, tri-color mark and sweep garbage collector. That’s a mouthful, so we’ll just focus on the fact that it’s concurrent. To simplify a little, this means that we cannot be certain when a garbage collection might occur during the runtime of a Go program.

Putting these two things together the eagle eyed reader will realise that if a garbage collection should occur between a conversion from unsafe.Pointer to uintptr and its use in a syscall, we might be passing a completely different structure in memory to the system call. This is because the garbage collector might move objects around in memory, but it won’t update an uintptr, so the address might hold completely different data compared to when we performed our conversion.

Security implication

Again, similarly to other Go-oddities, the impact of this bug really depends on context. First of all, the likelihood of memory changing because of garbage collection under an uintptr is pretty low. However, this likelihood will significantly increase the more goroutines we have and the length of time our program is running³. Most likely we’ll get an invalid pointer dereference exception, but it might be possible to turn such a bug into an exploit primitive.

Fix

Avoiding this issue is pretty easy: the conversion to uintptr and its use need to happen in an atomic step, such as below:

_, _, errno := syscall.Syscall(syscall.SYS_IOCTL, f.Fd(), request, uintptr(unsafe.Pointer(pointer)))

`os.Executable()`

The interesting thing about this function’s behaviour is that how it treats symbolic links will depend on the operating system - which is relatively weird behaviour in Go.

Let’s take this example which reads a supposed configuration file which is assumed to be in the same directory from where we are running our program:

func withoutEval() string {
	execBin, _ := os.Executable()
	path, err := filepath.Abs(filepath.Dir(execBin))
	if err != nil {
		log.Fatalf('error %v\n', err)
	}
	fmt.Println('Path without filepath.EvalSymlinks():', path)
	return path
}

func printFile(path string) {
	fname := 'test.txt'
	abspath := filepath.Join(path, fname)

	file, err := os.Open(abspath)
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	fbytes := make([]byte, 16)
	bytesRead, err := file.Read(fbytes)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf('Number of bytes read: %d\n', bytesRead)
	fmt.Printf('Bytes: %s\n', fbytes)
}

When we run this program without any symlinks, we get the expected behaviour, that is, the config file will be read from the same directory where the main binary is:

loltan@ubuntu:~/Desktop/symlink$ echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA > test.txt
loltan@ubuntu:~/Desktop/symlink$ ./symlink 
Executable location: /home/loltan/Desktop/symlink/symlink
/home/loltan/Desktop/symlink/test.txt
Number of bytes read: 16
Bytes: AAAAAAAAAAAAAAAA

Also, let’s assume that our actual binary is in some other directory, and we are running our program via a symlink, such as:

loltan@ubuntu:~/Desktop/symlink$ echo BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB > bin/test.txt
loltan@ubuntu:~/Desktop/symlink$ mv symlink symlink.bak
loltan@ubuntu:~/Desktop/symlink$ ln -s bin/symlink symlink

Now here comes the weird behaviour. On Linux, Go follows the symlink and thus will try to read the configuration file from /home/user/Documents/:

loltan@ubuntu:~/Desktop/symlink$ ./symlink
Executable location: /home/loltan/Desktop/symlink/bin/symlink
/home/loltan/Desktop/symlink/bin/test.txt
Number of bytes read: 16
Bytes: BBBBBBBBBBBBBBBB

While on MacOS and Windows the program will not follow the symlink and will look for the configuration file in Desktop.

Security implications

As with other subtle code patterns, this will depend on the context.But to give the reader an idea, let’s assume for example that we have a long running Go binary, such as a service or similar, in a location protected from a low-privileged user with a configuration file dictating some secure options, such as host certificate verification towards a server or similar. On Windows and MacOS even with a low privileged user we could potentially create a symlink to this binary in a location we control, and add a modified configuration file there, which will be read by the program next time it runs (or a configuration reparse occurs). Effectively this gives an attacker the ability to overwrite security settings even from a low privilege user account.

Fix

The fix is relatively simple. We just need to pass the results of os.Executable() to os.EvalSymlinks(). This function will check whether the path is a symlink and if so, it will return the absolute path the link points to.

Footnotes

References

How concurrency works under the hood in Go would deserve its own blogpost. To put it simply, goroutines are not native OS threads, rather virtual threads managed by the Go runtime itself. This incurs less overhead in both memory pressure and context switching time while keeping useful features such as multi-core support. ↩
For the C programmers, yes, this is a Go term. It’s weird. ↩
This is because the garbage collector’s scheduler works based on memory pressure on the heap. As much the Go compiler prefers to use the stack, with a longer running time and with more goroutines, pressure on the heap will likely increase. ↩

Golang code review notes

Intro

Preface

Getting started

Bug classes

Directory traversal

Goroutine leaks

Security implication

Fix

fmt.Sprintf()

Creating host:port strings

Unescaped control characters

The unsafe package

Security implication

Fix

os.Executable()

Security implications

Fix

Footnotes

References

`fmt.Sprintf()`

The `unsafe` package

`os.Executable()`