Top 5 Lessons I learned while working with Go for two years

So I have been writing Go services for like two years now, both professionally and as personal projects. Using a certain language in numerous projects over an extended period of time allows you to make mistakes, fix them, realize it's still not the best way to do it, fix them again and generally get better the more you get to re-do stuff because each time you try to avoid a mistake you made the last time that caused you a headache throughout that project.

In this article, I will be discussing some of those mistakes and the lessons I have learned trying to mitigate them in future projects. This is not by any means a discussion of ideal solutions, it's just the ideas I learned and developed through my experience working with Go.

1. Take the Go highway to Concurrency

In my opinion, what makes Go very appealing as a language (other than it's simplicity and near C performance), is its ability to easily write concurrent code. Goroutines are the Go way of writing concurrent code. A goroutine is a light-weight thread or otherwise called a green thread, and yes, it's not a kernel thread. Goroutines are green-threads in the sense that their scheduling is entirely managed by the Go runtime and not the OS. The Go scheduler is responsible for multiplexing goroutines onto real kernel threads, this has the benefit of making goroutines very light weight in terms of their startup time and memory requirements which allows a go application to run millions of goroutines!

I think that the way Go handles concurrency is unique and a common mistake is dealing with Go concurrency the way you would deal with concurrency in any other language, for example Java. One of the most famous examples of how different is the Go way of dealing with Concurrency can be summarized in:

Don’t communicate by sharing memory, share memory by communicating.

A very common case is that an application will have multiple goroutines accessing a shared memory block. So let's say for example, we are implementing a connection pool where we have an array of available connections and each goroutine can either acquire or release a connection. The most common way to go about this is to use a mutex which allows only the goroutine holding the mutex to exclusively access the array of connections at a given time. So the code will look something like this (note that some details are abstracted to keep the code concise):

type ConnectionPool struct {
  Mu          sync.Mutex
  Connections []Connection
}

func (pool *ConnectionPool) Acquire() Connection {
  pool.Mu.Lock()
  defer pool.Mu.Unlock()

  //acquire and return a connection
}

func (pool *ConnectionPool) Release(c Connection) {
  pool.Mu.Lock()
  defer pool.Mu.Unlock()

  //release the connection c
}

This seems quite reasonable but what if we forget to implement the locking logic? What if we did implement it but forgot to lock in one of the many functions? What if we didn't forget to lock, but forgot to unlock? What if we lock only parts of the critical section (underlocking)? Or what if we lock parts that don't belong to the critical section (overlocking)? This seems error prone and generally it's not the Go way of dealing with concurrency.

This takes us back to the Go mantra that is "Don’t communicate by sharing memory, share memory by communicating". To understand what this means, we need to first understand what is a Go channel? A channel is the Go way of implementing communication between goroutines. It's essentially a thread-safe data pipe that allows goroutines to send or receive data among themselves without having to access a shared memory block. A Go channel can also be buffered which allows its capacity to control the number of simultaneous calls, effectively acting as a semaphore!

So revisiting our code to make it share by communicating instead of locking, we get a code that looks something like this (note that some details are abstracted to keep the code concise):

type ConnectionPool struct {
  Connections chan Connection
}

func NewConnectionPool(limit int) *ConnectionPool {
  connections := make(chan Connection, limit)

  return &{ Connections: connections }
}

func (pool *ConnectionPool) Acquire() Connection {
  <- pool.Connections

  //acquire and return a connection
}

func (pool *ConnectionPool) Release(c Connection) {
  pool.Connections <- c

  //release the connection c
}

Using Go channels didn't only reduce the size and the overall complexity of the code, but it abstracted away the need to explicitly implement thread safety. So now the data structure itself is intrinsically thread-safe so even if we forgot about that, it would still work.

The benefits of using channels are quite numerous, this example barely scratches the surface, but the lesson here is to don't write concurrent code in Go the way you write it in any other language.

2. If it can be singleton, then make it singleton, but do it right!

A Go application will probably have to access a Database or a Cache, etc. which are examples of resources that have connection pools, meaning, that there is a limit on the number of concurrent connections to that resource. From my experience, most connection objects (database, cache, etc.) in Go are built as thread-safe pool of connections that can be used by multiple goroutines concurrently as opposed to a single connection.

So let's say we have a Go application that accesses as a mysql database via a *sql.DB object which is essentially a pool of connections to the database. If the application has many goroutines, it wouldn't make sense to create a new *sql.DB object, in fact this might cause connection pool exhaustion (note that not closing connections after usage can also cause this). So it makes sense that the *sql.DB object representing the connection pool has to be singleton, so even if a goroutine tried to create a new object, it will return the same object and thus disallowing having multiple connection pools.

Making objects that can be shared during the lifetime of the application singleton is generally a good practice because it encapsulates this logic and protects against code that doesn't respect this policy. A common pitfall is implementing singleton creation logic that is not thread-safe itself. For example, consider the following code (note that some details are abstracted to keep the code concise):

var dbInstance *DB

func DBConnection() *DB {
  if dbInstance != nil {
    return dbInstance
  }

  dbInstance = &sql.Open(...)
  return dbInstance
}

The previous code checks if the singleton object is not nil (which means it was previously created) in which case it returns it, but if it's nil then it creates a new connection, assigns it to the singleton object and returns it. This, on principle, should only create a single database connection, but this is not thread-safe.

Consider the case where 2 goroutines are calling the function DBConnection() at the same time. It's possible that the first goroutine reads the value of dbInstance and finds it nil and then proceeds to create a new connection, but before the newly created instance is assigned to the singleton object, the second goroutines also performs the same check, comes to the same conclusion and proceeds to create a new connection leaving us with 2 connections instead of just 1.

This problem can be handled using locks as discussed in the previous section, but this is also not the Go way of doing this. Go supports atomic operations that are by default thread-safe, so if we can use something that guarantees thread-safety instead of implementing it explicitly, so let's do that!

So revisiting our code to make it thread-safe, we get a code that looks something like this (note that some details are abstracted to keep the code concise):

var dbOnce sync.Once
var dbInstance *DB

func DBConnection() *DB {
  dbOnce.Do(func() {
    dbInstance = &sql.Open(...)
  }

  return dbInstance
}

This code uses a Go construct called sync.Once which allows us to write a function that will only be executed once. This way, the connection creation segment is guaranteed to run exactly once even if multiple goroutines attempt to execute it concurrently.

3. Beware of Blocking Code

Sometimes your Go application will perform blocking calls, maybe requests to external services that might not respond or maybe calls to functions that block on certain conditions. As a rule of thumb, never assume that calls will return in due time, because well, sometimes they don't.

The way this is usually handled is by setting a timeout after which this call will be cancelled and the execution can proceed. It's also advisable (if possible) to perform blocking calls on separate routines instead of blocking the main routine.

So let's consider the case where a Go application needs to perform an external network request over http. The Go http client doesn't timeout by default, but we can set a timeout as follows:

&http.Client{Timeout: time.Minute}

While this works just fine, it's quite limited, because now we have the same timeout for all requests performed via the same client (this might be a singleton object). Go has a package called called context built by google which allows us to pass request-scoped values, cancellation signals, and timeouts to all the goroutines involved in handling a request. We can use it as follows:

ctx, cancel := context.WithTimeout(context.Background(), time.Minute)
defer cancel()

req := req.WithContext(ctx)
res, err := c.Do(req)

The previous code allows to set a per-request timeout and also be able to cancel the request or pass per-request values if needed. This is particularly helpful if the goroutine sending the request spawned other goroutines that all need to exit if the request times out.

In this example, we were lucky that the http package supports contexts, but what if we are dealing with a blocking function that doesn't? There is an adhock way to implement timeout logic using Go's select statement along with Go channels (again!). Consider the following code:

func SendValue(){
  sendChan := make(chan bool, 1)

  go func() {
    sendChan <- Send()
   }()

  select {
    case <-sendChan:
    case <-time.After(time.Minute):
      return
  }

  //continue logic in case send didn't timeout
}

We have a function called SendValue() which calls a blocking function called Send() that might block forever. So we initialize a buffered boolean channel and execute the blocking function on a separate goroutine that when done sends a signal in the channel, while in the main goroutine we block waiting for the channel to yield a value (request succeeded) or wait for 1 minute and return. Note that the previous code is missing logic to cancel the Send() function and thus terminating the goroutine (otherwise this can cause a goroutine leak).

4. Graceful Termination and Clean Up

If you are writing a long running process, for example, a web server or a background jobs worker, etc. you are running the risk of being terminated abruptly. This termination could be because the process was terminated by the scheduler or even if you're releasing new code and you are rolling it out.

Generally, a long running process might have data in its memory that will be lost if the process was to be terminated or it might be holding resources that need to be released back to the resources pool. So when a process is killed we need to be able to perform graceful termination. Gracefully terminating a process means that we intercept the kill signal and perform application specific shutdown logic that guarantees that everything is in order before actually terminating.

So let's say we are building a web server which we will usually want to be able to run like so:

server := NewServer()
go server.Run()

The key idea here is that the Run() function that runs infinitely, in the sense that if we call it at the end of our main function, the process will not exit, it will only exit if the Run() function exits.

The server logic can be implemented to check for a shutdown signal and only exit if a shutdown signal was received as follows:

func (server *Server) Run() {
  for {
    //infinite loop

    select {
      case <- server.shutdown:
        return
      default:
        //do work
    }
  }
}

As we can see in the previous code the server loop checks if a signal was sent down the shutdown channel in which case it exits the server loop, otherwise it continues to perform its work.

Now the only missing piece in the puzzle is being able to intercept OS interrupts such as (SIGKILL or SIGTERM) and call Server.Shutdown() which performs shutdown logic (flush memory to disk, release resources, clean up, etc.) and sends the shutdown signal to terminate the server loop. We can achieve this as follows:

func main() {
  signals := make(chan os.Signal, 1)
  signal.Notify(signals, os.Interrupt)

  server := NewServer()
  go server.Run()

  select {
    case <-signals:
      server.Shutdown()
  }
}

The previous code creates a buffered channel of type os.Signal and sends a signal down the channel when an OS interrupt occurs. The rest of the main function runs the server and blocks waiting on the channel. When a signal is received, this means that an OS interrupt took place, instead of exiting immediately, it calls the server shutdown logic giving it a chance to gracefully terminate.

5. Go Modules FTW

It's very common that your Go application will have external dependencies, for example, you're using a mysql driver or a redis driver or any other package. When you first build your application, the build process will get the latest version of each of these dependencies, which is great. Now you built your binary, you tested it and it works, so you went to production.

A month later, you needed to add a new feature or do a hotfix which might not require a new dependency but will require re-building the application to produce the new binary. The build process will also get the latest version of the each of the needed packages, but this version might be different from the version you got on your first build and might contain breaking changes that will cause the application itself to break. So it's clear that we need a way to manage this by always getting the same version of the each dependency unless you explicitly opt to upgrade.

Go.11 introduced go.mod which is the new way to handle dependency versioning in Go. When you init a Go mod in your application and then build it, it will automatically produce a go.mod file and a go.sum file. The mod file looks something like this:

module github.com/org/module_name

go 1.14

require (
    github.com/go-sql-driver/mysql v1.5.0
    github.com/onsi/ginkgo v1.12.3 // indirect
    gopkg.in/redis.v5 v5.2.9
    gopkg.in/yaml.v2 v2.3.0
)

As we can see that it locks the Go version along with the version used for each dependency, so for example we will always get redis v5.2.9 with every build even if the redis repo released v5.3.0 as their latest version which guarantees stability. Notice that the second dependency that's labeled indirect which means that's a dependency that's not directly imported by your application but imported by one of its dependencies and it also locks its version.

There are many other benefits to using Go mods, for example:

It automatically runs go mod tidy which removes any unneeded dependencies.
It allows you to run the code from any directory (prior to go mods, a go project had to be placed in a specific directory).
It wraps the whole application into a module that can be imported into a brand new project. This particularly useful in case you decided to wrap some logic into a package for example a logger package and just import it in all your other projects to make that logging functionality available across multiple code bases.

Conclusion

Go is a very awesome language, quite frankly, I am a major fan, but while learning Go and working with it, you get to learn lots of useful lessons that make your future Go code better. In this article, I talked about some of those lessons I have learned, but there are many more, maybe I will write a part 2 some day!

Sayed Alesawy