Skip to content

MkUnion and state machines in golang

This document will show how to use mkunion to manage application state on example of an Order Service. You will learn:

  • how to model state machines in golang, and find similarities to "clean architecture"
  • How to test state machines (with fuzzing), and as a bonus you will get mermaid diagrams for free
  • How to persist state in database and how optimistic concurrency helps resolve concurrency conflicts
  • How to handle errors in state machines, and build foundations for self-healing systems

Working example

As an driving example, we will use e-commerce inspired Order Service that can be in one of the following states:

  • Pending - order is created, and is waiting for someone to process it
  • Processing - order is being processed, an human is going to pick up items from warehouse and pack them
  • Cancelled - order was cancelled, there can be many reason, one of them is that warehouse is out of stock.
  • Completed - order is completed, and can be shipped to customer.

Such states, have rules that govern transitions, like order cannot be cancelled if it's already completed, and so on.

We need to have a wayt to trigger changes in state, like create order that pending for processing, or cancel order. We will call those triggers commands.

Some of those rules could change in future, and we want to be able to change them without rewriting whole application. This also informs us that our design should be open for extensions.

Side note, if you want go strait to final code product, then into example/state/ directory and have fun exploring.

Modeling commands and states

Our example can be represented as state machine that looks like this: simple_machine_test.go.state_diagram.mmd

stateDiagram
    "*state.OrderProcessing" --> "*state.OrderCancelled": "*state.CancelOrderCMD"
    [*] --> "*state.OrderPending": "*state.CreateOrderCMD"
    "*state.OrderPending" --> "*state.OrderProcessing": "*state.MarkAsProcessingCMD"
    "*state.OrderProcessing" --> "*state.OrderCompleted": "*state.MarkOrderCompleteCMD"
    "*state.OrderProcessing" --> "*state.OrderError": "*state.MarkOrderCompleteCMD"
    "*state.OrderError" --> "*state.OrderCompleted": "*state.TryRecoverErrorCMD"

In this diagram, we can see that we have 5 states, and 6 commands that can trigger transitions between states shown as arrows.

Because this diagram is generated from code, it has names that represent types in golang that we use in implementation.

For example *state.CreateOrderCMD:

  • state it's a package name
  • CreateOrderCMD is a struct name in that package.
  • CMD suffix it's naming convention, that it's optional, but I find it makes code more readable, and easier to distinguish commands from states.

Below is a code snippet that demonstrate complete model of state and commands of Order Service, that we talked about.

Notice that we use mkunion to group commands and states. (Look for //go:tag mkunion:"Command")

This is one example how union types can be used in golang. Historically in golang it would be very hard to achieve such thing, and it would require a lot of boilerplate code. Here interface that group those types is generated automatically. You can focus on modeling your domain.

example/state/model.go
package state

import "time"

//go:tag mkunion:"Command"
type (
    CreateOrderCMD struct {
        OrderID OrderID
        Attr    OrderAttr
    }
    MarkAsProcessingCMD struct {
        OrderID  OrderID
        WorkerID WorkerID
    }
    CancelOrderCMD struct {
        OrderID OrderID
        Reason  string
    }
    MarkOrderCompleteCMD struct {
        OrderID OrderID
    }
    // TryRecoverErrorCMD is a special command that can be used to recover from error state
    // you can have different "self-healing" rules based on the error code or even return to previous healthy state
    TryRecoverErrorCMD struct {
        OrderID OrderID
    }
)

//go:tag mkunion:"State"
type (
    OrderPending struct {
        Order Order
    }
    OrderProcessing struct {
        Order Order
    }
    OrderCompleted struct {
        Order Order
    }
    OrderCancelled struct {
        Order Order
    }
    // OrderError is a special state that represent an error
    // during order processing, you can have different "self-healing jobs" based on the error code
    // like retrying the order, cancel the order, etc.
    // treating error as state is a good practice in state machine, it allow you to centralise the error handling
    OrderError struct {
        // error information
        Retried   int
        RetriedAt *time.Time

        ProblemCode ProblemCode

        ProblemCommand Command
        ProblemState   State
    }
)

type (
    // OrderID Price, Quantity are placeholders for value objects, to ensure better data semantic and type safety
    OrderID  = string
    Price    = float64
    Quantity = int

    OrderAttr struct {
        // placeholder for order attributes
        // like customer name, address, etc.
        // like product name, price, etc.
        // for simplicity we only have Price and Quantity
        Price    Price
        Quantity Quantity
    }

    // WorkerID represent human that process the order
    WorkerID = string

    // Order everything we know about order
    Order struct {
        ID               OrderID
        OrderAttr        OrderAttr
        WorkerID         WorkerID
        StockRemovedAt   *time.Time
        PaymentChargedAt *time.Time
        DeliveredAt      *time.Time
        CancelledAt      *time.Time
        CancelledReason  string
    }
)

type ProblemCode int

const (
    ProblemWarehouseAPIUnreachable ProblemCode = iota
    ProblemPaymentAPIUnreachable
)

Modeling transitions

One thing that is missing is implementation of transitions between states. There are few ways to do it. I will show you how to do it using functional approach (think reduce or map function).

Let's name function that we will build Transition and define it as:

func Transition(ctx context.Context, dep Dependencies, cmd Command, state State) (State, error)

Our function has few arguments, let's break them down:

  • ctx standard golang context, that is used to pass deadlines, and cancelation signals, etc.
  • dep encapsulates dependencies like API clients, database connection, configuration, context etc. everything that is needed for complete production implementation.
  • cmd it's a command that we want to apply to state, and it has Command interface, that was generate by mkunion when it was used to group commands.
  • state it's a state that we want to apply our command to and change it, and it has State interface, that was generate similarly to Command interface.

Our function must return either new state, or error when something went wrong during transition, like network error, or validation error.

Below is snippet of implementation of Transition function for our Order Service:

example/state/machine.go
//go:generate moq -with-resets -stub -out machine_mock.go . Dependency
type Dependency interface {
    TimeNow() *time.Time
    WarehouseRemoveStock(ctx context.Context, quantity Quantity) error
    PaymentCharge(ctx context.Context, price Price) error
}

func Transition(ctx context.Context, di Dependency, cmd Command, state State) (State, error) {
    return MatchCommandR2(
        cmd,
        func(x *CreateOrderCMD) (State, error) {
            if x.OrderID == "" {
                return nil, ErrOrderIDRequired
            }

            switch state.(type) {
            case nil:
                o := Order{
                    ID:        x.OrderID,
                    OrderAttr: x.Attr,
                }
                return &OrderPending{
                    Order: o,
                }, nil
            }

            return nil, ErrOrderAlreadyExist
        },
        func(x *MarkAsProcessingCMD) (State, error) {
            if x.OrderID == "" {
                return nil, ErrOrderIDRequired
            }
            if x.WorkerID == "" {
                return nil, ErrWorkerIDRequired
            }

            switch s := state.(type) {
            case *OrderPending:
                if s.Order.ID != x.OrderID {
                    return nil, ErrOrderIDMismatch
                }

                o := s.Order
                o.WorkerID = x.WorkerID

                return &OrderProcessing{
                    Order: o,
                }, nil
            }

            return nil, ErrInvalidTransition
// ...
// rest remove for brevity 
// ...

You can notice few patterns in this snippet:

  • Dependency interface help us to keep, well dependencies - well defined, which helps greatly in testability and readability of the code.
  • Use of generated function MatchCommandR2 to exhaustively match all commands. This is powerful, when new command is added, you can be sure that you will get compile time error, if you don't handle it.
  • Validation of commands in done in transition function. Current implementation is simple, but you can use go-validate to make it more robust, or refactor code and introduce domain helper functions or methods to the types.
  • Each command check state to which is being applied using switch statement, it ignore states that it does not care about. Which means as implementation you have to focus only on small bit of the picture, and not worry about rest of the states. This is also example where non-exhaustive use of switch statement is welcome.

Simple, isn't it? Simplicity also comes from fact that we don't have to worry about marshalling/unmarshalling data, working with database, those are things that will be done in other parts of the application, keeping this part clean and focused on business logic.

Note: Implementation for educational purposes is kept in one big function, but for large projects it may be better to split it into smaller functions, or define OrderService struct that conforms to visitor pattern interface, that was also generated for you:

example/state/model_union_gen.go
type CommandVisitor interface {
    VisitCreateOrderCMD(v *CreateOrderCMD) any
    VisitMarkAsProcessingCMD(v *MarkAsProcessingCMD) any
    VisitCancelOrderCMD(v *CancelOrderCMD) any
    VisitMarkOrderCompleteCMD(v *MarkOrderCompleteCMD) any
    VisitTryRecoverErrorCMD(v *TryRecoverErrorCMD) any
}

Testing state machines & self-documenting

Before we go further, let's talk about testing our implementation.

Testing will help us not only ensure that our implementation is correct, but also will help us to document our state machine, and discover transition that we didn't think about, that should or shouldn't be possible.

Here is how you can test state machine, in declarative way, using mkunion/x/machine package:

example/state/machine_test.go
    var di Dependency = &DependencyMock{
        TimeNowFunc: func() *time.Time {
            return &now
        },
    }

    order := OrderAttr{
        Price:    100,
        Quantity: 3,
    }

    suite := machine.NewTestSuite(di, NewMachine)
    suite.Case(t, "happy path of order state transition",
        func(t *testing.T, c *machine.Case[Dependency, Command, State]) {
            c.
                GivenCommand(&CreateOrderCMD{OrderID: "123", Attr: order}).
                ThenState(t, &OrderPending{
                    Order: Order{
                        ID:        "123",
                        OrderAttr: order,
                    },
                }).
                ForkCase(t, "start processing order", func(t *testing.T, c *machine.Case[Dependency, Command, State]) {
                    c.
                        GivenCommand(&MarkAsProcessingCMD{
                            OrderID:  "123",
                            WorkerID: "worker-1",
                        }).
                        ThenState(t, &OrderProcessing{
                            Order: Order{
                                ID:        "123",
                                OrderAttr: order,
                                WorkerID:  "worker-1",
                            },
                        }).
                        ForkCase(t, "mark order as completed", func(t *testing.T, c *machine.Case[Dependency, Command, State]) {
                            c.
                                GivenCommand(&MarkOrderCompleteCMD{
                                    OrderID: "123",
                                }).
                                ThenState(t, &OrderCompleted{
                                    Order: Order{
                                        ID:               "123",
                                        OrderAttr:        order,
                                        WorkerID:         "worker-1",
                                        DeliveredAt:      &now,
                                        StockRemovedAt:   &now,
                                        PaymentChargedAt: &now,
                                    },
                                })
                        }).
                        ForkCase(t, "cancel order", func(t *testing.T, c *machine.Case[Dependency, Command, State]) {
                            c.
                                GivenCommand(&CancelOrderCMD{
                                    OrderID: "123",
                                    Reason:  "out of stock",
                                }).
                                ThenState(t, &OrderCancelled{
                                    Order: Order{
                                        ID:              "123",
                                        OrderAttr:       order,
                                        WorkerID:        "worker-1",
                                        CancelledAt:     &now,
                                        CancelledReason: "out of stock",
                                    },
                                })
                        }).
                        ForkCase(t, "try complete order but removing products from stock fails", func(t *testing.T, c *machine.Case[Dependency, Command, State]) {
                            c.
                                GivenCommand(&MarkOrderCompleteCMD{
                                    OrderID: "123",
                                }).
                                BeforeCommand(func(t testing.TB, di Dependency) {
                                    di.(*DependencyMock).ResetCalls()
                                    di.(*DependencyMock).WarehouseRemoveStockFunc = func(ctx context.Context, quantity int) error {
                                        return fmt.Errorf("warehouse api unreachable")
                                    }
                                }).
                                AfterCommand(func(t testing.TB, di Dependency) {
                                    dep := di.(*DependencyMock)
                                    dep.WarehouseRemoveStockFunc = nil
                                    if assert.Len(t, dep.WarehouseRemoveStockCalls(), 1) {
                                        assert.Equal(t, order.Quantity, dep.WarehouseRemoveStockCalls()[0].Quantity)
                                    }

                                    assert.Len(t, dep.PaymentChargeCalls(), 0)
                                }).
                                ThenState(t, &OrderError{
                                    Retried:        0,
                                    RetriedAt:      nil,
                                    ProblemCode:    ProblemWarehouseAPIUnreachable,
                                    ProblemCommand: &MarkOrderCompleteCMD{OrderID: "123"},
                                    ProblemState: &OrderProcessing{
                                        Order: Order{
                                            ID:        "123",
                                            OrderAttr: order,
                                            WorkerID:  "worker-1",
                                        },
                                    },
                                }).
                                ForkCase(t, "successfully recover", func(t *testing.T, c *machine.Case[Dependency, Command, State]) {
                                    c.
                                        GivenCommand(&TryRecoverErrorCMD{OrderID: "123"}).
                                        BeforeCommand(func(t testing.TB, di Dependency) {
                                            di.(*DependencyMock).ResetCalls()
                                        }).
                                        AfterCommand(func(t testing.TB, di Dependency) {
                                            dep := di.(*DependencyMock)
                                            if assert.Len(t, dep.WarehouseRemoveStockCalls(), 1) {
                                                assert.Equal(t, order.Quantity, dep.WarehouseRemoveStockCalls()[0].Quantity)
                                            }
                                            if assert.Len(t, dep.PaymentChargeCalls(), 1) {
                                                assert.Equal(t, order.Price, dep.PaymentChargeCalls()[0].Price)
                                            }
                                        }).
                                        ThenState(t, &OrderCompleted{
                                            Order: Order{
                                                ID:               "123",
                                                OrderAttr:        order,
                                                WorkerID:         "worker-1",
                                                DeliveredAt:      &now,
                                                StockRemovedAt:   &now,
                                                PaymentChargedAt: &now,
                                            },
                                        })
                                })
                        })
                })
        },
    )

    if suite.AssertSelfDocumentStateDiagram(t, "machine_test.go") {
        suite.SelfDocumentStateDiagram(t, "machine_test.go")
    }
}

func TestStateTransition_UsingTableTests(t *testing.T) {
Few things to notice in this test:

  • We use standard go testing
  • We use machine.NewTestSuite as an standard way to test state machines
  • We start with describing happy path, and use suite.Case to define test case.
  • But most importantly, we define test cases using GivenCommand and ThenState functions, that help in making test more readable, and hopefully self-documenting.
  • You can see use of ForkCase command, that allow you to take a definition of a state declared in ThenState command, and apply new command to it, and expect new state.
  • Less visible is use of moq to generate DependencyMock for dependencies, but still important to write more concise code.

I know it's subjective, but I find it very readable, and easy to understand, even for non-programmers.

Generating state diagram from tests

Last bit is this line at the bottom of the test file:

example/state/machine_test.go
if suite.AssertSelfDocumentStateDiagram(t, "machine_test.go") {
   suite.SelfDocumentStateDiagram(t, "machine_test.go")
}

This code takes all inputs provided in test suit and fuzzy them, apply commands to random states, and records result of those transitions.

  • SelfDocumentStateDiagram - produce two mermaid diagrams, that show all possible transitions that are possible in our state machine.
  • AssertSelfDocumentStateDiagram can be used to compare new generated diagrams to diagrams committed in repository, and fail test if they are different. You don't have to use it, but it's good practice to ensure that your state machine is well tested and don't regress without you noticing.

There are two diagrams that are generated.

One is a diagram of ONLY successful transitions, that you saw at the beginning of this post.

stateDiagram
    "*state.OrderProcessing" --> "*state.OrderCancelled": "*state.CancelOrderCMD"
    [*] --> "*state.OrderPending": "*state.CreateOrderCMD"
    "*state.OrderPending" --> "*state.OrderProcessing": "*state.MarkAsProcessingCMD"
    "*state.OrderProcessing" --> "*state.OrderCompleted": "*state.MarkOrderCompleteCMD"
    "*state.OrderProcessing" --> "*state.OrderError": "*state.MarkOrderCompleteCMD"
    "*state.OrderError" --> "*state.OrderCompleted": "*state.TryRecoverErrorCMD"

Second is a diagram that includes commands that resulted in an errors:

stateDiagram
 %% error=cannot cancel order, order must be processing to cancel it; invalid transition 
    "*state.OrderCancelled" --> "*state.OrderCancelled": "❌*state.CancelOrderCMD"
 %% error=cannot cancel order, order must be processing to cancel it; invalid transition 
    "*state.OrderCompleted" --> "*state.OrderCompleted": "❌*state.CancelOrderCMD"
 %% error=cannot cancel order, order must be processing to cancel it; invalid transition 
    "*state.OrderError" --> "*state.OrderError": "❌*state.CancelOrderCMD"
 %% error=cannot cancel order, order must be processing to cancel it; invalid transition 
    "*state.OrderPending" --> "*state.OrderPending": "❌*state.CancelOrderCMD"
    "*state.OrderProcessing" --> "*state.OrderCancelled": "*state.CancelOrderCMD"
 %% error=cannot cancel order, order must be processing to cancel it; invalid transition 
    [*] --> [*]: "❌*state.CancelOrderCMD"
 %% error=cannot attemp order creation, order exists: invalid transition 
    "*state.OrderCancelled" --> "*state.OrderCancelled": "❌*state.CreateOrderCMD"
 %% error=cannot attemp order creation, order exists: invalid transition 
    "*state.OrderCompleted" --> "*state.OrderCompleted": "❌*state.CreateOrderCMD"
 %% error=cannot attemp order creation, order exists: invalid transition 
    "*state.OrderError" --> "*state.OrderError": "❌*state.CreateOrderCMD"
 %% error=cannot attemp order creation, order exists: invalid transition 
    "*state.OrderPending" --> "*state.OrderPending": "❌*state.CreateOrderCMD"
 %% error=cannot attemp order creation, order exists: invalid transition 
    "*state.OrderProcessing" --> "*state.OrderProcessing": "❌*state.CreateOrderCMD"
    [*] --> "*state.OrderPending": "*state.CreateOrderCMD"
 %% error=invalid transition 
    "*state.OrderCancelled" --> "*state.OrderCancelled": "❌*state.MarkAsProcessingCMD"
 %% error=invalid transition 
    "*state.OrderCompleted" --> "*state.OrderCompleted": "❌*state.MarkAsProcessingCMD"
 %% error=invalid transition 
    "*state.OrderError" --> "*state.OrderError": "❌*state.MarkAsProcessingCMD"
    "*state.OrderPending" --> "*state.OrderProcessing": "*state.MarkAsProcessingCMD"
 %% error=invalid transition 
    "*state.OrderProcessing" --> "*state.OrderProcessing": "❌*state.MarkAsProcessingCMD"
 %% error=invalid transition 
    [*] --> [*]: "❌*state.MarkAsProcessingCMD"
 %% error=cannot mark order as complete, order is not being process; invalid transition 
    "*state.OrderCancelled" --> "*state.OrderCancelled": "❌*state.MarkOrderCompleteCMD"
 %% error=cannot mark order as complete, order is not being process; invalid transition 
    "*state.OrderCompleted" --> "*state.OrderCompleted": "❌*state.MarkOrderCompleteCMD"
 %% error=cannot mark order as complete, order is not being process; invalid transition 
    "*state.OrderError" --> "*state.OrderError": "❌*state.MarkOrderCompleteCMD"
 %% error=cannot mark order as complete, order is not being process; invalid transition 
    "*state.OrderPending" --> "*state.OrderPending": "❌*state.MarkOrderCompleteCMD"
    "*state.OrderProcessing" --> "*state.OrderCompleted": "*state.MarkOrderCompleteCMD"
    "*state.OrderProcessing" --> "*state.OrderError": "*state.MarkOrderCompleteCMD"
 %% error=cannot mark order as complete, order is not being process; invalid transition 
    [*] --> [*]: "❌*state.MarkOrderCompleteCMD"
 %% error=cannot recover from non error state; invalid transition 
    "*state.OrderCancelled" --> "*state.OrderCancelled": "❌*state.TryRecoverErrorCMD"
 %% error=cannot recover from non error state; invalid transition 
    "*state.OrderCompleted" --> "*state.OrderCompleted": "❌*state.TryRecoverErrorCMD"
    "*state.OrderError" --> "*state.OrderCompleted": "*state.TryRecoverErrorCMD"
 %% error=cannot recover from non error state; invalid transition 
    "*state.OrderPending" --> "*state.OrderPending": "❌*state.TryRecoverErrorCMD"
 %% error=cannot recover from non error state; invalid transition 
    "*state.OrderProcessing" --> "*state.OrderProcessing": "❌*state.TryRecoverErrorCMD"
 %% error=cannot recover from non error state; invalid transition 
    [*] --> [*]: "❌*state.TryRecoverErrorCMD"

Those diagrams are stored in the same directory as test file, and are prefixed with name used in AssertSelfDocumentStateDiagram function.

machine_test.go.state_diagram.mmd
machine_test.go.state_diagram_with_errors.mmd

State machines builder

MkUnion provide *machine.Machine[Dependency, Command, State] struct that wires Transition, dependencies and state together. It provide methods like:

  • Handle(ctx context.Context, cmd C) error that apply command to state, and return error if something went wrong during transition.
  • State() S that return current state of the machine
  • Dep() D that return dependencies that machine was build with.

This standard helps build on top of it, for example testing library that we use in Testing state machines & self-documenting leverage it.

Another good practice is that every package that defines state machine in the way described here, should provide NewMachine function that will return bootstrapped machine with package types, like so:

example/state/machine.go
func NewMachine(di Dependency, init State) *machine.Machine[Dependency, Command, State] {
    return machine.NewMachine(di, Transition, init)
}

Conclusion

Now we have all pieces in place, and we can start building our application.

  • We have NewMachine constructor that will give us object to use in our application.
  • We have tests that will ensure that our state machine is correct, fuzzy test help to discover edge cases, and lastly we get diagrams showing which path we tested and cover.
  • We saw how this approach focus on business logic, and keep it separate from other concerns like database, or API clients. Which is one of the principles of clean architecture.

Next steps