How I Tested Asyncio Code in PHX Events

There are many resources on developing asyncio projects, but very few on testing them. Through trial and error and lots of research, I wrote asyncio tests with the help of pytest-asyncio and developing my own custom fixtures. Here’s how I used testing patterns to simply and easily test asyncio code.

OfferZen_Jethro-Muller_How-I-tested-Asyncio-Code-in-PHX-Events_Inner-article-image

Flickswitch is a software as a service company that builds products to allow businesses to manage their SIM card fleets. While working at Flickswitch, I built an asyncio Python library, PHX Events, which processes real-time messages containing updates of the airtime and data balances of SIM cards managed through their SIMcontrol product.

We chose asyncio with the ability to fall back on traditional threads to enable high-throughput for our primarily network-bound workloads.

Asyncio is really good at network-bound workloads because it allows other code to run while waiting for network responses.

The PHX Events proof of concept was built to validate the capabilities of both the server and the Python client and it had no tests. Over time, the proof of concept took on more responsibility and I decided it was worth rewriting it with proper test coverage to make sure that it was maintainable even after I left.

I encountered many questions around testing for asyncio systems in Python that I couldn’t easily find answers to online, such as: How to run async tests, how to mock async methods or functions, and what patterns should be used to ensure the code runs as expected. In the process, I found a set of tools and patterns that worked for my project. I used:

Pytest-asyncio for its ability to run async tests and fixtures,
AsyncMock, a relatively new addition to the unittest mock family to test async code, and
Events, an asyncio synchronisation primitive, to track which async functions had run successfully.

The limitations of PHX Events

We use PHX Events as a high-volume event stream handler. This is how it works: the upstream Elixir server produces events and uses the Phoenix Channels protocol to send them to any subscribed consumers.

Our initial proof of concept was a success because we could handle a reasonable volume of inbound messages and the handler wasn’t spewing errors. But, we wanted to add more topics and use the data we were getting in more ways.

To do this, we had to eliminate a few glaring omissions in the proof of concept’s feature set that were quite limiting:

It didn’t handle subscription responses correctly,
It wasn’t easy to add more topics,
It assumed that there would only ever be a single handler per event type, and
There were no tests!

We needed the ability to subscribe to more topics, handle new data streams, and use the data for multiple different tasks.

Replacing code that’s already in production

We realised using a pubsub method of transferring data was very convenient for doing real-time balance updates. It was fast and let us separate the code generating the data from the code processing the data. To enable this, we needed to subscribe to different topics and be able to register more handler functions for each event type. We also needed to fix the issues we’d found when using PHX Events in production. These changes affected the fundamental structure of the existing code, so we had to do a full rewrite using the knowledge we had gained from the proof of concept.

When replacing something that’s already running successfully in production, it’s often necessary to substantiate why it’s worth taking the risk of changing from the old to the new process.

In my case, I had two objectives: I wanted to leave no room for doubt that the new version of the code was better than what I’d originally written as the proof of concept and I wanted to leave the project in a state that could easily be expanded upon without needing to study the entire codebase first. This is where I think tests shine.

In many projects, tests are seen as a second-class citizen. However, in my experience, they serve an incredibly valuable purpose both for the maintainers and, in the case of a library, the users. The two key benefits I’ve seen them provide are:

The code works as intended

The simplest and most useful test is that the code both runs and runs as intended. It can’t be overstated how useful this is. Any changes you make can be validated against known test cases to give maintainers confidence that what they’ve done most likely hasn’t broken things. An important note is that tests can never give you 100% certainty. They’re a useful measure but if they’re written badly or not kept up to date, they don’t actually provide any useful guarantees.
Coverage is a metric of reliability

Code coverage is a useful metric to track. As the codebase grows, so do the number of critical paths and the potential for bugs. Having a good pool of tests requires that they test as many of the paths through your code as possible. Having a high test coverage score doesn’t automatically mean code is good, but in my experience, libraries with good coverage are often the ones worth investigating first.

Testing PHX Events

What didn’t work: pytest and unittest Mock

When I first set out to test PHX Events, I had never really tested an asyncio project before. I did, however, have years of testing traditional Python code. My first stab at writing the tests relied on the techniques and tools I already knew: pytest and unittest’s Mocks.

Asyncio code requires running on the event loop to work. As such, only async code can call other async code directly. It is only possible to invoke the event loop directly to run code. That’s why pytest wouldn’t be able to test the async portions of the library.

Similarly, because async functions have different internals to regular Python functions, the Mock/MagicMock functionality provided by the unittest library didn’t work as I expected it to. I later realised this was due to my user error. I also tried to implement spy mocks to check the function invocation without preventing the function from actually being called, but the unexpected behaviour continued.

What worked: pytest-asyncio and asyncmock

Fortunately, the world of asyncio has grown significantly since the library was first introduced with Python 3.4 in March 2014. All of the pitfalls I had stumbled into at first had existing solutions! I just had to put them together to create a custom solution for my tests.

The first and most pressing of the pitfalls is that you can’t natively use pytest to run async code. Pytest-asyncio is a necessity if you intend to write async tests in pytest. Made by the same people that made pytest, it allows you to easily call your async code and write async fixtures.

After some Googling, I found that unittest already had a method for mocking coroutines, AsyncMock. AsyncMock behaves the same as MagicMock or Mock but with explicit support for coroutines. It also provides some useful helper methods on the resulting mock objects, such as assert_await and assert_awaited_with.

The final piece of the puzzle for testing PHX Events was using the Event synchronisation primitive provided by the asyncio library. This class creates an object that can be used to wait until the event is set. You can await on event.wait() to pause execution until event.set() is called somewhere else. I used this to easily track if the handler functions I set for events were called as expected.

The benefits of asyncio testing

As I mentioned, pytest-asyncio is a life-saver if you want to use pytest to test asyncio code.

It also allows you to change the event loop your tests are running on very easily if you need to test your code on the default Python event loop and on uvloop.

The main benefit I got from pytest-asyncio in the PHX Events tests was the ability to write async test functions.

This functionality is enabled by using the provided pytest.mark.asyncio pytest mark.

A mark or marker is a way of attaching metadata to a test function. They can be used to enable functionality for specific tests or just as a way of grouping tests. You can mark modules, classes, and individual functions.

An example of the mark being applied to a test module from the pytest-asyncio documentation:

import asyncio
import pytest

# All test coroutines will be treated as marked.
pytestmark = pytest.mark.asyncio

async def test_example(event_loop):
    """No marker!"""

    await asyncio.sleep(0, loop=event_loop)

Similarly, per function:

import asyncio
import pytest

@pytest.mark.asyncio
async def test_example(event_loop):
    await asyncio.sleep(0, loop=event_loop)

Or per class:

import asyncio
import pytest

@pytest.mark.asyncio
class TestClass:

    async def test_example(event_loop):
        await asyncio.sleep(0, loop=event_loop)

An example from PHX Events library testing the async context manager functionality:

from unittest.mock import patch
import pytest
from phx_events.client import PHXChannelsClient

pytestmark = pytest.mark.asyncio

class TestPHXChannelsClientAEnter:

    async def test_returns_self_when_using_as_with_context_manager(self):
        phx_client = PHXChannelsClient('ws://web.socket/url/')

        async with phx_client as test_phx_client:
            assert isinstance(test_phx_client, PHXChannelsClient)
            assert phx_client == test_phx_client

The power of pytest-asyncio cannot be overstated.

Less overhead: To have your tests run as seamlessly as they do with the pytest mark would require hooking into the pytest framework and separating any asynchronous tests into a separate event loop as well as managing the handling of the results. This adds overhead to any project where you want to use asyncio.

Support setup: Another benefit that can easily be overlooked is that this implementation is made by the same people that made pytest, which gives pytest-asyncio the advantage in terms of support and interoperability with existing pytest plugins and pytest test suites.

Access to async fixtures: I used them to mock out the websocket connections to ensure the tests weren’t reliant on a local or remote server. Initially, I spent a lot of time implementing an async mock for an async iterator. However, while writing this article, I found a simpler way to do this is to use the side_effect property of the AsyncMock as is shown in the example below. The old version can be seen here.

Creating Async Fixtures

As is usual when creating a fixture, you only have to decorate an async function with @pytest.fixture for it to work once you have the pytest-asyncio plugin installed.


@pytest.fixture()
def mock_websocket_client() -> MagicMock:
    with patch('phx_events.client.client', autospec=True) as mocked_websocket:
        yield mocked_websocket

@pytest.fixture()
async def mock_websocket_connection(mock_websocket_client) -> AsyncMock:
    async with mock_websocket_client.connect('ws://web.socket/url/') as client_connection:
        yield client_connection

That mock is used by adding the fixture to a test function. In the test itself, before the mock is used, you set the side_effect value of the mock_websocket_connection.__aiter__.

In the following example, the mocked websocket is being used to test that a close message is handled correctly:


pytestmark = pytest.mark.asyncio

class TestPHXChannelsClientProcessWebsocketMessages:
    def setup(self):
        self.phx_client = PHXChannelsClient('ws://web.socket/url/')
        self.topic = Topic('topic:subtopic')

    async def test_raises_exception_on_phx_close_event(self, mock_websocket_connection):
        close_message = json_handler.dumps(make_message(PHXEvent.close, self.topic))
        mock_websocket_connection.__aiter__.side_effect = lambda: async_iter(close_message)
        with pytest.raises(TopicClosedError, match=r"'topic:subtopic', 'Upstream closed'"):
            await self.phx_client.process_websocket_messages(mock_websocket_connection)

I like having access to async fixtures, because they allow you to set up objects as you would in your normal code. In the above fixture example, creating the mocked websocket connection using an async with context manager lets you keep your mock objects as real as possible.

Having access to asynchronous data generation via fixtures is a useful way to interact with your async code in a test environment. Being able to call your async functions directly with the control pytest-asyncio is a great way to keep your test code as simple and maintainable as possible.

Pytest-asyncio provides the building blocks required to get started on testing asyncio and that’s pretty much it. It doesn’t make async code easier to reason about, nor does it abstract any of the complexities of the asyncio world. And I don’t think it should. Adding magic to an already complicated system would just make it harder to reason about any test failures.

Building on the foundation of asyncio and pytest-asyncio, I used asyncio Events as the final piece of the testing puzzle.

Using asyncio events to track async function calls

Asyncio event objects are a synchronisation primitive provided by Python. They allow you to model an event that you want your code to wait for before continuing. The example below will only continue past await event.wait() once event.set() has been called somewhere else in the code.


async def waiter(event):
    print('waiting for it ...')
    await event.wait()
    print('... got it!')

I used asyncio event objects extensively in the test for PHX Events. They were useful, because they allowed me to assert that an event had happened as expected when running the event processing code. This was necessary, because it’s possible to configure any number of handlers to run for a topic or a topic and event type combination. Being able to make sure only the expected events ran was an essential part of testing the system.

Below is an example of a test that checks that only the event handlers run if there is no topic handler associated with the topic in the message. The event handler made for the test sets the event when it runs:


async def event_handler(message: ChannelMessage, client: PHXChannelsClient) -> None:
    client.logger.info(f'{event_handler.__name__} {message=}')
    self.event_handler_event.set()

The test code waits for the event queue to be empty and then checks if the event is set. If it is, we know the handler was run.


    async def test_only_event_handlers_used_for_event_no_topic(self, 
    event_loop, caplog):
        event_handler_config = self.phx_client._event_handler_config[self.event]
        event_message = make_message(self.event, Topic('random_topic'))
        await event_handler_config.queue.put(event_message)
        mock_loop = Mock(event_loop, autospec=True, wraps=event_loop)
        self.phx_client._loop = mock_loop

        # Set log level
        caplog.set_level(logging.INFO)
        event_loop.create_task(self.phx_client._event_processor(self.event))

        # Wait for task to process messages
        await event_handler_config.queue.join()
        assert event_handler_config.queue.empty()

        # The event handler is called with the message and client
        assert self.event_handler_event.is_set()
        assert caplog.messages[0] == f'{self.event_handler.__name__} message={event_message}'

        # We only expect the coroutine to have been called
        mock_loop.create_task.assert_called()

        # We don't expect the event_topic_handler to have been called
        mock_loop.run_in_executor.assert_not_called()

Tip: There are other primitives that Python exposes and could be useful. When using something new, it’s often worth reading the full documentation to make sure you have a good idea of all the tools that you have access to before you start working. I wouldn’t have found the strategy of using Events if I hadn’t first read the documentation and knew they existed and how they worked.

Conclusion

Since its adoption in Python 3.4, asyncio has slowly grown in adoption and into a fully-fledged feature. It’s been an uphill battle: Existing code is often incompatible with asyncio code, which requires rewriting existing code or changing dependencies to options with asyncio support. This has long been a self-perpetuating problem, as projects aren’t made using asyncio, because there are few projects made using asyncio.

As the Python ecosystem moves forward and more code is made using asyncio, it’s going to be increasingly important to adapt existing techniques and workflows to handle the shift. Through this process it will be necessary to onboard new plugins and libraries that make life in the async world simpler and easier to manage. However, these will never be a substitute for knowing how to write good tests.

Writing tests is an important part of the software creation process. Knowing how to write tests effectively is essential to giving other developers the peace of mind that your code can be changed and used with confidence.

The original goals of the rewrite were to make corrections based on the learnings from the proof of concept and to add test coverage to make the code more maintainable. I believe I succeeded in doing that.

Currently, PHX Events has 97% test coverage across all the Python code. The bulk of the uncovered code (3 branches) is in an async logger. The process of writing the tests brought many bugs to light that wouldn’t have otherwise been found before using it in production. I would claim that this is the power of tests: the necessary interrogation of your own code, which is essentially writing up a list of expectations that it should meet for it to be considered valid. In this, I think, PHX Events succeeded.

Resources:

You can view the full project here.

Jethro Muller is a senior full stack developer at Stitch. He primarily works in Typescript on server-side NodeJS code. He enjoys ORM query optimisation, building pipelines and tooling, optimising workflows, and playing with AI.