My team has to be able to deliver new exotic trade types for a global banking network within a few weeks - complete with risk and sensitivity reporting. For this, we've introduced a mechanism to keep both digital and physical agile scrum boards in sync. Here's how we did that.
Alchemy is an in-house trading platform built for Standard Bank South Africa's structured solutions desk. The system was introduced to give the trading desk the ability to book exotic trade types in as short a time period as possible to maintain a competitive funding advantage. This is quite a task because exotic trades are complex in nature and the quantitative models involved in integration usually take months to glue together.
That's where Alchemy comes in. Our system offers three key services:
Integration of complex quantitative mathematical models - These models require market data such as interest rates or foreign exchange rates and trade economics as inputs and provide a trade valuation as an output.
The ability to provide daily reports on the valuation of complex trades - This allows our traders to hedge risk against various market condition, such as interest rate changes or currency movements.
A temporary home for new exotic trade types - New exotic trades types that cannot be booked in other systems because they are not yet available in those systems are sometimes temporarily booked in the system.
The key to maintaining a competitive advantage in this business is the ability to integrate complex new mathematical models into our system at a rapid rate.
One of our most essential tools is, perhaps surprisingly, what started as a common scrum board: It ensures that all developers know exactly what every other developer is busy with and helps us to flag early whenever delays are creeping in during an agile sprint. However, the board is a powerful tool only when it is kept up to date and synchronized with 3rd party solutions such as Jira or Rally Dev.
This synchronization can take a developer or scrum master a few hours per day. That, of course, was not an acceptable situation to deal with in an environment where time is of the essence. That's why we came up with something new: The trackable agile cards board (TRAC board).
The Trackable Agile Cards Board
We decided to synchronise our old-school scrum board via image processing. This way, we would be able to track our stories as per usual on our physical scrum board while their different states were automatically synchronised with the digital agile platform Rally Dev. That's how the TRackable Agile Cards board (TRAC board) was born.
The TRAC board workflow would work as follows:
1. Image processing
A simple camera snaps an image of the physical scrum board. The tasks and the position of the tasks are programmatically located on the image. To make the tasks easy to locate programmatically, we decided to print all tasks with a QR Code. QR Codes are simple to locate on an image using a c++ library called openCV (open computer vision).
2. Synchronisation of the TRAC board with Rally Dev and other 3rd party software
Now, this information can be used to update any digital platforms accordingly. The TRAC board service between our physical board and a digital platform creates an abstraction layer between the two worlds. This allows us to run our board alongside any digital platform. This is super useful because it allows other teams to use whatever they are used to, should they wish to work with the TRAC board system as well.
To free up more developer time we decided to allow the TRAC board to do more than simply update a digital platform with its state. We wanted to allow the board to perform other tasks such as initiating the deployment of code to various environments or notify our traders of required sign-offs.
To do this, we allowed actions to be assigned to tasks when they enter a certain state. For example, when a task moves from "in-progress" to "testing required", the system automatically issues a command to our CI system which then deploys the feature to a testing environment. This removes the need for developers to manage the deployment of features to various environments which saves us quite a bit of time.
Actions may include:
- Deployment – Features can be deployed to an environment using an existing continuous deployment service like gitlab.
- Notification – Notifications such as an email can be sent to users waiting to test a new feature.
Building the Alchemy TRAC board
QR code tagged story cards
Our track board has two unique features:
- A QR-code that holds the Rally dev task ID and a short description of the actual task in plain text and
- A simple camera pointed at the board that is attached to a raspberry pi device. This camera snaps an image of the entire board (it's an HD camera) every second.
The image is then scanned and analyzed by an algorithm that uses opencv to identify all the QR-codes and their positions, for example which column the code is in. Opencv is a c++ Computer Vision library with a ton of methods for image processing. After identifying all the QR-codes the algorithm then updates the status of the tasks in Rally dev to match the columns of the cards on the physical board.
Tasks for the physical board are created as soon as they are created in rally dev by a simple label printer. Again, we wrote a simple piece of code that monitors changes on rally dev then prints out new tasks when they are created, complete with QR-code and text description.
During our standup meetings, the developers move their task to the appropriate column. This in turn will be captured by the camera, processed by the software running on the raspberry pi and finally updated in Rally dev.
Here's a demo of our first attempt at TRAC board testing:
The tricky part of all this was to identify and decode multiple QR codes in a single image.
Computer vision - recognizing multiple QR codes at once
Scanning a QR-code is relatively simple. Scanning multiple QR-codes in a single frame, however, not so much. The main challenge is the context of the task, essentially knowing which lane the it is in. This relies on two things:
- A lane identifier which is another QR-code that describes a lane such as "In-Progress" and
- The lane barrier which is essentially a piece of insulation tape that outlines the lane boundaries.
Since not having a physical separation between each lane makes it even tricky for humans to determine which lane a story belongs to, determining those constraints programmatically without a clear distinction is near impossible. Separating each lane with some tape makes it easy to determine the end of one lane and the start of another.
TRAC board: More than just tracking tasks
At some point a light bulb went off in our minds: The other reasons we had always wanted a release coordinator, or scrum master, to track the progress of tasks, were to:
- Know exactly when to call on testers or users to sign off features and
- When to initiate the process of deployment to production.
For these, knowing the state of any task at any point is immensely powerful and that is exactly what the TRAC board was offering. With this automated information flow it was actually quite simple to start chipping away at the bureaucratic necessities that took much of our time:
A stable staging environment
As a task moves to the "in-testing" column on the physical board, a message is sent to our CI pipeline. This message in turn triggers an automated deployment of the feature to our staging or testing environment.
The TRAC board system will also notify all testers that the feature is available for testing. Once testing is successful, the tester will simply approve the feature by clicking on a link provided on the same email that notified them that the feature was ready for testing.
The approval is then sent back to the TRAC board system, which allows us to move the task from the testing column on the physical board to the "in-regression" column.
Note that while there's nothing stopping developers from physically moving the task to any column, not following the correct flow will stop the TRAC system from sending notifications to users or our CI pipeline on gitlab.
Regression results for reports
Once a task has been moved into the "in-regression" column, the TRAC system again notifies our CI pipeline to run a number of reports in our staging environment.
Once the reports are completed, a simple report is sent to our traders showing a comparison between our staging environment (in terms of trade valuations) and our production environment. The traders approve these valuations by clicking an approval link in the email they received and this gets sent back to the TRAC system.
Sign-offs from all users: traders, support teams and business heads and dependent systems
This is the most important feature of the system: The ability to get automated feedback from users and downstream system owners via simple communications such as email.
The system then stores and uses this feedback automatically to drive our deployment which cuts away the need for once critical processes, like change management approval and resource allocation.
These two processes were essentially set up to address any trust and accountability issues, but since the TRAC system establishes trust and accountability by logging sign-offs and who gave these sign-offs, change management approval becomes a redundant process. Resource allocation also disappears since the system will deploy to production as it does with staging, only now it has the golden key, namely sign-offs, and no other humans are required for the release.
Having a physical scrum board has always been our preference. It allows for faster interaction with the board during stand-ups and is always available. Having the ability to synchronize the board with a digital platform allows for quick and efficient progress reporting on the the status of our iterations without having to maintain anything besides the physical board. Our overall mission will be to provide the TRAC board as a service on a cloud platform such as AWS where teams could simply hook up a camera, configure a few connection strings to the cloud (and preferred digital agile platform) and boom, Trac board synchronized.
The development of TRAC 'system' (since it's way more than just a board :) ) has accomplished two things. One, it has automated a number of processes that were put in place because at the time they were appropriate solutions, and two,it has changed our mindset when it comes to looking at and fixing broken processes, we now tend to look into why a process was implemented in the first place, for instance in our case the process was put in place to ensure accountability it just so happened that the only way to ensure accountability was through manual processes, at least at the time a manual process seemed to be the quickest to implement. Taking a step back and putting in a bit of extra effort in automating such processes becomes a far more scalable solution.
José Pita is Technical lead for the Alchemy team at Standard Bank by day, computer systems engineer by night, volunteering to bring mining safety systems into the 21st century using smart sensing solutions.