Participants in the low-code/no-code Hackathon shared their submissions with the Programmable Banking community at July’s meetup. In his demo, Vutlhari Rikhotso of Team ZATech Radio demos his idea of training AI models to deal with uncategorised transactions.
Transcript of the demo
Nick Benson 0:00
Moving along so that we don’t get bogged down with everything. VT. Your project was very cool. Do you want to talk us through that?
Vutlhari Rikhotso 0:11
I set out to … Can you hear me? I set out to solve a personal problem … I think something almost every bank doesn’t get so right – and that’s why tools like 22seven exist – is that even if you can see [your] transactions, it’s very hard to make sense of them and to get real insights. I was really excited that I could get my transactions on Excel because I have recently started playing around with pivot tables, and you know, I started [taking] an interest in data insights. I wanted to make this as simple as possible for me. If you look there, I just basically added a new script that you could call, add to a cell, [and] everything else. I also want to put a disclaimer that about two open AI accounts were killed in the making of this project. Because it was [laughter] it was not, it was not an easy thing for me to figure out. The first iteration for me, was a simple one to say, I know that places like KFC […] are fast-food places, and Uber, so I could probably go and make a small rule-based engine, something that runs locally on Excel. I can show you what that looks like. You find a few rules like this that say […], something like ‘credit interest’ might say ‘interest earned’, and ‘wages’ if you see salary […]
So, basic rules here that just doing a look-up and getting some data. And immediately, I was able to play around with some pivot tables. Number one, I was able to go and get my categories, and then some of debit, and start drawing nice graphs like this one. But I started being bothered by this green here. Because it seemed like my biggest expense, but when I looked at it, it’s the uncategorised stuff that’s coming up. I tried to look at another graph because graphs are great. I went on to my banking, basically extracted the month from the date, so that I could actually see […] what the trendline looks like for me […] when I’m spending the amount of money I’m spending on food or things like that. For example, if you look at the blue line over there, you can see it being very high in April, and then you know, starting to go back down in June there. [inaudible] I proved to myself that being able to label my data is very useful. But this initial implementation was very limited, because a lot of my transactions came from this uncategorized [category] here. I set out to go and try to solve that by building another tool outside of all of this, where I pooled all my transactions from Investec.
You can see at the time of building this I had 1 541. I could go ahead and do things like [hook] anything that brings up things like KFC, go and classify and, put in the merchant and say, oh fine, that’s KFC, the category is income, and then the subcategory, and all of that. But then I realised that the number of merchants that one could come across grows exponentially, and they change so many times. And you also have random transactions like this one that says ‘Wife’; what does that mean? Is it an income, is it an expense? I did what everyone else would do, I went and I […] started looking at what AI could do. I also went and used GPT3. And it was really great because I could parse it an example. I could give it a few examples, and say, fine, this transaction is KFC; this is what the category is; this is Debonairs. And that is what’s running over here. If you see, the first model’s running on this table, if you look at McDonald’s and British Airways. By the way, I’ve never flown British Airways, it is not a transaction that shows up on my transactions. It doesn’t know what this is. But with being able to parse it those examples of the prompts, it was able to figure out that McDonald’s is ‘food and drink’ and was also able to figure out that British Airways is ‘travel’. So, it was progress. But the problem with this is that it was expensive. Hence the two accounts that were killed in the making of this demo. The first one was Dan’s account, which I maxed out by trying to run all my transactions through this; it didn’t work out; it just ran out of credit.
So, I had to go and try and figure out what to do with that. So, the last bit was like, oh, I already have this. It would be great to actually, you know, export that data, and go and train the AI, instead of parsing examples every time. So, since I was able to … I mean, if you look at this, I only have about 123 of those 1 500 odd transactions to manually classify. Since I had already spent the time manually classifying my transactions, I realised that that could be great training data. This is what the training data looks like. You heard about a prompt that open AI could get,so there’s a prompt saying, this is Checkers, Waterfall, and the completion is that that category would be food and drink. I was able now to go and train this on my own data. And it got better and cheaper. Instead of running at about two rand to three rand per transaction, it went down to being able to run for like a cent, making it more affordable. And yeah, like a worthy idea to try and continue with. The most interesting thing that I found, which is what I’m going to continue working on, is that it doesn’t have to be limited to just the one label. With the same training set, you could tell it, who the merchant is; you could tell it […] to associate an emoji with that transaction. And you could pull all of that in back into Excel, and then use it, you know, the way you would use any other Excel formula. That’s it. Thank you.
Dan Davey 7:01
Yeah, just for those of you who don’t know. I gave Vu a R1000 budget on my accounts, and I went to bed, and in the morning, when I woke up, it was gone. [Laughter] So that’s what we figured out pre-training.
Cool. Pieter, any thoughts about this project?
Pieter Heyns 7:24
Thanks a lot for that submission. Also, I think one of the biggest parts of what accountants do is trying to classify transactions. And everyone sort of figures that out for themselves and there is a base of transactions that is standard. You don’t really have to think too much about it, but you still need to manually classify it. But I think this idea of being able to train an AI model to automatically classify transactions. And if you change the transaction classification, or you add a new one in, then it updates the model, I think that’s really interesting. And that could potentially also be really powerful. Yeah, I’ve got no idea how that would work practically, or sort of for a big business. But it seems like it’s very promising. I think once you have this, and you can do automation, so you can, you can automatically input your transactions, and it’s automatically classifying transactions, you can start to save a lot of time on doing that yourself. I think it’s a really good, really good submission. This sort of background, or the more depth, didn’t come out in your application. I wish I knew that because I think I missed sort of that in [inaudible]. I didn’t realise sort of the background things, which you’ve spoken about now. But it seems like you’ve done a lot of work there on the background stuff to actually get it to work. That’s really, really good.
Devina Maharaj 8:53
As you went through it, and, I think, especially talking through it, it was quite evident you put in a lot of heart and an effort into it. […] What was really interesting for me is about 10 years ago, when we just had launched our digital channel with our mobile app at Investec, we had implemented Yodlee, [which] was integrated into our banking platform. And we’re probably one of the first banks to ever do that, where we had a fully integrated, personal financial management tool integrated into banking. In hindsight, it was probably a couple of years too early, because our adoption rate kind of like slowed and kind of very steadily started increasing. But what I loved about your submission, and I think a lot of what some of the other guys have done is, is this idea of training an AI model to do your categorisation. If you know anything about the big players in the space around Yodlee, who’s one of the biggest players in the market globally, they have struggled with this big uncategorized pie section for years. And it’s something that every single person who’s ever used personal financial management tool struggles with.
And that’s why the ability to have custom categories and for people to almost label their own things […] A very practical example that we learnt at Investec is that, when you use PFM tools, it doesn’t really recognise names, like […] Scooters Pizza was a real example that came to mind that we had as a real-use case where Yodlee didn’t recognise Scooters, it classified it as a garage or a car place, rather than a pizza place. And it was quite interesting because it needed to learn that. I think if you ask the guys in the team, they’d classify ‘Wife’ as an expense, by the way, just FYI, but [laughs] the two Barnards here would definitely do that. But I think that, the classification thing. And that little tool you built on the side is actually really powerful. Wayne, who’s not here; he actually is very interested in the idea of having like a GPT-3 model that we could use as an engine, but that could classify as a central place for all clients. Imagine if every single client started using it, and you were training this AI model with 80 000 people. That kind of big, green uncategorized section would reduce really quickly. I think you highlighted many exciting points for us. And like, Pieter, I don’t think a lot of what you said now came through in your submission. But yeah, well done.
Thank you, Nick, can I answer one question?
Yeah, someone wanted to know if they could train with their own data. So that’s the plan. But I think the value is in actually training with more data than just one person. I think crowdsourcing people’s transactions … The tool that I built on the side, I’m thinking of actually just cleaning that up, open sourcing that and just allowing the community to basically use their own transactions, manually classify, submit that and train a model because I think that’s where the power comes in. But also, a big thing that I’ve learnt is that companies like Yodlee …. This is a very complex problem that a lot of people are trying to solve. And it’s usually region specific, because it might perform very well in the US and not so much in South Africa. So this will be focused on the type of places that we have here, but, on top of that, I think just taking in the description is not enough, because there’s context of the type of transaction as well; so, in the prompt, if you parse it the description and then whether it’s a debit or credit, then it gets a bit smarter and then it becomes a bit more useful because what if the description is ‘Wife’, but it’s a debit? You know, that’s a whole different thing from it being ‘Wife’ and then being credit. [Laughter]
Get involved in the Programmable Banking Community.
If you have questions or want to say hi to the Programmable Banking Community core team, you can pop us a mail, and we will get back to you.
If you want to see more about what the community has been up to, you can:
- Join the Programmable Banking community
- Browse the community’s open-source projects
- Read the dev docs
- See more demos
- Read other programmable banking-related blog posts