Sep 14, 2023
Bonsai Turns to Aiven for Automated Event-Driven Architecture to Handle Millions of Data Points
Data challenges resulting from expanding product catalog solved with Aiven for Apache Kafka® on Google Cloud
Founded in 2016, Bonsai is an e-commerce platform trusted by elite publishers and retailers across North America and Europe. Its discovery commerce technology allows users to purchase a product within the content they love. When the Bonsai product catalog expanded from a couple of thousand products on small Shopify stores to millions of products, receiving constant updates, the company turned to Aiven for an automated, event-driven data architecture on Google Cloud using Aiven's managed services.
Alan Scott, Commercial Account Executive at Aiven, recently spoke to Brent Van Geertruy, Director of Engineering at Bonsai as part of a webinar hosted by C2C, the Google Cloud Customer Community. They talked about the key drivers, challenges and achievements in building data pipelines for Bonsai's millions of data points. Here are some of the extracts from that conversation.
Alan Scott: Tell us about Bonsai.
Brent Van Geertruy: We’re a tech company providing onsite check out so consumers can buy when purchasing intent is being formed. We use various integration methods like APIs, pre-built UI components and product feeds that power our catalog which currently consists of millions of products from over 400 retailers, including Best Buy. We service partners who provide us with products, partners who are using our catalog, our API or UI product, and we also have customers who view and purchase those products.
Alan Scott: What problem were you trying to solve when you started using Aiven?
Brent Van Geertruy: Over the past couple of years, our product catalogue has grown from a couple of thousand products on small Shopify stores to millions of products. This requires an immense number of updates every day, from price changes to inventory updates.
We have our own custom dashboard that we built and maintain. It’s used to perform order fulfillment and manage merchants, clients and our product catalogue. A couple of years ago, we often didn’t receive high-quality product information from the smaller merchants. Our internal team had to check every product and manually review everything from product description to color to ensure they aligned with our design. This was simply not scalable or sustainable.
We initially started researching event-driven systems as we had an extensive monolithic application where importing a product was almost 20 synchronous steps long. One of the first challenges we needed to tackle was image uploads. Every time a product gets ingested, we have to upload the image onto our servers. We couldn’t rely on the merchant image URLs because they often get blocked or won’t allow image manipulation. Imagine a feed of just a thousand products, or even a hundred thousand products! Every time that happened, our entire pipeline would get congested waiting for the image upload to be completed.
We started using Aiven for Apache Kafka® to make it less synchronous and to take more of an event-driven approach. We are also leveraging Kafka using an Aiven connector to MongoDB for generating feeds. Instead of manually indicating anything has changed to a product and updating a feed, we’re using the MongoDB connector to wait for product changes and automatically move that into our export system. Setting this up through Aiven was easy. And from an infrastructure perspective, it’s great that we can use the Aiven Terraform provider, as that’s how we like to set up all our services.
Alan Scott: Tell us more about the shift from monolithic to event-driven architecture.
Brent Van Geertruy: Our team has been discussing event-driven architecture for years. It was seen as the holy grail to solve all problems when scaling applications. There was also a lot of buzz about it, especially when Netflix highlighted how they were using it.
However, our application already contained a large amount of logic, was very coupled and synchronous, and there was technical debt from historical experiments dating back five years. This meant that any more extensive reworks were always deprioritized in favor of incoming client customization requests.
We finally started using Kafka because of increased API usage — the product had a clear market fit and we wanted to expand further. As we were able to provide some maturity into our product offering, we could carve out time for tackling technical debt and focusing on reworking larger pieces of our ecosystem. We had several sessions with senior developers to establish what the perfect world would look like, what the bottlenecks were now and if an event-driven approach could resolve them. We chose event-driven but through a transition, not an entire rework from monolith to event.
Alan Scott: What were your criteria when you chose Aiven?
Brent Van Geertruy: We’ve often chosen tools with usage-based billing and always started with minor costs. However, if it relates to our product pipeline, where we’re processing millions of updates daily, this cost quickly balloons. Price transparency was one of the reasons we chose Aiven. Also, Google is our primary cloud provider, so it was really easy to get started on the platform with the Google Cloud integration through Google marketplace. The Aiven team provides excellent support, and we wanted a vendor with expertise outside of Kafka. We also use Aiven for Caching, a managed, in-memory NoSQL database. Getting it from the same provider as Kafka and through our GCP billing makes it much easier.
Alan Scott: What are your key learnings?
Brent Van Geertruy: The initial proposal was to turn everything into an event-driven system — then we realized how much effort it would take, especially for a start-up. Instead, we started measuring the system, identifying the bottlenecks and tackling them in a more agile way. When you fix one bottleneck, another will often pop up.
A notable example is when we switched from uploading product images synchronously to using events into an asynchronous way. We solved the problem of images blocking our pipeline but immediately ran into the issue of our product pipeline being unable to handle the incoming products' speed. Essentially, the slowness of uploading product images had kept our pipeline slow and steady. Once we took that out of the equation, other places started to feel the load, which we’ve now also improved. So a big takeaway was, rather than do a significant overhaul, it’s better to tackle it bottleneck by bottleneck, considering the higher-level architecture plan.
Check you the full webinar: https://aiven.io/webinar/building-real-time-event-streaming-engines
To get the latest news about Aiven and our services, plus a bit of extra around all things open source, subscribe to our monthly newsletter! Daily news about Aiven is available on our LinkedIn and Twitter feeds.
If you just want to find out about our service updates, follow our changelog.
Stay updated with Aiven
Subscribe for the latest news and insights on open source, Aiven offerings, and more.