Apr 14, 2022
5 hard truths about scaling data intensive applications
Join David Esposito as he gives you the low-down on building data architectures that stand up even under the largest loads.
Let's talk about five hard truths for scaling data intensive applications.
The principles covered in this article are things that we've learned through experience and through talking to people on our network, helping customers scale data workloads in the cloud. We'll go through the whole thought process, from how to make sure you're working on the right things all the way through to how to set yourself up for success.
We want to make sure that you but also the ecosystem of tooling and companies that support you are successful.
For each of the principles, we'll talk about a mindset or a quote that helps you think about the problem differently. And we'll ask you a couple questions that you should really challenge yourself and your team with.
1. Limit the scope
You want to make sure that you're working on the right problems and focusing on customer pain. And I always go back to the mindset that I had in college of get it working, keep it working. And this goes right along with the concept of MVP that relates to working in a startup: Minimum Viable Product. Define what's needed, define the problem you're trying to solve and focus in on the customer pain. That's what's going to drive revenue.
The challenging questions that I like to ask my customers or my friends with startups is what's growing faster:
- Your code base and your product backlog?
- Or your customer base and your revenue?
Make sure you're taking a step back and not just building the fun, fancy features. Ensure you're really focusing on customer first. Don't design and build in a vacuum. Make sure that the feedback loop is tight and implemented well.
2. Know your limits and your runway
Once you designed the solution, you really need to design the customer experience - and make sure you do it on purpose. In software engineering and software architecture, there's the concept of a runway: knowing how much time you have left with the application system or knowing how much capacity you have left.
Think of the process of building a solution as an airplane sitting on a runway. You start rolling, going to take off, and as you go faster and faster, there's a point where you need to make a decision: to step on the brakes, so that you have enough time and room to stop, pivot and turn onto the taxiway - or to fully commit to taking off.
And the important thing is you need to have all the information necessary upfront to make that decision. You don't want to wait until you're at that spot to make the decision.
And the same is true with software architecture. And you want to make sure that you have all the capacity, all the information, all the observability, all the traceability, all the information before you have to make that decision. Don't be surprised by it.
And the question I'd like to challenge my customers and friends with is:
- How many customers are too many?
- When is a customer too big?
As far as how many customers is too many is concerned, I like to think of orders of magnitude, that is, multiples of 2X, 10 X, 100 X overnight. If your customer base doubled in size, what would break?
When you start talking about 10 X overnight, at that point, it's usually what's not going to break, what system is actually going to stand up, what infrastructure is going to stand up. And you start some interesting conversations about indexing strategies or sharding strategies or auto-scaling strategies.
And you realize that you may have certain blind spots. And then again, if your customers increase 100 X overnight, if you think about the system that can handle that load - what does that even look like? Are you even headed in the right direction to be able to scale to that? Do you have the patterns and processes in place to be able to scale to that volume? That's an example of horizontally scaling.
On the other hand, how big a customer is too big? Is there a case where a customer has too much data, too many users interactions with your application? How do you define "too big" from the application point of view? How's that going to affect your database tables, your infrastructure decisions... What is going to break? If you take your largest customer and imagine that 10 X, then 100 X, what's going to happen in those cases?
3. Observability!
Once you define a scaling strategy, you need to make sure that you have systems in place to observe that. Make sure that you mind your blind spots. Do not make assumptions that everything's okay.
If you don't know, assume the worst, know where you are and take those by taking those measurements. And remember, capacity and stability are not a Boolean state. It's a continuum. You're rolling down the runway. When are you at the point where you have to make a decision and do you have all the information necessary?
Don't be surprised when you're at the end of the runway and you're not going fast enough or you don't have the flaps out, make sure you have everything in place and know where you are.
The difficult question that I really like to ask, honestly, it goes back to intelligence:
- Do you know what you don't know?
- Where are your blind spots?
- What are you not measuring?
And then also once, once you've identified some of those blind spots, or maybe even when you're trying to think of them, which looming decisions would be the most disruptive?
Make sure you're measuring those, see all the information necessary, you know, where you are on that runway.
And it's really hard to get right. But don't let it seem like too big of a task.
4. Integrations and tooling are underrated
There's a whole ecosystem of tooling and integrations around almost every technology, every program and framework, every cloud. There's a lot of really smart people out there that want to make this easier. They like solving difficult problems, and your problems aren't new.
If you're a developer, you've been on Stack Overflow, and you've seen the posts from two, three years ago, searching the exact area you're searching. This goes back to standing on the shoulder of giants. Use what's there - don't feel like you always have to come up with your own solution, doing things the hard way.
Remember the most expensive part of the cloud is between the keyboard and the chair.
So the difficult question, which I always like to ask is: How do you make money? What does your company do that pulls in revenue? Is it running your own Kubernetes cluster? Is it managing your own premises data center? Is it building your own logging framework?
I ask those silly questions because I've worked at companies that have done all of those things. Did that give us an advantage in the end? I don't know. You really have to answer that question for yourself.
And then, if those things aren't earning you money - why do you do that? What advantage does it give you? Are you even getting an advantage? Are you losing out on opportunity cost by not using those engineering resources better?
Remember, there's a time and a place to choose build versus buy, use frameworks or outsource or build your own solution. But you need to know when that time is, and that place.
Keep that in mind and set yourself up for success. Above all, remember, it's an ecosystem. You live in the tech community.
5. Mutual success above all
There's a quote floating around the internet: if you want to go fast, go alone. If you want to go far, go together. And that's also true with tech products. Being able to leverage the expertise of others, whether it's hiring new expertise and bringing that in-house, whether that's relying on a consulting firm's professional services, or if that's using a managed service provider, make sure that you know what's going to give you advantage and how you can best use that.
Realize that, at the end of the day, your vendors or your managed service provide a service in exchange for money. So the questions I like to ask are:
- How do they make that money?
- How do they grow?
- What are their goals with you?
Are there goals to just grow your account a hundred percent year over year, or are there different mechanisms to define mutual success? Think about how that affects your relationship.
Aiven works on the right things
At Aiven, we strive every day to work on the right things, knowing where we are. We try to put all these principles that we talked about in place so that not only are we successful, but also our customers, our vendors, all the supporting platforms we build on top of, are successful, too.
Wrapping up
Aiven's a Database As A Service platform, and we're built by developers for developers to make developers' lives better. Just like out in the industry, there's a lot of smart people working in Aiven. There's a lot of smart people using Aiven services.
To partner with smart people to solve difficult problems, start a free trial and reach out to the experts at Aiven. Let's work on the really hard problems together.
--
Come and follow our changelog and blog RSS feeds or our LinkedIn and Twitter accounts to stay up-to-date with product and feature-related news.
Stay updated with Aiven
Subscribe for the latest news and insights on open source, Aiven offerings, and more.
Related resources
Dec 20, 2021
0day? How about 0december! Aiven's CISO recaps the recent vulnerabilities and what Aiven did about them.
Feb 25, 2022
Should you get your database managed? Discover the real costs and benefits of self-hosted vs. managed databases.
May 11, 2022
Aiven will contribute to stopping climate change and increasing equal access to technology. Read our commitment to find out how.