The Upside Marketplace is our core product—where all our business happens across all our channels, for users, merchants, and internal operators. As we grow, our Marketplace incurs (strategic) tech debt, with system complexities we no longer need. But as it is with most tech companies, addressing that debt is often deprioritized relative to product improvements that directly impact our quarterly goals.
So we put our heads together, and created a structure and a team to prioritize work on the Marketplace and measure its impact.
If your company is looking for best practices on how to prioritize internal work and improve the way your system runs, read on.
Before we prioritized this work internally, the Upside Marketplace was high-functioning but costly.
We’ve always had high uptime and operated well at our current transaction levels, but the associated operational and engineering costs began to scale alongside our product.
Unfortunately these costs—beyond the financial ones—are often hidden from view because there are no numbers we track that expose their toll on the team.
By thinking in terms of “costs,” it’s easier to bring maintenance-related issues to light and prioritize platform improvements in your roadmap. How can you do this with your team? Ensure your roadmap prioritization considers these three things.
The Google Site Reliability Engineering ebook defines toil as, “the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows.”
While you can’t fully eliminate toil—especially if you’re a smaller startup or organization—it’s important to track it because too much toil can be harmful to the business.
The costs that can help you track toil are:
Once you define these costs and find a way to track and measure them on an ongoing basis, you can ensure that work gets prioritized to keep toil below a level that you and your business are willing to accept.
Product and engineering teams often think that quality control means having some type of QA process to test code before it is deployed. However, it’s imperative for quality control to be baked into whatever product or platform is being built.
When we look at how to prioritize work to improve our quality control, we measure against the following costs:
At Upside, we often prioritize work that will achieve clear quality control outcomes. The one initiative we continue to work on is what we call “auditability.” Are the pipelines and algorithms that are core to our business observable and monitorable across our company? Do we have the tools in place to ensure that when there is an inevitable quality control issue, we are proactively finding it, not our customers?
According to Gartner, downtime can cost companies $5,600 per minute and up to $300,000 per hour in web application downtime.
It’s critical to work on ways to prevent outages and to help ensure business continuity. However, as you grow your business, things can get lost in the shuffle. Upside is working to hire 30+ engineers this year, so it’s inevitable that certain code might get into production without the “right” pair of eyes reviewing it first.
The costs associated with the risks of outages include:
As product teams develop roadmaps, it’s critical to implement processes that reduce the chance of potential outages — from improved testing coverage to scaling services for projected growth.
Our platform prioritization rubric is just one of the many examples of how Upside Engineering is leading the way in process and organizational structure — not to mention the tech we’re actually building. If you’re interested in learning more or joining the team, reach out. We’re hiring.