February 2024

Architecture Decisions You Can't Undo

Some technical choices compound for years. From Cantoo's distributed voice infrastructure to OneRagtime's data pipeline: lessons on the decisions that define a product's trajectory.

I have made architecture decisions that paid off for years. I have also made ones that haunted me for just as long. The difference between the two was rarely about picking the "right" technology. It was about understanding which choices lock you in and which ones you can revisit later.

At Cantoo, our cloud telecom platform, and at OneRagtime, where I built a VC data pipeline from scratch, I learned this the hard way. Here is what I know now about the technical decisions that compound over time.

The Decisions That Lock You In

Not all architecture choices are equal. Some you can change next quarter. Others become load-bearing walls. The trick is knowing which is which before you commit.

At Cantoo, the single biggest decision was how we handled voice routing. We built a multi-dimensional routing engine that optimized for cost, quality, and latency simultaneously. The alternative was simpler: pick the cheapest route and move on. But telecom margins are razor-thin, and a few milliseconds of latency or a slight dip in audio quality could lose us an operator client permanently.

That routing engine became the core of our competitive advantage. Every feature we built after that, the real-time billing pipeline, the auto-scaling infrastructure, the fraud detection layer, all of it sat on top of that routing logic. If we had gotten it wrong, rewriting it would have meant rewriting everything.

The lesson: identify which component everything else will depend on. That is your load-bearing wall. Spend extra time on it. Get it reviewed by someone who has built something similar. Do not rush it because you are eager to ship features.

When Simplicity Wins

At OneRagtime, I designed a data pipeline called Deepdive. Its job was to discover, qualify, and rank startups for the investment team. We screened over 4,000 deals per year, and the pipeline needed to pull data from dozens of sources, normalize it, score it, and surface recommendations.

The temptation was to build something sophisticated from day one. A microservices architecture with event sourcing, a graph database for relationship mapping, ML models for scoring. I have seen teams go down that road. Most of them spend six months building infrastructure and never ship the actual product.

Instead, we started with a monolithic Django app, a PostgreSQL database, and a set of Python scripts that ran on a schedule. It was not elegant. But it worked within weeks, not months. The investment team was using it to make real decisions while our competitors were still debating their tech stack.

We added complexity only when the data proved we needed it. The web scraping layer got its own service when it started slowing down the main app. The scoring model became a separate module when the team wanted to iterate on it independently. Each evolution was driven by a real bottleneck, not a hypothetical one.

The Database Question

If I could give one piece of advice to early-stage CTOs, it would be this: think harder about your data model than your framework choice.

At Cantoo, we used PostgreSQL for transactional data and ClickHouse for analytics. That split was deliberate. Voice traffic generates enormous volumes of event data, millions of CDRs (call detail records) per day. Trying to run real-time billing queries and historical analytics on the same database would have been a disaster.

But we did not start with ClickHouse. We started with PostgreSQL for everything and migrated the analytics workload when query times started hurting. The key was that we designed our data model with that eventual split in mind. The tables were clean. The boundaries between transactional and analytical data were well-defined. When the migration happened, it took weeks, not months.

Contrast this with what I saw during technical due diligence at OneRagtime. Startups with tangled data models, where user data, billing data, and product data were all interleaved in ways that made it nearly impossible to scale any single dimension independently. That kind of mess does not happen because of bad technology choices. It happens because nobody stopped to think about data boundaries early on.

What I Got Wrong

I am not immune to this. At Cantoo, I over-invested in Kubernetes early. We were a two-person engineering team running a platform that did not yet have enough traffic to justify the operational overhead of K8s. I spent weeks configuring auto-scaling, setting up monitoring, and debugging networking issues that would not have existed on a simpler deployment.

The platform eventually grew into it. Kubernetes became essential once we had real operator clients and needed carrier-grade reliability. But for the first six months, a few VMs behind a load balancer would have been just fine. I let my excitement about the technology override my judgment about what we actually needed at that stage.

At OneRagtime, I underestimated how much the Deepdive pipeline would need to evolve. I built it as a batch processing system because that matched the initial requirement: run the scraping and scoring once a day. But the investment team quickly wanted real-time alerts when a startup hit certain thresholds. Retrofitting real-time capabilities onto a batch system is painful. If I had designed it with an event-driven layer from the start, even a simple one, the transition would have been smoother.

A Framework for Architecture Decisions

After years of making these calls, I have developed a mental framework that helps me evaluate architecture decisions. It comes down to three questions.

How reversible is this decision? If you can change it in a sprint, do not overthink it. If changing it later means rewriting half the system, invest the time now to get it right.

What will this look like at 10x scale? You do not need to build for 10x today. But you need to make sure that what you build today does not actively prevent you from getting there. Avoid designs that create hard ceilings.

Who else needs to understand this? Architecture that only the original author can maintain is a liability. I have seen brilliant systems that became bottlenecks because they were too clever. The best architecture is the one your team can reason about, debug, and extend without you in the room.

The Compounding Effect

Good architecture decisions compound like good investments. The routing engine at Cantoo enabled features we had not even imagined when we built it. The clean data model at OneRagtime made it possible to add new data sources in days instead of weeks.

Bad decisions compound too, but in the other direction. Every shortcut you take in your data model, every boundary you blur between services, every premature optimization that adds complexity, all of these create drag that slows you down a little more each month.

The difference between teams that ship fast and teams that get stuck is rarely about talent or tools. It is about the accumulated weight of their past architecture decisions. Choose carefully. The decisions you make this quarter will still be shaping your product years from now.