SaaS Stack Architecture: How to Stop Building a Spaghetti Tech Stack
There is a pattern I have encountered consistently across organisations of different sizes, industries and maturity levels. The trigger is usually a new CTO, a post-acquisition integration, or a scaling event that suddenly makes an invisible problem very visible. A team sits down to map their technology estate - the actual systems they are running, not the approved list - and what they find does not resemble a stack. It resembles a plate of spaghetti: dozens of tools, dozens of integrations, half of them undocumented, a handful of critical business processes running through connections that nobody fully understands and a growing pile of workarounds built on top of workarounds built on top of systems that were never supposed to do what they are now doing.
The immediate instinct is to call this a technology problem. It is not. Or rather: it is a technology symptom of an organisational problem - a procurement dynamic, a governance gap and an architectural philosophy (or the absence of one) that has been compounding quietly for years. And because the root causes are organisational, the standard technology response - replace everything with a more modern stack - tends to reproduce the original problem in a cleaner environment, on a slightly longer timeline.
The question worth asking is not “which tools should we be using?” It is “how did we end up here and what would prevent us from arriving at the same place again?”
How the Spaghetti Forms: The Accumulation Problem
No organisation deliberately builds a spaghetti tech stack. Every tool in the estate was, at some point, a reasonable response to a real problem. Understanding why reasonable decisions compound into architectural dysfunction requires looking not at individual choices but at the decision pattern they collectively form.
The typical accumulation trajectory follows a recognisable sequence. An organisation in its early growth phase adopts tools rapidly, the priority is velocity, not architecture. The CRM is stood up. A marketing automation platform is added. Finance gets a dedicated system. HR needs something for payroll. The product team wants a project management tool. Each decision is made by the function closest to the problem, evaluated primarily on feature fit and integrated opportunistically - a native connector here, a Zapier flow there, a developer writing a quick sync script on a Friday afternoon. At this stage, the estate is manageable because it is small. The team can hold it in their heads.
The problem does not announce itself. It accumulates. The marketing team adds a webinar platform. The sales team adds a sales intelligence tool. The customer success function wants a dedicated CS platform. The finance team buys a forecasting add-on. The product team adopts a roadmapping tool. Each addition is justified. Each creates new integration surface. And because each integration was built to solve the immediate connection problem - not to become part of a coherent architecture - the overall topology grows in ways that no individual decision-maker can see or is accountable for.
What makes this dynamic so persistent is that it is self-reinforcing. Once an estate reaches a certain density of point-to-point connections, adding a new tool to the existing web becomes easier than replacing any existing tool, because replacement requires untangling the dependencies that have accumulated around the old system. The cost of switching becomes high not because any individual tool is deeply embedded - most SaaS tools are relatively shallow individually - but because the integration web around them is not. The stack ossifies around its connections, not around its core systems.
By the time the problem becomes visible - usually when a major integration breaks, or when someone is asked to produce a report that requires data from four systems that do not agree with each other - the estate typically has a decade of compounding decisions encoded in it. The spaghetti is not a metaphor for messiness. It is a structural description of a connectivity topology that has lost coherence.
The Hidden Costs: Operational and Governance
The visible cost of a spaghetti tech stack is the maintenance burden. Engineering teams in organisations with high architectural debt spend a disproportionate share of their capacity on reactive work - fixing broken integrations, resolving data discrepancies, responding to incidents triggered by changes in one system propagating unexpectedly to others. This cost is real but at least partially legible: it shows up in ticket queues and sprint retrospectives and it creates visible pressure on delivery capacity.
The less visible costs are more damaging precisely because they are harder to quantify and easier to defer.
The governance cost is structural opacity. When the integration topology of a technology estate is not documented, owned, or periodically reviewed, leadership loses the ability to make reliable architectural decisions. Every proposal to change, replace, or extend any part of the stack requires a discovery exercise - mapping what the proposed change would affect, which downstream systems depend on it, what failure modes the transition might introduce. In organisations with clean architecture, this is a short exercise. In organisations with spaghetti stacks, it is often weeks of archaeology that produces a map that nobody is confident is complete. The cost of this opacity is not just the time spent on discovery. It is the decisions not taken because the cost and risk of change appear too high - the architectural conservatism that gradually narrows strategic options.
The operational risk is compounding fragility. Spaghetti stacks are not uniformly fragile - they are unevenly fragile in ways that are difficult to predict from the outside. A seemingly minor integration between two peripheral systems may turn out to be load-bearing: the single source of truth for a data field that half a dozen downstream processes depend on, built by an engineer who has since left, documented nowhere and vulnerable to any change in either connected system. These hidden load-bearing connections are discovered not through audit but through incident. And the incidents, when they occur, have a way of arriving at the worst possible moment - during an acquisition due diligence, in the middle of a peak trading period, or precisely when the organisation is trying to demonstrate operational maturity to investors.
The data quality cost is the one that most directly affects business decisions. In an estate where the same entity - a customer, an order, a product - is represented in multiple systems that are not reliably synchronised, data quality degrades predictably. The finance system says one number; the CRM says another; the data warehouse, built to reconcile them, says a third. Teams learn to distrust automated reports and fall back on manually compiled spreadsheets, which introduces human error at the point where it is least visible. Decisions get made on data that is approximately right, with informal adjustments applied by whoever is closest to the underlying reality. This is not a data strategy failure. It is an architectural failure manifesting as a data governance problem.
Diagnosing the Stack: The Coherence Assessment
Before any rationalisation work can be meaningfully planned, the organisation needs an honest assessment of what it is actually managing. I call this a coherence assessment - a structured investigation not into what the estate contains, but into how coherently the parts relate to each other.
A coherence assessment has four dimensions. The first is the dependency map: a complete picture of every active integration between systems, including the mechanism of connection, the data flowing through it, the frequency of exchange and the named owner. This is almost always harder to produce than expected, because in most organisations with architectural debt, the integration estate is partially documented in configuration systems, partially in developer memory and partially in no record at all. The exercise of producing the map is itself diagnostic - the integrations that cannot be mapped without significant archaeological effort are the ones most likely to be creating hidden risk.
The second dimension is the ownership audit: for every system and every integration in the estate, is there a named team or individual who is accountable for its health, its evolution and its eventual decommissioning? Systems and integrations without owners are orphans. They tend to be the ones nobody is maintaining proactively, nobody is monitoring consistently and nobody will notice is degrading until it fails completely.
The third dimension is the data authority map: for every data entity that matters to the business - customer, order, product, contract, transaction - which system is the authoritative source and how does that authority propagate to other systems that also hold versions of the same data? In clean architecture, data authority is intentional and explicit. In spaghetti stacks, it is often contested, ambiguous, or undefined - producing the data quality dysfunction described earlier.
The fourth dimension is the change exposure analysis: given the current integration topology, what are the highest-risk points of change? If vendor A updates its API, how many downstream integrations are affected and how quickly would the organisation know? If system B were to be decommissioned, what would break and what would be the remediation path? This analysis is not primarily about current risk. It is about understanding the architectural constraints that the current stack imposes on future decisions.
Together, these four dimensions produce a coherence profile - a structured view of where the estate is well-governed and where it is not, which is the prerequisite for any meaningful rationalisation strategy.
The Rationalisation Framework: CORE
When rationalisation decisions need to be made - which tools to keep, which to retire, which to replace - the instinct is often to evaluate tools on their individual merits: capability, cost, vendor stability, user satisfaction. These factors matter, but they are insufficient as a basis for architectural decisions, because they evaluate components in isolation rather than as parts of a system.
I use a framework I call CORE - Criticality, Ownership, Replaceability, Ecosystem fit - as a structured basis for rationalisation decisions. Each dimension captures a different aspect of how a tool functions within the architecture, rather than how it performs as an isolated product.
Criticality is the degree to which the tool is load-bearing - not just useful, but structurally necessary for core business processes to function. A tool can be widely used but not critical: if it disappeared tomorrow, workflows would be disrupted but not broken. A tool can be narrowly used but highly critical: if it disappeared, a specific process that everything else depends on would fail. Criticality drives protection decisions - which tools warrant investment in resilience, redundancy and documented failover - rather than procurement decisions.
Ownership is whether the tool has clear, active governance: a named owner, documented integration points, an established change management process and a lifecycle plan. Unowned tools - tools that are running, paid for and depended upon, but that nobody is actively responsible for - are the primary source of hidden architectural risk. The rationalisation question for unowned tools is not whether to keep them but whether the organisation is prepared to establish genuine ownership; if not, the tool should be scheduled for replacement with something that will be owned.
Replaceability is the architectural cost of switching - not the functional difficulty of finding an alternative, but the integration cost of unpicking the current tool from the estate and reconnecting its replacement. Highly embedded tools with many integration points and poorly documented connection logic are expensive to replace regardless of how commoditised their functionality has become. Understanding replaceability is essential for realistic prioritisation: the tools that are most frustrating to use are not always the right starting point for rationalisation if they are also the most expensive to replace.
Ecosystem fit is the degree to which the tool integrates naturally with the core platform strategy - whether it supports standard integration patterns (REST APIs, webhooks, standard authentication), participates in the organisation’s chosen data flow architecture and aligns with the direction the stack is evolving. Tools with poor ecosystem fit create architectural drag: they require bespoke integration work, resist standardisation and tend to generate disproportionate maintenance overhead relative to their functional value.
Applying CORE across the estate produces a structured view of where rationalisation effort is most warranted: tools with low criticality, unclear ownership, poor replaceability assessment and weak ecosystem fit are the highest-priority candidates for replacement. Tools with high criticality and clear ownership that score poorly on replaceability may need architectural investment - better integration documentation, more resilient connection design - rather than replacement.
Building Intentional Architecture: The Three Layers
Rationalisation addresses the existing estate. The more important challenge is architectural intentionality going forward - building a stack that is coherent by design rather than by accident.
Intentional stack architecture requires clarity about three distinct layers: the system of record layer, the integration layer and the workflow layer. Most spaghetti stacks suffer from the absence of this distinction - tools perform functions across multiple layers without it being clear which layer they belong to and the result is a topology without structure.
The system of record layer is where authoritative data lives. Every business-critical data entity has one system of record: the single source of truth to which all other systems defer. CRM is the authoritative record for customers and opportunities. ERP or accounting software is the authoritative record for financial transactions. HRIS is the authoritative record for employees. These designations need to be explicit, documented and enforced through integration design - data should flow from systems of record to dependent systems, not be maintained independently in multiple places.
The integration layer is the connective tissue - the mechanisms through which data flows between systems of record and the tools that consume or contribute to that data. The critical architectural decision at this layer is whether to manage a topology of bilateral point-to-point connections, or to invest in integration infrastructure - an iPaaS, an event bus, an API gateway - that provides a managed, observable and governable connection layer. For organisations with more than fifteen to twenty active integrations, the integration infrastructure model typically becomes more efficient to maintain, because it constrains the failure surface and enables consistent monitoring and documentation practices. Below that threshold, well-governed bilateral connections may be adequate if ownership and documentation disciplines are applied consistently.
The workflow layer is where business processes execute - the automation tools, AI agents, dashboards and collaborative applications that orchestrate work across systems of record. This layer should be treated as a consumer of the integration layer, not a builder of its own connections. The mistake many organisations make is allowing workflow tools - process automation platforms, no-code builders, reporting tools - to develop direct, undocumented integrations with systems of record, effectively bypassing the integration governance that the middle layer is meant to provide. When this happens, the workflow layer begins to reproduce the spaghetti pattern and the integration infrastructure investment is undermined.
Maintaining clarity across these three layers is not a one-time architectural decision. It is an ongoing governance discipline - the practice of evaluating every proposed new tool, integration, or automation against the architectural model and resisting the organisational pressure to make expedient exceptions.
Common Mistakes When Trying to Fix It
The most costly mistake in stack rationalisation is the big bang migration - the attempt to solve an architectural accumulation problem through a single, comprehensive replacement programme. The appeal is understandable: if the spaghetti is the problem, replace everything with a clean, integrated platform and start fresh. In practice, big bang migrations have a structural failure mode that is almost universal. The programme is scoped in a moment of architectural optimism, before the full complexity of the existing estate is understood. As the project proceeds, the complexity of untangling the old estate and rebuilding its functionality in the new one consistently exceeds initial estimates. Timescales slip, costs escalate and the organisation ends up running parallel stacks - the old estate that operational teams are reluctant to abandon and the new platform that is perpetually six months from being fully ready. By the time the migration is complete, or more often by the time the programme is quietly scaled back to something deliverable, the architectural principles that motivated it have been compromised by the practical pressure to keep things running.
The alternative is incremental rationalisation, guided by the coherence assessment and CORE analysis: identify the highest-risk, lowest-ownership, lowest-ecosystem-fit components of the estate; build a replacement sequence that addresses them in order of priority; and establish the integration and governance infrastructure before - not after - migrating the data and workflows that depend on it. This approach is slower in headline terms but faster in practice, because it does not carry the catastrophic risk of a failed big bang and because each completed step improves the architectural foundation on which the next step builds.
The second common mistake is tool proliferation as a rationalisation strategy - adding new integration or orchestration tools to manage the complexity of existing tools, without addressing the underlying architectural dysfunction. An iPaaS platform deployed on top of a poorly governed estate does not fix the governance problem; it adds another layer to the spaghetti and creates the illusion of control without the substance of it. Integration infrastructure is only as effective as the governance practices that operate it. Without those practices, the platform becomes another undermanaged component in an estate that already has too many of them.
The third mistake is treating rationalisation as a project rather than a practice. Stack architecture degrades continuously because the organisational dynamics that produced the original accumulation do not change unless they are explicitly addressed. Completing a rationalisation programme and then reverting to the previous procurement and governance patterns will produce the same result on a predictable timeline. The structural governance changes - the integration cost accounting, the ownership requirements, the review cycles - are not the output of the rationalisation project. They are the mechanism by which the project’s outcomes are sustained.
The Governance Model for Ongoing SaaS Decisions
Sustainable stack architecture requires a governance model that is simple enough to operate without dedicated overhead, robust enough to prevent the accumulation dynamic from reasserting itself and authoritative enough that exceptions genuinely require justification rather than just approval.
The governance model I have found most effective in practice has three components: a procurement gate, an ownership requirement and an integration review cycle.
The procurement gate is a lightweight evaluation process applied to every proposed new tool before adoption. It does not need to be bureaucratic. It needs to answer five questions: What problem is this solving and is that problem not already addressed by something in the current estate? What is the integration cost - initial build and ongoing maintenance - and which team owns it? What is the data authority model - does this tool create, consume, or duplicate data that already exists in a system of record? What is the exit strategy if the tool needs to be replaced? And does this tool align with the integration layer architecture, or does it require a bespoke connection? These questions do not prevent adoption. They prevent adoption without acknowledgement of the obligations it creates.
The ownership requirement is the rule that no tool or integration goes live without a named owner who accepts accountability for its ongoing health. Ownership is not a role - it is a commitment by a team to include the tool or integration in their operational remit, to monitor it proactively, to respond to incidents involving it and to plan for its evolution or decommissioning as part of their regular work. This requirement is easy to apply to new additions. It is more difficult and more important, to retroactively establish ownership for the orphaned components of the existing estate. An amnesty process - a defined period during which teams are invited to accept ownership of unowned components, with the understanding that unowned components after a certain date will be scheduled for decommissioning - is often the most effective mechanism for closing the orphan gap.
The integration review cycle is a scheduled, periodic assessment of the integration estate - ideally quarterly - that asks three questions about every active integration: Is it still necessary? Is it operating within acceptable parameters? Is its documentation current and its ownership confirmed? Integrations that fail the first question are immediately scheduled for decommissioning. Integrations that fail the second are escalated to engineering as a prioritised remediation item. Integrations that fail the third are treated as governance actions before anything else proceeds. The review cycle is not an audit. It is a maintenance practice - the organisational equivalent of code review applied to the integration estate as a whole.
Together, these three components do not require a dedicated architecture team or a significant ongoing investment. They require a consistent cultural commitment to treating the integration estate as a managed asset rather than an emergent phenomenon.
The Strategic Reflection
There is a question behind all of this that does not often get asked at the right level of the organisation: what does the state of your technology estate say about the quality of your governance?
A spaghetti tech stack is not primarily evidence of poor vendor selection or inadequate engineering. It is evidence of a governance model that allowed consequential decisions - decisions that accumulate architectural obligation over years - to be made without adequate visibility of their full implications. The accumulated dysfunction is the sum of thousands of individually rational, locally optimised decisions made without a systems view.
The organisations that maintain coherent stack architecture over time are not the ones with the best tools. They are the ones where the decision architecture for technology adoption is mature enough to make the long-term costs of each decision visible at the point of decision, where ownership of the consequences is clear and accepted and where the review mechanisms exist to catch drift before it compounds into crisis.
That is a governance question before it is an architecture question. And governance questions are always, ultimately, leadership questions. The CTO who can articulate not just what the organisation’s stack contains but how it is governed - who owns what, what the architectural principles are and how decisions are made and reviewed - is demonstrating something more valuable than technical knowledge. They are demonstrating that the organisation has the structural clarity to make its technology estate a strategic asset rather than a compounding liability.
A clean tech stack is not the goal. An understood, owned and governable one is. The difference between those two things is the difference between a one-time project and a sustained organisational capability.


