Guide · LLM Hosting

LLM hosting readiness for UK businesses

The procurement question dressed up as an architecture question. Self-host an LLM, use SaaS, or use a hosted enterprise tier — the readiness framework that decides.

The short answer. LLM hosting readiness is the assessment of whether a UK business has the data, security, and operational foundations to self-host a large language model — or whether the better answer is to use an enterprise-tier SaaS AI with proper contractual posture. For most UK SMEs and mid-market businesses, SaaS with UK or EEA data residency is the correct answer; for businesses with material sovereign-data, regulatory, or sensitivity requirements, self-hosting becomes the right answer at a defined capability threshold. Both paths need the same readiness foundations underneath.

The three deployment models

Enterprise SaaS AI. Use a vendor's enterprise tier — ChatGPT Enterprise, Claude for Work, Microsoft 365 Copilot, Gemini Workspace. Customer data does not train the model; UK or EEA data residency typically available; standard DPA, audit logs, vendor security evidence. The right answer for most UK businesses.

Hosted enterprise model. Use a vendor's API tier (OpenAI API, Anthropic API, Azure OpenAI Service, AWS Bedrock) inside applications the business builds or operates. Customer data does not train the model; data residency configurable; more operational responsibility on the business than full SaaS. The right answer for businesses building AI-into-product use cases.

Self-hosted LLM. Run an open-source model (Llama, Mistral, Mixtral, or one of the specialised UK-hosted variants) inside the business's own cloud or on-premises. Customer data does not leave the business's perimeter; full data residency control; substantial operational responsibility and capability requirements. The right answer for a defined set of use cases.

When self-hosting is the right answer

Self-hosting an LLM is the right answer when at least two of the following are true:

  • Sovereign-data requirement. The business processes data that contractually or legally cannot leave a specific jurisdiction. UK government suppliers, certain financial services activities, certain healthcare contexts.
  • High-sensitivity use case. The data the AI processes is so sensitive that even a contractual no-train, no-retain commitment from a SaaS vendor does not satisfy the risk owner.
  • Material vendor concentration risk. The business has decided not to add another critical SaaS dependency at the AI layer.
  • Specialised model requirement. The business needs a model fine-tuned on internal data in a way SaaS vendors don't support, or a domain-specialised model that is only available open-source.
  • Sufficient internal capability. The business has, or is willing to build, internal capability to operate the LLM stack — ML engineering, infrastructure, monitoring, model lifecycle.

Without at least two of these, the operational cost of self-hosting typically exceeds the value over enterprise SaaS.

The self-hosting capability requirements

Infrastructure. GPU capacity (rented or owned), at the level the model requires. For a 70B-class model serving moderate volume, this is a meaningful cost; for a 7B-class model serving internal use, much less so. Network and storage configured for the I/O patterns LLM serving creates.

ML and inference engineering. At least one person who understands the model serving stack — quantisation, batching, KV-cache management, optimisation. For most mid-market businesses, this is a hire or a partner; for businesses with existing ML capability, it is an extension.

Security overlay. The security baseline that SaaS vendors provide as part of their offering — DLP for AI interactions, authentication and authorisation, audit logging, monitoring — has to be built or bought separately when self-hosting. Most UK businesses underestimate this layer.

Operational lifecycle. Model updates, security patching, vulnerability management, capacity management, incident response. The lifecycle is the running cost, not just the deploy cost.

Compliance and governance. The AI usage policy, vendor due diligence, and risk register apply equally to self-hosted models — the "vendor" is now the open-source community and the business's own operations team.

Cloud-hosted LLM options for UK businesses

For businesses choosing self-hosting, the practical cloud options in 2026:

  • AWS UK-region deployment (eu-west-2 London). GPU instances available with reservation; Bedrock as a managed path for hosted open-source models.
  • Azure UK South. GPU-backed VMs; Azure ML for orchestration; Azure OpenAI Service as the hosted-enterprise alternative.
  • Google Cloud europe-west2 (London). GPU instances; Vertex AI as the managed path.
  • UK-domiciled providers. A number of smaller UK cloud providers offer GPU capacity with explicit UK sovereignty positioning. Due diligence as for any third party.
  • On-premises. Where data residency requirements require it; substantial capital cost; the right answer for a small minority of UK businesses.

The readiness check before the architecture decision

Before the LLM hosting decision, the AI readiness foundations underneath need to be in place — governance, data, infrastructure, security, use case. A business that self-hosts an LLM without those foundations has solved the wrong problem; it has the most exclusive infrastructure with the same shadow-IT risk profile underneath.

The Arx Certa AI Readiness Scorecard checks the five foundations and surfaces whether the business is ready for any AI deployment — SaaS, hosted, or self-hosted. The hosting question follows readiness, not the other way round.

Frequently asked

Should we self-host our AI for data protection reasons?

Probably not, for most UK businesses. Enterprise SaaS AI with UK or EEA data residency and proper contractual posture satisfies UK GDPR for most use cases. Self-hosting is the right answer for sovereign-data contracts, the highest-sensitivity use cases, or where vendor concentration is a board-level concern — not as a default response to data protection.

What does it cost to self-host an LLM in the UK?

Wide range. A 7B-class model for internal use can run on a £4-8K/month GPU instance. A 70B-class model serving moderate volume needs £15-40K/month of GPU capacity plus engineering overhead. On-premises with reserved hardware is a £100K-£500K capital project. SaaS at equivalent volume is typically a fraction of these numbers; the cost case for self-hosting is rarely about saving money.

Is Microsoft Copilot self-hosted?

No — Copilot is a SaaS product running on Microsoft's infrastructure (Azure OpenAI Service underneath). It is enterprise-grade SaaS with strong contractual posture, but it is not self-hosted. Self-hosted Microsoft AI typically means deploying Azure OpenAI Service in a customer's own Azure tenant — more controllable than Copilot but still using Microsoft's underlying hosting.

Can we start with SaaS and move to self-hosting later?

Yes — and most UK businesses that end up self-hosting do exactly this. The SaaS phase is the use-case discovery and capability-building phase; the self-hosting phase is the production-scale phase where the use cases justify the operational investment. Starting with self-hosting before knowing the use cases tends to optimise the architecture for use cases that turn out not to be the ones that matter.

What's the minimum business size for self-hosting to make sense?

There is no minimum size — there is a minimum capability and a minimum use-case justification. A 50-person business with a sovereign-data contract and one ML engineer can sensibly self-host; a 2000-person business with no internal AI engineering capability and standard use cases cannot. Capability and use case are the constraints, not headcount.

Related Arx Certa services

If the gaps the scorecard surfaces need outside help to close, these are the engagement types we run for UK firms:

  • AI services — implementation reviews, AI policy work, vendor due diligence, and pilot scoping.
  • Cybersecurity — UK GDPR, NCSC alignment, vendor risk assessment, audit-readiness.
  • Database — the data foundations AI projects depend on.
  • Infrastructure — cloud, identity, network and integration foundations.

Check the readiness foundations before the hosting decision

The Arx Certa AI Readiness Scorecard takes 4 minutes and surfaces whether the foundations are in place for safe AI deployment — at whatever hosting model fits your business.

Get your AI readiness score → 4 minutes · 12 questions · Personalised report