Architecture

Why Your Data Shouldn't Leave Your VPC

25 January 2026|8 min read

When you type a prompt into a public AI API -whether that is OpenAI, Anthropic's public API, or Google's Gemini -your data embarks on a journey that most organisations have never fully considered. It leaves your network, traverses the public internet, arrives at a multi-tenant data centre operated by a third party, is processed on shared infrastructure alongside data from thousands of other organisations, and may be logged, cached, or retained in ways that your data governance team has no visibility into.

For a marketing team brainstorming taglines, this might be an acceptable risk. For awealth management firm processing client portfolio data, a law firm analysing privileged documents, or a healthcare provider handling patient records, it is not. This article explains why sensitive data should never leave your Virtual Private Cloud (VPC), and how modern cloud architecture makes it possible to use powerful AI models without that data ever touching the public internet.

The Data Exposure Risk: What Actually Happens

Let us trace what happens when a regulated business sends sensitive data to a public AI API. We will use a realistic example: a financial adviser uploading a client's pension statement for AI-assisted analysis.

The pension statement contains the client's full name, date of birth, National Insurance number, employer details, salary information, pension fund values, and beneficiary details. The adviser pastes this into an AI tool to generate a summary and identify potential consolidation opportunities.

At that moment, several things happen that the adviser may not appreciate.

Data in transit over the public internet: Even with TLS encryption, the data is routed through public infrastructure. DNS queries, routing metadata, and connection patterns are all potentially observable. More importantly, the data is now outside the firm's network perimeter and security controls.
Third-party processing: The data is received and processed by the AI provider. Under their terms of service, they may log the input, cache it for performance purposes, or retain it for abuse monitoring. Even providers who commit to not training on customer data typically retain inputs for some period.
Multi-tenant infrastructure: The request is processed on shared compute infrastructure alongside requests from every other customer. While logical isolation exists, the physical infrastructure is shared. Side-channel attacks, though rare, are a documented risk in multi-tenant environments.
Limited audit trail: The firm has no visibility into how the data was processed, which systems it touched, or how long it was retained. If a regulator asks "where did this client's data go?" the answer is "we do not know the full picture."
Potential sub-processing: Many AI providers use sub-processors for logging, monitoring, and infrastructure management. Each sub-processor represents an additional party with potential access to your data.

What Is a VPC?

A Virtual Private Cloud is an isolated section of a public cloud provider's infrastructure that is logically dedicated to your organisation. Think of it as renting a private, walled-off section of a data centre, rather than sharing an open-plan office.

Within your VPC, you control the network topology. You define which resources can communicate with each other, which can access the internet (if any), and which are completely isolated. You set the firewall rules. You manage the encryption keys. You decide what comes in and what goes out.

When we talk about deploying AI within your VPC, we mean running AI models on infrastructure that sits inside this private, controlled environment. The model endpoint is only accessible from within your VPC -it has no public IP address, no internet gateway, and no route to the outside world. Your data goes in, the model processes it, and the results come back. Nothing leaves.

The VPC-Based AI Architecture

A properly architected VPC-based AI deployment typically looks like this.

Your application sits in a private subnet within your VPC. When it needs to call an AI model, that request travels through an AWS PrivateLink endpoint -a private network interface within your VPC that connects directly to the AI service (such as Amazon Bedrock) over AWS's internal backbone network. The request never touches the public internet. There is no internet gateway, no NAT gateway, and no public route in the subnet's route table.

The architecture looks like this in practice:

Your application (in a private subnet) sends an API request to the AI model.
The request is routed to a VPC endpoint (powered by AWS PrivateLink) -a network interface with a private IP address within your VPC.
The request travels over AWS's internal network backbone to the AI service (e.g., Amazon Bedrock), which processes it in an isolated compute environment.
The response returns via the same private path back to your application.
At no point does any data leave AWS's private network or become accessible from the public internet.

This is the foundation of our Secure AI Platform, and it is the architecture we deploy for every client operating in a regulated industry.

Key Benefits of Keeping Data in Your VPC

Data Sovereignty

Your data remains within infrastructure you control, in a geographic region you choose. For UK businesses, this means your data stays in AWS's London (eu-west-2) region, on infrastructure subject to UK law. You are not dependent on a third party's data residency commitments -you can verify it yourself through your VPC configuration.

Regulatory Compliance

For FCA-regulated firms, the ability to demonstrate exactly where client data is processed and stored is not a nice-to-have -it is an operational requirement. A VPC-based deployment gives you complete visibility and control. You can show auditors the network topology, the security group rules, the encryption configuration, and the access logs. There is no "trust the third party" element to explain. We cover FCA-specific requirements in detail in our article on FCA and AI compliance.

Complete Audit Trail

Within your VPC, you control the logging. Every API call to the AI model can be logged in CloudTrail. Every network flow can be captured in VPC Flow Logs. Every data access can be recorded. This gives you an audit trail that regulators and internal compliance teams can review -something that is simply not possible when data is sent to a public API endpoint.

Customisation and Control

When AI models run within your environment, you can implement custom guardrails, content filters, and output validation that are specific to your business requirements. You can integrate the model with your internal data sources via private network connections without exposing that data to the internet. You can tune model parameters, implement caching strategies, and optimise performance without being constrained by a third-party API's limitations.

Zero Data Exfiltration Risk

Perhaps the most compelling benefit: if the AI model endpoint has no route to the public internet, data exfiltration through that endpoint is architecturally impossible. This is not a policy control that could be misconfigured -it is a network-level constraint enforced by the infrastructure itself.

Real-World Scenario: Public API vs Private VPC

Consider a wealth management firm that wants to use AI to analyse client portfolios and generate personalised investment summaries. The firm manages 2,000 client relationships with a combined asset value of over three hundred million pounds.

Scenario A: Public AI API

The firm integrates with a public AI API. Each portfolio analysis sends client names, portfolio holdings, transaction history, and risk profiles to the API provider's servers. The data traverses the public internet. The firm relies on the provider's contractual commitments regarding data handling. When the FCA asks how client data is protected during AI processing, the firm must point to a third party's terms of service and trust framework. The firm has limited visibility into data retention, sub-processors, or the specific infrastructure used. If the API provider changes their terms, suffers a breach, or is acquired by another company, the firm's data exposure profile changes without their direct control.

Scenario B: Private VPC Deployment

The same firm deploys AI models within their own AWS VPC via Amazon Bedrock, accessed through PrivateLink. Client data never leaves the firm's controlled environment. Portfolio data travels from the firm's application server to the Bedrock endpoint entirely within the private network. The firm can demonstrate to the FCA exactly where data is processed -pointing to specific VPC configurations, security groups, and network ACLs. All API calls are logged in the firm's own CloudTrail. Encryption keys are managed in the firm's own KMS. There is no third-party data processing agreement to manage because no data leaves the firm's environment.

"The difference is not just technical -it is the difference between telling your clients 'we trust our AI provider' and telling them 'your data never leaves our secure environment.' In regulated industries, that distinction matters."

AWS PrivateLink: The Private Tunnel

AWS PrivateLink is the technology that makes VPC-based AI practical without requiring you to host and manage models on your own GPU instances. It creates a private connection between your VPC and supported AWS services (including Amazon Bedrock) that stays entirely within the AWS network.

In non-technical terms, imagine two buildings connected by an underground tunnel. Even though both buildings face a public street, the tunnel means people can move between them without ever stepping outside. PrivateLink is that tunnel -it connects your private environment to the AI service without your data ever being exposed to the public internet.

The practical benefits for regulated organisations are significant. There is no need to configure an internet gateway or NAT gateway for AI traffic. Security groups and network ACLs can restrict access to the AI endpoint to specific applications or IP ranges within your VPC. All traffic is automatically encrypted. The endpoint appears as a private IP address within your VPC, making it indistinguishable from any other internal service from a networking perspective.

Beyond Compliance: The Business Case for Data Sovereignty

Data sovereignty is often framed purely as a compliance requirement, but there are compelling business reasons to keep your data within your own infrastructure.

Competitive differentiation: In financial services and legal, the ability to tell clients that their data is processed exclusively within your secure private environment is a genuine differentiator. As AI adoption accelerates, clients are becoming more sophisticated in their questions about data handling. The firms that can provide clear, verifiable answers will win and retain clients.

Client trust: Trust is the foundation of advisory relationships. A single data incident involving a third-party AI provider could be catastrophic. By keeping data within your own environment, you remove this risk category entirely.

Insurance implications: Cyber insurance underwriters are increasingly asking about AI data handling practices. Firms that can demonstrate data sovereignty through VPC-based deployment may benefit from more favourable terms compared to those relying on public APIs for sensitive data processing.

Vendor independence: When your AI infrastructure is within your own VPC, you are not locked into a single AI provider. You can switch models, test alternatives, or run multiple models in parallel without renegotiating data processing agreements or reassessing compliance for each provider.

Implementation: Moving from Public API to Private VPC

Moving from a public AI API to a private VPC deployment is more straightforward than many organisations assume. Here is the high-level approach.

Audit current AI usage: Identify all AI tools and APIs currently in use across the organisation, what data they process, and who uses them. You may find shadow AI usage that your IT team is not aware of.
Classify data sensitivity: Not all AI workloads need to run in a private VPC. Internal brainstorming with non-sensitive data might be fine on public APIs. But anything involving client data, personal data, or commercially sensitive information should be migrated.
Design the VPC architecture: Define the network topology, security groups, encryption configuration, and logging requirements. This should be done in collaboration with your compliance and security teams.
Provision the infrastructure: Set up the VPC, subnets, PrivateLink endpoints, and IAM policies using infrastructure-as-code (Terraform or CloudFormation) for repeatability and auditability.
Migrate workloads: Update your applications to point to the private endpoint rather than the public API. In most cases, this requires only changing the endpoint URL and authentication method -the API contracts for services like Amazon Bedrock are the same whether accessed publicly or via PrivateLink.
Validate and test: Confirm that no traffic is leaving the VPC by reviewing VPC Flow Logs. Run penetration testing against the new architecture. Verify that all logging and audit trails are functioning correctly.
Decommission public API access: Once workloads are migrated and validated, revoke API keys for the public endpoints and update your security policies to prevent future use of public AI APIs for sensitive data.

This is precisely the process we follow when deploying our Secure AI Platform for clients. The entire migration can typically be completed within four to six weeks, depending on the complexity of existing AI integrations.

Taking the Next Step

If your organisation is processing sensitive data through public AI APIs, the risk is real and growing. Regulators are paying closer attention, clients are asking harder questions, and the consequences of a data incident involving AI are severe.

The good news is that private VPC-based AI deployment is now mature, cost-effective, and achievable for mid-market businesses -not just the largest enterprises. You do not need to compromise on AI capability to keep your data secure.

We help regulated businesses across the UK deploy AI securely within their own cloud environments. Whether you are starting from scratch or migrating existing AI workloads, our team can guide you through the architecture, implementation, and compliance process. Book a consultation to discuss your requirements, or explore our full range of AI implementation services.

Explore Related Services

Secure AI Platform

Private cloud AI deployed in your own infrastructure.

Learn more →

AI-Powered Analytics

Transform your data into actionable intelligence.