Dynamic RAG based platform for TERAVERA

 

About the client:

Teravera is a pioneering provider of secure AI infrastructure for developers, offering a zero trust API that safeguards data, prevents hallucinations, and ensures AI integrity across leading platforms like OpenAI, Google, and AWS. Teravera technology empowers developers to confidently integrate AI into critical enterprise application Agents and workflows by tightly controlling data access and output accuracy. With a mission to bring trust and reliability to AI, Teravera serves as a cornerstone for developers and organisations adopting advanced AI solutions responsibly.

Untitled design (79)

Challenge:

Teravera needed a secure, scalable way to extend its zerotrust AI approach into AWS while replicating the architecture and functionality of its existing Azure based system. The goal was to design a dynamic RAG driven knowledge base and agent solution, one that could create new knowledge bases on the fly through an API. If a requested knowledge base didn’t exist, it would be instantly created; if it did, new documents would be seamlessly uploaded, ingested into the underlying vector database, and made ready for immediate querying.

A key requirement was departmental segmentation, enabling users to query specific knowledge bases and receive tailored, accurate answers. Instead of relying on OpenSearch serverless, the solution needed to leverage Aurora PostgreSQL, using a single database with separate tables for each department’s knowledge base. This approach demanded a precise architectural design to balance flexibility, data integrity, and scalability while ensuring a smooth migration from Azure to AWS without disrupting Teravera’s established processes.

 Streams of digital particles flowing through a central abstract lock shape, but splitting into multiple channels as they pass through- This represents both zero trust security and segmentation by departmen-1

 

Solution:

To meet Teravera’s goal of replicating and enhancing their Azure-based setup on AWS, we designed and delivered a highly technical, API-driven Retrieval-Augmented Generation (RAG) solution powered by Amazon Bedrock. This architecture allows knowledge bases to be created, managed, and queried on demand, while ensuring scalability, security, and departmental segmentation requirements.

Key components of the solution include:

Dynamic Knowledge Base Creation
Through API calls, the system checks for the existence of a departmental knowledge base. If none exists, it automatically provisions:
 
    • A new knowledge base
    • Data source
    • Agent and agent alias

  • Amazon S3 for Data Source Management 
    Documents uploaded via API are securely stored in S3 before being ingested into Aurora PostgreSQL. S3 provides scalable storage, granular IAM-based access control, and seamless integration into the ingestion pipeline.
  •  

  • Aurora PostgreSQL for Vector Storage & Segmentation
    Instead of OpenSearch, we used Aurora PostgreSQL as the vector database, implementing:

    • A single database for efficiency and manageability
    • Separate tables per department to maintain strict data segmentation


  • AWS Lambda for Orchestration
     Lambda functions power all orchestration logic, dynamically:

    • Creating knowledge bases, data sources, and agents when required
    • Routing documents into the correct departmental tables for ingestion
    • Invoking the correct agent alias for user queries—or returning a clean error if the department doesn’t exist



    Amazon Bedrock for AI Capabilities
    Bedrock handles all LLM-driven reasoning and RAG workflows, including:
     
    • Generating embeddings for document retrieval
    • Interpreting user queries and augmenting them with retrieved content
    • Returning contextually accurate, department-specific responses

  • Fully API-Driven Workflow
    Orchestrated via Amazon API Gateway, the entire system enables users to upload documents, create knowledge bases dynamically, and query them in real time without manual setup.

  • Minimal, futuristic visual of interconnected cubes or hexagons, each cube representing a knowledge base. Some cubes glow as if activated when documents are added. Flowing light lines connect them, symbolis

By combining S3 for secure storage, Aurora PostgreSQL for structured vector data, Lambda for orchestration, and Bedrock for AI reasoning, the solution delivered a dynamic, on-demand knowledge base ecosystem. It not only replicated Teravera’s Azure-based architecture but also improved scalability, security, and operational efficiency through AWS-native services.

 

 

Amazon Bedrock vs OpenAI: Performance & Latency Comparison

 

When planning Teravera’s platform, Cloud Combinator assessed multiple LLM delivery options to balance performance, security, and enterprise readiness. While OpenAI’s GPT family demonstrated strong generative capabilities, the need for AWS-native integration, enterprise compliance, and low-latency delivery made Amazon Bedrock the preferred solution.

Below is a comparison of Amazon Bedrock (using Claude 3.5 and Titan in latency optimised mode) against OpenAI’s GPT4/GPT4o API models:

 

Amazon Bedrock (Claude 3.5 / Titan – Latency Optimised)

OpenAI GPT‑4 / GPT‑4o

Typical Latency (Optimised)

~1.2–1.3 seconds (latency optimised mode reduces response time by ~40%)

~2.5–3.0 seconds (varies by model and traffic load)

Cost Model

Pay as you go per token; eligible for AWS credits and enterprise programs

Pay as you go per token; no AWS credit integration

Infrastructure Control

Fully AWS managed, running natively within AWS for tighter security and integration

API only; hosted on OpenAI’s external infrastructure

Scaling Approach

Serverless – scales automatically with demand

API scaling with explicit rate limits

Privacy & Compliance

Built on AWS’s enterprise compliance framework (ISO, SOC, HIPAA)

High compliance standards but outside AWS ecosystem

 

 

Outcome

Automated Knowledge Base Creation – When documents are uploaded to a department without an existing knowledge base, the platform now provisions all required AWS resources automatically, including the knowledge base, data source, agent, and alias which eliminated any manual setup.

Fast, Relevant Responses – Leveraging Aurora PostgreSQL for vector storage and Amazon Bedrock’s latency optimised inference mode, the system delivers sub-second retrieval times and has reduced LLM response latency by over 40%, ensuring quick and accurate answers.

Error Handling Built In – Queries sent to non-existent departments are instantly rejected with a clear error message, providing consistent and predictable feedback for users.

Secure and Scalable by Design – The entire solution is built exclusively on AWS managed services (S3, Lambda, Bedrock, Aurora PostgreSQL), ensuring enterprise grade security, compliance, and automatic scalability as usage grows.

 

Impact

  • Significant Performance Gains – Teravera now benefits from a faster, more efficient retrieval system, with latency metrics that meet or exceed industry standards, dramatically improving the user experience.
  • Streamlined Operations – Automated document ingestion and intelligent query handling remove manual processes, speeding up the creation of new knowledge bases and reducing operational burden for the team.
  • Infrastructure-as-Code for OneClick Deployment – The entire solution is built and deployed via AWS CloudFormation, enabling one click provisioning, consistent environments, and simplified updates across all stages of development and production.
  • Cost Visibility & Control – Every AWS resource is fully tagged, making it easy for Teravera to track, attribute, and manage costs across departments and scale usage without losing financial transparency.
  • Future-Ready Architecture – Designed for growth, the platform can scale seamlessly as Teravera expands, maintaining low operational costs while supporting evolving AI and knowledge management needs.

 

Testimonial

Teravera’s CEO, Keith Wood, reflected on the project: "The implementation of a dynamic knowledge base architecture has already streamlined the way we manage data internally, and their expertise with services like Amazon Bedrock and OpenSearch has given us a platform we can rely on.

What stood out most was Cloud Combinator’s collaborative approach; they felt like an extension of our own team. "

“3Gi Technologys expertise in AWS services transformed our approach to AI model training. With SageMaker HyperPod, weve significantly reduced training times and costs, positioning us for rapid g (5)-1

 

Future Engagement

Teravera expressed strong satisfaction with the solution delivered by Cloud Combinator, praising both the technical execution and the collaborative approach taken throughout the project. While they explored joining the AWS Solution Provider Program (SPP), their strict client privacy obligations meant they could not proceed, as participation would have required granting external access to their billing information.

Despite this limitation, Teravera has been clear about their intent to continue partnering with Cloud Combinator on future AWS initiatives. They recognise the value, deep technical expertise, and trusted partnership demonstrated during this engagement, and view Cloud Combinator as a key collaborator for their ongoing cloud and AI transformation.

 

 

Submit Your Comment