Case Study

Private ChatGPT for Secure Supply Chain Knowledge 

Solution

AI & ML, Data Governance & Privacy, Cloud
Migration/Modernization 

Industry

Healthcare

Core Technology

AWS, AI/LLM Models 

Overview

Our client, a leading global medical device manufacturer, was facing a critical digital dilemma dilemma. Its global supply chain department needed the immense productivity potential of generative AI, yet the risk of leaking sensitive proprietary data to public LLMs was an unacceptable security threat. 

To bridge the gap between innovation and security, Xavor was engaged to provide generative AI consulting services to design and implement a fully private, on-premise AI knowledge platform that provideed ChatGPT-like capabilities without any data leaving the organization’s secure environment. 

Business Challenge

The department’s needs collided with tangible real-world constraints, getting locked in a “trilemma” of security, capability, and cost.

the solution

We architected a self-contained intelligence platform in two strategic phases, ensuring the architecture could evolve as requirements matured. 

Initial Implementation 

We first deployed a local instance using PrivateGPT and Ollama within a private, self-hosted environment (implemented on AWS EC2 for development and testing). 

This environment delivered four intelligent modes for different use cases, with multiformat document support (.docx, PDF, .pptx, .md, CSV, .xlsx, .xlsm.) 

RAG Mode (Document Q&A)
Ask questions about uploaded documents and get answers with source citations 
System retrieves relevant information from knowledge base and provides cited responses 
Enables users to quickly find specific information without manually searching through documents 

Search Mode (Semantic Discovery)
Intelligent semantic search across all uploaded documents 
Goes beyond keyword matching to understand intent and context 
Returns relevant document sections with file and page references 

Basic Chat Mode (General AI Assistant)
General conversational AI capabilities for brainstorming and problem-solving 
No document retrieval – pure conversational assistance 
Helps users think through supply chain challenges 

Summarize Mode (Intelligent Data Analysis)
Automatically summarizes documents and data files 
For CSV/Excel files: Provides dataset overview, key statistics, and insights 
Answer both descriptive questions (“What does this data represent?”) and quantitative questions (“What is the variance of column X?”) 
Generates visual summaries for complex data

Key Features

Knowledge Sharing Knowledge Sharing

Authorized user groups can upload domain-specific knowledge, making it searchable and accessible across the organization 

Data Privacy Data Privacy

100% on-premise – no data ever leaves the organization’s secure environment 

Source Attribution Source Attribution

Every answer includes citations showing exactly which documents and pages the information came from

Multi-User Access Multi-User Access

Centralized knowledge repository
accessible to entire supply chain team

Critical Infrastructure Challenge
However, the Ollama-based initial implementation has scalability limitations: 

1. Hardware
constraint:

Only small models (3B-7B parameters) could run efficiently 

2. Multi-user bottleneck:
Each concurrent user required loading
additional model instances into
limited GPU RAM 

3. Cost
barrier:

Scaling to 256GB RAM + 64GB GPU would be prohibitively expensive for development and testing 

4. Performance limitations:
Response times of 120-150 seconds for complex queries 

Phase Two: Migration to AWS Bedrock

To solve the concurrency and speed bottlenecks, we transitioned the platform to AWS Bedrock, unlocking enterprise-grade performance and scalability, while maintaining data privacy through AWS’s secure infrastructure. 

We implemented a “Dual Storage Strategy”—using AWS OpenSearch Serverless for “hot” frequently accessed data and local Qdrant for “cold” archival storage—optimizing both cost and performance. 

This move enhanced platform capabilities, delivering faster response times, unlimited concurrent users, and real-time interactions with iterative exploratory workflows.
 
It also enabled advanced AI features: 

Multiple AI Models:
Access to state-of-the-art models (Claude Sonnet 4.5, Haiku 4.5, Opus 4.5) 

Superior Document Processing:
Vision Language Models for advanced OCR and image understanding 

Optimized for Different Tasks:
Fast models for simple queries, powerful models for complex analysis 

outcomes & benefits

The transformation reshaped how the organization interacts with its own intelligence: 

95% Improved Velocity  95% Improved Velocity

Complex query response times dropped from over two minutes to just 5–7 seconds, enabling real-time exploratory workflows.

Improved Document Improved Document and
Data Accessibility

Vision-enabled processing and optimized chunking strategies improved the system’s ability to retrieve relevant information across document types, and it could intelligently select the correct sheets in multi-tab Excel files to support both descriptive and quantitative questions.

Mitigated Shadow AI Risk Mitigated Shadow AI Risk

By providing a secure internal alternative, the organization significantly reduced data leakage risk while meeting user expectations for speed and ease of use.

Data Privacy Institutional Memory

Knowledge that once lived in silos became a centralized, searchable asset, accelerating onboarding and cross-functional collaboration.

Tools & tech stack
Claude Sonnet 4.5
PrivateGPT
AWS EC2
OpenSearch Service
conclusion

For high-stakes teams, the challenge is not access to AI, but access to AI that is fast, governable, and trustworthy. By starting with a pragmatic foundation and evolving into a scalable architecture, Xavor helped a global supply chain organization turn a data exposure risk into a durable operational advantage. 

If your organization needs to bridge the gap between AI innovation and strict data governance, Xavor’s experts can help you engineer the now. Visit www.xavor.com to learn more or to schedule a personalized consultation. 

Use generative AI to develop a scalable technology architecture

Xavor can help you design a modern AI-based architecture for your business that gives you an operational advantage.

Scroll to Top