Case Study
Private ChatGPT for Secure Supply Chain Knowledge
Solution
AI & ML, Data Governance & Privacy, Cloud
Migration/Modernization
Industry
Healthcare
Core Technology
AWS, AI/LLM Models
Overview
Our client, a leading global medical device manufacturer, was facing a critical digital dilemma dilemma. Its global supply chain department needed the immense productivity potential of generative AI, yet the risk of leaking sensitive proprietary data to public LLMs was an unacceptable security threat.
To bridge the gap between innovation and security, Xavor was engaged to provide generative AI consulting services to design and implement a fully private, on-premise AI knowledge platform that provideed ChatGPT-like capabilities without any data leaving the organization’s secure environment.
Business Challenge
The department’s needs collided with tangible real-world constraints, getting locked in a “trilemma” of security, capability, and cost.
- Data Security Risk: Employees were uploading sensitive supply chain data (CSV files, Excel spreadsheets, internal documents) to public AI platforms like ChatGPT and Claude, creating significant data leakage risks
- Knowledge Silos: Critical supply chain knowledge was fragmented across teams with no centralized, AI-powered knowledge sharing system
- Privacy vs. Capability Trade-off: Need for enterprise-grade AI capabilities without the ability to use cloud-based LLM services due to privacy concerns
- Infrastructure Constraints: Limited ability to scale on-premise AI solutions due to hardware costs and multi-user concurrency challenges
the solution
We architected a self-contained intelligence platform in two strategic phases, ensuring the architecture could evolve as requirements matured.
Initial Implementation
We first deployed a local instance using PrivateGPT and Ollama within a private, self-hosted environment (implemented on AWS EC2 for development and testing).
This environment delivered four intelligent modes for different use cases, with multiformat document support (.docx, PDF, .pptx, .md, CSV, .xlsx, .xlsm.)
RAG Mode (Document Q&A)
Ask questions about uploaded documents and get answers with source citations
System retrieves relevant information from knowledge base and provides cited responses
Enables users to quickly find specific information without manually searching through documents
Search Mode (Semantic Discovery)
Intelligent semantic search across all uploaded documents
Goes beyond keyword matching to understand intent and context
Returns relevant document sections with file and page references
Basic Chat Mode (General AI Assistant)
General conversational AI capabilities for brainstorming and problem-solving
No document retrieval – pure conversational assistance
Helps users think through supply chain challenges
Summarize Mode (Intelligent Data Analysis)
Automatically summarizes documents and data files
For CSV/Excel files: Provides dataset overview, key statistics, and insights
Answer both descriptive questions (“What does this data represent?”) and quantitative questions (“What is the variance of column X?”)
Generates visual summaries for complex data
Key Features
Knowledge Sharing
Authorized user groups can upload domain-specific knowledge, making it searchable and accessible across the organization
Data Privacy
100% on-premise – no data ever leaves the organization’s secure environment
Source Attribution
Every answer includes citations showing exactly which documents and pages the information came from
Multi-User Access
Centralized knowledge repository
accessible to entire supply chain team
Critical Infrastructure Challenge
However, the Ollama-based initial implementation has scalability limitations:
1. Hardware
constraint:
Only small models (3B-7B parameters) could run efficiently
2. Multi-user bottleneck:
Each concurrent user required loading
additional model instances into
limited GPU RAM
3. Cost
barrier:
Scaling to 256GB RAM + 64GB GPU would be prohibitively expensive for development and testing
4. Performance limitations:
Response times of 120-150 seconds for complex queries
Phase Two: Migration to AWS Bedrock
To solve the concurrency and speed bottlenecks, we transitioned the platform to AWS Bedrock, unlocking enterprise-grade performance and scalability, while maintaining data privacy through AWS’s secure infrastructure.
We implemented a “Dual Storage Strategy”—using AWS OpenSearch Serverless for “hot” frequently accessed data and local Qdrant for “cold” archival storage—optimizing both cost and performance.
This move enhanced platform capabilities, delivering faster response times, unlimited concurrent users, and real-time interactions with iterative exploratory workflows.
It also enabled advanced AI features:
Multiple AI Models:
Access to state-of-the-art models (Claude Sonnet 4.5, Haiku 4.5, Opus 4.5)
Superior Document Processing:
Vision Language Models for advanced OCR and image understanding
Optimized for Different Tasks:
Fast models for simple queries, powerful models for complex analysis
outcomes & benefits
The transformation reshaped how the organization interacts with its own intelligence:
Improved Document and
Data Accessibility
Vision-enabled processing and optimized chunking strategies improved the system’s ability to retrieve relevant information across document types, and it could intelligently select the correct sheets in multi-tab Excel files to support both descriptive and quantitative questions.
Mitigated Shadow AI Risk
By providing a secure internal alternative, the organization significantly reduced data leakage risk while meeting user expectations for speed and ease of use.
Institutional Memory
Knowledge that once lived in silos became a centralized, searchable asset, accelerating onboarding and cross-functional collaboration.
Tools & tech stack
conclusion
For high-stakes teams, the challenge is not access to AI, but access to AI that is fast, governable, and trustworthy. By starting with a pragmatic foundation and evolving into a scalable architecture, Xavor helped a global supply chain organization turn a data exposure risk into a durable operational advantage.
If your organization needs to bridge the gap between AI innovation and strict data governance, Xavor’s experts can help you engineer the now. Visit www.xavor.com to learn more or to schedule a personalized consultation.
Use generative AI to develop a scalable technology architecture
Xavor can help you design a modern AI-based architecture for your business that gives you an operational advantage.