HTI Chat Engine

HTI Chat Engine - Ai Powered

Search Less. Solve Faster. Stay Private.

Platform Overview

Built for Real Enterprise Workloads

Every feature maps directly to production requirements — from token-accurate context budgeting to JWT-guarded multi-tenant routing

Multi-Stage RAG Pipeline

Retrieval → Query Expansion → ChromaDB Search → Cross-Encoder Reranking → Token-Budgeted Context Assembly → Mistral LLM Generation. Every stage is precision-tuned for enterprise accuracy.

Exact Token Budgeting

The Generator computes real token counts via the LLM's own tokenizer. History and context are binary-search truncated to fit within VRAM-detected n_ctx limits — no silent overflows.

Scheduled Maintenance Mode

Middleware-level 503 traffic routing with schedule-based auto-on/off. The Angular SystemService polls /api/health every 30s with exponential backoff, redirecting non-admin users to a maintenance screen.

Live SSE Token Streaming

Responses stream token-by-token via Server-Sent Events. The Angular frontend uses @microsoft/fetch-event-source for smooth real-time rendering with stop-generation support.

JWT Auth & Session Management

Secure JWT-based authentication with Angular Signals for reactive session state. Auto-refresh on expiry, bearer-token injection via HTTP interceptors, and role-scoped guards on every route.

Document Ingestion Pipeline

Supports PDF, DOCX, TXT, and Markdown up to 100MB. Files are chunked (500–1000 chars), embedded via Nomic v1.5, and indexed in tenant-isolated ChromaDB collections with real-time progress via SSE.

Products

RAG Engine

Key Capabilities

Trail Request

Leads Web Form
✅ Thank you! Your details have been submitted successfully.
❌ Submission failed. Please check your details and try again.

Would you like to start a project with us?

At Hephzibah Technologies, we partner with you to design, develop, and deliver innovative IT solutions tailored to your business goals. Let’s create something exceptional together.