A safe, local RAG maternal-health assistant. This page describes what the prototype does today, how a request flows through it, and how it is designed to be reviewed and scaled.
MamaCare AI answers common pregnancy questions in plain language, surfaces warning signs that need medical review, stays grounded in trusted knowledge instead of guessing, and feels supportive rather than clinical. It is a digital maternal support companion for the moments between antenatal visits.
The current prototype supports pregnancy questions across trimester-specific topics, uses curated FAQ cards as the highest-trust answer layer, applies hybrid retrieval to match natural phrasing, returns citations for grounded responses, blocks unsafe medication-style answers and diagnostic claims, escalates emergency symptom patterns immediately, and operates locally without an external API.
Safety is a deterministic rule engine that runs before answer generation. It is intentionally explicit so it can be reviewed with clinicians rather than hidden inside a model. The guardrail layers are:
# Guardrail layers (deterministic, run before retrieval)
EMERGENCY → heavy/severe bleeding, no fetal movement, convulsions,
chest pain, difficulty breathing, severe headache,
blurred vision, water breaking, high fever → escalate now
SELF_HARM_CRISIS → self-harm / crisis language → crisis escalation
SENSITIVE → pregnancy-decision language → emotional support + referral
MEDICATION_BLOCK → doses (mg/mcg/ml), "twice daily", drug names → block dosage advice
PRIVACY → names, long numbers, contacts → ask to avoid sharing PII
OUT_OF_SCOPE → unrelated topics → politely decline
When a mother asks a question:
1. The app receives the message.
2. Guardrails check for emergency, self-harm, medication, privacy, and scope.
3. Trimester hints are inferred from the question when possible.
4. A fast curated FAQ search runs first.
5. If the fast path is weak, semantic retrieval runs against the local index.
6. Trusted maternal guidance cards are prioritised over raw reports/tables.
7. The response layer formats a warm, grounded answer with escalation + citations.
The active retrieval stack is a local hybrid system, not a fine-tuned medical LLM: an embedding model (sentence-transformers/all-MiniLM-L6-v2), a vector database (ChromaDB), a lexical curated-FAQ matcher as a fallback, and a rule-based grounded response chain. The project also follows an ITU-aligned ML pipeline view — trusted sources, collection, preprocessing, modelling, policy enforcement, distribution, and continuous improvement — with MLflow for experiment tracking, model registry, evaluation, and artifact management.
The indexer can ingest curated maternal FAQ JSON cards, maternal guidance cards, and PDF, DOCX, CSV, XLSX/XLS, TXT, MD, and ZIP sources. For PDFs with little extractable text, a same-name .txt or .md sidecar (after OCR/text export) is used automatically during the knowledge build. Curated cards carry trimester, topic tags, keywords, common questions, the answer, "when to seek care" guidance, danger signs, and a confidence score.
Python · Streamlit · ChromaDB · sentence-transformers · Pandas · PyPDF · python-docx · OpenPyXL · XLRD · LangChain-core. The full app runs as a Streamlit chat experience; this portfolio sandbox is a browser-only re-implementation of the same retrieval-and-guardrail behaviour so it can be demonstrated safely on static hosting.
Priorities for wider deployment: enrich the knowledge base with official WHO/CDC and local ministry-of-health guidance; expand curated cards for medicine safety, vaccines, breastfeeding, mental health, and newborn care; add multilingual support (English plus Swahili and other local languages); add OCR and text-cleaning for scanned PDFs; and build evaluation sets and a clinician review workflow. For scale: an API layer for mobile/WhatsApp/web/health-worker integration, stronger semantic reranking, an answer-quality and safety evaluation dashboard, monitoring of high-risk question patterns, and offline-first deployment.