feat(rag): replace torch with onnxruntime to fit 512MB RAM

↗ view on GitHub · Davemaina1 · 2026-05-14 · 7a0da671

Eliminates torch (1.5GB) and sentence-transformers entirely. Uses
onnxruntime + tokenizers + huggingface-hub to run the same all-MiniLM-L6-v2
model via its ONNX export (~90MB). Drops CrossEncoder reranker - RRF fusion
alone is sufficient for the testing phase. Estimated memory: ~340MB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Repository Davemaina1/iroh_
Author Davemaina1 <dmain7015@gmail.com>
Authored
Parents 2b142754
Stats 3 files changed , +121 , -73
Part of RAG Python sidecar - memory-engineered for 512MB free tier

Capture this commit into my fork

Download a Markdown prompt that tells Claude how to port this exact commit into your working tree. Run it via claude -p < capture-commit-7a0da671.md from inside the repo you want the change in.

⬇ Download capture-commit-7a0da671.md