feat(rag): replace torch with onnxruntime to fit 512MB RAM
Eliminates torch (1.5GB) and sentence-transformers entirely. Uses onnxruntime + tokenizers + huggingface-hub to run the same all-MiniLM-L6-v2 model via its ONNX export (~90MB). Drops CrossEncoder reranker - RRF fusion alone is sufficient for the testing phase. Estimated memory: ~340MB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| Repository | Davemaina1/iroh_ |
|---|---|
| Author | Davemaina1 <dmain7015@gmail.com> |
| Authored | |
| Parents | 2b142754 |
| Stats | 3 files changed , +121 , -73 |
| Part of | RAG Python sidecar - memory-engineered for 512MB free tier |
Capture this commit into my fork
Download a Markdown prompt that tells Claude how to port this
exact commit into your working tree. Run it via
claude -p < capture-commit-7a0da671.md
from inside the repo you want the change in.