dropthejase spins document conversion into its own service

Word-to-PDF (and back) gets its own dedicated worker, triggered automatically whenever a file lands in storage.

infrastructureworkflow

Until now, document conversion lived inside the main application. dropthejase pulled it out into a standalone service that runs LibreOffice - the free office suite - inside a lightweight, isolated container. When a user uploads a document, the storage system fires an event and the converter picks it up on its own; the main app no longer has to babysit the process.

One detail worth flagging for anyone copying the pattern: the converter writes its output back to the same bucket it listens to, which means without a guard it would happily convert its own outputs in an endless loop. The fork includes the fix that stops that, which is the kind of thing teams usually discover the expensive way.

So what Legal-tech founders running document-heavy products should look here for a working blueprint of cheap, on-demand format conversion - and a reminder of the trap that bites everyone who builds it.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

8 commits from dropthejase/louis, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
510081c6 feat(conversion): scaffold project and copy conversion logic from backend Jason Lee 2026-05-06 ↗ GitHub
d42ae59b feat(conversion): add Lambda handler and production Dockerfile Jason Lee 2026-05-06 ↗ GitHub
90c65e48 Migrate conversion Lambda from Supabase to Aurora RDS Data API Jason Lee 2026-05-08 ↗ GitHub
43e9d4dd feat(infra): enable EventBridge on docs bucket, remove @supabase/supabase-js Jason Lee 2026-05-08 ↗ GitHub
b3674db6 infra: fix IAM permissions, EventBridge PDF rule, deploy-agent authorizer Jason Lee 2026-05-11 ↗ GitHub
commit body
- Lambda + AgentCore roles: extend Bedrock resources to wildcard region for
  cross-region inference profiles; add inference-profile/* ARN
- API Lambda role: upgrade sessionsBucket from grantRead to grantReadWrite
  (delete session on chat delete, Phase 5)
- AgentCore role: add s3:ListBucket on sessionsBucket
- AuthStack Identity Pool: add converted-pdfs/ prefix to per-user S3 policy
  and extend ListBucket condition to cover all three prefixes
- ConversionStack: add PDF EventBridge rule (documents/*.pdf trigger)
- deploy-agent.sh: include authorizer-configuration on update-agent-runtime
  (was only set on create); inject TABULAR_AGENT_ARN / MAIN_AGENT_ARN from
  SSM into runtime env vars (best-effort, skipped if not yet deployed)
0ee216da fix(conversion,chat): EventBridge format, PDF passthrough, delete session cleanup Jason Lee 2026-05-11 ↗ GitHub
1a11621d fix: remove libreoffice-convert from backend Lambda (conversion Lambda handles it) Jason Lee 2026-05-11 ↗ GitHub
8afab465 fix(conversion): stop EventBridge infinite loop on PDF uploads Jason Lee 2026-05-13 ↗ GitHub
commit body
Replace 3 separate EventBridge rules (docx/doc/pdf) with single rule
scoped to documents/ prefix only. Previous rules used OR semantics on
prefix+suffix - converted-pdfs/foo.pdf matched the .pdf suffix rule,
causing Lambda to re-trigger on every PDF it wrote. Added guard in
handler to reject keys outside documents/ as defense in depth.
Also upgrades base image to node22 (26.2-node22-x86_64).

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-331.md from inside the repo you want the changes in.

⬇ Download capture-thread-331.md