Some data can't leave the building, and shouldn't. This academic medical center wanted the speed of generative AI for a painfully manual job: turning a patient's record into clear, personalized prep and recovery instructions for a procedure. But these are about as sensitive as records get, and sending them to an outside API was a non-starter for the institution and its review boards.

The instructions themselves were a real problem. Clinicians wrote them by hand or pulled from generic templates that ignored the patient's own medications, history, and language, which is exactly where instructions tend to fail.

The challenge

Could a language model personalize clinical instructions straight from the EMR without a single byte of patient data leaving the hospital network? Generic capability was not the hard part. Doing it entirely in-house, fast enough for clinical workflows and safe enough for the review board, was.

The approach

We brought the model to the data instead of the data to the model. Mindcracker stood up a small language model on an on-premises NVIDIA GPU stack, inside the hospital's own firewall. It reads the relevant record directly from the EMR, drafts a personalized set of notes and instructions, and routes them to a clinician to review and sign. Nothing is sent to any external service, ever.

01
A model that runs in-house
A right-sized small language model on an on-prem NVIDIA stack: fast enough for clinical workflows, and small enough to own and govern completely.
02
Direct, governed EMR access
The model reads the record inside the network, under the same access controls and audit expectations as any other internal system.
03
Personalized, not generic
Instructions adapt to the individual, their medications, history, and reading level, so a patient on metformin gets the guidance that actually applies to them.
04
Clinician in the loop
Every draft is reviewed and signed by a clinician before it reaches a patient. The model writes the first draft; the human owns the decision.

The model came to the data. The data never had to take a risk.

On-premises architecture where the EMR feeds a private small language model on a GPU cluster, which drafts notes a clinician reviews, all inside the firewall
FIG.02Every step, from EMR to private SLM to the drafted note, runs inside the hospital firewall. Patient data never leaves the network.

The outcome

Drafting a personalized instruction set went from several minutes of clinician time to seconds, and documentation load dropped across every procedure area that went live. Patients received instructions written for them specifically, and the institution got all of it without taking on the privacy risk of an outside model.

Privacy and capability are usually a trade. Here they weren't. The model lives where the data lives.

Because the platform is entirely on-premises, the medical center owns it outright: the model, the data, and the audit trail. No usage meter, no egress, no third party in the loop.