OPEN SOURCE · MEDICAL MODEL · MULTIMODAL
MedGemma
Google's open-weight medical Gemma model — released as 4B and 27B variants (now all multimodal under MedGemma 1.5), tuned for medical text and image reasoning, available under the Health AI Developer Foundations license, and runnable directly through Ollama or vLLM. The buyer-relevant medical open-weight model in 2026, with real production deployments at Taiwan's NHI and Qmed Asia.
MedGemma 1.5 ships as a 4B multimodal model and a 27B model with both text-only and multimodal versions. All variants accept text and image inputs; outputs are text. Built on Gemma 3 base.
MedGemma 4B runs comfortably on a single workstation GPU (RTX 4090, A6000) or M-series Mac. The 27B variant fits on a single A100 80GB at fp16 or a workstation GPU at 4-bit quantization.
Taiwan's National Health Insurance Administration is using MedGemma to extract structured data from 30,000+ pathology reports for preoperative lung-cancer-surgery assessment — one of the largest disclosed open-medical-model deployments.
MedGemma 1.5 4B adds high-dimensional imaging support — CT, MRI, whole-slide histopathology, longitudinal chest X-ray, anatomical localization — to the original chest X-ray and dermatology coverage.
What MedGemma actually is
MedGemma is a Google DeepMind open-weight collection of medical-tuned models built on Gemma 3, intended as the foundation for private medical-AI applications. The release ships through Hugging Face (google/medgemma-4b-it, google/medgemma-27b-it, google/medgemma-27b-text-it) under the Health AI Developer Foundations license, which permits commercial use with the standard caveat that the model is not clinically validated and outputs require human review.
For a hospital, MedGemma matters for two reasons. First, it is the most credible open-weight medical model of 2025–26 — open-weight, downloadable, runnable behind your own firewall, with explicit medical-tuning Google has published in detail. Second, the multimodal variants make medical-image-plus-text reasoning runnable locally for the first time at workstation hardware scale: chest X-ray interpretation, dermatology image triage, EHR text extraction, and histopathology summarization can all be tested in a hospital environment without sending images to a vendor cloud.
What MedGemma is not: a clinical-grade replacement for radiologist or pathologist judgment. Google is explicit that outputs are preliminary, require independent verification, and are not intended to directly inform diagnosis or treatment decisions. The buyer-relevant use cases are upstream — drafting, triage, summarization, extraction — not autonomous decision-making.
Deployment posture
MedGemma is delivered as Hugging Face model weights. The deployment path is the same as any open-weight LLM: pull the weights once, serve through vLLM for production, through Ollama for pilots, or through llama.cpp for CPU / Apple Silicon. The model is small enough that the runtime cost is a single GPU rather than a multi-GPU cluster, which is what makes it usable for departmental and pilot deployments.
MedGemma 4B (multimodal) is the workstation-friendly entry point. MedGemma 27B (text and multimodal) is the higher-quality option that still fits on a single A100 80GB at fp16.
Runs in vLLM, Ollama, llama.cpp, TGI, and any other engine that supports Gemma 3 architecture. GGUF builds are available within days of upstream release.
Accepts image and text inputs, produces text outputs. Imaging coverage includes chest X-ray, CT, MRI, whole-slide histopathology, dermatology images, and ophthalmology.
Open-weight under Google's HAI-DEF license. Permits commercial use; explicit non-clinical-validation disclaimer. Buyers should read the license and the model card before production use.
Healthcare fit
MedGemma is the buyer-relevant open-weight medical model when the workflow needs medical-tuned reasoning behind a hospital firewall — particularly for image-plus-text use cases that general-purpose Llama or Mistral handle less well. Real-world deployments include Taiwan NHI's extraction of structured data from 30,000+ pathology reports, Qmed Asia's askCPG conversational interface to 150+ Malaysian clinical practice guidelines, and a growing list of hospital pilots covering preoperative assessment, radiology report drafting, and EHR text extraction.
- checkGood fit: medical-document understanding — extraction of structured fields from unstructured pathology reports, EHR text, lab reports, and discharge summaries.
- checkGood fit: medical-image triage tasks — chest X-ray reading, dermatology image classification, ophthalmology screening — where the model produces a draft for human review.
- checkGood fit: clinician-facing reasoning surfaces (patient interviewing, triaging, clinical decision support, summarization) where the model output stays preliminary and is reviewed before any clinical action.
- closeBad fit: autonomous clinical decision-making, diagnostic systems without a clinician in the loop, or any workflow that treats MedGemma output as clinical truth. Google is explicit that outputs require independent verification.
- closeBad fit: ambient-scribe-grade transcription — MedGemma is a reasoning model, not a speech model. Pair with Whisper for the audio surface.
Privacy and governance
MedGemma's open-weight nature is the privacy advantage: the model is downloadable, runnable behind a hospital firewall with no outbound API, and auditable at the weight level. That puts MedGemma in the same data-handling posture as any other open-weight model — the runtime (Ollama, vLLM, llama.cpp) determines the actual privacy boundary, not the model itself. For HIPAA / PIPEDA / PHIPA / Quebec Law 25 environments, MedGemma is one of the few medical-tuned models that is even an option.
Governance is the operator's responsibility. The Health AI Developer Foundations license is permissive but the non-clinical-validation disclaimer is binding: any clinical workflow using MedGemma needs explicit human review, a documented evaluation rubric, allowed-data-classes policy, and a stop / restart governance memo. Moneli Automation's typical use of MedGemma is for the upstream tasks where its medical tuning earns its keep (text extraction, image triage, summarization) inside a workflow where the clinician is the final reviewer.
Strengths and limitations
Most credible open-weight medical model in 2026. Multimodal variants make medical-image reasoning runnable locally. Workstation-fit footprint (4B) and A100-fit footprint (27B). Open-weight, downloadable, auditable. Production deployments at named institutions (NHI Taiwan, Qmed Asia) validate the operational story. Backed by Google DeepMind's ongoing release cadence (MedGemma 1.5 added imaging breadth in 2026).
Not clinically validated. Google's own documentation is explicit that outputs require independent verification and are not intended for direct clinical use. Smaller than frontier general-purpose models (Llama 3.1 405B, GPT-5-class); use it where medical tuning earns its keep, not as a default model. Multimodal coverage is broader than rivals but not radiologist-grade — fine for triage drafting, not for autonomous reads. License is permissive but specific; read it before commercial deployment.
Where MedGemma fits in a hospital stack
| Layer | What MedGemma contributes | What still has to be solved |
|---|---|---|
| Model choice | Open-weight medical-tuned reasoning with multimodal support — the best buyer option for image-plus-text workflows behind a firewall. | Held-out evaluation against the specific clinical workflow; comparison against general-purpose Llama 3.x on the same task. |
| Runtime | None — pair with vLLM (production), Ollama (pilot), or llama.cpp (CPU/Apple). | Capacity planning, observability, gateway-side identity. |
| Modalities | Image + text input, text output. Chest X-ray, CT, MRI, whole-slide histopathology, dermatology. | Imaging pipeline (DICOM ingestion, anonymization, slice selection); evaluation per modality. |
| Audio | None — pair with Whisper for transcription. | Speech-to-text pipeline, speaker attribution, retention policy. |
| Governance / audit | None at the model layer — every workflow needs clinician-review enforcement. | Review-required workflow, audit retention, stop conditions, evaluation harness. |
MedGemma is the model. It is the right answer when the workflow is medical-tuned reasoning on a hospital-owned runtime — and one of the few open-weight models that earns the "medical" descriptor honestly. Use it as the model layer, not as the whole solution.
Quick facts
| Publisher | Google DeepMind / Google Health AI, under the Health AI Developer Foundations program. |
| License | Health AI Developer Foundations license (open-weight, commercial-use-permitted, explicit non-clinical-validation disclaimer). |
| Variants | MedGemma 4B (multimodal). MedGemma 27B text-only. MedGemma 27B multimodal. All MedGemma 1.5 variants are multimodal. |
| Base architecture | Gemma 3. Compatible with any Gemma-3 runtime (vLLM, Ollama, llama.cpp, TGI). |
| Modalities | Text + image input; text output. Imaging: chest X-ray, CT, MRI, whole-slide histopathology, dermatology, ophthalmology. |
| Notable deployments | Taiwan National Health Insurance Administration (preoperative lung-cancer surgery assessment, 30,000+ pathology reports). Qmed Asia (askCPG, Malaysian clinical practice guidelines). |
| Typical hospital sizing | 4B: single workstation GPU or M-series Mac. 27B: single A100 80GB at fp16, or workstation GPU at 4-bit quantization. |
| Distribution | Hugging Face: google/medgemma-4b-it, google/medgemma-27b-it. GGUF builds available through community channels. |
Use MedGemma when the workflow demands medical tuning
MedGemma is the model to reach for when the workflow specifically benefits from medical pre-training and multimodal medical-image reasoning. Moneli Automation's typical pattern is MedGemma as the model layer behind vLLM or Ollama, paired with Whisper for audio and Qdrant or OpenSearch for retrieval, inside a governance wrapper that enforces clinician review.
send Request a WalledCare pilot arrow_back All open-source profiles
Further reading
- MedGemma — Google DeepMind
- MedGemma — Health AI Developer Foundations documentation
- MedGemma 1.5 + MedASR announcement (Google Research)
- MedGemma 1.5 model card
- MedGemma 27B IT on Hugging Face
- MedGemma 4B IT on Hugging Face
- Google Research overview of MedGemma
- Ollama profile — quickest path to running MedGemma locally
- vLLM profile — production-serving path for MedGemma at scale