MedGemma — Google's Open Medical Gemma for Private Clinical AI

Variants

4B + 27B

MedGemma 1.5 ships as a 4B multimodal model and a 27B model with both text-only and multimodal versions. All variants accept text and image inputs; outputs are text. Built on Gemma 3 base.

Hardware fit

Workstation+

MedGemma 4B runs comfortably on a single workstation GPU (RTX 4090, A6000) or M-series Mac. The 27B variant fits on a single A100 80GB at fp16 or a workstation GPU at 4-bit quantization.

Real deployments

NHI Taiwan

Taiwan's National Health Insurance Administration is using MedGemma to extract structured data from 30,000+ pathology reports for preoperative lung-cancer-surgery assessment — one of the largest disclosed open-medical-model deployments.

Imaging coverage

CT / MRI / WSI

MedGemma 1.5 4B adds high-dimensional imaging support — CT, MRI, whole-slide histopathology, longitudinal chest X-ray, anatomical localization — to the original chest X-ray and dermatology coverage.

What MedGemma actually is

MedGemma is a Google DeepMind open-weight collection of medical-tuned models built on Gemma 3, intended as the foundation for private medical-AI applications. The release ships through Hugging Face (google/medgemma-4b-it, google/medgemma-27b-it, google/medgemma-27b-text-it) under the Health AI Developer Foundations license, which permits commercial use with the standard caveat that the model is not clinically validated and outputs require human review.

For a hospital, MedGemma matters for two reasons. First, it is the most credible open-weight medical model of 2025–26 — open-weight, downloadable, runnable behind your own firewall, with explicit medical-tuning Google has published in detail. Second, the multimodal variants make medical-image-plus-text reasoning runnable locally for the first time at workstation hardware scale: chest X-ray interpretation, dermatology image triage, EHR text extraction, and histopathology summarization can all be tested in a hospital environment without sending images to a vendor cloud.

What MedGemma is not: a clinical-grade replacement for radiologist or pathologist judgment. Google is explicit that outputs are preliminary, require independent verification, and are not intended to directly inform diagnosis or treatment decisions. The buyer-relevant use cases are upstream — drafting, triage, summarization, extraction — not autonomous decision-making.

Deployment posture

MedGemma is delivered as Hugging Face model weights. The deployment path is the same as any open-weight LLM: pull the weights once, serve through vLLM for production, through Ollama for pilots, or through llama.cpp for CPU / Apple Silicon. The model is small enough that the runtime cost is a single GPU rather than a multi-GPU cluster, which is what makes it usable for departmental and pilot deployments.

SIZE

4B and 27B

MedGemma 4B (multimodal) is the workstation-friendly entry point. MedGemma 27B (text and multimodal) is the higher-quality option that still fits on a single A100 80GB at fp16.

RUNTIME

Any Gemma-compatible engine

Runs in vLLM, Ollama, llama.cpp, TGI, and any other engine that supports Gemma 3 architecture. GGUF builds are available within days of upstream release.

MODALITIES

Text + image

Accepts image and text inputs, produces text outputs. Imaging coverage includes chest X-ray, CT, MRI, whole-slide histopathology, dermatology images, and ophthalmology.

LICENSE

Health AI Developer Foundations

Open-weight under Google's HAI-DEF license. Permits commercial use; explicit non-clinical-validation disclaimer. Buyers should read the license and the model card before production use.

Healthcare fit

MedGemma is the buyer-relevant open-weight medical model when the workflow needs medical-tuned reasoning behind a hospital firewall — particularly for image-plus-text use cases that general-purpose Llama or Mistral handle less well. Real-world deployments include Taiwan NHI's extraction of structured data from 30,000+ pathology reports, Qmed Asia's askCPG conversational interface to 150+ Malaysian clinical practice guidelines, and a growing list of hospital pilots covering preoperative assessment, radiology report drafting, and EHR text extraction.

checkGood fit: medical-document understanding — extraction of structured fields from unstructured pathology reports, EHR text, lab reports, and discharge summaries.
checkGood fit: medical-image triage tasks — chest X-ray reading, dermatology image classification, ophthalmology screening — where the model produces a draft for human review.
checkGood fit: clinician-facing reasoning surfaces (patient interviewing, triaging, clinical decision support, summarization) where the model output stays preliminary and is reviewed before any clinical action.
closeBad fit: autonomous clinical decision-making, diagnostic systems without a clinician in the loop, or any workflow that treats MedGemma output as clinical truth. Google is explicit that outputs require independent verification.
closeBad fit: ambient-scribe-grade transcription — MedGemma is a reasoning model, not a speech model. Pair with Whisper for the audio surface.

Privacy and governance

MedGemma's open-weight nature is the privacy advantage: the model is downloadable, runnable behind a hospital firewall with no outbound API, and auditable at the weight level. That puts MedGemma in the same data-handling posture as any other open-weight model — the runtime (Ollama, vLLM, llama.cpp) determines the actual privacy boundary, not the model itself. For HIPAA / PIPEDA / PHIPA / Quebec Law 25 environments, MedGemma is one of the few medical-tuned models that is even an option.

Governance is the operator's responsibility. The Health AI Developer Foundations license is permissive but the non-clinical-validation disclaimer is binding: any clinical workflow using MedGemma needs explicit human review, a documented evaluation rubric, allowed-data-classes policy, and a stop / restart governance memo. Moneli Automation's typical use of MedGemma is for the upstream tasks where its medical tuning earns its keep (text extraction, image triage, summarization) inside a workflow where the clinician is the final reviewer.

Strengths and limitations

STRENGTHS

Why hospital stacks pick it

Most credible open-weight medical model in 2026. Multimodal variants make medical-image reasoning runnable locally. Workstation-fit footprint (4B) and A100-fit footprint (27B). Open-weight, downloadable, auditable. Production deployments at named institutions (NHI Taiwan, Qmed Asia) validate the operational story. Backed by Google DeepMind's ongoing release cadence (MedGemma 1.5 added imaging breadth in 2026).

LIMITATIONS

Where it does not fit

Not clinically validated. Google's own documentation is explicit that outputs require independent verification and are not intended for direct clinical use. Smaller than frontier general-purpose models (Llama 3.1 405B, GPT-5-class); use it where medical tuning earns its keep, not as a default model. Multimodal coverage is broader than rivals but not radiologist-grade — fine for triage drafting, not for autonomous reads. License is permissive but specific; read it before commercial deployment.

Where MedGemma fits in a hospital stack

Layer	What MedGemma contributes	What still has to be solved
Model choice	Open-weight medical-tuned reasoning with multimodal support — the best buyer option for image-plus-text workflows behind a firewall.	Held-out evaluation against the specific clinical workflow; comparison against general-purpose Llama 3.x on the same task.
Runtime	None — pair with vLLM (production), Ollama (pilot), or llama.cpp (CPU/Apple).	Capacity planning, observability, gateway-side identity.
Modalities	Image + text input, text output. Chest X-ray, CT, MRI, whole-slide histopathology, dermatology.	Imaging pipeline (DICOM ingestion, anonymization, slice selection); evaluation per modality.
Audio	None — pair with Whisper for transcription.	Speech-to-text pipeline, speaker attribution, retention policy.
Governance / audit	None at the model layer — every workflow needs clinician-review enforcement.	Review-required workflow, audit retention, stop conditions, evaluation harness.

MedGemma is the model. It is the right answer when the workflow is medical-tuned reasoning on a hospital-owned runtime — and one of the few open-weight models that earns the "medical" descriptor honestly. Use it as the model layer, not as the whole solution.

Quick facts

Publisher	Google DeepMind / Google Health AI, under the Health AI Developer Foundations program.
License	Health AI Developer Foundations license (open-weight, commercial-use-permitted, explicit non-clinical-validation disclaimer).
Variants	MedGemma 4B (multimodal). MedGemma 27B text-only. MedGemma 27B multimodal. All MedGemma 1.5 variants are multimodal.
Base architecture	Gemma 3. Compatible with any Gemma-3 runtime (vLLM, Ollama, llama.cpp, TGI).
Modalities	Text + image input; text output. Imaging: chest X-ray, CT, MRI, whole-slide histopathology, dermatology, ophthalmology.
Notable deployments	Taiwan National Health Insurance Administration (preoperative lung-cancer surgery assessment, 30,000+ pathology reports). Qmed Asia (askCPG, Malaysian clinical practice guidelines).
Typical hospital sizing	4B: single workstation GPU or M-series Mac. 27B: single A100 80GB at fp16, or workstation GPU at 4-bit quantization.
Distribution	Hugging Face: google/medgemma-4b-it, google/medgemma-27b-it. GGUF builds available through community channels.

Use MedGemma when the workflow demands medical tuning

MedGemma is the model to reach for when the workflow specifically benefits from medical pre-training and multimodal medical-image reasoning. Moneli Automation's typical pattern is MedGemma as the model layer behind vLLM or Ollama, paired with Whisper for audio and Qdrant or OpenSearch for retrieval, inside a governance wrapper that enforces clinician review.

send Request a WalledCare pilot arrow_back All open-source profiles