MedGemma 1.5: Complete Beginner's Guide to Google's Medical Imaging AI for CT, and MRI

MedGemma 1.5 is an open-source AI model from Google that analyzes three-dimensional medical scans like CT and MRI images. Released on January 13, 2026, this updated version represents a major step forward in medical AI technology.

Unlike previous medical AI tools that could only look at flat X-rays or single images, MedGemma 1.5 is the first public release of an open multimodal large language model that can interpret high-dimensional medical data while also retaining the ability to interpret general 2D data and text. This means doctors and developers can build tools that understand complete CT scans, MRI volumes, and tissue samples.

This guide explains what MedGemma 1.5 does, how it works, and how healthcare developers can start using it. Whether you work in radiology, pathology, or healthcare software development, you'll learn how this AI model can help analyze medical images more efficiently.

What is MedGemma 1.5?

MedGemma 1.5 is a specialized version of Google's Gemma 3 AI model that has been trained specifically on medical images and text. The model has been trained using chest x-ray, histopathology, dermatology, fundus images, CT, MR, medical text/documents and electronic health records (EHR) data.

The "1.5" version brings several major improvements over the original MedGemma 1 model. It can now process entire 3D medical scans instead of just single slices. It understands medical reports, patient records, and can even compare how a patient's condition has changed over time.

Key Capabilities

MedGemma 1.5 4B expands support for high-dimensional medical imaging including interpretation of three-dimensional volume representations of Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), whole-slide histopathology imaging (WSI), longitudinal medical imaging, and anatomical localization.

The model comes in a 4 billion parameter size, which is small enough to run on a single GPU but powerful enough for real medical applications. This makes it practical for hospitals and research institutions that want to test AI tools without massive computing resources.

How MedGemma 1.5 Works: The Technology Behind Medical AI

MedGemma 1.5 uses a multimodal architecture, which means it can understand both images and text at the same time. This is important because medical diagnosis requires looking at scans while also reading patient history and lab reports.

The model has two main components:

Image Encoder: This part processes medical images using a technology called SigLIP. It has been specifically trained on medical images to understand what normal and abnormal anatomy looks like. When you feed it a CT scan, it converts the visual information into a format the AI can understand.

Language Model Decoder: This component takes the visual information from the image encoder and combines it with any text you provide (like questions or patient history). It then generates human-readable responses explaining what it sees in the images.

The Training Process

MedGemma is based on Gemma 3 and has been further trained on medical images and text. Google used a combination of publicly available medical datasets and licensed, de-identified patient data from hospitals and diagnostic centers.

The training included:

Thousands of CT scans from different body parts (head, chest, abdomen)
Multi-parametric MRI studies
Whole slide histopathology images
Electronic health records in FHIR format
Medical question-answer pairs
Radiology reports and lab documents

This diverse training allows the model to recognize patterns across different imaging types and medical specialties.

Performance Improvements Over MedGemma 1

The 1.5 update brought significant accuracy gains across multiple medical tasks. These improvements make the model more reliable for real-world healthcare applications.

Task Type	MedGemma 1 Accuracy	MedGemma 1.5 Accuracy	Improvement
CT Disease Classification	58%	61%	+3%
MRI Disease Findings	51%	65%	+14%
Histopathology Analysis (ROUGE-L)	0.02	0.49	+2,350%
Anatomical Localization (IoU)	3%	38%	+35%
Longitudinal Chest X-ray Review	61%	66%	+5%
Medical Text Reasoning (MedQA)	64%	69%	+5%
EHR Question Answering	68%	90%	+22%

On internal benchmarks, the baseline absolute accuracy of MedGemma 1.5 improved by 3% over MedGemma 1 on classification of disease-related CT findings and by 14% on classification of disease-related MRI findings. The histopathology improvement is particularly dramatic, jumping from nearly zero to matching specialized models built only for tissue analysis.

What Medical Tasks Can MedGemma 1.5 Handle?

MedGemma 1.5 excels at several specific medical imaging and analysis tasks. Understanding what it can and cannot do helps developers build appropriate applications.

3D Medical Imaging Analysis

The model can process entire CT and MRI scan volumes. Instead of looking at one slice at a time, it sees multiple slices together and understands the three-dimensional structure of organs and abnormalities.

For example, if you give it a chest CT scan with 150 slices, it can analyze all the slices to identify lung nodules, measure their size, and describe their characteristics. This mimics how radiologists actually read scans.

Whole Slide Histopathology

Pathologists examine tissue samples under microscopes, and these samples are often digitized into extremely large images. MedGemma 1.5 can analyze multiple patches from these whole slide images simultaneously.

On an internal diverse benchmark of histopathology slides and associated findings, the fidelity of MedGemma 1.5's predictions improved by 0.47 over MedGemma 1, matching the score achieved by the task-specific PolyPath model.

Longitudinal Image Comparison

One of the most clinically useful features is the ability to compare how a patient's condition has changed over time. MedGemma 1.5 can interpret chest X-rays in the context of prior images, comparing current versus historical scans.

This is exactly what doctors need when they want to know if a lung condition is getting better, staying stable, or worsening.

Anatomical Localization

The model can perform bounding box-based localization of anatomical features and findings in chest X-rays. This means it can point to specific areas in an image and say "the abnormality is located here."

Medical Document Understanding

Beyond images, the model can extract structured information from medical documents. Lab report extraction improved, with retrieval scores rising to 78 percent from 60 percent. It can pull out test values, units, and reference ranges from unstructured PDF reports.

Electronic Health Record Analysis

MedGemma enables further development in medical contexts, and examples of tasks within MedGemma's training include visual question answering pertaining to medical images, document understanding, or providing answers to textual medical questions.

The model understands FHIR (Fast Healthcare Interoperability Resources) format, which is the standard for exchanging healthcare information electronically.

How to Use MedGemma 1.5: Implementation Guide

Getting started with MedGemma 1.5 requires some technical knowledge, but Google has made it accessible through multiple platforms.

Where to Access the Model

You can download and use MedGemma 1.5 in three ways:

Hugging Face: The model is available at google/medgemma-1.5-4b-it for download and local use
Google Cloud Vertex AI: For production deployment with enterprise support
Model Garden: Google's managed platform with specialized tools for handling large medical images

The model is free for both research and commercial use under the Health AI Developer Foundations terms of use.

Basic Implementation Example

Here's a simple Python code example to get started with MedGemma 1.5 for analyzing a chest X-ray:

from transformers import pipeline
from PIL import Image
import requests
import torch

# Create the image-text-to-text pipeline
pipe = pipeline(
    "image-text-to-text",
    model="google/medgemma-1.5-4b-it",
    torch_dtype=torch.bfloat16,
    device="cuda"  # Use "cpu" if no GPU available
)

# Load a medical image
image_url = "https://example.com/chest_xray.png"
image = Image.open(requests.get(image_url, stream=True).raw)

# Create a question about the image
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "Describe any abnormal findings in this chest X-ray"}
        ]
    }
]

# Get the AI's analysis
output = pipe(text=messages, max_new_tokens=2000)
print(output[0]["generated_text"][-1]["content"])

This code loads the model, feeds it a chest X-ray image, and asks it to describe what it sees. The model will generate a text response describing any findings.

Working with CT and MRI Volumes

For 3D scans like CT or MRI, you need to process multiple slices. Google Cloud deployments now support full DICOM files, which is a big deal for real hospital imaging systems.

Google provides specialized tutorial notebooks for:

CT interpretation
Histopathology analysis
Anatomical localization
Longitudinal imaging
Fine-tuning with reinforcement learning

These notebooks show best practices for handling large medical image files efficiently.

System Requirements

The 4 billion parameter model is designed to be compute-efficient. You can run it on:

A single NVIDIA GPU with at least 16GB of memory
CPU mode (slower but works without a GPU)
Google Cloud infrastructure for production scale

For testing and development, a single modern GPU is sufficient. For hospital-wide deployment, cloud infrastructure is recommended.

Important Limitations and Safety Considerations

MedGemma 1.5 is a powerful tool, but it has important limitations that every user must understand.

Not a Diagnostic Tool

MedGemma is not intended to be used without appropriate validation, adaptation and making meaningful modification by developers for their specific use case. The outputs generated by these models are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications.

This is critical. The model is a starting point for developers, not a finished product for doctors to use directly with patients.

Requires Validation and Fine-Tuning

The baseline model provides good performance, but developers should validate their adapted model's performance and make necessary improvements before deploying in a production environment.

This means:

Testing on your specific use case with your own data
Measuring accuracy on real cases from your institution
Fine-tuning the model with domain-specific examples
Having medical professionals review outputs

Data Contamination Concerns

When evaluating the generalization capabilities of a large model like MedGemma in a medical context, there is a risk of data contamination, where the model might have inadvertently seen related medical information during its pre-training.

To address this, developers should test the model on internal datasets that were never made publicly available. This ensures the model is truly generalizing rather than just memorizing training examples.

Privacy and Compliance

MedGemma itself is not HIPAA compliant. HIPAA compliance depends on your entire system, including how you handle patient data, access controls, encryption, and audit logging.

Running the model locally can help with privacy because patient data doesn't need to leave your environment, but you still must implement proper safeguards and de-identification procedures.

Practical Use Cases for Healthcare Developers

Healthcare organizations are adopting AI tools rapidly. Healthcare organizations adopting AI at 22% are moving 2.2 times faster than the broader economy. Here are realistic applications where MedGemma 1.5 can add value.

Radiology Workflow Assistance

Radiologists can use MedGemma-based tools to:

Generate preliminary descriptions of scans for quality control
Compare current scans with prior studies automatically
Flag potential abnormalities for closer review
Extract structured data from radiology reports

These applications don't replace radiologists but help them work more efficiently.

Pathology Research and Labeling

Research institutions can use the model to:

Label histopathology datasets for training other AI models
Extract findings from pathology reports at scale
Identify tissue samples with specific characteristics
Build searchable databases of tissue images

Taiwan's National Health Insurance Administration applied MedGemma to analyze 30,000 pathology reports for lung cancer surgical decision-making, demonstrating real-world policy-level implementation.

Medical Education Tools

MedGemma can be used to build a tool to help medical students sharpen their Chest X-Ray interpretation skills. Students can practice with the AI providing feedback and explanations.

Clinical Decision Support Systems

Developers can build tools that:

Summarize patient charts from EHR data
Answer questions about medical guidelines
Generate pre-visit reports by analyzing patient history
Extract relevant information from previous visits

Triage and Prioritization

Emergency departments and imaging centers can use AI to help prioritize which cases need urgent attention. The model can analyze incoming scans and flag those with concerning findings for immediate review.

Customization and Fine-Tuning Options

MedGemma 1.5 is designed as a foundation model that you adapt to your specific needs. Google provides several approaches for customization.

Prompt Engineering

For certain use cases, MedGemma's baseline performance may be sufficient after careful prompting, potentially including few-shot examples of desirable example responses within the prompt.

This means you can improve results just by writing better questions and providing example answers in your prompts. No additional training required.

Fine-Tuning with LoRA

LoRA (Low-Rank Adaptation) is an efficient method for fine-tuning large models. MedGemma can be fine-tuned for improved performance on the existing tasks it's been trained on, or to add additional tasks to its repertoire.

Google provides tutorial notebooks showing how to fine-tune MedGemma using LoRA with your own medical data.

Reinforcement Learning

For complex tasks, reinforcement learning can teach the model to perform better without compromising its existing abilities. Google has released tutorials on this advanced customization technique.

Component-Specific Fine-Tuning

Users can specifically fine-tune the language model decoder component to help the model better interpret the visual tokens produced by the image encoder, or fine-tune both.

This flexibility lets you optimize either how the model understands images or how it generates text responses.

Common Mistakes to Avoid

When implementing MedGemma 1.5, watch out for these common pitfalls:

Skipping Validation: Never deploy the baseline model directly in clinical workflows. Always test on your specific data first.

Treating Outputs as Final Answers: The model generates draft descriptions that need clinical review. Don't present them as finished diagnoses.

Ignoring Domain Shift: Medical images from different hospitals or machines can look different. The model may perform differently on your data than on Google's benchmarks.

Over-Relying on Prompt Engineering: While prompting helps, some use cases require fine-tuning with domain-specific data to achieve clinical-grade accuracy.

Neglecting Privacy Controls: Even though the model can run locally, you still need proper data handling procedures, access controls, and audit trails.

Using for Multi-Turn Conversations: MedGemma has not been evaluated or optimized for multi-turn applications. It works best for single-question, single-answer interactions.

Expecting Perfect Accuracy: The model is impressive but not infallible. Performance benchmarks highlight baseline capabilities, but inaccurate model output is possible.

MedGemma 1.5 vs Other Medical AI Models

How does MedGemma 1.5 compare to alternatives like GPT-4 Vision, Claude, or Gemini?

Advantages of MedGemma 1.5:

Specifically trained on medical data
Can run locally for privacy
Free for commercial use
Supports 3D medical imaging natively
Smaller size makes it practical for institutional deployment
Open-source allows customization

Where General Models May Be Stronger:

Broader medical knowledge and reasoning
Better at multi-turn conversations
More polished dialogue and explanations
Stronger performance on complex text reasoning

For Image Classification Without Text: Google recommends using MedSigLIP instead of MedGemma. For medical image-based applications that do not involve text generation, such as data-efficient classification, zero-shot classification, or content-based or semantic image retrieval, the MedSigLIP image encoder is recommended.

The MedGemma Impact Challenge: $100,000 in Prizes

Google announced the MedGemma Impact Challenge, a Kaggle-hosted hackathon with $100,000 in prizes. This competition encourages developers to build innovative healthcare applications using MedGemma and other Health AI Developer Foundations models.

The hackathon is open to all developers and offers an opportunity to showcase how AI can transform healthcare. Google wants to see creative applications beyond what they originally envisioned.

Getting Support and Resources

Google provides multiple resources for developers working with MedGemma 1.5:

Official Documentation: The Health AI Developer Foundations site at developers.google.com/health-ai-developer-foundations/medgemma contains model cards, benchmarks, and guidelines.

Tutorial Notebooks: Google has released notebooks covering CT interpretation, histopathology analysis, anatomical localization, longitudinal imaging, and fine-tuning techniques.

GitHub Repository: The MedGemma GitHub repository contains code examples and implementation guides.

HAI-DEF Forum: A dedicated technical support forum where developers can ask questions and share experiences.

Model Garden: For production deployments, Model Garden on Google Cloud provides managed infrastructure and specialized tools for handling large medical images stored in DICOM format.

Conclusion: The Future of Medical AI with MedGemma 1.5

MedGemma 1.5 represents an important step in making advanced medical AI accessible to healthcare developers. By combining 3D imaging analysis, medical text understanding, and open-source availability, it enables new applications that were previously impossible or required massive resources.

The model's improvements over the previous version show rapid progress in medical AI capabilities. The 14% jump in MRI accuracy and 35-fold improvement in anatomical localization demonstrate that specialized medical training produces real benefits over general-purpose AI models.

For developers, MedGemma 1.5 offers a practical starting point. The 4 billion parameter size makes it deployable on modest hardware, while the performance is strong enough for real applications. The open-source nature means you can customize it for your specific needs without licensing restrictions.

However, success requires more than just downloading the model. You need to validate it on your data, fine-tune it for your use case, and build proper safeguards around it. Medical AI is not plug-and-play, and patient safety must always come first.

As the healthcare AI market grows toward projected valuations of $928 billion by 2035, tools like MedGemma 1.5 will become increasingly important. They democratize access to advanced medical AI and allow smaller institutions to build sophisticated diagnostic support tools.

Whether you're building radiology workflow tools, pathology research systems, or medical education applications, MedGemma 1.5 provides a solid foundation. Start with the tutorial notebooks, test on your data, and build responsibly. The future of medical AI is open-source, and MedGemma 1.5 is an excellent place to begin your journey.