RAM and device memory

When an AI model runs directly on a device — a phone, a tablet, or a workstation — it needs to be loaded into the device’s active memory while it is working. This active memory is called RAM (Random Access Memory). RAM is a physical component of your device. It is distinct from storage (the space where your photos and files are saved). RAM is the working space your device uses to run active processes. The larger and more capable an AI model is, the more RAM it requires to operate. If your device does not have sufficient RAM to load a given model, that model simply cannot run on your device. This is a hard hardware constraint, not a software limitation that can be worked around.

Understand how model size maps to RAM

AI models vary significantly in size. A small, specialized model designed for a single task — such as transcribing speech or extracting medication names from a note — requires relatively little RAM and can run comfortably on a modern smartphone. A larger model capable of complex multi-step reasoning across long documents requires substantially more RAM and is better suited to a tablet, a laptop, or an on-premise server. In Isa, the model picker lists each variant with its size in GB and a recommended iPhone, so you can pick one that fits your device. As devices become more powerful — newer iPhones now ship with 8 GB of RAM, and high-end iPad Pro variants reach 16 GB — the range of capable on-device models expands accordingly.

Why it matters for clinicians

Device selection: When a hospital or department is procuring new devices for clinical AI use, RAM is one of the most important specifications to consider. A device with more RAM can run more capable models, supporting more complex clinical workflows.
Future-proofing: AI models are improving rapidly. A device with higher RAM today will remain capable of running more advanced models as they become available, without requiring early hardware replacement.
Performance: Sufficient RAM ensures that the AI runs smoothly alongside your other clinical applications, without slowdowns or crashes during active patient encounters.

Quantization

How models are compressed to fit in less RAM without losing meaningful accuracy.

Hardware requirements

Which devices Isa runs on today, and what’s coming next.

Choose a model

Pick the right model for your device: on-device vs cloud, model sizes, and quantization.

On-device vs. cloud MLX

Documentation Index

​Understand how model size maps to RAM

​Why it matters for clinicians

​Next

Quantization

Hardware requirements

Choose a model

Understand how model size maps to RAM

Why it matters for clinicians

Next