Open vs. Closed AI Models
Open-weights models let you download and run them yourself; closed models are only accessible through APIs controlled by their creators.
What does "open" mean for AI models?
The AI industry uses "open" differently than traditional software. A spectrum exists:
- Closed/Proprietary: GPT-5.1, Claude 4.5. You can only access them through APIs. The weights, training data, and architecture details are secret.
- Open weights: Llama 3.3, Mistral. You can download the trained model and run it yourself. But the training code and data are often not released.
- Fully open: Some research models release weights, training code, and data. Rare at the frontier.
Most "open-source AI" is actually open-weights: you get the result of training, not the recipe to reproduce it.
What can you do with open-weights models?
With Llama or Mistral, you can:
- Run locally: No API calls, no usage fees, complete privacy
- Fine-tune: Adapt the model for your specific domain or task
- Modify: Remove safety filters, change behavior (with all the responsibility that implies)
- Deploy anywhere: Your servers, your cloud, your rules
- Inspect: Study how the model works, run interpretability research
What are the trade-offs?
Open models offer:
- Control over your data (nothing sent to external servers)
- Customization (fine-tune for your use case)
- Cost predictability (hardware costs, not per-token fees)
- Independence (no API changes, no service shutdowns)
Closed models offer:
- State-of-the-art capability (GPT-5.1, Claude 4.5 still lead)
- No hardware management
- Continuous improvements (providers update models behind the API)
- Safety infrastructure (moderation, filtering, monitoring)
The major open model families
- Llama (Meta): The flagship open model. Llama 3.3 405B approaches frontier closed model capability. Permissive license for most uses.
- Mistral (Mistral AI): French company, strong models, competitive with larger Llama variants. Some models fully open, some commercial.
- Qwen (Alibaba): Strong multilingual performance, especially Chinese. Various sizes and specializations.
- Gemma (Google): Smaller models for research and development. More restricted license than Llama.
- Phi (Microsoft): Small but capable, designed to prove that smaller models can perform well.
Running open models yourself
The ecosystem for running open models has matured:
- Ollama: One-command setup for running models locally on Mac, Windows, Linux
- llama.cpp: Efficient C++ implementation that runs on consumer hardware
- vLLM: High-performance inference for server deployments
- Text Generation WebUI: Browser interface for local models
With a decent GPU (16GB+ VRAM), you can run 7-13B parameter models comfortably. For larger models, you need multiple GPUs or cloud instances.