flowchart LR
W[Weights<br/>Learned numbers] --> M((Model))
C[Code<br/>Instructions] --> M
M --> T[Does tasks]
From Using to Contributing
Hugging Face
2025-12-09
“Let’s build a chatbot for our collections!”
But is this the best starting point?
“A frontier model like GPT-4 is like a Ferrari. It’s an obvious triumph of engineering, designed to win races. But it takes a special pit crew just to change the tires.
In contrast, a smaller specialized model is like a Honda Civic. It’s engineered to be affordable, reliable, and extremely useful. And that’s why they’re absolutely everywhere.”
— Adapted from “Finally, a Replacement for BERT” https://huggingface.co/blog/modernbert
flowchart LR
W[Weights<br/>Learned numbers] --> M((Model))
C[Code<br/>Instructions] --> M
M --> T[Does tasks]
The weights are the “brain” - patterns learned from training data
flowchart TD
D[(Training Data<br/>Books, websites, images)] --> L[Learning Process]
L --> W[Weights File<br/>.safetensors]
W --> E["Billions of numbers:<br/>[0.023, -0.891, 0.442, ...]"]
These numbers encode everything the model “knows”
Open Weights
flowchart TD
O[Download weights] --> R[Run anywhere]
R --> I[Inspect & modify]
Closed Weights
flowchart TD
A[API only] --> B[Black box]
B --> V[Vendor controlled]
“fully” open source AI includes:
Examples of “fully” open models:
| Choice | 1M+ models vs ~100 closed APIs |
| Control | Run where you want, pin versions |
| Flexibility | Fine-tune for your domain |
| Cost | Often cheaper at scale |
| Privacy | Data never leaves your infra |
| Transparency | Inspect model & training data |
AI/ML is more than LLMs!
Many tasks don’t need a large language model:
Finding models: Hugging Face Hub - 1M+ open models, filterable by task, language, size
Open models != local models
A few genres of inference for open models:
| Approach | Setup | Best For |
|---|---|---|
| Pay per token | Minimal | Prototyping, low volume |
| Local hardware | Medium | Privacy, offline use |
| Rent hardware | Higher | Production, scale |
Setup: Minimal - use via OpenAI-compatible client
Pros
Cons
Setup: Medium - install runtime + download model
Pros
Cons
Setup: Higher - configure cloud instance
Pros
Cons
mindmap
root((Local model ecosystem))
Python Libraries
Pytorch / Tensorflow / JAX
Transformers
Diffusers
Sentence Transformers
ONNX Runtime
Optimum
JavaScript
Transformers.js
Inference Frameworks
GPU Poor focused
llama.cpp
LM Studio
Ollama
MLX Apple Silicon
mlx-lm
mlx-vlm
ExLlamaV2
Jan
Enterprise Focused
vLLM
SgLang
Triton Inference Server
Quantization Formats
GGUF
GPTQ
AWQ
UIs
Gradio
Open WebUI
Switch to notebook
How can GLAMs contribute to Open Source AI?
GLAMs hold unique assets for AI training:
Move beyond “vibe checks” with domain expertise:
Share trained models back to the community:
danielvanstrien.xyz | Open Source AI for GLAMs