Finding Illustrations in Digitized Collections with a 2.5MB Browser Model

IIIF Community Call

Daniel van Strien

Hugging Face

2026-02-11

Daniel van Strien — Machine Learning Librarian @ Hugging Face

Demo

huggingface.co/spaces/small-models-for-glam/iiif-illustration-detector

What just happened?

Model MobileNetV2 — 2.5MB, quantized ONNX
Runs Entirely in your browser via transformers.js
Privacy No data leaves your institution
IIIF Supports Presentation API v2 AND v3
Output W3C Web Annotations — plugs into your existing infra

You just saw the model, the dataset, and the tool — all open, all on Hugging Face.

How this got made

  • 2022: Hand-labeled 1,896 pages as “illustrated” or “not”
  • Data sat for 3 years
  • One evening in 2025:
    • Train MobileNetV2 on that data
    • Convert to ONNX, quantize to 2.5MB
    • Build browser tool with transformers.js

One evening with modern tools turned 3-year-old data into a working browser tool.

Training models is getting easier

  • Fine-tuning a classifier: an afternoon, not a PhD
  • Expanding training data with AI: $15, not $15,000
  • This illustration detector: one evening of work

Running models is getting easier

  • 2.5MB — runs in any browser, no GPU needed
  • No server to maintain, no API keys, no cloud costs
  • Ship a model as a URL — nothing to install

Small models can be very powerful

  • One model finds illustrations across any IIIF collection
  • Imagine an ecosystem: illustration detection, handwriting recognition, layout analysis, map detection…
  • Each model small, focused, composable

An ecosystem of task-specific models built by and for the cultural heritage community.

So what’s the bottleneck?

Data.

  • AI can help prepare training data (VLMs, zero-shot labeling)
  • But it still needs human domain expertise to validate and guide it
  • Cultural heritage professionals have that expertise

Data as infrastructure

  • Union catalogs → MARC → IIIF → …?
  • IIIF = the cultural heritage community saying “we build our own interop”

Shared datasets and small models are the next layer.

The GLAM sector can build AI infrastructure by building datasets for training and evaluation.

Get involved

What would help:

  • Training data from your collections
  • Feedback on edge cases and new collection types
  • Feature requests

The model works well on Western European printed material — we need more diversity.

Questions & Discussion

  • What types of illustrations does your collection have?
  • Would you use this in a cataloguing workflow?
  • What other “boring but useful” classification tasks would help your work?
  • Is the annotation export format right for your infrastructure?

Thank You

Daniel van Strien

Links: