Skip to main content

Days of Mayo

Based on personal data.

View the interactive →

Every pet owner knows this problem. You pick up a dog on a Tuesday in November, and from that moment your camera roll is never the same again. You photograph everything — the first night, the first snow, the ridiculous sleeping positions. Duplicates accumulate. Burst shots pile up. Three years later you have 1,387 photos and no idea what any of it looks like as a whole.

This is data trash. Affectionate, sentimental data trash — but trash nonetheless. No structure, no labels, no way to see the shape of it.

I wanted to see the shape of it.

The result is a scatter plot where each photo is a dot, and dots that look similar are placed near each other. Outdoor walks cluster together. Sleeping-on-the-couch photos form their own group. The puppy months — when she was still small enough to fit in a coat pocket — sit in their own dense corner, clearly distinct from the adult dog who now takes up the whole sofa.

You can zoom in on any cluster, hover over photos, filter by year, and open the “Growing up” grid to see her at different stages — one photo randomly drawn from each time window, different every visit.

It is, I think, a better way to look at your photos than scrolling.


How it works

Export. Photos come from Apple Photos via osxphotos, queried by subject tag. Each image is resized to 512px; videos get a single keyframe extracted at the midpoint with ffmpeg. Duplicates from the shared album are deduplicated by UUID and by filename+date proximity.

Embed. Three separate feature vectors are computed per image. DINOv2 (ViT-S/14) produces a 384-dimensional vector capturing texture, composition, and spatial structure — this drives the main scatter. CLIP (ViT-B/32) adds a semantically richer embedding that powers a second view. An HSV colour histogram (32 bins across hue, saturation, value) captures mood and lighting for a third. Burst shots with cosine similarity above 0.98 are removed before any reduction.

Reduce. The 384-dimensional embeddings are projected to 2D with UMAP (n_neighbors=15, min_dist=0.1, metric='cosine'). Position in the scatter encodes genuine visual similarity, not time.

Display. Photos are packed into sprite atlas sheets and drawn to a canvas element. The “Growing up” grid selects, for each of eight time windows, the five photos closest to that period’s embedding centroid — one is picked at random on each view.

View the interactive →