HNPWA with Next.js

DINOv3

reqo | 175 points

- Blog post: https://ai.meta.com/blog/dinov3-self-supervised-vision-model... - Paper: https://ai.meta.com/research/publications/dinov3/ - Hugging Face: https://huggingface.co/collections/facebook/dinov3-68924841b...

beklein | 2 days ago

As someone who works on satellite imagery, this part is incredibly exciting:

> ViT models pretrained on satellite dataset (SAT-493M)

DINOv2 had pretty poor out-of-the-box performance on satellite/aerial imagery, so it's super exciting that they released a version of it specifically for this use case.

fnands | a day ago

I think SAM and DINO are the two off-the-shelf image models I've gotten the most mileage out of.

Imnimo | a day ago

You have to share your contact information, including DoB, and then be approved access, to obtain the models, and given that it's Meta I assume they're actually validating it against their All Humans database.

They made their own DINOv3 license for this release (whereas DINOv2 used the Apache 2.0 license).

Neat though. Will still check it out.

As a first comment, I had to install the latest transformer==4.56.0dev (e.g. pip install git+https://github.com/huggingface/transformers) for it to work properly. 4.55.2 and earlier was failing with a missing image type in the config.

llm_nerd | a day ago

Could anyone point to an example or git repo showing a simple implementation?

I’m fascinated by this, but am admittedly clueless about how to actually go about building any kind of recognizer or other system atop it.

cobbzilla | a day ago

If I’m already using siglip2 for a clustering application, is this enough of a an uplift that I should look at it?

deepsquirrelnet | 21 hours ago

I have no idea what this even is.

ranger_danger | 2 days ago

That's awesome. DINOv2 was the best image embedder until now.

barbolo | a day ago

This was submitted earlier:

DINOV3: Self-supervised learning for vision at unprecedented scale | https://news.ycombinator.com/item?id=44904608

PhilippGille | a day ago