HNPWA with Next.js

Ollama 0.4 is released with support for Meta's Llama 3.2 Vision models locally

BUFU | 182 points

This was a pretty heavy lift for us to get out which was why it took a while. In addition to writing new image processing routines, a vision encoder, and doing cross attention, we also ended up re-architecting the way the models get run by the scheduler. We'll have a technical blog post soon about all the stuff that ended up changing.

Patrick_Devine | 8 months ago

I tested the small model with a few images from Clevr. On first blush I am afraid it didn't do very well at all, it got object counts totally wrong and struggled to identify shapes and colours.

Still, it seems to understand what's in the images in general (cones and spheres and cubes), and the fact that it runs on my mac book at all is basically amazing.

sgt101 | 8 months ago

Did they fix multiline editing yet? Any interactive input that wraps across 3+ lines seems to become off-by-one when editing (but fine if you only append?), and this will be only more common with long filenames being added. And triple-quote breaks editing entirely.

How does this address the security concern of filenames being detected and read when not wanted?

o11c | 8 months ago

Is Qwen2VL supported too? Its a great vision model, works in comfyui. Llama3.2s vision seems to be super censored...

ei23 | 8 months ago

I thought llamacpp didn't support images yet, has that changed or ollama is using a different library for this?

papruapap | 8 months ago

Does anyone know if this will run on the iPhone 15 (6GB) or iPhone 16 (8GB)

zamderax | 8 months ago

Can it run the quantized models?

inasring | 8 months ago

[deleted]

| 8 months ago

how likely is it to run on a reasonably new windows laptop?

vasilipupkin | 8 months ago