Extending the context length to 1M tokens

cmcconomy | 114 points

This is fantastic news. I've been using Qwen2.5-Coder-32B-Instruct with Ollama locally and it's honestly such a breathe of fresh air. I wonder if any of you have had a moment to try this newer context length locally?

BTW, I fail to effectively run this on my 2080 ti, I've just loaded up the machine with classic RAM. It's not going to win any races, but as they say, it's not the speed that matter, it's the quality of the effort.

aliljet | 4 days ago

> We have extended the model’s context length from 128k to 1M, which is approximately 1 million English words

Actually English language tokenizers map on average 3 words into 4 tokens. Hence 1M tokens is about 750K English words not a million as claimed.

lr1970 | 4 days ago

Is this model downloadable?

lostmsu | 4 days ago

Note unexpected three body problem spoilers in this page

swazzy | 4 days ago

Can we all agree that these models far surpass human intelligence now? I mean they process hours worth of audio in less time than it would take a human to even listen. I think the singularity passed and we didn't even notice (which would be expected)

anon291 | 4 days ago