My current list of best large language models for local AI:
gpt-oss-120bfor general purpose chat with tool calling (wildly underrated in my opinion, for use with my personal indexing service)glm-4.7for coding using Claude Code via claude-code-router (though I’m on the fence withminimax-m2.1being the primary)qwen3-coder-30bfor a fast model for coding using Claude Code
All three of these models fit in memory of a 512GB Mac Studio and can be used at the same time. Maybe at very large context sizes there is a risk of OOM’ing.
For pure analysis and world knowledge, qwen3-235b-a22b-2507 is pretty hard to beat as are deepseek-r1 and deepseek-v3.2. Still, I stick with gpt-oss-120b with tool calling for most things because it’s much faster and better at using tools which makes up for it’s worse out-of-the-box world knowledge.