
This week, while the spotlight is on GPT-5, OpenAI’s latest language model, there’s another notable announcement from OpenAI: the release of gpt-oss, an AI model you can use directly on your own device. I managed to set it up on my laptop and iMac, though I’m hesitant to suggest you do the same.
What’s the big deal with gpt-oss?
gpt-oss, similar to GPT-5, is an AI model. However, unlike OpenAI’s latest model, gpt-oss is “open-weight,” which lets developers customize and adjust the model to suit their particular needs. This is not the same as open source, which would require OpenAI to provide both the underlying code and the training data. Instead, developers get access to the “weights,” or the settings that determine how the model interprets data relationships.
I’m not a developer, so I can’t utilize that feature. However, I can operate gpt-oss locally on my Mac, a capability unavailable with GPT-5. For an average user like myself, running the model without internet access is a major benefit. It provides a level of privacy with an OpenAI model, especially since the company collects all the data I produce when using ChatGPT.
The model is available in two versions: gpt-oss-20b and gpt-oss-120b. The latter is significantly more powerful, requiring machines with at least 80GB of system memory. My computers don’t come close to that RAM capacity, so no 120b version for me. Fortunately, gpt-oss-20b’s memory requirement is 16GB: exactly the RAM in my M1 iMac, and two gigabytes less than my M3 Pro MacBook Pro.
Installing gpt-oss on a Mac
Installing gpt-oss on a Mac is relatively easy: You only need a program called Ollama, which allows you to run LLMs locally. Download Ollama, then open it. The app resembles other chatbots, but you can select from various LLMs to download first. Click the model selector next to the send button, select “gpt-oss:20b,” then send any message to initiate the download. In my experience, you’ll need just over 12GB of storage for it.
Alternatively, you can use your Mac’s Terminal app to download the LLM by entering the command: ollama run gpt-oss:20b. Once downloaded, you’re all set.
Running gpt-oss on my Macs
Once gpt-oss-20b was on both my Macs, I set about testing them. I closed almost all active programs to allocate resources to running the model. The only active apps were Ollama and Activity Monitor, so I could monitor my Macs’ performance.
I started with, “what is 2+2?” After initiating with both keyboards, chat bubbles showed the request being processed, as if Ollama was typing. I also noticed both machines’ memory was maximized.
Ollama on my MacBook pondered the request for 5.9 seconds, writing “The user asks: ‘what is 2+2’. It’s a simple arithmetic question. The answer is 4. Should answer simply. No further elaboration needed, but might respond politely. No need for additional context.” It then answered. The process took about 12 seconds. Conversely