Skill

Guide to Running a Local Language Model

Madhav Malhotra

Madhav Malhotra

Co-Founder, QurioSkill.

See all products from Madhav Malhotra

What does this contain?

An reference guide that explains how to run language models on your own laptop, written as a companion to the AI Saturdays session.

What's Covered?

  • The core idea: a model is a file, an inference engine runs it

  • GGUF, the file format that works on Windows, macOS, and Linux

  • Quantization explained, and why Q4_K_M is the common default

  • How RAM sets the limit on which models fit on a given laptop

  • A short framework for picking a model, with a recommended starting point at each RAM tier

  • The recommended model used throughout: Llama 3.2 3B Instruct (GGUF, Q4_K_M), about 2 GB, runs on any laptop with 8 GB of RAM

Free

A short reference guide to running language models on your laptop. Covers LM Studio, Ollama, Jan, and WebLLM.