Ollama Tutorial: Run Open Source LLMs Locally on Your Machine (Step-by-Step Guide)

Author

DevDuniya

May 16, 2025

Ollama Tutorial: Run Open Source LLMs Locally on Your Machine (Step-by-Step Guide)

In the rapidly growing world of AI and large language models (LLMs), many developers and enthusiasts are looking for local alternatives to cloud-based AI tools like ChatGPT or Bard. Enter Ollama – a fantastic way to run open-source LLMs like LLaMA, Mistral, and others on your own computer.

This blog is a complete beginner’s guide to:

✅ What is Ollama
✅ Why use it
✅ How to install Ollama on Windows, macOS, or Linux
✅ How to run AI models locally
✅ Useful Ollama commands
✅ Creating your own custom models (Modelfile)
✅ Integrating Ollama with other tools like LangChain or Python


📘 What is Ollama?

Ollama is an open-source tool that allows you to run large language models locally on your computer with a simple command-line interface (CLI).

Think of it as your personal, offline ChatGPT that runs fully on your machine.

🛠️ Key Features:

  • Run popular LLMs like LLaMA2, Mistral, Gemma, and more
  • Stream responses in real-time
  • Fully offline once the model is downloaded
  • Lightweight and fast (optimized for CPU/GPU)
  • Easily integrates with tools like LangChain, Python, and more

💻 Why Use Ollama?

  • Privacy: Everything runs on your machine. No data sent to the cloud.
  • Speed: Once downloaded, model responses are very fast.
  • Flexibility: Choose the models you want, run them offline.
  • Open Source: You're not tied to one vendor.

🔧 How to Install Ollama (Windows, macOS, Linux)

✅ System Requirements:

  • At least 8GB RAM (16GB+ recommended)
  • A decent CPU (M1/M2/M3 or any modern Intel/AMD)
  • GPU optional (improves performance but not required)

🖥️ 1. Install on macOS

brew install ollama

Or, download directly from: https://ollama.com/download


💻 2. Install on Windows

  • Download the Windows installer from: https://ollama.com/download
  • Run the .exe file and follow the prompts.
  • Once installed, open Command Prompt or PowerShell and type:
ollama --version

If installed correctly, it will show the version.


🐧 3. Install on Linux (Debian/Ubuntu)

curl -fsSL https://ollama.com/install.sh | sh

Then run:

ollama run llama2

🚀 How to Run a Model (Like LLaMA 2 or Mistral)

Once installed, open your terminal and simply run:

ollama run llama2

It will:

  1. Download the model (if not already installed)
  2. Start the model and open a prompt
  3. You can now type and chat with it directly!

Example:

> ollama run llama2
>>> What is Laravel?
Laravel is a PHP web application framework...

📚 List of Available Models

You can explore models at: https://ollama.com/library

Popular models:

Model Command Description
LLaMA2 ollama run llama2 Meta's LLM
Mistral ollama run mistral Lightweight & fast
Gemma ollama run gemma Google’s open-source LLM
Code LLaMA ollama run codellama Optimized for coding
Phi ollama run phi Lightweight, Microsoft model

🔁 Switch Between Models

Want to try a different model?

ollama run mistral

You can run as many models as you want — just one at a time in a session.


️ Ollama CLI Commands

Here are some useful commands:

Command Description
ollama run [model] Run a model
ollama pull [model] Download a model without running
ollama list See downloaded models
ollama rm [model] Remove a model
ollama create [model-name] Create a custom model using a Modelfile
ollama serve Run the Ollama server for API use
ollama help Show help information

🧪 Example: Pull a Model Without Running

ollama pull mistral

🧠 Create Your Own Custom Model (Modelfile)

You can customize a model’s behavior with a Modelfile.

Example: Modelfile

FROM mistral

SYSTEM "You are an assistant that helps write Laravel PHP code."

Then create the model:

ollama create laravel-assistant -f Modelfile

Run it:

ollama run laravel-assistant

Now it will always guide users on Laravel-related questions!


🔌 Use Ollama via API (Python or Curl)

Start the Ollama server:

ollama serve

Then send a request via curl:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "Explain Docker in simple terms"
}'

Or with Python:

import requests

response = requests.post(
    'http://localhost:11434/api/generate',
    json={
        'model': 'mistral',
        'prompt': 'What is Laravel?',
    }
)
print(response.json()['response'])

🔗 Integrate with LangChain or LlamaIndex

Ollama works well with LangChain, enabling advanced workflows and chaining.

LangChain (Python):

from langchain.llms import Ollama

llm = Ollama(model="mistral")
print(llm("Tell me a joke"))

❌ How to Remove a Model

If you want to free up space:

ollama rm llama2

🧼 Clean All Models

ollama rm -a

📦 Where Are Models Stored?

On macOS/Linux:

~/.ollama/models

On Windows:

C:\Users\YourUsername\.ollama\models

📋 Summary

Task Command
Run a model ollama run mistral
List models ollama list
Remove model ollama rm mistral
Create custom model ollama create my-model -f Modelfile
Use API ollama serve and call /api/generate
Download model only ollama pull model-name

🚀 Final Thoughts

Ollama is a game-changer for developers, educators, and AI enthusiasts who want local control, privacy, and speed when working with powerful language models.

Whether you’re building an app, exploring AI, or just want your own offline ChatGPT — Ollama is the perfect starting point.


🔗 Useful Links


💬 Got Questions?

Drop them in the comments, or connect on GitHub or Twitter. Happy hacking! 💻


Would you like this blog in Markdown, PDF, or as a Notion page? I can format and export it for you.

Tags

Ai Python Machine Learning

Related Posts