Building a local AI coding tool

This project was done because I wanted something that could help with coding, but not have to pay monthly fees to a company like JetBrains to do AI coding. It uses an open source tool called Ollama combined with two AI models, the Gemma3 model from Google for conversational AI interpretation of queries, and qwen3-coder. By combining these two on a local box, I can point it to the root of my code project. It will ingest the code into the model, keep track of changes, make recommendations, and help code. This allows you to airgap your system and prevent proprietary coding data from being available in the cloud. I called the system Nemo, short for Mnemosyne, the Goddess of Memory. Here are the precise directions used for setting it up.

Nemo (Mnemosyne) – Exact Reproducible Setup Guide

This guide provides exact, copy‑and‑paste commands to reproduce the Nemo (Mnemosyne) local AI assistant on a new machine. It assumes Ollama is already installed.

1. System Requirements

– macOS or Linux
– Python 3.10 or newer
– Git
– Ollama installed and running
– Recommended: 32 GB RAM (24 GB minimum)

2. Verify Ollama

Start Ollama:

ollama serve

Verify service:

curl http://localhost:11434/api/tags

3. Pull Required Models

Run the following commands:

ollama pull qwen3-coder:30b
ollama pull gemma3:27b
ollama pull nomic-embed-text

Verify:

ollama list

4. Create Project Directory

mkdir -p ~/mnemosyne
cd ~/mnemosyne

5. Pull the Nemo Repository

Pull the source code from this zip file.

https://drive.google.com/file/d/1tAPPYyXcenzbeOv5YqjkX1woUKVCMre2/view?usp=drive_link

6. Create Python Virtual Environment

python3 -m venv .venv
source .venv/bin/activate

7. Install Python Dependencies

pip install –upgrade pip
pip install -r requirements.txt

8. Verify Nemo CLI

python -m nemo –help

9. Configure Nemo

Edit config.yaml with the following structure:

repo_root: /absolute/path/to/target/repo
ollama:
  host: http://localhost:11434
  chat_model: qwen3-coder:30b
  embed_model: nomic-embed-text
index:
  persist_dir: data/index
  collection: nemo_code
chunking:
  max_lines: 220
  overlap_lines: 50
  max_chars: 4000
retrieval:
  top_k: 8

10. Ingest Repository

Remove any previous index:

rm -rf data/index

Run ingest:

python -B -m nemo ingest

11. Ask Questions via CLI

Example:

python -m nemo ask “Explain the backend architecture”

12. Run Web Interface

Start web server:

cd into the nemo subdirectory

uvicorn webapp:app –reload

Open browser:

http://localhost:8000

13. Re‑ingesting After Code Changes

Whenever the repo changes significantly:

rm -rf data/index
python -B -m nemo ingest

14. Operational Notes

– Nemo runs fully locally
– No internet access by default
– Models can be swapped via config.yaml
– Chat responses are streamed and may take time on large models