git clone https://github.com/phronmophobic/llama.clj
cd llama.clj
mkdir -p models
# Download 0.5Gb model to models/ directory
(cd models && curl -L -O 'https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/qwen2-0_5b-instruct-q4_0.gguf')
- https://github.com/phronmophobic/llama.clj#locally-compiled
- llama.cpp (cpu build)
# dependency on void
sudo xbps-install -S libgomp-devel
# get code
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
# 2025-02-04 https://github.com/ggml-org/llama.cpp/releases/tag/b4634
git checkout b4634
# setup build - tweaked based on llama.clj README
mkdir build
cc build
cmake -DBUILD_SHARED_LIBS=ON ..
# build with parallel jobs (measure time)
time cmake --build . --config Release -j 4
real 2m35.639s
user 8m1.827s
sys 0m17.816s
edit deps.edn
:
;; https://github.com/phronmophobic/llama.clj#locally-compiled
:local-llama
{:jvm-opts ["-Djna.library.path=../llama.cpp/build/bin"]
:extra-deps
{com.phronemophobic/llama-clj {:mvn/version "0.8.6"}}
}
adapted command at cli usage section
using local-llama
alias created above
clojure -M:local-llama -m com.phronemophobic.llama "models/qwen2-0_5b-instruct-q4_0.gguf" "what is 2+2?"
this particular model didn't seem very good at answering questions accurately