AI

llama.cpp

What is llama.cpp?

llama.cpp is an open-source project that lets you run large language models (LLMs), especially Meta’s LLaMA family of models, efficiently on your own computer — including laptops, desktops, and even some mobile devices.

Build

Make sure the CUDA Toolkit has been installed.

$ sudo apt install -y libcurl4-openssl-dev build-essential cmake ccache git
$ git clone https://github.com/ggml-org/llama.cpp.git
$ cd llama.cpp
$ mkdir build && cd build
$ cmake -DGGML_CUDA=ON -DCMAKE_BUILD_TYPE=Release ..
$ cmake --build build --parallel
$ make -j$(nproc)
Previous
Google Gemini CLI