LlamaFarm CLI LlamaFarm

ai aiproject edge edge-computing finetuning-llms llama3 llama4 models prompt-engineering rag

Use this command to install LlamaFarm CLI:

winget install --id=LlamaFarm.CLI -e

LlamaFarm is a comprehensive, modular AI framework that gives you complete control over your AI stack. Build powerful AI applications locally with production-ready components including RAG systems, vector databases, model management, prompt engineering, and fine-tuning - all designed to work seamlessly together or independently. Features: - Local-First Development - Build and test entirely on your machine - Production-Ready Components - Battle-tested modules that scale from laptop to cluster - Strategy-Based Configuration - Smart defaults with infinite customization - Deploy Anywhere - Same code runs locally, on-premises, or in any cloud - Multi-Provider Support - Use cloud LLM providers or run your own models locally - Complete RAG Pipeline - Document processing, embedding, and retrieval

README

🦙 LlamaFarm - Run your own AI anywhere

> Build powerful AI locally, extend anywhere.

LlamaFarm is an open-source framework for building retrieval-augmented and agentic AI applications. It ships with opinionated defaults (Ollama for local models, Chroma for vector storage) while staying 100% extendable—swap in vLLM, remote OpenAI-compatible hosts, new parsers, or custom stores without rewriting your app.

Local-first developer experience with a single CLI (lf) that manages projects, datasets, and chat sessions.
Production-ready architecture that mirrors server endpoints and enforces schema-based configuration.
Composable RAG pipelines you can tailor through YAML, not bespoke code.
Extendable everything: runtimes, embedders, databases, extractors, and CLI tooling.

📺 Video demo (90 seconds): https://youtu.be/W7MHGyN0MdQ

🚀 Quickstart (TL;DR)

Prerequisites:

Docker
Ollama (local runtime) OR Lemonade (local GGUF models with NPU/GPU acceleration)

Install the CLI

macOS / Linux

curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash

Windows (via winget)

winget install LlamaFarm.CLI

Adjust Ollama context window
- Open the Ollama app, go to , and set the context window to match production (e.g., 100K tokens).

Task	Command	Notes
Initialize a project	`lf init my-project`	Creates `llamafarm.yaml` from server template.
Start dev stack + chat TUI	`lf start`	Spins up server, rag worker, monitors Ollama/vLLM.
Interactive project chat	`lf chat`	Opens TUI using project from `llamafarm.yaml`.
List available models	`lf models list`	Shows all configured models (Ollama, Lemonade, OpenAI, etc.).
Use specific model	`lf chat --model powerful "question"`	Switch between models in multi-model configs.
Send single prompt	`lf chat "Explain retrieval augmented generation"`	Uses RAG by default; add `--no-rag` for pure LLM.
Preview REST call	`lf chat --curl "What models are configured?"`	Prints sanitized `curl` command.
Create dataset	`lf datasets create -s pdf_ingest -b main_db research-notes`	Validates strategy/database against project config.
Upload files	`lf datasets upload research-notes ./docs/*.pdf`	Supports globs and directories.
Process dataset	`lf datasets process research-notes`	Streams heartbeat dots during long processing.
Semantic query	`lf rag query --database main_db "What did the 2024 FDA letters require?"`	Use `--filter`, `--include-metadata`, etc.

Example	What it Shows	Location
FDA Letters Assistant	Multi-document PDF ingestion, RAG queries, reference-style prompts	`examples/fda_rag/` & Docs
Raleigh UDO Planning Helper	Large ordinance ingestion, long-running processing tips, geospatial queries	`examples/gov_rag/` & Docs

LlamaFarm CLI LlamaFarm

README

🦙 LlamaFarm - Run your own AI anywhere

🚀 Quickstart (TL;DR)

🌟 Why LlamaFarm

🔧 Core CLI Workflows

🌐 REST API

Key Endpoints

Finding Your Namespace and Project

🗂️ Configuration Snapshot

Multi-Model Configuration (Recommended)

🧩 Extensibility Highlights

📚 Examples

🧪 Development & Testing

🤝 Community & Support

📄 License & Acknowledgments