Foundry Local Microsoft Corporation

agent ai artificial-intelligence cpu cuda gpu inference llm machine-learning ml npu slm

Use this command to install Foundry Local:

winget install --id=Microsoft.FoundryLocal -e

Foundry Local is a powerful tool designed to bring the capabilities of Azure AI Foundry directly to your local device. It enables users to run large language models (LLMs) locally, ensuring all data processing occurs on-device for enhanced privacy and security.

Key Features:

Runs generative AI models on local hardware without requiring an Azure subscription.
Supports an OpenAI-compatible API for seamless integration with existing applications.
Offers automatic hardware acceleration optimization, leveraging CUDA GPUs, NPUs, or CPU fallback depending on your device.
Maintains data privacy by keeping all processing local.

Audience & Benefit: Ideal for developers and businesses looking to integrate AI capabilities without relying on cloud infrastructure. Foundry Local provides a simple, performant solution for running AI models locally, enabling enhanced privacy and reducing latency for real-time applications. It can be installed via winget for easy setup on supported platforms.

README

  <h1>Foundry Local</h1>
 <h3><a href="https://aka.ms/foundry-local-installer">Download</a> | <a href="https://aka.ms/foundry-local-docs">Documentation</a> | <a href="https://aka.ms/foundry-local-discord">Discord</a></h3>

👋 Welcome to Foundry Local

Foundry Local brings the power of Azure AI Foundry to your local device without requiring an Azure subscription. It allows you to:

Run Generative AI models directly on your local hardware - no sign-up required.
Keep all data processing on-device for enhanced privacy and security
Integrate models with your applications through an OpenAI-compatible API
Optimize performance using ONNX Runtime and hardware acceleration

🚀 Quickstart

Install Foundry Local:
- Windows: Install Foundry Local for your architecture (x64 or arm64):
```
  winget install Microsoft.FoundryLocal
```

MacOS: Open a terminal and run the following command: bash brew install microsoft/foundrylocal/foundrylocal Alternatively, you can download the installers from the releases page and follow the on-screen installation instructions.

> [!TIP] > For any issues, refer to the Installation section below.

Run your first model: Open a terminal and run the following command to run a model:
```
foundry model run phi-3.5-mini
```

> [!NOTE] > The foundry model run command will automatically download the model if it's not already cached on your local machine, and then start an interactive chat session with the model.

Foundry Local will automatically select and download a model variant with the best performance for your hardware. For example:

if you have an Nvidia CUDA GPU, it will download the CUDA-optimized model.
if you have a Qualcomm NPU, it will download the NPU-optimized model.
if you don't have a GPU or NPU, Foundry local will download the CPU-optimized model.

🔍 Explore available models

using Microsoft.AI.Foundry.Local; using Betalgo.Ranul.OpenAI.ObjectModels.RequestModels; using Microsoft.Extensions.Logging; CancellationToken ct = new CancellationToken(); var config = new Configuration { AppName = "my-app-name", LogLevel = Microsoft.AI.Foundry.Local.LogLevel.Debug }; using var loggerFactory = LoggerFactory.Create(builder => { builder.SetMinimumLevel(Microsoft.Extensions.Logging.LogLevel.Debug); }); var logger = loggerFactory.CreateLogger(); // Initialize the singleton instance. await FoundryLocalManager.CreateAsync(config, logger); var mgr = FoundryLocalManager.Instance; // Get the model catalog var catalog = await mgr.GetCatalogAsync(); // List available models Console.WriteLine("Available models for your hardware:"); var models = await catalog.ListModelsAsync(); foreach (var availableModel in models) { foreach (var variant in availableModel.Variants) { Console.WriteLine($" - Alias: {variant.Alias} (Id: {string.Join(", ", variant.Id)})"); } } // Get a model using an alias var model = await catalog.GetModelAsync("qwen2.5-0.5b") ?? throw new Exception("Model not found"); // is model cached Console.WriteLine($"Is model cached: {await model.IsCachedAsync()}"); // print out cached models var cachedModels = await catalog.GetCachedModelsAsync(); Console.WriteLine("Cached models:"); foreach (var cachedModel in cachedModels) { Console.WriteLine($"- {cachedModel.Alias} ({cachedModel.Id})"); } // Download the model (the method skips download if already cached) await model.DownloadAsync(progress => { Console.Write($"\rDownloading model: {progress:F2}%"); if (progress >= 100f) { Console.WriteLine(); } }); // Load the model await model.LoadAsync(); // Get a chat client var chatClient = await model.GetChatClientAsync(); // Create a chat message List messages = new() { new ChatMessage { Role = "user", Content = "Why is the sky blue?" } }; var streamingResponse = chatClient.CompleteChatStreamingAsync(messages, ct); await foreach (var chunk in streamingResponse) { Console.Write(chunk.Choices[0].Message.Content); Console.Out.Flush(); } Console.WriteLine(); // Tidy up - unload the model await model.UnloadAsync();

Foundry Local Microsoft Corporation

README

👋 Welcome to Foundry Local

🚀 Quickstart

🔍 Explore available models

🧑‍💻 Integrate with your applications using the SDK

C#

Python

JavaScript

Manage

Installing

Windows

macOS

Upgrading

Uninstalling

Features & Use Cases

Reporting Issues

🎓 Learn

⚖️ License