Foundry Local is a powerful tool designed to bring the capabilities of Azure AI Foundry directly to your local device. It enables users to run large language models (LLMs) locally, ensuring all data processing occurs on-device for enhanced privacy and security.
Key Features:
Runs generative AI models on local hardware without requiring an Azure subscription.
Supports an OpenAI-compatible API for seamless integration with existing applications.
Offers automatic hardware acceleration optimization, leveraging CUDA GPUs, NPUs, or CPU fallback depending on your device.
Maintains data privacy by keeping all processing local.
Audience & Benefit:
Ideal for developers and businesses looking to integrate AI capabilities without relying on cloud infrastructure. Foundry Local provides a simple, performant solution for running AI models locally, enabling enhanced privacy and reducing latency for real-time applications. It can be installed via winget for easy setup on supported platforms.
Foundry Local lets you embed generative AI directly into your applications — no cloud or server calls required. All inference runs on-device, which means user data never leaves the device, responses start immediately with zero network latency, and your app works offline. No per-token costs, no backend infrastructure to maintain.
Key benefits include:
Self-contained SDK — Ship AI features without requiring users to install any external dependencies.
Chat AND Audio in one runtime — Text generation and speech-to-text (Whisper) through a single SDK — no need for separate tools like whisper.cpp + llama.cpp.
Easy-to-use CLI — Explore models and experiment locally before integrating with your app.
Optimized models out-of-the-box — State-of-the-art quantization and compression deliver both performance and quality.
Small footprint — Leverages ONNX Runtime; a high performance inference runtime (written in C++) that has minimal disk and memory requirements.
Automatic hardware acceleration — Leverage GPUs and NPUs when available, with seamless fallback to CPU. Zero hardware detection code needed.
Model distribution — Popular open-source models hosted in the cloud with automatic downloading and updating.
Multi-platform support — Windows, macOS (Apple silicon), Linux and Android.
Bring your own models — Add and run custom models alongside the built-in catalog.
Supported Tasks
Task
Model Aliases
API
Chat / Text Generation
phi-3.5-mini, qwen2.5-0.5b, qwen2.5-coder-0.5b, etc.
> [!NOTE]
> Foundry Local is a unified local AI runtime — it replaces the need for separate tools like whisper.cpp, llama.cpp, or ollama. One SDK handles both chat and audio, with automatic hardware acceleration (NPU > GPU > CPU).
🚀 Quickstart
Explore with the CLI
The Foundry Local CLI is a great way to explore models and test features before integrating with your app.
Install the CLI to explore models interactively before integrating with your app.
The Foundry Local SDK makes it easy to integrate local AI models into your applications. Below are quickstart examples for JavaScript, C# and Python.
> [!TIP]
> For the JavaScript and C# SDKs you do not require the CLI to be installed. The Python SDK has a dependency on the CLI but a native in-process SDK is coming soon.
JavaScript
Install the SDK using npm:
npm install foundry-local-sdk
> [!NOTE]
> On Windows, NPU models are not currently available for the JavaScript SDK. These will be enabled in a subsequent release.
Use the SDK in your application as follows:
import { FoundryLocalManager } from 'foundry-local-sdk';
const manager = FoundryLocalManager.create({ appName: 'foundry_local_samples' });
// Download and load a model (auto-selects best variant for user's hardware)
const model = await manager.catalog.getModel('qwen2.5-0.5b');
await model.download((progress) => {
process.stdout.write(`\rDownloading... ${progress.toFixed(2)}%`);
});
await model.load();
// Create a chat client and get a completion
const chatClient = model.createChatClient();
const response = await chatClient.completeChat([
{ role: 'user', content: 'What is the golden ratio?' }
]);
console.log(response.choices[0]?.message?.content);
// Unload the model when done
await model.unload();
On Windows, we recommend using the Microsoft.AI.Foundry.Local.WinML package, which will enable wider hardware acceleration support.
Use the SDK in your application as follows:
using Microsoft.AI.Foundry.Local;
var config = new Configuration { AppName = "foundry_local_samples" };
await FoundryLocalManager.CreateAsync(config);
var mgr = FoundryLocalManager.Instance;
// Download and load a model (auto-selects best variant for user's hardware)
var catalog = await mgr.GetCatalogAsync();
var model = await catalog.GetModelAsync("qwen2.5-0.5b");
await model.DownloadAsync();
await model.LoadAsync();
// Create a chat client and get a streaming completion
var chatClient = await model.GetChatClientAsync();
var messages = new List
{
new() { Role = "user", Content = "What is the golden ratio?" }
};
await foreach (var chunk in chatClient.CompleteChatStreamingAsync(messages))
{
Console.Write(chunk.Choices[0].Message.Content);
}
// Unload the model when done
await model.Unload();
Python
NOTE: The Python SDK currently relies on the Foundry Local CLI and uses the OpenAI-compatible REST API. A native in-process SDK (matching JS/C#) is coming soon.
Install the SDK using pip:
pip install foundry-local-sdk openai
Use the SDK in your application as follows:
import openai
from foundry_local import FoundryLocalManager
# Initialize manager (starts local service and loads model)
manager = FoundryLocalManager("phi-3.5-mini")
# Use the OpenAI SDK pointed at your local endpoint
client = openai.OpenAI(base_url=manager.endpoint, api_key=manager.api_key)
response = client.chat.completions.create(
model=manager.get_model_info("phi-3.5-mini").id,
messages=[{"role": "user", "content": "What is the golden ratio?"}]
)
print(response.choices[0].message.content)
More samples
Explore complete working examples in the samples/ folder:
The SDK also supports audio transcription via Whisper models. Use model.createAudioClient() to transcribe audio files on-device:
import { FoundryLocalManager } from 'foundry-local-sdk';
const manager = FoundryLocalManager.create({ appName: 'MyApp' });
// Download and load the Whisper model
const whisperModel = await manager.catalog.getModel('whisper-tiny');
await whisperModel.download();
await whisperModel.load();
// Transcribe an audio file
const audioClient = whisperModel.createAudioClient();
audioClient.settings.language = 'en';
const result = await audioClient.transcribe('recording.wav');
console.log('Transcription:', result.text);
// Or stream in real-time
await audioClient.transcribeStreaming('recording.wav', (chunk) => {
process.stdout.write(chunk.text);
});
await whisperModel.unload();
> [!TIP]
> A single FoundryLocalManager can manage both chat and audio models simultaneously. See the chat-and-audio sample for a complete example that transcribes audio then analyzes it with a chat model.
Manage
This section provides an overview of how to manage Foundry Local, including installation, upgrading, and removing the application.
Installing
Foundry Local is available for Windows and macOS (Apple silicon only). You can install it using package managers or manually download the installer.
Windows
You can install Foundry Local using the following command in a Windows console (PowerShell, cmd, etc.):
winget install Microsoft.FoundryLocal
Alternatively, you can also manually download and install the packages. On the releases page
select a release and expand the Artifacts list. Copy the artifact full URI (for example: https://github.com/microsoft/Foundry-Local/releases/download/v0.3.9267/FoundryLocal-x64-0.3.9267.43123.msix)
to use in the below PowerShell steps. Replace x64 with arm64 as needed.
# Download the package and its dependency
$releaseUri = "https://github.com/microsoft/Foundry-Local/releases/download/v0.3.9267/FoundryLocal-x64-0.3.9267.43123.msix"
Invoke-WebRequest -Method Get -Uri $releaseUri -OutFile .\FoundryLocal.msix
$crtUri = "https://aka.ms/Microsoft.VCLibs.x64.14.00.Desktop.appx"
Invoke-WebRequest -Method Get -Uri $crtUri -OutFile .\VcLibs.appx
# Install the Foundry Local package
Add-AppxPackage .\FoundryLocal.msix -DependencyPath .\VcLibs.appx
If you're having problems installing Foundry, please file an issue
and include logs using one of these methods:
For WinGet - use winget install Microsoft.FoundryLocal --logs --verbose - select the most-recently-dated log file
and attach it to the issue.
For Add-AppxPackage - immediately after it indicates an error, in an elevated PowerShell instance, use
Get-MsixLogs | Out-File MsixLogs.txt and attach it to the issue.
Use Windows Feedback Hub and create a Problem in the "Apps > All other apps" category. Use the
"Add More Details > Recreate my problem" and re-run the failing commands to collect more data. Once your feedback
is submitted, use the "Share" option to generate a link and put that into the filed issue.
> [!NOTE]
> Log files may contain information like user names, IP addresses, file paths, etc. Be sure to remove those
> before sharing here.
macOS
Install Foundry Local using the following command in your terminal:
brew install microsoft/foundrylocal/foundrylocal
Alternatively, you can also manually download and install the packages by following these steps:
To uninstall Foundry Local, run the following command in your terminal:
Windows: You can uninstall Foundry Local using winget in a Windows console (PowerShell, cmd, etc.):
winget uninstall Microsoft.FoundryLocal
Alternatively, you can also uninstall Foundry Local by navigating to Settings > Apps > Apps & features in Windows, finding "Foundry Local" in the list, and selecting the ellipsis (...) followed by Uninstall.
macOS: If you installed Foundry Local using Homebrew, you can uninstall it with the following command:
Foundry Local Lab: This GitHub repository contains a lab designed to help you learn how to use Foundry Local effectively. It includes hands-on exercises, sample code, and step-by-step instructions to guide you through the process of setting up and using Foundry Local in various scenarios.
⚖️ License
Foundry Local is licensed under the Microsoft Software License Terms. For more details, read the LICENSE file.