Foundry Local brings the power of Azure AI Foundry to your local device. It allows you to run large language models (LLMs) directly on your hardware and keep all data processing on device for enhanced privacy and security. It also has an OpenAI compatible API and hardware acceleration out of the box, ensuring you're getting a simple and performant experience regardless of what hardware you are using.
Foundry Local is a powerful tool designed to bring the capabilities of Azure AI Foundry directly to your local device. It enables users to run large language models (LLMs) locally, ensuring all data processing occurs on-device for enhanced privacy and security.
Run your first model: Open a terminal and run the following command to run a model:
foundry model run phi-3.5-mini
> [!NOTE]
> The foundry model run command will automatically download the model if it's not already cached on your local machine, and then start an interactive chat session with the model.
Foundry Local will automatically select and download a model variant with the best performance for your hardware. For example:
if you have an Nvidia CUDA GPU, it will download the CUDA-optimized model.
if you have a Qualcomm NPU, it will download the NPU-optimized model.
if you don't have a GPU or NPU, Foundry local will download the CPU-optimized model.
Cursor is a new, intelligent IDE, empowered by seamless integrations with AI. Built upon VSCode, Cursor is quick to learn, but can make you extraordinarily productive.
LM Studio is an easy to use desktop app for experimenting with local and open-source Large Language Models (LLMs). The LM Studio cross platform desktop app allows you to download and run any ggml-compatible model from Hugging Face, and provides a simple yet powerful model configuration and inferencing UI. The app leverages your GPU when possible.
Dive is an open-source AI Agent desktop application that seamlessly integrates any Tools Call-supported LLM with frontend MCP Server—part of the Open Agent Platform initiative. ✨
Features:
- 🌐 Universal LLM Support: Compatible with ChatGPT, Anthropic, Ollama and OpenAI-compatible models
- 💻 Cross-Platform: Available for Windows, MacOS, and Linux
- 🔄 Model Context Protocol: Enabling seamless AI agent integration
- 🔌 MCP Server Integration: External data access and processing capabilities
- 🌍 Multi-Language Support: Traditional Chinese, English, with more coming soon
- ⚙️ Advanced API Management: Multiple API keys and model switching support
- 💡 Custom Instructions: Personalized system prompts for tailored AI behavior
- 💬 Intuitive Chat Interface: Real-time context management and user-friendly design
- 🚀 Upcoming Features: Prompt Schedule and OpenAgentPlatform MarketPlace
AI as Workspace - A better AI (LLM) client. Full-featured, lightweight. Support multiple workspaces, plugin system, cross-platform, local first + real-time cloud sync, Artifacts, MCP
Features:
Consistent Experience Across All Platforms
- Supported platforms: Windows, Linux, Mac OS, Android, Web (PWA)
- Multiple AI providers: OpenAI, Anthropic, Google, DeepSeek, xAI, Azure, etc.
Conversation Interface
- User input preview
- Modifications and regenerations presented as branches
- Customizable keyboard shortcuts
- Quick scrolling to the beginning/end of a message
Multiple Workspaces
- Create multiple workspaces to separate conversations by themes
- Group workspaces into folders; supports nesting
- Create multiple assistants within a workspace or global assistants
Data Storage
- Data is stored locally first, accessible offline and loads instantly
- Cloud synchronization available after login for cross-device syncing
- Multi-window collaboration: open multiple tabs in the same browser with responsive data synchronization
Design Details
- Support for text files (code, csv, etc.) as attachments; AI can see file contents and names without occupying display space
- For large text blocks, use Ctrl + V outside the input box to paste as an attachment; prevents large content from cluttering the display
- Quote content from previous messages to user inputs for targeted follow-up questions
- Select multiple lines of message text to copy the original Markdown
- Automatically wrap code pasted from VSCode in code blocks with language specification
MCP Protocol
- Support for MCP Tools, Prompts, Resources
- STDIO and SSE connection methods
- Install MCP-type plugins from the plugin marketplace or manually add MCP servers
Artifacts
- Convert any part of assistant responses into Artifacts
- User-editable with version control and code highlighting
- Control assistant read/write permissions for Artifacts
- Open multiple Artifacts simultaneously
Plugin System
- Built-in calculator, document parsing, video parsing, image generation plugins
- Install additional plugins from the marketplace
- Configure Gradio applications as plugins; compatible with some LobeChat plugins
- Plugins are more than just tool calling
Lightweight and High Performance
- Quick startup with no waiting
- Smooth conversation switching
Dynamic Prompts
- Create prompt variables using template syntax for dynamic, reusable prompts
- Extract repetitive parts into workspace variables for prompt reusability
Additional Features
Assistant marketplace, dark mode, customizable theme colors, and more
Cursor is a new, intelligent IDE, empowered by seamless integrations with AI. Built upon VSCode, Cursor is quick to learn, but can make you extraordinarily productive.
LM Studio is an easy to use desktop app for experimenting with local and open-source Large Language Models (LLMs). The LM Studio cross platform desktop app allows you to download and run any ggml-compatible model from Hugging Face, and provides a simple yet powerful model configuration and inferencing UI. The app leverages your GPU when possible.
Dive is an open-source AI Agent desktop application that seamlessly integrates any Tools Call-supported LLM with frontend MCP Server—part of the Open Agent Platform initiative. ✨
Features:
- 🌐 Universal LLM Support: Compatible with ChatGPT, Anthropic, Ollama and OpenAI-compatible models
- 💻 Cross-Platform: Available for Windows, MacOS, and Linux
- 🔄 Model Context Protocol: Enabling seamless AI agent integration
- 🔌 MCP Server Integration: External data access and processing capabilities
- 🌍 Multi-Language Support: Traditional Chinese, English, with more coming soon
- ⚙️ Advanced API Management: Multiple API keys and model switching support
- 💡 Custom Instructions: Personalized system prompts for tailored AI behavior
- 💬 Intuitive Chat Interface: Real-time context management and user-friendly design
- 🚀 Upcoming Features: Prompt Schedule and OpenAgentPlatform MarketPlace
AI as Workspace - A better AI (LLM) client. Full-featured, lightweight. Support multiple workspaces, plugin system, cross-platform, local first + real-time cloud sync, Artifacts, MCP
Features:
Consistent Experience Across All Platforms
- Supported platforms: Windows, Linux, Mac OS, Android, Web (PWA)
- Multiple AI providers: OpenAI, Anthropic, Google, DeepSeek, xAI, Azure, etc.
Conversation Interface
- User input preview
- Modifications and regenerations presented as branches
- Customizable keyboard shortcuts
- Quick scrolling to the beginning/end of a message
Multiple Workspaces
- Create multiple workspaces to separate conversations by themes
- Group workspaces into folders; supports nesting
- Create multiple assistants within a workspace or global assistants
Data Storage
- Data is stored locally first, accessible offline and loads instantly
- Cloud synchronization available after login for cross-device syncing
- Multi-window collaboration: open multiple tabs in the same browser with responsive data synchronization
Design Details
- Support for text files (code, csv, etc.) as attachments; AI can see file contents and names without occupying display space
- For large text blocks, use Ctrl + V outside the input box to paste as an attachment; prevents large content from cluttering the display
- Quote content from previous messages to user inputs for targeted follow-up questions
- Select multiple lines of message text to copy the original Markdown
- Automatically wrap code pasted from VSCode in code blocks with language specification
MCP Protocol
- Support for MCP Tools, Prompts, Resources
- STDIO and SSE connection methods
- Install MCP-type plugins from the plugin marketplace or manually add MCP servers
Artifacts
- Convert any part of assistant responses into Artifacts
- User-editable with version control and code highlighting
- Control assistant read/write permissions for Artifacts
- Open multiple Artifacts simultaneously
Plugin System
- Built-in calculator, document parsing, video parsing, image generation plugins
- Install additional plugins from the marketplace
- Configure Gradio applications as plugins; compatible with some LobeChat plugins
- Plugins are more than just tool calling
Lightweight and High Performance
- Quick startup with no waiting
- Smooth conversation switching
Dynamic Prompts
- Create prompt variables using template syntax for dynamic, reusable prompts
- Extract repetitive parts into workspace variables for prompt reusability
Additional Features
Assistant marketplace, dark mode, customizable theme colors, and more
Runs generative AI models on local hardware without requiring an Azure subscription.
Supports an OpenAI-compatible API for seamless integration with existing applications.
Offers automatic hardware acceleration optimization, leveraging CUDA GPUs, NPUs, or CPU fallback depending on your device.
Maintains data privacy by keeping all processing local.
Audience & Benefit:
Ideal for developers and businesses looking to integrate AI capabilities without relying on cloud infrastructure. Foundry Local provides a simple, performant solution for running AI models locally, enabling enhanced privacy and reducing latency for real-time applications. It can be installed via winget for easy setup on supported platforms.
🔍 Explore available models
You can list all available models by running the following command:
foundry model ls
This will show you a list of all models that can be run locally, including their names, sizes, and other details.
🧑💻 Integrate with your applications using the SDK
Foundry Local has an easy-to-use SDK (Python, JavaScript) to get you started with existing applications:
Python
The Python SDK is available as a package on PyPI. You can install it using pip:
pip install foundry-local-sdk
pip install openai
> [!TIP]
> We recommend using a virtual environment such as conda or venv to avoid conflicts with other packages.
Foundry Local provides an OpenAI-compatible API that you can call from any application:
import openai
from foundry_local import FoundryLocalManager
# By using an alias, the most suitable model will be downloaded
# to your end-user's device.
alias = "phi-3.5-mini"
# Create a FoundryLocalManager instance. This will start the Foundry
# Local service if it is not already running and load the specified model.
manager = FoundryLocalManager(alias)
# The remaining code us es the OpenAI Python SDK to interact with the local model.
# Configure the client to use the local Foundry service
client = openai.OpenAI(
base_url=manager.endpoint,
api_key=manager.api_key # API key is not required for local usage
)
# Set the model to use and generate a streaming response
stream = client.chat.completions.create(
model=manager.get_model_info(alias).id,
messages=[{"role": "user", "content": "What is the golden ratio?"}],
stream=True
)
# Print the streaming response
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
JavaScript
The JavaScript SDK is available as a package on npm. You can install it using npm:
npm install foundry-local-sdk
npm install openai
import { OpenAI } from "openai";
import { FoundryLocalManager } from "foundry-local-sdk";
// By using an alias, the most suitable model will be downloaded
// to your end-user's device.
// TIP: You can find a list of available models by running the
// following command in your terminal: `foundry model list`.
const alias = "phi-3.5-mini";
// Create a FoundryLocalManager instance. This will start the Foundry
// Local service if it is not already running.
const foundryLocalManager = new FoundryLocalManager()
// Initialize the manager with a model. This will download the model
// if it is not already present on the user's device.
const modelInfo = await foundryLocalManager.init(alias)
console.log("Model Info:", modelInfo)
const openai = new OpenAI({
baseURL: foundryLocalManager.endpoint,
apiKey: foundryLocalManager.apiKey,
});
async function streamCompletion() {
const stream = await openai.chat.completions.create({
model: modelInfo.id,
messages: [{ role: "user", content: "What is the golden ratio?" }],
stream: true,
});
for await (const chunk of stream) {
if (chunk.choices[0]?.delta?.content) {
process.stdout.write(chunk.choices[0].delta.content);
}
}
}
streamCompletion();
Manage
This section provides an overview of how to manage Foundry Local, including installation, upgrading, and removing the application.
Installing
Foundry Local is available for Windows and macOS (Apple silicon only). You can install it using package managers or manually download the installer.
Windows
You can install Foundry Local using the following command in a Windows console (PowerShell, cmd, etc.):
winget install Microsoft.FoundryLocal
Alternatively, you can also manually download and install the packages. On the releases page
select a release and expand the Artifacts list. Copy the artifact full URI (for example: https://github.com/microsoft/Foundry-Local/releases/download/v0.3.9267/FoundryLocal-x64-0.3.9267.43123.msix)
to use in the below PowerShell steps. Replace x64 with arm64 as needed.
# Download the package and its dependency
$releaseUri = "https://github.com/microsoft/Foundry-Local/releases/download/v0.3.9267/FoundryLocal-x64-0.3.9267.43123.msix"
Invoke-WebRequest -Method Get -Uri $releaseUri -OutFile .\FoundryLocal.msix
$crtUri = "https://aka.ms/Microsoft.VCLibs.x64.14.00.Desktop.appx"
Invoke-WebRequest -Method Get -Uri $crtUri -OutFile .\VcLibs.appx
# Install the Foundry Local package
Add-AppxPackage .\FoundryLocal.msix -DependencyPath .\VcLibs.appx
If you're having problems installing Foundry, please file an issue
and include logs using one of these methods:
For WinGet - use winget install Microsoft.FoundryLocal --logs --verbose - select the most-recently-dated log file
and attach it to the issue.
For Add-AppxPackage - immediately after it indicates an error, in an elevated PowerShell instance, use
Get-MsixLogs | Out-File MsixLogs.txt and attach it to the issue.
Use Windows Feedback Hub and create a Problem in the "Apps > All other apps" category. Use the
"Add More Details > Recreate my problem" and re-run the failing commands to collect more data. Once your feedback
is submitted, use the "Share" option to generate a link and put that into the filed issue.
> [!NOTE]
> Log files may contain information like user names, IP addresses, file paths, etc. Be sure to remove those
> before sharing here.
macOS
Install Foundry Local using the following commands in your terminal:
brew tap microsoft/foundrylocal
brew install foundrylocal
Alternatively, you can also manually download and install the packages by following these steps:
To uninstall Foundry Local, run the following command in your terminal:
Windows: You can uninstall Foundry Local using winget in a Windows console (PowerShell, cmd, etc.):
winget uninstall Microsoft.FoundryLocal
Alternatively, you can also uninstall Foundry Local by navigating to Settings > Apps > Apps & features in Windows, finding "Foundry Local" in the list, and selecting the ellipsis (...) followed by Uninstall.
macOS: If you installed Foundry Local using Homebrew, you can uninstall it with the following command:
HyperChat is an open Chat client that can use various LLM APIs to provide the best Chat experience and implement productivity tools through the MCP protocol.
- Supports OpenAI-style LLMs, OpenAI, Claude(OpenRouter), Qwen, Deepseek, GLM, Ollama.
- Built-in MCP plugin market with user-friendly MCP installation configuration, one-click installation, and welcome to submit HyperChatMCP.
- Also supports manual installation of third-party MCPs; simply fill in command, args, and env.
HyperChat is an open Chat client that can use various LLM APIs to provide the best Chat experience and implement productivity tools through the MCP protocol.
- Supports OpenAI-style LLMs, OpenAI, Claude(OpenRouter), Qwen, Deepseek, GLM, Ollama.
- Built-in MCP plugin market with user-friendly MCP installation configuration, one-click installation, and welcome to submit HyperChatMCP.
- Also supports manual installation of third-party MCPs; simply fill in command, args, and env.
Large Language Models (LLMs) based AI bots are amazing. However, their behavior can be random and different bots excel at different tasks. If you want the best experience, don't try them one by one. ChatALL (Chinese name: 齐叨) can send prompt to several AI bots concurrently, help you to discover the best results. All you need to do is download, install and ask.
Large Language Models (LLMs) based AI bots are amazing. However, their behavior can be random and different bots excel at different tasks. If you want the best experience, don't try them one by one. ChatALL (Chinese name: 齐叨) can send prompt to several AI bots concurrently, help you to discover the best results. All you need to do is download, install and ask.
Are you tired of using chatbots that invade your privacy and store your data indefinitely? Look no further! My DxGPTAi is here to provide you with a secure and reliable chatbot experience. 💬 With DxGPTAi, you can enjoy conversations without worrying about your data being mishandled. Our platform is designed to delete all temporarily stored information after shutdown, ensuring your privacy is protected to the fullest.
🎙️ And that's not all! We've also added a microphone transcription feature to make your experience even more convenient. With just a few clicks, you can chat with your ChatGPT using your voice instead of typing.
🔥 Plus, we've included a range of other cool features to make your chatbot experience even better. Download our free API from the official ChatGPT website and start chatting today!
Are you tired of using chatbots that invade your privacy and store your data indefinitely? Look no further! My DxGPTAi is here to provide you with a secure and reliable chatbot experience. 💬 With DxGPTAi, you can enjoy conversations without worrying about your data being mishandled. Our platform is designed to delete all temporarily stored information after shutdown, ensuring your privacy is protected to the fullest.
🎙️ And that's not all! We've also added a microphone transcription feature to make your experience even more convenient. With just a few clicks, you can chat with your ChatGPT using your voice instead of typing.
🔥 Plus, we've included a range of other cool features to make your chatbot experience even better. Download our free API from the official ChatGPT website and start chatting today!
An open-source, modern-design AI chat framework. Supports Multi AI Providers (OpenAI / Claude 4 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG), Multi-Modals (Plugins/Artifacts) and Thinking.
An open-source, modern-design AI chat framework. Supports Multi AI Providers (OpenAI / Claude 4 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG), Multi-Modals (Plugins/Artifacts) and Thinking.
Transformer Lab is a free, open-source LLM workspace that you can run on your own computer. It is designed to go beyond what most modern open LLM applications allow. Using Transformer Lab you can easily finetune, evaluate, export and test LLMs across different inference engines and platforms.
Transformer Lab is a free, open-source LLM workspace that you can run on your own computer. It is designed to go beyond what most modern open LLM applications allow. Using Transformer Lab you can easily finetune, evaluate, export and test LLMs across different inference engines and platforms.
Transformer Lab is a free, open-source LLM workspace that you can run on your own computer. It is designed to go beyond what most modern open LLM applications allow. Using Transformer Lab you can easily finetune, evaluate, export and test LLMs across different inference engines and platforms.