A GUI Agent application based on UI-TARS (Vision-Lanuage Model) that allows you to control your computer using natural language.
UI TARS is a GUI Agent application designed to enable control of your computer using natural language. This innovative tool leverages the UI-TARS Vision-Language Model to interpret user commands and perform actions on the desktop interface.
Key Features:
Real-time processing of visual and textual inputs
Integration with UI-TARS for accurate command execution
Support across multiple applications and languages
Ability to perform tasks such as launching applications, managing files, and automating workflows
Ideal for tech-savvy users and developers seeking efficient automation and control over their computing environment. The application provides a seamless way to execute complex operations through simple natural language commands.
TARS* is a Multimodal AI Agent stack, currently shipping two projects: Agent TARS and UI-TARS-desktop:
<a href="#agent-tars">Agent TARS</a>
<a href="#ui-tars-desktop">UI-TARS-desktop</a>
<b>Agent TARS</b> is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.
It primarily ships with a <a href="https://agent-tars.com/guide/basic/cli.html">CLI</a> and <a href="https://agent-tars.com/guide/basic/web-ui.html">Web UI</a> for usage.
It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world <a href="https://agent-tars.com/guide/basic/mcp.html">MCP</a> tools.
<b>UI-TARS Desktop</b> is a desktop application that provides a native GUI Agent based on the <a href="https://github.com/bytedance/UI-TARS">UI-TARS</a> model.
It primarily ships a
<a href="https://github.com/bytedance/UI-TARS-desktop/blob/main/docs/quick-start.md#get-model-and-run-local-operator">local</a> and
<a href="https://github.com/bytedance/UI-TARS-desktop/blob/main/docs/quick-start.md#run-remote-operator">remote</a> computer as well as browser operators.
[2025-06-25] We released a Agent TARS Beta and Agent TARS CLI - Introducing Agent TARS Beta, a multimodal AI agent that aims to explore a work form that is closer to human-like task completion through rich multimodal capabilities (such as GUI Agent, Vision) and seamless integration with various real-world tools.
[2025-06-12] - 🎁 We are thrilled to announce the release of UI-TARS Desktop v0.2.0! This update introduces two powerful new features: Remote Computer Operator and Remote Browser Operator—both completely free. No configuration required: simply click to remotely control any computer or browser, and experience a new level of convenience and intelligence.
[2025-04-17] - 🎉 We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports the advanced UI-TARS-1.5 model for improved performance and precise control.
[2025-02-20] - 📦 Introduced UI TARS SDK, is a powerful cross-platform toolkit for building GUI automation agents.
[2025-01-23] - 🚀 We updated the Cloud Deployment section in the 中文版: GUI模型部署教程 with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.
Agent TARS
Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.
It primarily ships with a CLI and Web UI for usage.
It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools.
Showcase
Please help me book the earliest flight from San Jose to New York on September 1st and the last return flight on September 6th on Priceline
Booking Hotel
Generate Chart with extra MCP Servers
<b>Instruction:</b> <i>I am in Los Angeles from September 1st to September 6th, with a budget of $5,000. Please help me book a Ritz-Carlton hotel closest to the airport on booking.com and compile a transportation guide for me</i>
<b>Instruction:</b> <i>Draw me a chart of Hangzhou's weather for one month</i>
🧰 MCP Integration - The kernel is built on MCP and also supports mounting MCP Servers to connect to real-world tools.
Quick Start
# Luanch with `npx`.
npx @agent-tars/cli@latest
# Install globally, required Node.js >= 22
npm install @agent-tars/cli@latest -g
# Run with your preferred model provider
agent-tars --provider volcengine --model doubao-1-5-thinking-vision-pro-250428 --apiKey your-api-key
agent-tars --provider anthropic --model claude-3-7-sonnet-latest --apiKey your-api-key
Visit the comprehensive Quick Start guide for detailed setup instructions.
Documentation
> 🌟 Explore Agent TARS Universe 🌟
Category
Resource Link
Description
🏠 <strong>Central Hub</strong>
<a href="https://agent-tars.com">
<img src="https://img.shields.io/badge/Visit-Website-4F46E5?style=for-the-badge&logo=globe&logoColor=white" />
</a>
Your gateway to Agent TARS ecosystem
📚 <strong>Quick Start</strong>
<a href="https://agent-tars.com/guide/get-started/quick-start.html">
<img src="https://img.shields.io/badge/Get-Started-06B6D4?style=for-the-badge&logo=rocket&logoColor=white" />
</a>
Zero to hero in 5 minutes
🚀 <strong>What's New</strong>
<a href="https://agent-tars.com/beta">
<img src="https://img.shields.io/badge/Read-Blog-F59E0B?style=for-the-badge&logo=rss&logoColor=white" />
</a>
Discover cutting-edge features & vision
🛠️ <strong>Developer Zone</strong>
<a href="https://agent-tars.com/guide/get-started/introduction.html">
<img src="https://img.shields.io/badge/View-Docs-10B981?style=for-the-badge&logo=gitbook&logoColor=white" />
</a>
Master every command & features
🎯 <strong>Showcase</strong>
<a href="https://github.com/bytedance/UI-TARS-desktop/issues/842">
<img src="https://img.shields.io/badge/View-Examples-8B5CF6?style=for-the-badge&logo=github&logoColor=white" />
</a>
View use cases built by the official and community
🔧 <strong>Reference</strong>
<a href="https://agent-tars.com/api/">
<img src="https://img.shields.io/badge/API-Reference-EF4444?style=for-the-badge&logo=book&logoColor=white" />
</a>
Complete technical reference
UI-TARS Desktop
UI-TARS Desktop is a native GUI agent for your local computer, driven by UI-TARS and Seed-1.5-VL/1.6 series models.
This project is licensed under the Apache License 2.0.
Citation
If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:
@article{qin2025ui,
title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
journal={arXiv preprint arXiv:2501.12326},
year={2025}
}