UI TARS logo

UI TARS ByteDance

Use this command to install UI TARS:
winget install --id=ByteDance.UI-TARS -e

A GUI Agent application based on UI-TARS (Vision-Lanuage Model) that allows you to control your computer using natural language.

UI TARS is a GUI Agent application designed to enable control of your computer using natural language. This innovative tool leverages the UI-TARS Vision-Language Model to interpret user commands and perform actions on the desktop interface.

Key Features:

  • Real-time processing of visual and textual inputs
  • Integration with UI-TARS for accurate command execution
  • Support across multiple applications and languages
  • Ability to perform tasks such as launching applications, managing files, and automating workflows

Ideal for tech-savvy users and developers seeking efficient automation and control over their computing environment. The application provides a seamless way to execute complex operations through simple natural language commands.

Installation is straightforward via winget.

README

> [!IMPORTANT] > > > > > [2025-03-18] We released a technical preview version of a new desktop app - Agent TARS, a multimodal AI agent that leverages browser operations by visually interpreting web pages and seamlessly integrating with command lines and file systems.

UI-TARS Desktop

UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

   📑 Paper    | 🤗 Hugging Face Models   |   🫨 Discord   |   🤖 ModelScope  

🖥️ Desktop Application    |    👓 Midscene (use in browser)    |   

Showcases

InstructionLocal OperatorRemote Operator
Please help me open the autosave feature of VS Code and delay AutoSave operations for 500 milliseconds in the VS Code setting.
Could you help me check the latest open issue of the UI-TARS-Desktop project on GitHub?

News

  • [2025-06-12] - 🎁 We are thrilled to announce the release of UI-TARS Desktop v0.2.0! This update introduces two powerful new features: Remote Computer Operator and Remote Browser Operator—both completely free. No configuration required: simply click to remotely control any computer or browser, and experience a new level of convenience and intelligence.
  • [2025-04-17] - 🎉 We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports the advanced UI-TARS-1.5 model for improved performance and precise control.
  • [2025-02-20] - 📦 Introduced UI TARS SDK, is a powerful cross-platform toolkit for building GUI automation agents.
  • [2025-01-23] - 🚀 We updated the Cloud Deployment section in the 中文版: GUI模型部署教程 with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.

Features

  • 🤖 Natural language control powered by Vision-Language Model
  • 🖥️ Screenshot and visual recognition support
  • 🎯 Precise mouse and keyboard control
  • 💻 Cross-platform support (Windows/MacOS/Browser)
  • 🔄 Real-time feedback and status display
  • 🔐 Private and secure - fully local processing
  • 🛠️ Effortless setup and intuitive remote operators

Quick Start

See Quick Start.

Deployment

See Deployment.

Contributing

See CONTRIBUTING.md.

SDK (Experimental)

See @ui-tars/sdk

License

UI-TARS Desktop is licensed under the Apache License 2.0.

Citation

If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:

@article{qin2025ui,
  title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
  author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
  journal={arXiv preprint arXiv:2501.12326},
  year={2025}
}
Versions
0.2.0
0.1.3
0.1.2
0.1.1
0.1.0
0.0.9
0.0.8
0.0.7
0.0.6
Website
License