Home/Use Cases/Desktop Automation

Desktop Automation in the Cloud

Control cloud desktops programmatically — mouse, keyboard, browser, apps. VNC streaming, screenshots, screen recording. Perfect for RPA, testing, and AI agents.

No credit card requiredFirefox & Chromium includedVNC streaming

Beyond Headless Browsers

Sometimes you need more than a headless browser. You need a full desktop:

  • Sites that detect and block headless browsers
  • Desktop apps that have no web equivalent
  • Visual testing that needs actual rendering
  • Complex multi-window workflows

Hopx Desktop Sandboxes

  • Full Linux desktop with real display
  • Control mouse, keyboard, clipboard via API
  • Firefox & Chromium for web automation
  • VNC streaming for live monitoring

What You Can Automate

Web Scraping with JS

Scrape sites that require JavaScript rendering

Browser Testing

End-to-end tests with real browser interactions

RPA Workflows

Automate repetitive desktop tasks at scale

Screenshot Services

Generate screenshots of websites or apps

Form Filling

Automate form submissions across websites

Visual Regression

Compare screenshots to detect UI changes

Full Desktop Control

Full Cloud Desktop

Complete Linux desktop with GUI, browser, and applications. Not a headless browser — a real visual environment.

Mouse & Keyboard Control

Click, type, drag, scroll — automate any desktop interaction programmatically via simple API calls.

Screenshots & Recording

Capture screenshots on demand, record screen sessions, stream VNC for live viewing.

Browser Automation

Firefox and Chromium pre-installed. Use Selenium, Playwright, or Puppeteer for web automation.

Tools Pre-installed

Desktop sandboxes come with browsers and automation tools ready to use.

FirefoxChromiumPlaywrightSeleniumPuppeteerxdotoolImageMagickffmpeg

Simple API, Powerful Automation

Control the desktop with high-level commands, or drop down to Playwright/Selenium for precise browser control.

Mouse Control

Click, double-click, drag, scroll — any mouse action

Keyboard Input

Type text, press hotkeys, handle special characters

Visual Capture

Screenshots, screen recording, VNC live streaming

desktop_automation.py
1from hopx_ai import Sandbox
2
3# Create sandbox with desktop template
4sandbox = Sandbox.create(template="desktop")
5
6# Take a screenshot of the desktop
7screenshot = sandbox.desktop.screenshot()
8screenshot.save("desktop.png")
9
10# Open Firefox and navigate to a website
11sandbox.desktop.click(x=50, y=50)  # Click Firefox icon
12sandbox.desktop.wait(2000)  # Wait for browser to open
13
14# Type URL
15sandbox.desktop.type("https://example.com")
16sandbox.desktop.press("Enter")
17sandbox.desktop.wait(3000)
18
19# Take screenshot of the loaded page
20page_screenshot = sandbox.desktop.screenshot()
21page_screenshot.save("webpage.png")
22
23# Interact with page elements
24sandbox.desktop.click(x=400, y=300)  # Click on element
25sandbox.desktop.type("Hello World")
26
27# Or use Playwright for more precise control
28result = sandbox.run_code("""
29from playwright.sync_api import sync_playwright
30
31with sync_playwright() as p:
32    browser = p.chromium.launch(headless=False)
33    page = browser.new_page()
34    page.goto('https://example.com')
35    page.screenshot(path='/workspace/screenshot.png')
36    
37    # Interact with elements
38    page.fill('input[name="search"]', 'hello world')
39    page.click('button[type="submit"]')
40    
41    browser.close()
42""")
43
44# Download the screenshot
45screenshot = sandbox.files.read("/workspace/screenshot.png")
46
47# Stream VNC for live viewing
48vnc_url = sandbox.desktop.get_vnc_url()
49print(f"Watch live: {vnc_url}")
50
51sandbox.kill()

Start Automating Desktops Today

Get $200 in free credits. Spin up desktop sandboxes and automate anything you can click.