Skip to content

ivikash/OpenManus

Repository files navigation

OpenManus

A web application that allows users to input prompts, which are then used to run browser-use automations. The application displays the automation steps in a chat panel in real-time.

Architecture

For detailed architecture information, including data flow diagrams and dependency chains, see ARCHITECTURE.md.

Frontend

  • Next.js with React and TypeScript
  • Shadcn UI components
  • Real-time updates with WebSockets

Backend

  • Node.js WebSocket server
  • Python automation service
  • Integration with browser-use library

Setup

Prerequisites

  • Node.js 18+
  • Python 3.9+
  • uv (optional, for faster Python package installation)

Installation

  1. Clone the repository
git clone https://github.com/yourusername/browser-use-demo.git
cd browser-use-demo
  1. Install Node.js dependencies
npm install
  1. Install Python package
# Using uv (recommended)
pip install uv
cd python_agent
uv pip install -e .
cd ..

# Or using pip
cd python_agent
pip install -e .
cd ..
  1. Create a .env file based on .env.example
cp .env.example .env
  1. Update the .env file with your configuration
PORT=3001
OPENAI_API_KEY=your_openai_api_key_if_using_openai

Running the Application

  1. Start the FastAPI server
npm run server
  1. In a separate terminal, start the frontend development server
npm run dev
  1. Open your browser and navigate to http://localhost:3000

Project Structure

├── python_agent/               # Python package for browser automation
│   ├── browser_automation_agent/
│   │   ├── __init__.py
│   │   ├── agent.py           # Main agent implementation
│   │   ├── cli.py             # Command-line interface
│   │   ├── server.py          # FastAPI server
│   │   ├── models.py          # Pydantic models
│   │   └── logger.py          # Logging utilities
│   ├── setup.py               # Package setup file
│   ├── pyproject.toml         # Python project configuration
│   └── requirements.txt       # Python dependencies
├── src/
│   ├── app/                   # Next.js app directory
│   ├── components/            # React components
│   └── lib/                   # Utility functions and libraries
├── public/                    # Static assets
└── .env                       # Environment variables

Configuration Options

The application supports various configuration options that can be set from the frontend:

Model Provider

  • Ollama (local models)
  • OpenAI (cloud models)
  • AWS Bedrock (cloud models)

Models

  • For Ollama: llama3.2, llama3, llama2
  • For OpenAI: gpt-4o, gpt-4-turbo, gpt-4
  • For AWS Bedrock: Claude 3 (Sonnet, Haiku, Opus), Llama 3 70B

Browser Options

  • Vision capabilities (on/off)
  • Headless mode (on/off)
  • Browser type (chromium, firefox, webkit)

Development

Running in Debug Mode

npm run server:debug

Code Quality

# Format Python code
cd python_agent
ruff format .

# Lint Python code
ruff check .

# Lint TypeScript/JavaScript code
npm run lint

Commit Convention

This project uses conventional commits. Please follow the commit convention.

API Documentation

When running the FastAPI server, API documentation is available at:

WebSocket API

The WebSocket API allows real-time communication between the client and server:

Client to Server Events

  • prompt:submit: Submit a prompt for automation
  • automation:stop: Stop the current automation

Server to Client Events

  • automation:log: Log message from the automation
  • automation:complete: Automation completed
  • automation:error: Error in automation

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published