Hypha Navigator — AI Browser Automation (Chrome extension)
==========================================================

Drive your whole browser from an AI agent over Hypha RPC. The agent can open,
close, switch, and navigate TABS, and inspect & control the current page (read
the DOM, screenshot, click/type by index, inspect React). Works on ANY site,
including strict-CSP pages that block 'unsafe-eval'.


INSTALL (load unpacked)
-----------------------
1. Unzip this folder somewhere permanent (don't delete it after loading — Chrome
   loads the extension from this folder).
2. Open Chrome (or Edge) and go to:  chrome://extensions
3. Turn ON "Developer mode" (toggle, top-right).
4. Click "Load unpacked" and select THIS folder (the one containing manifest.json).
5. The "Hypha Navigator" extension appears. Pin it for easy access (optional).

Requires Chrome/Edge version 116 or newer.


RUN / USE
---------
1. Click the Hypha Navigator toolbar icon — a side panel opens.
2. (Optional) Set the Hypha server URL (default: https://hypha.aicell.io).
3. Click "Connect browser" — one connection controls the whole browser.
   - The side panel shows a SERVICE URL and a live activity log.
4. Copy the SERVICE URL and paste it into your AI agent.
   The agent drives the browser over HTTP, e.g.:

     curl "<SERVICE_URL>/list_tabs?_mode=last"
     curl -X POST "<SERVICE_URL>/open_tab" -d '{"url":"https://example.com"}'
     curl "<SERVICE_URL>/get_browser_state?_mode=last"   # current target tab
     curl "<SERVICE_URL>/get_skill_md?_mode=last"        # full API + usage

   The agent only needs the URL — GET <SERVICE_URL>/get_skill_md lists every
   tool (tabs + page) and how to use them.
5. Click "Disconnect" to stop.

TARGET TAB (which page the agent controls)
-------------------------------------------
Page tools act on a fixed TARGET tab — the tab that was active when you clicked
Connect. Switching tabs in your browser does NOT move it (so you can keep
browsing while the agent works on its tab). To change the target:
  - click "Pin current tab" in the side panel, or
  - have the agent call activate_tab(tab_id) or open_tab(url).
The side panel shows the current target tab. It survives the browser putting the
extension to sleep.

TOOLS (summary; get_skill_md has the full list)
-----------------------------------------------
Browser:  list_tabs, get_active_tab, open_tab, close_tab, activate_tab,
          navigate, reload_tab, go_back, go_forward
Page (act on the current target tab — set it with activate_tab / open_tab):
          get_browser_state, click_element_by_index, input_text, select_option,
          scroll, take_screenshot, query_dom, get_html, get_react_tree,
          execute_script, get_page_info, ...
Site skills (per-origin memory — the agent gets smarter/cheaper over time):
          list_site_skills, get_site_skill, set_site_skill, remove_site_skill
          (all bound to a site origin; the origin is in tab/browser info)


SITE SKILLS — the agent learns each site over time
--------------------------------------------------
Each skill is a MARKDOWN note about one type of operation on a site (searching,
exporting, creating, logging in, …) — the agent's accumulated experience: the
working JS/API recipe, key selectors, steps, and gotchas. Skills are bound to a
site ORIGIN and identified by a NAME, with a one-line description. Every operation
result carries the current origin plus a name+description index of that origin's
skills, so the agent reuses them (or records a new one) without an extra call —
fewer steps, fewer tokens. get_skill_md explains the loop (explore once → script
& batch → save the markdown skill).

Skills persist across reinstall: they're mirrored to Chrome's account-synced
storage and restored automatically when you reinstall (sign in to Chrome to sync).
You can also Export / Import all sites' skills as a JSON file from the side panel
(under "Advanced"). Import merges into what you have.


NOTES
-----
- Works on any site: the connection runs in the extension's background context,
  so the page's Content Security Policy can't block it.
- Persistent: the connection survives Chrome's service-worker sleep and is
  restored automatically.
- "execute_script" runs arbitrary JS on ANY page — even strict-CSP pages that block
  'unsafe-eval' — via the Chrome debugger (Page.setBypassCSP, like
  Puppeteer/Playwright/DevTools). The first execute_script attaches the debugger to
  the target tab, so Chrome shows a yellow "Hypha Navigator started debugging this
  browser" banner; it clears when you Disconnect. The DOM tools (get_browser_state,
  click_element_by_index, query_dom, …) don't use the debugger.

Homepage / docs:  https://amun-ai.github.io/hypha-navigator/
Source:           https://github.com/amun-ai/hypha-navigator
