ocs_ci.ocs.ui.llm_tools package
Submodules
ocs_ci.ocs.ui.llm_tools.llm_helper module
- class ocs_ci.ocs.ui.llm_tools.llm_helper.ClaudeClient(model=None)
Bases:
LLMClientUses the Claude CLI (
claude) as an LLM backend for vision-based UI analysis.The CLI must be installed and authenticated on the machine. Each query runs
claude -pin non-interactive single-shot mode with--allowedTools Readso the CLI can read image files from disk.- DEFAULT_VARIANT = 'sonnet'
- VARIANT_MAP = {'haiku': 'claude-haiku-4-5', 'opus': 'claude-opus-4-6', 'sonnet': 'claude-sonnet-4-5'}
- is_available()
Checks if the
claudeCLI is installed and responsive.- Returns:
True if
claude --versionexits with code 0.- Return type:
bool
- property model_name
Returns the full model name for the current variant.
- query_dom(prompt)
Sends a text-only prompt to the Claude CLI (no image, no tools).
- Parameters:
prompt (str) – The prompt to send (typically contains DOM HTML).
- Returns:
The LLM’s text response.
- Return type:
str
- query_screenshot(screenshot_path, prompt)
Sends a screenshot to the Claude CLI for analysis.
- Parameters:
screenshot_path (str) – Path to the screenshot PNG file.
prompt (str) – The prompt to send along with the image.
- Returns:
The LLM’s text response.
- Return type:
str
- Raises:
RuntimeError – If the CLI fails, times out, or reports an error.
- class ocs_ci.ocs.ui.llm_tools.llm_helper.LLMClient
Bases:
ABCAbstract base class for LLM backends used in vision-based UI analysis.
Subclasses must implement
is_available,query_screenshot, andquery_dom. JSON parsing and multi-screenshot merging are provided by the base class.- Cost tracking attributes:
total_cost_usd (float): Cumulative cost across all requests. total_requests (int): Total number of LLM requests made. last_request_cost_usd (float): Cost of the most recent request.
- abstract is_available()
Checks whether the LLM backend is reachable and ready.
- Returns:
True if the backend can accept queries.
- Return type:
bool
- abstract query_dom(prompt)
Sends a text-only prompt to the LLM and returns the raw text response.
- Parameters:
prompt (str) – The prompt to send (typically contains DOM HTML).
- Returns:
The LLM’s text response.
- Return type:
str
- abstract query_screenshot(screenshot_path, prompt)
Sends a single image and prompt to the LLM and returns the raw text response.
- Parameters:
screenshot_path (str) – Path to the screenshot PNG file.
prompt (str) – The prompt to send along with the image.
- Returns:
The LLM’s text response.
- Return type:
str
- query_screenshot_json(screenshot_paths, prompt)
Queries the LLM with one or more screenshots and returns merged JSON.
When multiple screenshot paths are provided, each is queried separately with the same prompt, and the resulting JSON dicts are merged. Later screenshots fill in keys that were empty or missing from earlier ones.
- Parameters:
screenshot_paths (str or list) – Path(s) to the screenshot PNG file(s).
prompt (str) – The prompt to send along with the image(s).
- Returns:
Merged JSON response from all screenshots.
- Return type:
dict
- class ocs_ci.ocs.ui.llm_tools.llm_helper.OllamaClient(model=None, host=None)
Bases:
LLMClientManages communication with a local ollama instance for vision-based UI analysis.
- is_available()
Checks if ollama is running and the required model is pulled.
- Returns:
True if ollama is reachable and the model is available.
- Return type:
bool
- query_dom(prompt)
Sends a text-only prompt to ollama and returns the raw text response.
- Parameters:
prompt (str) – The prompt to send (typically contains DOM HTML).
- Returns:
The LLM’s text response.
- Return type:
str
- query_screenshot(screenshot_path, prompt)
Sends a single image and prompt to ollama and returns the raw text response.
- Parameters:
screenshot_path (str) – Path to the screenshot PNG file.
prompt (str) – The prompt to send along with the image.
- Returns:
The LLM’s text response.
- Return type:
str
- ocs_ci.ocs.ui.llm_tools.llm_helper.ask_llm_about_screen(prompt='', model=None)
Takes screenshots and queries the LLM about them in one call.
- Parameters:
prompt (str) – The question to ask the LLM about the screenshot.
model (str) – The LLM model to use. If None, reads from config.
- Returns:
The LLM’s text response about the screenshot.
- Return type:
str
- ocs_ci.ocs.ui.llm_tools.llm_helper.get_llm_client(model=None)
Factory function that returns the appropriate LLMClient based on the model string.
- Parameters:
model (str) – Model identifier. If it starts with
"claude"aClaudeClientis returned; otherwise anOllamaClient. Falls back to the value ofconfig.UI_SELENIUM["llm_model"]when model isNone.- Returns:
An instance of the selected backend.
- Return type:
ocs_ci.ocs.ui.llm_tools.locator_fallback module
- class ocs_ci.ocs.ui.llm_tools.locator_fallback.LocatorFallback(driver)
Bases:
objectAI-powered locator fallback for Selenium UI tests.
When a locator fails, the DOM (and optionally a screenshot) is sent to an LLM which generates a replacement locator. Results are cached per-test and accumulated in a session-wide cache for reuse across tests.
- attempt_fallback(locator, action='interact', stack_trace=None)
Attempts to find a replacement locator using LLM analysis.
- Parameters:
locator (tuple) – Original (selector, By) tuple that failed.
action (str) – The action that was being performed (click, send_keys, etc.).
stack_trace (str) – Full Python traceback captured at the point of failure.
- Returns:
(selector, by_type) replacement locator, or None if fallback fails.
- Return type:
tuple
- property client
- log_cost_summary()
Logs a final cost summary for the entire test.
Call this at test teardown to get a complete picture of AI fallback costs incurred during the test run.
- ocs_ci.ocs.ui.llm_tools.locator_fallback.get_session_cache_path()
Returns the path to the session-wide locator cache file.