ocs_ci.ocs.ui.llm_tools package

Submodules

ocs_ci.ocs.ui.llm_tools.llm_helper module

class ocs_ci.ocs.ui.llm_tools.llm_helper.ClaudeClient(model=None)

Bases: LLMClient

Uses the Claude CLI (claude) as an LLM backend for vision-based UI analysis.

The CLI must be installed and authenticated on the machine. Each query runs claude -p in non-interactive single-shot mode with --allowedTools Read so the CLI can read image files from disk.

DEFAULT_VARIANT = 'sonnet'
VARIANT_MAP = {'haiku': 'claude-haiku-4-5', 'opus': 'claude-opus-4-6', 'sonnet': 'claude-sonnet-4-5'}
is_available()

Checks if the claude CLI is installed and responsive.

Returns:

True if claude --version exits with code 0.

Return type:

bool

property model_name

Returns the full model name for the current variant.

query_dom(prompt)

Sends a text-only prompt to the Claude CLI (no image, no tools).

Parameters:

prompt (str) – The prompt to send (typically contains DOM HTML).

Returns:

The LLM’s text response.

Return type:

str

query_screenshot(screenshot_path, prompt)

Sends a screenshot to the Claude CLI for analysis.

Parameters:
  • screenshot_path (str) – Path to the screenshot PNG file.

  • prompt (str) – The prompt to send along with the image.

Returns:

The LLM’s text response.

Return type:

str

Raises:

RuntimeError – If the CLI fails, times out, or reports an error.

class ocs_ci.ocs.ui.llm_tools.llm_helper.LLMClient

Bases: ABC

Abstract base class for LLM backends used in vision-based UI analysis.

Subclasses must implement is_available, query_screenshot, and query_dom. JSON parsing and multi-screenshot merging are provided by the base class.

Cost tracking attributes:

total_cost_usd (float): Cumulative cost across all requests. total_requests (int): Total number of LLM requests made. last_request_cost_usd (float): Cost of the most recent request.

abstract is_available()

Checks whether the LLM backend is reachable and ready.

Returns:

True if the backend can accept queries.

Return type:

bool

abstract query_dom(prompt)

Sends a text-only prompt to the LLM and returns the raw text response.

Parameters:

prompt (str) – The prompt to send (typically contains DOM HTML).

Returns:

The LLM’s text response.

Return type:

str

abstract query_screenshot(screenshot_path, prompt)

Sends a single image and prompt to the LLM and returns the raw text response.

Parameters:
  • screenshot_path (str) – Path to the screenshot PNG file.

  • prompt (str) – The prompt to send along with the image.

Returns:

The LLM’s text response.

Return type:

str

query_screenshot_json(screenshot_paths, prompt)

Queries the LLM with one or more screenshots and returns merged JSON.

When multiple screenshot paths are provided, each is queried separately with the same prompt, and the resulting JSON dicts are merged. Later screenshots fill in keys that were empty or missing from earlier ones.

Parameters:
  • screenshot_paths (str or list) – Path(s) to the screenshot PNG file(s).

  • prompt (str) – The prompt to send along with the image(s).

Returns:

Merged JSON response from all screenshots.

Return type:

dict

class ocs_ci.ocs.ui.llm_tools.llm_helper.OllamaClient(model=None, host=None)

Bases: LLMClient

Manages communication with a local ollama instance for vision-based UI analysis.

is_available()

Checks if ollama is running and the required model is pulled.

Returns:

True if ollama is reachable and the model is available.

Return type:

bool

query_dom(prompt)

Sends a text-only prompt to ollama and returns the raw text response.

Parameters:

prompt (str) – The prompt to send (typically contains DOM HTML).

Returns:

The LLM’s text response.

Return type:

str

query_screenshot(screenshot_path, prompt)

Sends a single image and prompt to ollama and returns the raw text response.

Parameters:
  • screenshot_path (str) – Path to the screenshot PNG file.

  • prompt (str) – The prompt to send along with the image.

Returns:

The LLM’s text response.

Return type:

str

ocs_ci.ocs.ui.llm_tools.llm_helper.ask_llm_about_screen(prompt='', model=None)

Takes screenshots and queries the LLM about them in one call.

Parameters:
  • prompt (str) – The question to ask the LLM about the screenshot.

  • model (str) – The LLM model to use. If None, reads from config.

Returns:

The LLM’s text response about the screenshot.

Return type:

str

ocs_ci.ocs.ui.llm_tools.llm_helper.get_llm_client(model=None)

Factory function that returns the appropriate LLMClient based on the model string.

Parameters:

model (str) – Model identifier. If it starts with "claude" a ClaudeClient is returned; otherwise an OllamaClient. Falls back to the value of config.UI_SELENIUM["llm_model"] when model is None.

Returns:

An instance of the selected backend.

Return type:

LLMClient

ocs_ci.ocs.ui.llm_tools.locator_fallback module

class ocs_ci.ocs.ui.llm_tools.locator_fallback.LocatorFallback(driver)

Bases: object

AI-powered locator fallback for Selenium UI tests.

When a locator fails, the DOM (and optionally a screenshot) is sent to an LLM which generates a replacement locator. Results are cached per-test and accumulated in a session-wide cache for reuse across tests.

attempt_fallback(locator, action='interact', stack_trace=None)

Attempts to find a replacement locator using LLM analysis.

Parameters:
  • locator (tuple) – Original (selector, By) tuple that failed.

  • action (str) – The action that was being performed (click, send_keys, etc.).

  • stack_trace (str) – Full Python traceback captured at the point of failure.

Returns:

(selector, by_type) replacement locator, or None if fallback fails.

Return type:

tuple

property client
log_cost_summary()

Logs a final cost summary for the entire test.

Call this at test teardown to get a complete picture of AI fallback costs incurred during the test run.

ocs_ci.ocs.ui.llm_tools.locator_fallback.get_session_cache_path()

Returns the path to the session-wide locator cache file.

Module contents