The Web Schema system enables NLP2CMD to understand and interact with real websites by extracting their UI structure and converting natural language commands into browser automation actions.
The system uses Playwright to extract interactive elements from websites:
nlp2cmd web-schema extract https://github.com --headless
Extracted schemas contain:
github.com)
google.com)
amazon.com)
# Extract schema from any website
nlp2cmd web-schema extract https://example.com --headless
# The schema will be saved to:
# command_schemas/sites/example.com.json
The SiteExplorer provides automatic discovery of forms and content across entire websites, even when they’re not on the initially loaded page.
from nlp2cmd.web_schema import SiteExplorer
explorer = SiteExplorer(max_depth=2, max_pages=10)
# Find contact forms
result = explorer.find_form("https://example.com", intent="contact")
if result.success:
print(f"Found form at: {result.form_url}")
# Automatically navigated to contact page
# Find content pages
result = explorer.find_content(
"https://example.com",
content_type="article" # or "product", "docs"
)
# Universal exploration with intent detection
result = explorer.explore(
"https://example.com",
"find pricing information" # Auto-detects "product" intent
)
When using CLI with -r flag, form discovery happens automatically:
# If form not on main page, SiteExplorer searches the site
nlp2cmd -r "wejdź na maskservice.pl i wypełnij formularz kontaktu"
# Output:
# 🔍 Exploring site for contact form...
# ✓ Found form at: https://www.maskservice.pl/en/contact
# Form submitted via: input[type="submit"]
from nlp2cmd.web_schema import SiteExplorer
explorer = SiteExplorer(
max_depth=2, # How many levels deep to crawl
max_pages=10, # Maximum pages to visit
headless=True, # Run browser headless
timeout_ms=15000, # Page load timeout
dynamic_wait_ms=3000, # Wait for JS frameworks
)
The browser domain supports four main intents:
General web interactions including typing, clicking, and navigating.
# Examples
"wpisz tekst w wyszukiwarce google" → browser/web_action
"kliknij przycisk sign in" → browser/web_action
"wypełnij formularz rejestracji" → browser/web_action
Page navigation and URL opening.
# Examples
"przejdź do github" → browser/navigate
"otwórz stronę google" → browser/navigate
"idź na stronę główną" → browser/navigate
Button and link clicking.
# Examples
"kliknij przycisk" → browser/click
"wybierz opcję" → browser/click
"aktywuj link" → browser/click
Form filling and data entry.
# Examples
"wypełnij formularz" → browser/fill_form
"podaj dane kontaktowe" → browser/fill_form
"zarejestruj się" → browser/fill_form
# Search on Google
nlp2cmd --query "wyszukaj python tutorial w google"
# Navigate to GitHub
nlp2cmd --query "przejdź do github"
# Fill form on website
nlp2cmd --query "wypełnij formularz kontaktowy"
# Click button
nlp2cmd --query "kliknij przycisk submit"
# Search repositories
nlp2cmd --query "znajdź repozytorium nlp2cmd na github"
# User lookup
nlp2cmd --query "wyszukaj użytkownika tomek na github"
# Code navigation
nlp2cmd --query "przejdź do kodu projektu"
# Search queries
nlp2cmd --query "wyszukaj najlepsze kursy python w google"
# Application navigation
nlp2cmd --query "otwórz aplikacje google"
# Form interactions
nlp2cmd --query "wyczyść wyszukiwarkę google"
# Product search
nlp2cmd --query "znajdź książki o nlp na amazon"
# Shopping cart
nlp2cmd --query "dodaj do koszyka amazon"
# Form filling
nlp2cmd --query "wypełnij dane wysyłki amazon"
Add to data/patterns.json:
{
"browser": {
"web_action": [
"wpisz tekst", "type text", "wypełnij pole", "fill field",
"kliknij przycisk", "click button", "wypełnij formularz", "fill form",
"wyszukaj w google", "google search", "znajdź repozytorium na github",
"szukaj na amazon", "amazon search"
],
"navigate": [
"przejdź do", "go to", "otwórz stronę", "open website",
"nawiguj do", "navigate to", "idź do", "visit"
],
"click": [
"kliknij", "click", "naciśnij", "press", "wybierz", "select"
],
"fill_form": [
"wypełnij formularz", "fill form", "wypełlij dane", "fill data"
]
}
}
Update keyword_intent_detector_config.json:
{
"domain_boosters": {
"browser": ["przeglądark", "browser", "strona", "website", "kliknij", "click",
"formularz", "form", "przycisk", "button", "link", "url",
"google", "github", "amazon", "szukaj", "wyszukaj", "wpisz", "type"]
},
"priority_intents": {
"browser": ["web_action", "navigate", "click", "fill_form"]
}
}
# Basic extraction
nlp2cmd web-schema extract https://example.com
# With headless mode
nlp2cmd web-schema extract https://example.com --headless
# Specify output directory
nlp2cmd web-schema extract https://example.com --output-dir ./schemas
{
"app_name": "example.com",
"app_version": "1.0",
"description": "Web interface for example.com",
"actions": [
{
"id": "type_search",
"name": "search",
"description": "Type text into input field",
"parameters": [
{
"name": "text",
"type": "string",
"description": "Text to type",
"required": true
}
],
"examples": [
"wpisz tekst w search",
"type text in search"
],
"metadata": {
"selector": "#search-input",
"element_type": "input",
"input_type": "text"
}
}
],
"metadata": {
"url": "https://example.com",
"title": "Example Domain",
"extracted_at": "2026-01-24T21:15:22.864128",
"total_inputs": 5,
"total_buttons": 12,
"total_links": 8
}
}
# List all extracted schemas
ls command_schemas/sites/
# View specific schema
cat command_schemas/sites/github.com.json
# Remove schema
rm command_schemas/sites/example.com.json
# Update schema (re-extract)
nlp2cmd web-schema extract https://example.com --headless --force
The system uses semantic similarity to match queries with web actions:
from nlp2cmd.generation.enhanced_context import get_enhanced_detector
detector = get_enhanced_detector()
match = detector.get_best_match("znajdź repozytorium nlp2cmd na github")
# Returns browser/web_action with high confidence
# Uses GitHub schema for enhanced understanding
The system supports both Polish and English:
# Polish queries
nlp2cmd --query "znajdź repozytorium na github"
nlp2cmd --query "wyszukaj produkty na amazon"
# English queries
nlp2cmd --query "find repository on github"
nlp2cmd --query "search products on amazon"
The system extracts entities from queries:
# Query: "znajdź książki o nlp na amazon"
# Entities:
# - "książki" (book) - product type
# - "nlp" - topic
# - "amazon" - site
# - Language: polish
Problem: Unable to extract schema from website
Solution: Check website accessibility and Playwright installation:
# Check Playwright installation
npx playwright --version
# Install browsers
npx playwright install
# Test extraction with verbose output
nlp2cmd web-schema extract https://example.com --headless --verbose
Problem: Queries detected as shell instead of browser
Solution: Verify browser patterns and domain boosters:
# Check browser patterns
grep -A 20 '"browser"' data/patterns.json
# Check domain boosters
grep -A 5 '"browser"' data/keyword_intent_detector_config.json
Problem: Web actions not recognized for specific sites
Solution: Ensure schema is properly extracted and loaded:
# Check if schema exists
ls command_schemas/sites/
# Verify schema content
cat command_schemas/sites/github.com.json | jq '.actions | length'
# Test enhanced context detection
python -c "
from nlp2cmd.generation.enhanced_context import get_enhanced_detector
detector = get_enhanced_detector()
print('Browser schemas loaded:', 'browser' in detector.schemas)
"
Problem: Wrong intent detected (click vs web_action vs navigate)
Solution: Review intent priorities and patterns:
{
"priority_intents": {
"browser": ["web_action", "navigate", "click", "fill_form"]
}
}
Enable debug mode for detailed detection information:
# Run with explain flag
nlp2cmd --query "wyszukaj w google" --explain
# Check detection reasoning
python -c "
from nlp2cmd.generation.enhanced_context import get_enhanced_detector
detector = get_enhanced_detector()
explanation = detector.explain_detection('wyszukaj w google')
print(explanation)
"
Schemas are cached in memory for fast access:
# Schemas loaded once at initialization
detector = get_enhanced_detector()
# Multiple queries use cached schemas
for query in queries:
match = detector.get_best_match(query)
Semantic embeddings are precomputed for patterns:
# Built during initialization
detector._build_semantic_index()
# Fast similarity lookup during queries
similarity = detector._calculate_semantic_similarity(query, domain, intent)
Process multiple queries efficiently:
# Batch intent detection
queries = ["wyszukaj w google", "znajdź na github", "szukaj na amazon"]
results = [detector.get_best_match(q) for q in queries]
The system includes safety measures:
The system is designed for extensibility:
# Custom web schema extractor
class CustomSchemaExtractor:
def extract_schema(self, url: str) -> dict:
# Custom extraction logic
pass
# Enhanced context detector extension
class CustomContextDetector(EnhancedContextDetector):
def _calculate_custom_score(self, query: str) -> float:
# Custom scoring logic
pass
The Web Schema system enables NLP2CMD to understand and interact with real websites, providing:
This represents a significant advancement in making web automation accessible through natural language interfaces.