Overview
Anysite CLI integrates with LLM providers to add AI-powered analysis to your data workflows. Six operations are available: classify, summarize, enrich, generate, match, and deduplicate.Requires the
llm extra: pip install "anysite-cli[llm]"Setup
Configure your LLM provider:Supported Providers
| Provider | Default Model | Configuration |
|---|---|---|
| OpenAI | gpt-4.1-mini | Uses JSON Schema for structured output |
| Anthropic | claude-sonnet-4-5-20250514 | Uses system prompts with JSON schema |
~/.anysite/config.yaml.
Operations
Classify
Categorize records into predefined categories:--categories is omitted, the LLM auto-detects 3-7 appropriate categories based on the data.
Summarize
Generate concise summaries:Enrich
Extract new structured attributes from text data:- Enum — predefined choices:
"seniority:junior/mid/senior" - Boolean — true/false:
"is_technical:boolean" - Number — numeric value:
"years_experience:number" - String — free text:
"primary_skill:string"
Generate
Create new text using templates with field placeholders:Match
Compare records across two sources and find best matches:Deduplicate
Find and flag semantic duplicates within a source:Using LLM in Dataset Pipelines
Add LLM processing directly in your pipeline YAML:Caching
LLM results are cached in a local SQLite database (~/.anysite/llm_cache.db) to avoid repeated API calls and reduce costs.
Options Reference
| Option | Description | Applies To |
|---|---|---|
--fields | Fields to include in LLM context (comma-separated) | classify, summarize |
--categories | Comma-separated categories | classify |
--add | Attribute to extract (repeatable) | enrich |
--prompt | Template with {field} placeholders | generate |
--temperature | LLM creativity (0.0-1.0) | generate |
--max-length | Max words for output | summarize |
--output-column | Name for the result column | all |
--top-k | Number of matches per record | match |
--key | Field to compare for duplicates | deduplicate |
--threshold | Similarity threshold (0.0-1.0) | deduplicate |
--no-cache | Skip the LLM cache | all |
Next Steps
SQL Querying
Query and analyze your enriched data with DuckDB SQL
Examples
See complete end-to-end workflow examples