Insert Data
From Stdin
Pipe API results directly into a database:From File
Load a previously saved file:Auto-Create Tables
With--auto-create, the CLI automatically infers the schema from the JSON data and creates the table if it doesn’t exist. Column types are determined from the data values.
Upsert (Insert or Update)
Update existing records or insert new ones based on a unique key:Inspect Schema
View the table schema in your database:Query Data
Run SQL queries against your database:Dataset Loading
Load data from a collected dataset pipeline directly into a database:Diff-Based Sync
Compare collected data with what’s already in the database and apply incremental updates:- New records — present in dataset but not in database
- Updated records — present in both but with different values
- Deleted records — present in database but not in latest dataset
ClickHouse uses ALTER TABLE mutations for diff-sync updates. Transactions are not supported — each batch insert is applied directly.
Auto-Schema and Foreign Keys
The CLI automatically:- Infers column types from JSON data (string, integer, float, boolean, timestamp)
- Flattens nested objects using underscore notation (e.g.,
urn.value→urn_value) - Tracks provenance — links between parent and dependent source tables using foreign key references
Complete Pipeline-to-Database Example
Operations Reference
| Command | Description |
|---|---|
anysite db insert <conn> --table <name> --stdin | Insert data from stdin |
anysite db insert <conn> --table <name> --file <path> | Insert data from file |
anysite db insert <conn> ... --auto-create | Auto-create table from data |
anysite db insert <conn> ... --upsert --key <col> | Upsert on unique key |
anysite db upsert <conn> --table <name> --conflict-columns <col> | Upsert with conflict handling |
anysite db schema <conn> --table <name> | Inspect table schema |
anysite db query <conn> --sql "..." | Run SQL query |
anysite db query <conn> --sql "..." --format csv | Export query to CSV |
anysite dataset load-db <yaml> -c <conn> | Load dataset into database |
anysite dataset load-db <yaml> -c <conn> --drop-existing | Reload with fresh tables |
anysite dataset load-db <yaml> -c <conn> --snapshot <date> | Load a specific snapshot |
anysite dataset diff <yaml> --source <id> --key <field> | Show diff between dataset and DB |
anysite dataset diff <yaml> ... --fields "name,headline" | Diff with specific fields |
Next Steps
LLM Analysis
Enrich your data with AI-powered classification and summarization
SQL Querying
Query collected datasets with DuckDB SQL