Insert Data
From Stdin
Pipe API results directly into a database:
# Single profile → database
anysite api /api/linkedin/user user=satyanadella -q --format jsonl | \
anysite db insert mydb --table profiles --stdin --auto-create
# Batch results → database
anysite api /api/linkedin/user --from-file users.txt --input-key user \
--parallel 5 -q --format jsonl | \
anysite db insert mydb --table profiles --stdin --auto-create
From File
Load a previously saved file:
anysite db insert mydb --table profiles --file results.jsonl --auto-create
Auto-Create Tables
With --auto-create, the CLI automatically infers the schema from the JSON data and creates the table if it doesn’t exist. Column types are determined from the data values.
The --auto-create flag is safe to use repeatedly — it only creates the table on the first run, then inserts into the existing table on subsequent runs.
Upsert (Insert or Update)
Update existing records or insert new ones based on a unique key:
# Upsert with conflict handling
cat updates.jsonl | \
anysite db upsert mydb --table profiles --conflict-columns id --stdin
# Or use the insert command with --upsert flag
cat updated_profiles.jsonl | \
anysite db insert mydb --table profiles --stdin --auto-create --upsert --key urn_value
The upsert performs an INSERT on new records and UPDATE on existing ones based on the specified key/conflict columns.
Inspect Schema
View the table schema in your database:
anysite db schema mydb --table profiles
Query Data
Run SQL queries against your database:
# Query with table output
anysite db query mydb --sql "SELECT name, headline FROM profiles LIMIT 10" --format table
# Export query results to CSV
anysite db query mydb --sql "SELECT * FROM profiles WHERE industry = 'Technology'" \
--format csv --output tech_profiles.csv
# Count records
anysite db query mydb --sql "SELECT COUNT(*) FROM profiles"
Dataset Loading
Load data from a collected dataset pipeline directly into a database:
# Load all sources into database (creates tables per source)
anysite dataset load-db dataset.yaml -c mydb
# Drop existing tables and reload
anysite dataset load-db dataset.yaml -c mydb --drop-existing
# Load a specific snapshot date
anysite dataset load-db dataset.yaml -c mydb --snapshot 2026-01-15
# Auto-load during collection
anysite dataset collect dataset.yaml --load-db mydb
Each source in the pipeline becomes a separate table in the database, named after the source ID.
Diff-Based Sync
Compare collected data with what’s already in the database and apply incremental updates:
anysite dataset diff dataset.yaml --source profiles --key urn.value
# Diff with specific fields
anysite dataset diff dataset.yaml --source profiles --key urn.value --fields "name,headline"
This shows:
- New records — present in dataset but not in database
- Updated records — present in both but with different values
- Deleted records — present in database but not in latest dataset
Auto-Schema and Foreign Keys
The CLI automatically:
- Infers column types from JSON data (string, integer, float, boolean, timestamp)
- Flattens nested objects using underscore notation (e.g.,
urn.value → urn_value)
- Tracks provenance — links between parent and dependent source tables using foreign key references
Complete Pipeline-to-Database Example
# Step 1: Collect data
anysite dataset collect dataset.yaml
# Step 2: Load into PostgreSQL
anysite dataset load-db dataset.yaml -c pg
# Step 3: Query the results
anysite db query pg --sql "
SELECT p.name, p.headline, e.company_name
FROM profiles p
JOIN employees e ON p.urn_value = e.urn_value
LIMIT 20
" --format table
# Or do it all in one command:
anysite dataset collect dataset.yaml --load-db pg
Operations Reference
| Command | Description |
|---|
anysite db insert <conn> --table <name> --stdin | Insert data from stdin |
anysite db insert <conn> --table <name> --file <path> | Insert data from file |
anysite db insert <conn> ... --auto-create | Auto-create table from data |
anysite db insert <conn> ... --upsert --key <col> | Upsert on unique key |
anysite db upsert <conn> --table <name> --conflict-columns <col> | Upsert with conflict handling |
anysite db schema <conn> --table <name> | Inspect table schema |
anysite db query <conn> --sql "..." | Run SQL query |
anysite db query <conn> --sql "..." --format csv | Export query to CSV |
anysite dataset load-db <yaml> -c <conn> | Load dataset into database |
anysite dataset load-db <yaml> -c <conn> --drop-existing | Reload with fresh tables |
anysite dataset load-db <yaml> -c <conn> --snapshot <date> | Load a specific snapshot |
anysite dataset diff <yaml> --source <id> --key <field> | Show diff between dataset and DB |
anysite dataset diff <yaml> ... --fields "name,headline" | Diff with specific fields |
Next Steps