For Data Engineers & DBAs

Elasticsearch Schema Management Was Hell. Then Someone Typed SQL.

No more REST API calls with JSON payloads. No more version-specific syntax. Manage your Elasticsearch indexes with familiar SQL.

The Problem

Elasticsearch has no native SQL DDL or DML. Creating indexes requires REST API calls with JSON mappings. Mappings are immutable — modifying a field means creating a new index and reindexing all data. Dynamic mapping can create type conflicts across shards. There is no migration tooling and no standard approach for schema changes at scale.

SQL DDL for Elasticsearch

-- Create an index with typed columns
CREATE TABLE orders (
  id KEYWORD,
  customer_id KEYWORD,
  amount DOUBLE,
  status KEYWORD,
  order_date DATE,
  PRIMARY KEY (id)
);

-- Add a column safely
ALTER TABLE orders ADD COLUMN IF NOT EXISTS shipping_address TEXT;

-- Inspect your schema
DESCRIBE TABLE orders;

-- Drop when done
DROP TABLE IF EXISTS temp_orders;

SQL DML + Bulk Loading

-- Insert documents
INSERT INTO orders (id, customer_id, amount, status)
VALUES ('o1', 'c42', 199.99, 'pending');

-- Update with conditions
UPDATE orders SET status = 'shipped' WHERE status = 'pending' AND order_date < '2025-01-01';

-- Bulk load from S3 (Parquet, JSON, Delta)
COPY INTO orders FROM 's3://data-lake/orders/' FORMAT PARQUET;

-- Bulk load from local JSON
COPY INTO logs FROM '/data/app-logs.json' FORMAT JSON;

COPY INTO — Formats & Sources

File Formats

JSON

JSON Array

Parquet

Delta

Data Sources

Local files

Amazon S3

Azure Blob

Google Cloud Storage

HDFS