Zero-copy columnar access to Elasticsearch over gRPC — for DuckDB, Python, Apache Superset, Grafana, and any Arrow Flight SQL client.
Features
- gRPC Protocol — High-performance columnar data access over HTTP/2
- BI Tool Compatible — Works with DBeaver, Superset, Grafana, DataGrip, and any Arrow Flight SQL client
- Docker Ready — Pre-built Docker images for ES 6, 7, 8, and 9
- Lazy Streaming — Memory-efficient; only the current batch resides in memory
- Configurable Batch Size — Via
arrow.flight.batch-size
- Full SQL — DDL + DML + DQL, not just SELECT
Quick Start with Docker
docker run -p 32010:32010 \
-e ES_HOST=elasticsearch \
-e ES_PASSWORD=changeme \
softnetwork/softclient4es8-arrow-flight-sql:latest
Available Docker Images
| Elasticsearch | Docker Image |
|---|
| ES 6.x | softnetwork/softclient4es6-arrow-flight-sql:latest |
| ES 7.x | softnetwork/softclient4es7-arrow-flight-sql:latest |
| ES 8.x | softnetwork/softclient4es8-arrow-flight-sql:latest |
| ES 9.x | softnetwork/softclient4es9-arrow-flight-sql:latest |
Fat JAR
java -jar softclient4es8-arrow-flight-sql-0.1.5.jar
| Elasticsearch | Artifact |
|---|
| ES 6.x | softclient4es6-arrow-flight-sql-0.1.5.jar |
| ES 7.x | softclient4es7-arrow-flight-sql-0.1.5.jar |
| ES 8.x | softclient4es8-arrow-flight-sql-0.1.5.jar |
| ES 9.x | softclient4es9-arrow-flight-sql-0.1.5.jar |
Python + DuckDB
import adbc_driver_flightsql.dbapi as flight_sql
conn = flight_sql.connect("grpc://localhost:32010")
cursor.execute("SELECT * FROM ecommerce")
table = cursor.fetch_arrow_table() # zero-copy Arrow table
SELECT category, SUM(total_price) AS revenue
Configuration
host = "0.0.0.0" # env: ARROW_HOST
port = 32010 # env: ARROW_PORT
batch-size = 1000 # env: ARROW_BATCH_SIZE
host = "localhost" # env: ES_HOST
port = 9200 # env: ES_PORT
user = "elastic" # env: ES_USER
password = "changeme" # env: ES_PASSWORD
Arrow Type Mapping
| SQL Type | ES Type | Arrow Type |
|---|
TINYINT | byte | Int(32) |
SMALLINT | short | Int(32) |
INT | integer | Int(32) |
BIGINT | long | Int(64) |
REAL | float | Float(SINGLE) |
DOUBLE | double | Float(DOUBLE) |
BOOLEAN | boolean | Bool |
DATE | datetime | Date(MILLISECOND) |
TIMESTAMP | datetime | Timestamp(MS) |
VARCHAR | text | Utf8 |
KEYWORD | keyword | Utf8 |
VARBINARY | binary | Binary |
STRUCT | object | Struct |
GEO_POINT | geo_point | Struct{Float64, Float64} |
ARRAY<*> | — | List |
ARRAY<STRUCT> | nested | List<Struct> |
ADBC Driver (In-Process Alternative)
For use cases that don’t need a separate server, the ADBC driver provides in-process columnar access:
val params = new java.util.HashMap[String, AnyRef]()
params.put(AdbcDriver.PARAM_URI.getKey,
"adbc:elastic://localhost:9200?user=elastic&password=changeme")
val db = AdbcDriverManager.getInstance().connect(params, allocator)
val stmt = conn.createStatement()
stmt.setSqlQuery("SELECT * FROM my_index LIMIT 10")
val result = stmt.executeQuery()
ADBC vs JDBC vs Arrow Flight SQL
| Feature | JDBC | ADBC | Arrow Flight SQL |
|---|
| Process model | In-process | In-process | Separate server (gRPC) |
| Data format | Row-based (ResultSet) | Columnar (Arrow) | Columnar (Arrow) |
| Protocol | JDBC API | ADBC API | gRPC (HTTP/2) |
| Use case | Java apps, BI tools | Analytics, data engineering | Multi-client, networked |
| Setup | JAR on classpath | JAR on classpath | Docker/server deployment |
Live Demos
DuckDB + Python Pipeline
docker compose --profile duckdb up
Apache Superset BI Dashboards
docker compose --profile superset-flight up
Grafana BI Dashboards
docker compose --profile grafana up
License
Arrow Flight SQL is licensed under the Elastic License 2.0 — free to use, not open source.