Bring Elasticsearch Into Your Data Platform
Elasticsearch is a data silo in modern lakehouse architectures. SoftClient4ES bridges the gap with the first and only Arrow Flight SQL server for ES.
The Data Silo Problem
Modern data platforms converge on Apache Arrow as the standard in-memory columnar format. DuckDB, Spark, Dremio, Snowflake, and Databricks all speak Arrow. But Elasticsearch has no native Arrow interface — ES data requires ETL to participate in the lakehouse.
Arrow Flight SQL + ADBC
Arrow Flight SQL Server
gRPC-based protocol delivering Arrow columnar data. 10-100x faster than JDBC/ODBC for analytical workloads. Zero serialization overhead.
ADBC Driver
Java ADBC driver for in-process, zero-copy JVM access. Connect from Python, Go, or DuckDB via the Arrow Flight SQL server using standard ADBC clients.
Materialized Views for Analytics
Elasticsearch has no native cross-index JOIN. Materialized Views bridge this gap — combine data from multiple indices, add aggregations and computed columns, and query the result via Arrow Flight SQL for maximum throughput. No external ETL needed.
-- Cross-index JOIN + aggregation inside ES
CREATE MATERIALIZED VIEW service_metrics AS
SELECT
s.name AS service_name,
s.team,
DATE_TRUNC('hour', l.timestamp) AS hour,
AVG(l.response_time) AS avg_response,
COUNT(*) AS request_count
FROM access_logs l
JOIN services s ON l.service_id = s.id
GROUP BY s.name, s.team, hour; # Query via Arrow Flight SQL (Python)
import adbc_driver_flightsql.dbapi as flight
conn = flight.connect("grpc://localhost:32010")
cursor = conn.cursor()
cursor.execute("SELECT * FROM service_metrics WHERE hour > '2025-03-01'")
df = cursor.fetch_arrow_table().to_pandas()