For Data Platform Architects

Bring Elasticsearch Into Your Data Platform

Elasticsearch is a data silo in modern lakehouse architectures. SoftClient4ES bridges the gap with the first and only Arrow Flight SQL server for ES.

The Data Silo Problem

Modern data platforms converge on Apache Arrow as the standard in-memory columnar format. DuckDB, Spark, Dremio, Snowflake, and Databricks all speak Arrow. But Elasticsearch has no native Arrow interface — ES data requires ETL to participate in the lakehouse.

Arrow Flight SQL + ADBC

Arrow Flight SQL Server

gRPC-based protocol delivering Arrow columnar data. 10-100x faster than JDBC/ODBC for analytical workloads. Zero serialization overhead.

ADBC Driver

Java ADBC driver for in-process, zero-copy JVM access. Connect from Python, Go, or DuckDB via the Arrow Flight SQL server using standard ADBC clients.

Materialized Views for Analytics

Elasticsearch has no native cross-index JOIN. Materialized Views bridge this gap — combine data from multiple indices, add aggregations and computed columns, and query the result via Arrow Flight SQL for maximum throughput. No external ETL needed.

-- Cross-index JOIN + aggregation inside ES
CREATE MATERIALIZED VIEW service_metrics AS
SELECT
  s.name AS service_name,
  s.team,
  DATE_TRUNC('hour', l.timestamp) AS hour,
  AVG(l.response_time) AS avg_response,
  COUNT(*) AS request_count
FROM access_logs l
JOIN services s ON l.service_id = s.id
GROUP BY s.name, s.team, hour;

# Query via Arrow Flight SQL (Python)
import adbc_driver_flightsql.dbapi as flight
conn = flight.connect("grpc://localhost:32010")
cursor = conn.cursor()
cursor.execute("SELECT * FROM service_metrics WHERE hour > '2025-03-01'")
df = cursor.fetch_arrow_table().to_pandas()

Arrow Flight SQL Guide →