DQL — Queries
The SQL Gateway provides a full Data Query Language on top of Elasticsearch, translating SQL into search, aggregation, and scroll APIs.
SELECT
SELECT [DISTINCT] expr1, expr2, ...FROM table_name [alias][WHERE condition][GROUP BY expr1, expr2, ...][HAVING condition][ORDER BY expr1 [ASC|DESC] [NULLS FIRST|NULLS LAST], ...][LIMIT n][OFFSET m];Nested Fields and Aliases
SELECT id, name AS full_name, profile.city AS city, profile.followers AS followersFROM dql_usersORDER BY id ASC;profileis aSTRUCTcolumn- Dot notation accesses nested fields
- Aliases (
AS) are returned as column names
WHERE
Supports comparison operators (=, !=, <, <=, >, >=), logical operators (AND, OR, NOT), IN, NOT IN, BETWEEN, IS NULL, IS NOT NULL, LIKE, RLIKE (regex), and conditions on nested fields.
SELECT id, name, ageFROM dql_usersWHERE (age > 20 AND profile.followers >= 100) OR (profile.city = 'Lyon' AND age < 50)ORDER BY age DESC;SELECT id, age + 10 AS age_plus_10, nameFROM dql_usersWHERE age BETWEEN 20 AND 50 AND name IN ('Alice', 'Bob', 'Chloe') AND name IS NOT NULL AND (name LIKE 'A%' OR name RLIKE '.*o.*');ORDER BY
Supports multiple sort keys, ASC/DESC, NULLS FIRST / NULLS LAST,
expressions, and nested fields.
SELECT id, name, ageFROM dql_usersORDER BY age DESC, name ASCLIMIT 2 OFFSET 1;NULLS FIRST / NULLS LAST
Each sort key may declare where NULL values appear:
SELECT id, name, bonusFROM dql_usersORDER BY bonus DESC NULLS LAST;Mapped to Elasticsearch’s sort.missing parameter:
NULLS FIRST→"missing": "_first"NULLS LAST→"missing": "_last"
When omitted, defaults follow the Elasticsearch convention: ASC →
nulls last, DESC → nulls first.
Different null orderings can be combined within a single query:
SELECT id, name, bonus, hire_dateFROM dql_usersORDER BY bonus DESC NULLS LAST, hire_date ASC NULLS FIRST;Caveat (ES6 Jest client): scroll / search_after queries in the
ES6 Jest client do not propagate NULLS FIRST / NULLS LAST reliably
across batches. For ES6 scroll/search_after, prefer client-side
null-bucketing or upgrade to ES7+.
LIMIT / OFFSET
LIMIT nrestricts returned rowsOFFSET mskips the firstmrows- Translated to Elasticsearch
from+size
UNION ALL
Combines results of multiple SELECT queries without removing duplicates. All SELECT statements must have the same number of columns with the same names.
SELECT id, name FROM dql_users WHERE age > 30UNION ALLSELECT id, name FROM dql_users WHERE age <= 30;Executed using Elasticsearch Multi-Search (_msearch). ORDER BY and LIMIT apply per SELECT, not globally.
JOIN UNNEST
The Gateway supports JOIN UNNEST on ARRAY<STRUCT> columns.
SELECT o.id, items.product, items.quantity, SUM(items.price * items.quantity) OVER (PARTITION BY o.id) AS total_priceFROM dql_orders oJOIN UNNEST(o.items) AS itemsWHERE items.quantity >= 1ORDER BY o.id ASC;JOIN UNNEST produces one output row per array element, with parent fields duplicated — exactly like a standard SQL UNNEST. It supports expressions, filtering, and aggregations via window functions. Multi-level nesting is handled recursively.
Aggregations
Supported aggregate functions: COUNT(*), COUNT(expr), SUM, AVG, MIN, MAX,
STDDEV / STDDEV_SAMP / STDDEV_POP, VARIANCE / VAR_SAMP / VAR_POP.
STDDEV defaults to sample standard deviation (STDDEV ≡ STDDEV_SAMP, Bessel-corrected)
and VARIANCE defaults to sample variance (VARIANCE ≡ VAR_SAMP). This matches PostgreSQL
and Snowflake; users coming from MySQL 5.5 or earlier should note that those releases defaulted
STDDEV to population.
SELECT department, STDDEV(salary) AS sd, VAR_POP(salary) AS vpFROM empGROUP BY department;All six map to a single Elasticsearch extended_stats aggregation per call. Sample variants
require Elasticsearch 7.7+; population variants work on Elasticsearch 6+.
Percentiles — PERCENTILE_CONT / PERCENTILE_DISC
-- p99 request latency per endpoint (SRE latency analysis)SELECT endpoint, PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY duration_ms) AS p99FROM requestsGROUP BY endpoint;Forms accepted (for both PERCENTILE_CONT and PERCENTILE_DISC):
PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY column)— ANSI ordered-set aggregate (optionally with a top-levelGROUP BY)PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY column) OVER (PARTITION BY ...)— value column fromWITHIN GROUP, partition fromOVERPERCENTILE_CONT(p) OVER (PARTITION BY ... ORDER BY column)— value column from theOVERORDER BYPERCENTILE_CONT(column, p)— column-first shorthand (many BI tools emit it)
The percentile literal p is a value in [0, 1] (e.g. 0.99 for p99); out-of-range values are rejected
at parse time. The value column comes from the ORDER BY clause or the shorthand’s first argument;
grouping comes from OVER (PARTITION BY ...) or a top-level GROUP BY. Both functions map to the
Elasticsearch percentiles aggregation (TDigest). Elasticsearch has no native discrete percentile, so
PERCENTILE_DISC is continuous-backed — it returns the same interpolated value as PERCENTILE_CONT
rather than the nearest actual data point. All forms work on Elasticsearch 6+.
GROUP BY and HAVING
SELECT profile.city AS city, COUNT(*) AS cnt, AVG(age) AS avg_ageFROM dql_usersGROUP BY profile.cityHAVING COUNT(*) >= 1ORDER BY COUNT(*) DESC;GROUP BYsupports nested fieldsHAVINGfilters groups based on aggregate conditions- Translated to Elasticsearch aggregations
Parent-Level Aggregations on Nested Arrays
Compute aggregations over nested arrays while keeping one row per parent document (the original nested array is preserved):
SELECT o.id, o.items, SUM(items.price * items.quantity) OVER (PARTITION BY o.id) AS total_priceFROM dql_orders oJOIN UNNEST(o.items) AS itemsWHERE items.quantity >= 1ORDER BY o.id ASC;Returns one row per parent with the original nested array preserved and the aggregated value added as a top-level field.
Window Functions
Window functions operate over a logical window defined by OVER (PARTITION BY ... ORDER BY ...).
Supported: SUM, COUNT, STDDEV / STDDEV_SAMP / STDDEV_POP,
VARIANCE / VAR_SAMP / VAR_POP, PERCENTILE_CONT / PERCENTILE_DISC,
FIRST_VALUE, LAST_VALUE, ARRAY_AGG, ROW_NUMBER, RANK, DENSE_RANK.
SELECT product, customer, amount, SUM(amount) OVER (PARTITION BY product) AS sum_per_product, COUNT(_id) OVER (PARTITION BY product) AS cnt_per_product, FIRST_VALUE(amount) OVER (PARTITION BY product ORDER BY ts ASC) AS first_amount, LAST_VALUE(amount) OVER (PARTITION BY product ORDER BY ts ASC) AS last_amount, ARRAY_AGG(amount) OVER (PARTITION BY product ORDER BY ts ASC LIMIT 10) AS amounts_arrayFROM dql_salesORDER BY product, ts;Ranking windows (ROW_NUMBER, RANK, DENSE_RANK)
ORDER BY is REQUIRED inside OVER for ranking functions (ANSI). PARTITION BY
is optional — when absent, the entire result set is treated as one partition.
SELECT name, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS r, DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS drFROM emp;Tie semantics:
ROW_NUMBER— sequential within partition; no ties (1, 2, 3, 4, …)RANK— ties share rank, next rank skips (1, 2, 2, 4, …)DENSE_RANK— ties share rank, next rank does NOT skip (1, 2, 2, 3, …)
Top-N per group: inline LIMIT N inside OVER to keep only the top-N rows
per partition. The engine pushes N into the Elasticsearch top_hits.size
parameter, so only the top-N rows per partition are materialised:
SELECT name, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC LIMIT 3) AS rFROM emp;Without an explicit LIMIT, top_hits.size defaults to 100 — the
Elasticsearch index.max_inner_result_window default. For larger partitions
either supply LIMIT N inline or raise the index setting.
Functions
Numeric & Trigonometric
| Function | Description |
|---|---|
ABS(x) | Absolute value |
CEIL(x) / CEILING(x) | Round up |
FLOOR(x) | Round down |
ROUND(x, n) | Round to n decimals |
SQRT(x) | Square root |
POW(x, y) / POWER(x, y) | Power |
EXP(x) | Exponential |
LOG(x) / LN(x) | Natural logarithm |
LOG10(x) | Base-10 logarithm |
SIGN(x) / SGN(x) | Sign of x |
SIN(x), COS(x), TAN(x) | Trigonometric |
ASIN(x), ACOS(x), ATAN(x) | Inverse trigonometric |
ATAN2(y, x) | Arc-tangent of y/x |
PI() | Pi constant |
RADIANS(x), DEGREES(x) | Angle conversion |
String
| Function | Description |
|---|---|
CONCAT(a, b, ...) | Concatenate strings |
SUBSTRING(str, start, len) | Extract substring |
LOWER(str) / LCASE(str) | Lowercase |
UPPER(str) / UCASE(str) | Uppercase |
TRIM(str), LTRIM(str), RTRIM(str) | Trim whitespace |
LENGTH(str) / LEN(str) | String length |
REPLACE(str, from, to) | Replace substring |
LEFT(str, n), RIGHT(str, n) | Left/right n chars |
REVERSE(str) | Reverse string |
POSITION(substr IN str) / STRPOS(str, substr) | Position of substring |
REGEXP_LIKE(str, pattern) | Regex match |
MATCH(str) AGAINST (query) | Full-text search |
Date & Time
Current:
| Function | Description |
|---|---|
CURRENT_DATE / TODAY() / CURDATE() | Current date (UTC) |
CURRENT_TIMESTAMP / NOW() / CURRENT_DATETIME | Current timestamp (UTC) |
CURRENT_TIME / CURTIME() | Current time (UTC) |
Extraction:
| Function | Description |
|---|---|
YEAR(date), MONTH(date), DAY(date) | Date components |
HOUR(ts), MINUTE(ts), SECOND(ts) | Time components |
MILLISECOND(ts), MICROSECOND(ts), NANOSECOND(ts) | Sub-second components |
EXTRACT(unit FROM date) | Extract any date/time unit |
Arithmetic:
| Function | Description |
|---|---|
DATE_ADD(date, INTERVAL n unit) | Add interval |
DATE_SUB(date, INTERVAL n unit) | Subtract interval |
DATETIME_ADD(ts, INTERVAL n unit) | Add interval to timestamp |
DATETIME_SUB(ts, INTERVAL n unit) | Subtract from timestamp |
DATE_DIFF(date1, date2, unit) | Difference in units |
DATE_TRUNC(date, unit) | Truncate to unit |
Formatting & Parsing:
| Function | Description |
|---|---|
DATE_FORMAT(ts, pattern) | Format date as string |
DATE_PARSE(str, pattern) | Parse string into date |
DATETIME_FORMAT(ts, pattern) | Format timestamp as string |
DATETIME_PARSE(str, pattern) | Parse string into timestamp |
Special:
| Function | Description |
|---|---|
LAST_DAY(date) | Last day of month |
EPOCHDAY(date) | Days since epoch (1970-01-01) |
OFFSET_SECONDS(date) | Epoch seconds |
Geospatial
-- Create pointPOINT(longitude, latitude)
-- Calculate distanceST_DISTANCE(location, POINT(2.3522, 48.8566))Supported distance units: km, m, cm, mm, mi, yd, ft, in, nmi.
Conditional
| Function | Description |
|---|---|
CASE WHEN ... THEN ... ELSE ... END | Conditional expression |
COALESCE(a, b, c) | First non-null value |
NULLIF(a, b) | NULL if a = b |
GREATEST(e1, e2, ...) | Largest non-null numeric value (NULL only when every arg is NULL) |
LEAST(e1, e2, ...) | Smallest non-null numeric value (NULL only when every arg is NULL) |
ISNULL(expr) | TRUE if NULL |
ISNOTNULL(expr) | TRUE if NOT NULL |
Type Conversion
| Function | Description |
|---|---|
CAST(value AS TYPE) | Convert type (error on failure) |
TRY_CAST(value AS TYPE) / SAFE_CAST(...) | Convert type (NULL on failure) |
CONVERT(value, TYPE) | Alias for CAST |
value::TYPE | PostgreSQL-style cast |
System
| Function | Description |
|---|---|
VERSION() | Engine version string |
Scroll & Pagination
For large result sets, the Gateway uses Elasticsearch scroll or search-after mechanisms depending on backend capabilities.
LIMITandOFFSETare applied after retrieving documents from Elasticsearch- Deep pagination may require scroll
search_afterrequires an explicitORDER BYclause
SHOW and DESCRIBE Commands
-- TablesSHOW TABLES [LIKE 'pattern'];SHOW TABLE table_name;SHOW CREATE TABLE table_name;DESCRIBE TABLE table_name;
-- PipelinesSHOW PIPELINES;SHOW PIPELINE pipeline_name;SHOW CREATE PIPELINE pipeline_name;DESCRIBE PIPELINE pipeline_name;
-- WatchersSHOW WATCHERS;SHOW WATCHER STATUS watcher_name;
-- Enrich PoliciesSHOW ENRICH POLICIES;SHOW ENRICH POLICY policy_name;
-- ClusterSHOW CLUSTER NAME;
-- LicenseSHOW LICENSE;REFRESH LICENSE;SHOW LICENSE
SHOW LICENSE;Returns the current license type, quota values, expiration date, and grace status.
| Column | Description |
|---|---|
license_type | Current license tier (Community, Pro, Enterprise). Shows “(trial)” suffix for trial licenses, “(degraded)” suffix if degraded. |
trial | true if the license is a Pro trial, false otherwise |
platform | Platform scope of the current license key (PRODUCTION, STAGING, DEVELOPMENT, INTEGRATION). Defaults to “PRODUCTION” when not platform-scoped. |
max_materialized_views | Maximum materialized views allowed, or “unlimited” |
max_clusters | Maximum federated clusters allowed, or “unlimited” |
max_result_rows | Maximum rows returned per query, or “unlimited” |
max_concurrent_queries | Maximum concurrent queries allowed, or “unlimited” |
max_joins | Maximum number of JOIN operations allowed per query, or “unlimited” |
expires_at | License expiration timestamp, or “never” for Community |
days_remaining | Days until expiration, or -1 for Community (no expiry) |
status | ”Active”, or grace period details if expired |
REFRESH LICENSE
REFRESH LICENSE;Forces an immediate license refresh from the backend (API key fetch). Returns the previous and new tier information.
| Column | Description |
|---|---|
previous_tier | License tier before refresh |
new_tier | License tier after refresh |
trial | true if the new license is a Pro trial, false otherwise |
expires_at | New expiration timestamp |
status | ”Refreshed” on success, “Failed” on error |
message | Error details (empty on success) |
Requires API key configuration. Without an API key, returns an informational failure message.
Version Compatibility
| Feature | ES6 | ES7 | ES8 | ES9 |
|---|---|---|---|---|
| Basic SELECT | Yes | Yes | Yes | Yes |
| Nested fields | Yes | Yes | Yes | Yes |
| UNION ALL | Yes | Yes | Yes | Yes |
| JOIN UNNEST | Yes | Yes | Yes | Yes |
| Aggregations | Yes | Yes | Yes | Yes |
| Parent-level nested array aggs | Yes | Yes | Yes | Yes |
| Window functions | Yes | Yes | Yes | Yes |
| Geospatial functions | Yes | Yes | Yes | Yes |
| Date/time functions | Yes | Yes | Yes | Yes |
| String / math functions | Yes | Yes | Yes | Yes |
Limitations
- Traditional SQL joins are supported only through the use of Materialized Views (only
JOIN UNNESTonARRAY<STRUCT>is available natively) - No correlated subqueries
- No arbitrary subqueries in
SELECTorWHERE - No
GROUPING SETS,CUBE,ROLLUP - No
DISTINCT ON - No explicit window frame clauses (
ROWS BETWEEN ...)