Skip to main content

MongoDB Compatibility Matrix

Version: 1.0 Baseline: MongoDB 6.0+ Wire Protocol Last Updated: 2026-03-04 Token Inventory: 308 tokens across 6 categories (from tests/e2e/matrix/mql_language/reference_tokens.py)

Thermocline targets compatibility with the MongoDB 6.0+ wire protocol and MQL surface. Applications connect using standard MongoDB drivers with only a connection string change.

This document provides a per-operator compatibility matrix across three execution tiers:

TierDescription
Hot (SE)Storage Engine (Rust) -- native LSM-tree, in-memory evaluation
Cold (QE)Query Engine (Rust) -- Parquet/DataFusion, cold-tier pushdown
FederatedCoordinator (Go) -- hot+cold merge, post-filter materialization

Legend

SymbolMeaning
YFully supported
PPartial support (see Notes)
NNot supported
--Not applicable to this tier
RTRouted through another tier (e.g., geo queries routed through SE materialization)

1. Query Operators (31 tokens)

All query predicate operators used in find(), $match, deleteMany(), updateMany(), etc.

OperatorHot (SE)Cold (QE)FederatedNotes
$eqYYYImplicit equality and explicit {$eq: val}
$neYYYFixed in E5-S1
$gtYYY
$gteYYY
$ltYYY
$lteYYY
$inYYY
$ninYYYQE: post-filter path (E5-S3)
$andYYY
$orYYY
$notYYYPer-field negation
$norYYY
$existsYYY
$typeYPYQE: limited without schema; SE post-filter fallback (E5-S2)
$regexYPYQE: post-filter for complex patterns (E5-S3)
$modYPYQE: post-filter path
$exprYNRTAggregation expression in filter; cold uses SE materialization
$textYYYRequires text index; QE has dedicated text index module
$whereYNRTJS expression evaluation; cold routed through SE
$allYPYQE: post-filter for array containment
$elemMatchYPYQE: equality pushdown; range on LIST columns uses post-filter (E5-S3)
$sizeYPYQE: post-filter path
$bitsAllSetYPYQE: post-filter path
$bitsAllClearYPYQE: post-filter path
$bitsAnySetYPYQE: post-filter path
$bitsAnyClearYPYQE: post-filter path
$geoWithinYNRTQE: routed through SE materialization (E5-S7)
$geoIntersectsYNRTQE: routed through SE materialization (E5-S7)
$nearYNRTQE: routed through SE materialization (E5-S7)
$nearSphereYNRTQE: routed through SE materialization (E5-S7)
$jsonSchemaYNRTSchema validation in query context

Coverage: 31/31 tokens handled (all routed or supported across tiers)


2. Update Operators (23 tokens)

Update operators are SE-only (writes always go to the hot tier).

OperatorHot (SE)Cold (QE)FederatedNotes
$setY----
$unsetY----
$incY----
$mulY----
$minY----
$maxY----
$renameY----
$currentDateY----Supports {$type: "date"} and {$type: "timestamp"}
$setOnInsertY----Applied only during upsert insert
$pushY----With $each, $position, $slice, $sort modifiers
$addToSetY----With $each modifier
$pullY----Supports query condition matching
$pullAllY----
$popY----First (-1) and last (1) element
$bitY----and, or, xor bitwise operations
$ (positional)Y----Resolves to matched array index
$[] (all positional)P----Supported via resolve_positional for single $
$[<identifier>] (filtered)P----Array filter identifier support is partial
$eachY----Modifier for $push and $addToSet
$positionY----Modifier for $push with $each
$slice (update)Y----Modifier for $push with $each
$sort (update)Y----Modifier for $push with $each

Coverage: 23/23 tokens handled


3. Projection Operators (3 tokens)

OperatorHot (SE)Cold (QE)FederatedNotes
$ (positional)Y--RTPositional projection from query match
$elemMatchY--RTProject first matching array element
$sliceY--RTSubset of array: {field: {$slice: N}} or {field: {$slice: [skip, limit]}}

Additional projection features supported on SE:

  • Inclusion/exclusion projection
  • _id exclusion with inclusion fields
  • $meta projection (textScore, indexKey, recordId, vectorSearchScore)
  • Computed expressions in $project (arithmetic, string, conditional, field refs)
  • Nested field projection via dot notation

Coverage: 3/3 tokens handled


4. Aggregation Stages (51 tokens)

StageHot (SE)Cold (QE)FederatedNotes
$matchYYYDelegates to filter evaluator
$projectYYYFull expression support
$sortYYY
$limitYYY
$skipYYY
$countYYY
$groupYPYQE: basic aggregation via DataFusion; SE: 25 accumulators
$addFieldsYPYSE: full expression eval; QE: literal/field-ref only
$setYPYAlias for $addFields
$unsetYPY
$unwindYPYWith includeArrayIndex and preserveNullAndEmptyArrays
$lookupYNRTEquality and sub-pipeline forms; requires SE for foreign docs
$facetYNRTMultiple sub-pipelines
$bucketYNRTBoundary-based grouping with accumulators
$bucketAutoYNRTAuto-bucketing with granularity series (R5/R10/R20/R40/1-2-5/POWERSOF2)
$sampleYNRTRandom sampling
$replaceRootYNRTReplace document with sub-expression
$replaceWithYNRTAlias for $replaceRoot
$redactYNRTField-level access control ($$PRUNE/$$DESCEND/$$KEEP)
$sortByCountYNRTGroup + count + sort descending
$graphLookupYNRTRecursive graph traversal with depth control
$densifyYNRTFill gaps in sequences (numeric/date)
$fillYNRTFill missing values (value/LOCF/linear methods)
$documentsYNRTLiteral document stream
$setWindowFieldsYNRTWindow functions with document/range bounds
$unionWithYNRTCombine with another collection's documents
$geoNearYNRTDistance computation and proximity sort
$outYNRTWrite output to collection (parsed, execution deferred)
$mergeYNRTMerge into collection (replace/keepExisting/merge/fail)
$vectorSearchYYYHNSW hot path + cold brute-force; coordinator merges results
$changeStreamP--PPassthrough; WAL-based change streams handled at gateway/SE level
$changeStreamSplitLargeEventP----Passthrough
$collStatsP----Passthrough (returns collection statistics)
$currentOpP----Passthrough (server diagnostics)
$indexStatsP----Passthrough (index usage statistics)
$planCacheStatsP----Passthrough (plan cache diagnostics)
$listSessionsP----Passthrough
$listLocalSessionsP----Passthrough
$listClusterCatalogN----Atlas-specific; passthrough (no-op)
$listSampledQueriesN----Atlas-specific; passthrough (no-op)
$listSearchIndexesN----Atlas Search; passthrough (no-op)
$querySettingsN----Atlas-specific; passthrough (no-op)
$queryStatsN----Atlas-specific; passthrough (no-op)
$rankFusionN----Atlas Search; passthrough (no-op)
$scoreN----Atlas Search; passthrough (no-op)
$scoreFusionN----Atlas Search; passthrough (no-op)
$searchN----Atlas Search; passthrough (no-op)
$searchMetaN----Atlas Search; passthrough (no-op)
$shardedDataDistributionN----Sharded cluster diagnostic; passthrough (no-op)
$mlTrainY----Thermocline extension: in-database ML training
$mlPredictY----Thermocline extension: ML model prediction

Coverage: 51/51 reference tokens parsed (30 fully implemented, 8 passthrough for server diagnostics, 13 Atlas/sharded-specific stubs)


5. Aggregation Expressions (182 tokens)

5.1 Arithmetic (42 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$addYYYMulti-operand; date+number addition
$subtractYYY
$multiplyYYYMulti-operand
$divideYYYDivision by zero returns NaN
$modYYY
$absYYY
$ceilYYY
$floorYYY
$roundYYYOptional precision argument
$truncYYYOptional precision argument
$powYPY
$sqrtYPY
$logYPY
$lnYPY
$log10YPY
$expYPY
$randYNRTRandom number generation
$sampleRateYNRTProbabilistic document sampling
$sinYPY
$cosYPY
$tanYPY
$asinYPY
$acosYPY
$atanYPY
$atan2YPY
$sinhYPY
$coshYPY
$tanhYPY
$asinhYPY
$acoshYPY
$atanhYPY
$degreesToRadiansYPY
$radiansToDegreesYPY
$sigmoidYNRTThermocline extension
$bitAndYNRT
$bitOrYNRT
$bitXorYNRT
$bitNotYNRT
$binarySizeYNRT
$bsonSizeYNRT
$subtypeYNRTBinary subtype extraction
$tsIncrementYNRTTimestamp increment extraction
$tsSecondYNRTTimestamp seconds extraction

5.2 String (22 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$concatYYY
$toUpperYYY
$toLowerYYY
$substrYYYAlias for $substrCP
$substrCPYYYCode-point based substring
$substrBytesYPYByte-based substring
$trimYPY
$ltrimYPY
$rtrimYPY
$splitYPY
$strLenCPYPY
$strLenBytesYPY
$indexOfCPYPY
$indexOfBytesYPY
$strcasecmpYPY
$regexMatchYPY
$regexFindYPY
$regexFindAllYPY
$replaceOneYPY
$replaceAllYPY
$encStrContainsYNRTEncrypted string operations
$encStrEndsWithYNRTEncrypted string operations
$encStrStartsWithYNRTEncrypted string operations
$encStrNormalizedEqYNRTEncrypted string operations

5.3 Date (17 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$yearYYY
$monthYYY
$dayOfMonthYYY
$dayOfWeekYYY
$dayOfYearYYY
$hourYYY
$minuteYYY
$secondYYY
$millisecondYYY
$weekYYY
$isoWeekYYYQE: dedicated UDF
$isoWeekYearYYYQE: dedicated UDF
$isoDayOfWeekYYY
$dateToStringYPY
$dateFromStringYPY
$dateTruncYPY
$dateAddYPY
$dateSubtractYPY
$dateDiffYPY
$dateFromPartsYPY
$dateToPartsYPY

5.4 Conditional (3 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$condYYYIf-then-else
$ifNullYYY
$switchYPY

5.5 Comparison (7 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$cmpYPY
$eqYYYExpression-level equality
$neYYY
$gtYYY
$gteYYY
$ltYYY
$lteYYY

5.6 Logical (3 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$andYYY
$orYYY
$notYYY

5.7 Type (15 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$typeYPY
$convertYYYQE: dedicated ConvertWithErrorUdf
$toBoolYPY
$toIntYPY
$toLongYPY
$toDoubleYPY
$toDecimalYPY
$toStringYPY
$toObjectIdYYYQE: dedicated ToObjectIdUdf
$toDateYPY
$toUUIDYNRT
$toHashedIndexKeyYNRT
$serializeEJSONYNRT
$deserializeEJSONYNRT
$isNumberYPY
$isArrayYPY

5.8 Array (25 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$arrayElemAtYYYQE: dedicated array UDF
$concatArraysYPY
$filterYPYVariable binding ($$this, as)
$inYPY
$sizeYPY
$sliceYPY
$mapYPYVariable binding
$reduceYNRT
$zipYNRT
$firstYPY
$lastYPY
$firstNYNRT
$lastNYNRT
$maxNYNRT
$minNYNRT
$reverseArrayYNRT
$rangeYNRT
$indexOfArrayYNRT
$sortArrayYNRT
$setEqualsYYYQE: dedicated set UDF
$setIntersectionYYYQE: dedicated set UDF
$setUnionYYYQE: dedicated set UDF
$setDifferenceYYYQE: dedicated set UDF
$setIsSubsetYYYQE: dedicated set UDF
$anyElementTrueYYYQE: dedicated set UDF
$allElementsTrueYYYQE: dedicated set UDF
$arrayToObjectYPY
$objectToArrayYPY

5.9 Object (5 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$mergeObjectsYYYQE: dedicated object UDF
$objectToArrayYPY
$arrayToObjectYPY
$getFieldYYYQE: dedicated object UDF
$setFieldYNRT
$unsetFieldYNRT

5.10 Variable Binding and Custom Code (3 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$letYNRTVariable binding for sub-expressions
$literalYYYPass-through literal value
$functionPNRTCustom JS: evaluates first arg only (no server-side JS engine)
$accumulatorPNRTCustom accumulator: falls back to $sum with input field (no server-side JS)

5.11 Window/Positional (8 tokens)

ExpressionHot (SE)Cold (QE)FederatedNotes
$rankYNRTWindow function in $setWindowFields
$denseRankYNRTWindow function
$documentNumberYNRTWindow function
$shiftYNRTWindow function with offset
$expMovingAvgYNRTWindow function; passthrough in expression context
$derivativePNRTPassthrough in expression context
$integralPNRTPassthrough in expression context
$linearFillPNRTPassthrough in expression context
$locfPNRTPassthrough in expression context
$minMaxScalerYNRTThermocline extension

5.12 Metadata (1 token)

ExpressionHot (SE)Cold (QE)FederatedNotes
$metaYPYtextScore, indexKey, recordId, vectorSearchScore

5.13 Accumulator Expressions in Expression Context (8 tokens)

These work on arrays when used outside $group:

ExpressionHot (SE)Cold (QE)FederatedNotes
$sumYYY
$avgYYY
$minYYY
$maxYYY
$pushYNRT
$addToSetYNRT
$stdDevPopYNRT
$stdDevSampYNRT
$covariancePopYNRT
$covarianceSampYNRT

Coverage: 182/182 reference expression tokens handled (all implemented on SE; QE coverage via DataFusion UDFs + post-filter routing)


6. Accumulators ($group context) (25 tokens)

AccumulatorHot (SE)Cold (QE)FederatedNotes
$sumYYYCount mode with {$sum: 1}
$avgYYY
$minYYY
$maxYYY
$firstYPY
$lastYPY
$pushYNRT
$addToSetYNRT
$concatArraysYNRTConcatenate arrays across group
$setUnionYNRTUnion arrays across group
$countYYYShorthand for {$sum: 1}
$stdDevPopYNRTPopulation standard deviation (Welford's)
$stdDevSampYNRTSample standard deviation (Bessel's correction)
$topYNRTTop document by sort order
$topNYNRTTop N documents by sort order
$bottomYNRTBottom document by sort order
$bottomNYNRTBottom N documents by sort order
$firstNYNRTFirst N values in group
$lastNYNRTLast N values in group
$maxNYNRTN largest values
$minNYNRTN smallest values
$mergeObjectsYNRTMerge documents in group
$medianYNRTMedian of numeric values
$percentileYNRTPercentile(s) of numeric values
$accumulatorPNRTCustom JS accumulator; falls back to $sum with input (no JS engine)

Coverage: 25/25 reference accumulator tokens handled


7. Cross-Cutting Features

7.1 Collation

FeatureHot (SE)Cold (QE)FederatedNotes
Collation-aware queriesYNRTQE rejects collation; coordinator routes collated queries through SE (E5-S4)

7.2 ACID Transactions

FeatureStatusNotes
Multi-document transactionsYMVCC snapshot isolation
readConcern levelsYlocal, majority, snapshot, linearizable
writeConcern levelsYw:1, w:majority
readPreferenceYprimary, secondary, secondaryPreferred, nearest
atClusterTime (time travel)YMVCC versioning

7.3 Change Streams

FeatureStatusNotes
Collection-level change streamsYWAL-based, guaranteed order
Database-level change streamsY
Resume tokensYpostBatchResumeToken in $clusterTime response
fullDocument: "updateLookup"Y
FeatureStatusNotes
$vectorSearch stageYHot: HNSW index; Cold: brute-force scan
Hybrid search (pre/post filter)YAuto-strategy based on selectivity
Score normalizationY$meta: "vectorSearchScore"
SQ8 quantizationY4x memory reduction with re-ranking

7.5 Geospatial

FeatureHot (SE)Cold (QE)Notes
2dsphere indexYN
$geoWithinYNGeoJSON and legacy coordinate pairs
$geoIntersectsYNGeoJSON geometries
$near / $nearSphereYNHaversine distance
$geoNear stageYNDistance computation with sort
FeatureHot (SE)Cold (QE)Notes
Text indexYYQE has dedicated text index module
$text query operatorYYWith language, case/diacritic sensitivity
$meta: "textScore"YYRelevance scoring

8. Known Limitations and Out-of-Scope Items

8.1 Atlas Search Features (Not Applicable)

The following are Atlas Search / Atlas-specific features that are not part of the MongoDB server wire protocol. They are parsed as passthrough stages and return no-op results:

  • $search, $searchMeta -- Atlas Search full-text
  • $listSearchIndexes -- Atlas Search index management
  • $rankFusion, $score, $scoreFusion -- Atlas Search scoring
  • $listClusterCatalog, $listSampledQueries, $querySettings, $queryStats -- Atlas diagnostics
  • $shardedDataDistribution -- Sharded cluster diagnostics

8.2 Custom JavaScript Execution

FeatureStatusNotes
$function expressionPartialEvaluates first argument; no server-side JS engine
$accumulatorPartialFalls back to $sum with input field; no server-side JS
$where query operatorYSimple expression evaluation via js_eval module

8.3 Cold-Tier Routing Strategy

When a query contains operators not pushable to the Parquet/DataFusion cold tier, the coordinator employs one of these strategies:

  1. Post-filter materialization: QE returns candidate rows; coordinator applies SE-level BSON filter
  2. SE routing: Query is routed entirely through the storage engine, which reads only hot data
  3. Federated merge: Hot results from SE + cold results from QE are merged; post-filter applied

Operators that trigger post-filter materialization on cold tier:

  • $nin, $regex (complex patterns), $all, $size, $mod
  • $elemMatch with range conditions on LIST columns
  • $type without schema information
  • Bitwise operators ($bitsAllSet, $bitsAllClear, $bitsAnySet, $bitsAnyClear)

Operators that trigger SE-only routing:

  • $geoWithin, $geoIntersects, $near, $nearSphere
  • $expr, $where, $jsonSchema
  • Collation-aware queries

8.4 Fail-Closed Semantics (E4-S1)

Unknown or unrecognized query operators are handled with fail-closed semantics:

  • SE filter evaluator returns FilterOperator::Invalid (matches nothing)
  • QE scan abandons SQL pushdown and returns all rows for post-filtering
  • This ensures no silent data loss from unsupported operators

9. Summary Statistics

CategoryTotal TokensSE (Hot)QE (Cold)Federated
Query Operators3131 Y15 Y, 12 P, 4 N31 (RT for N)
Update Operators2321 Y, 2 P----
Projection Operators33 Y--3 RT
Aggregation Stages5130 Y, 8 P10 Y/P51 (RT for N)
Aggregation Expressions182178 Y, 4 P~60 Y/P182 (RT for N)
Accumulators2524 Y, 1 P6 Y/P25 (RT for N)
Total308 (+7 ext.)287 Y, 14 PAll handled

All 308 reference MQL tokens are handled by at least one tier. Operators not natively supported on the cold tier are routed through the hot tier or handled via post-filter materialization in the federated path.