[fix](search) Add searcher cache reuse and DSL result cache for search() function#60790
Conversation
…ch() function Phase 1: Make search() reuse InvertedIndexSearcherCache instead of directly opening Lucene IndexReader per field. This avoids redundant file I/O by caching IndexSearcher handles and extracting IndexReader from the cached searcher via getReader(). Phase 2: Add SearchFunctionQueryCache - an LRU cache that stores the final roaring::Roaring bitmap result per (segment, DSL) pair. Repeated identical DSL queries against the same segment skip Lucene execution entirely. The cache is controlled by the enable_search_function_query_cache session variable (default: true) and search_function_query_cache_limit config (10%). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SearchFunctionQueryCache UT: insert/lookup, cache miss, key encoding, null bitmap handling, key independence, overwrite behavior - FunctionSearch UT: build_dsl_signature canonical ordering, empty input, missing index properties, searcher cache handles lifetime - Regression test: cache consistency (same query same result), cache disabled vs enabled equivalence, multi-field query cache, complex boolean query cache vs uncached - Fix null pointer in SearchFunctionQueryCache::insert when null_bitmap is nullptr Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…elds The DSL result cache key was built using only original_dsl and field bindings, missing default_operator and minimum_should_match. This caused false cache hits when queries shared the same DSL text but had different options, returning incorrect results. Replace manual string concatenation with Thrift binary serialization of the entire TSearchParam, so all fields (DSL, AST root, field bindings, default_operator, minimum_should_match) are automatically included. This prevents future cache key bugs when new fields are added. Also add "p0" group tag to all 21 search regression test suites.
…nd comment fixes Address code review findings for the search function DSL result cache: - Add null check for result_bitmap before dereferencing on cache hit - Add LOG(WARNING) when Thrift serialization falls back to original_dsl - Add VLOG_DEBUG when extract_segment_prefix fails to find a reader - Add unit tests for deep-copy mutation isolation (lookup and insert sides) - Add realistic delimiter-based key collision test for length-prefix encoding - Fix misleading comments: config memory description, dsl_signature field, get_value<T>() type safety caveat, encode() method, class-level docs, and regression test sleep heuristic
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29372 ms |
TPC-DS: Total hot run time: 183898 ms |
…yCache and remove Thrift fallback Based on Occam's Razor analysis of PR apache#60790: 1. Reuse InvertedIndexQueryCache instead of separate SearchFunctionQueryCache: - Add SEARCH_DSL_QUERY type to InvertedIndexQueryType enum - Use CacheKey{segment_prefix, "__search_dsl__", SEARCH_DSL_QUERY, thrift_sig} - Cache post-mask_out_null bitmap (eliminates deep-copy complexity) - Reuse enable_inverted_index_query_cache session variable 2. Remove Thrift serialization fallback: - build_dsl_signature returns "" on failure instead of falling back to original_dsl - Prevents cache collisions from degraded signatures Removed: - SearchFunctionQueryCache class (inverted_index_cache.h/cpp) - SEARCH_FUNCTION_QUERY_CACHE cache type (cache_policy.h) - search_function_query_cache_limit config (config.h/cpp) - enable_search_function_query_cache session variable (SessionVariable.java) - enable_search_function_query_cache thrift field (PaloInternalService.thrift) - ExecEnv search function query cache init/destroy (exec_env.h, exec_env_init.cpp) - build_dsl_signature public API (now file-static in function_search.cpp) Net: -281 lines, same functionality.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29011 ms |
TPC-DS: Total hot run time: 184357 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
…tions The DSL result cache stores bitmaps with null _null_bitmap (since cached results are already post-mask_out_null). When these cached bitmaps are copied via set_index_result_for_expr (which takes by value), the copy constructor dereferences the null shared_ptr, causing a segfault. Add null checks to both the copy constructor and copy assignment operator, consistent with the null guards already present in operator&=, |=, -=.
|
run buildall |
TPC-H: Total hot run time: 29146 ms |
TPC-DS: Total hot run time: 183931 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
…und operations The DSL result cache stored only the data bitmap (post mask_out_null) and discarded the null bitmap. When cached results were used by VCompoundPred for NOT/OR/AND operations, the missing null bitmap caused bitmap operators to either silently do nothing or produce incorrect results (e.g., NOT queries including NULL rows). Fix: cache both data and null bitmaps, retrieve both on cache hit. Also make InvertedIndexResultBitmap operators null-safe so they handle nullptr _null_bitmap gracefully instead of silently skipping.
|
run buildall |
TPC-H: Total hot run time: 28803 ms |
TPC-DS: Total hot run time: 183883 ms |
The collect_all_field_nulls method was removed from FunctionSearch but two tests still referenced it, causing compilation failure. Remove the dead code (TrackingIndexIterator class and collect_all_field_nulls calls) while keeping the valid InvertedIndexResultBitmap null masking tests.
|
run buildall |
TPC-H: Total hot run time: 29034 ms |
TPC-DS: Total hot run time: 183602 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…h() function (apache#60790) ### What problem does this PR solve? Problem Summary: This PR adds searcher cache reuse and a DSL result cache for the `search()` function to improve query performance on repeated search queries against the same segments. **Key changes:** 1. **DSL result cache**: Caches the final roaring bitmap per (segment, DSL) pair so repeated identical `search()` queries skip Lucene execution entirely. Uses length-prefix key encoding to avoid hash collisions. 2. **Deep-copy bitmap semantics**: Bitmaps are deep-copied on both cache read and write to prevent `mask_out_null()` from polluting cached entries. 3. **Type-safe cache accessor**: Replaces raw `void*` return with a template `get_value<T>()` that uses `static_assert` to ensure T derives from `LRUCacheValueBase`. 4. **Session-level cache toggle**: Adds `enable_search_function_query_cache` session variable (default: true) to allow disabling the cache per query via `SET_VAR`. 5. **Const-correctness fix**: Removes unsafe `const_cast` in `build_dsl_signature` by copying TSearchParam before Thrift serialization. 6. **Defensive improvements**: Adds null check for `result_bitmap` on cache hit, logging for serialization fallback and cache bypass paths. ### Release note Add DSL result cache for search() function to skip repeated Lucene execution on identical queries.
…o branch-4.0 Squashed backport of the following master PRs: - apache#59747 [fix](search) Make AND/OR/NOT operators case-sensitive in search DSL - apache#60654 [refactor](search) Refactor SearchDslParser to single-phase ANTLR parsing and fix ES compatibility issues - apache#60782 [fix](search) Upgrade query type for variant subcolumns with analyzer-based indexes - apache#60784 [fix](search) Fix MATCH_ALL_DOCS query failing in multi-field search mode - apache#60786 [feat](search) Support field-grouped query syntax field:(term1 OR term2) - apache#60790 [fix](search) Add searcher cache reuse and DSL result cache for search() function - apache#60793 [fix](search) Fix wildcard query on variant subcolumns returning empty results - apache#60798 [fix](search) Use FE-provided analyzer key for multi-index columns in search() - apache#60814 [fix](search) Fix implicit conjunction incorrectly modifying preceding term in lucene mode - apache#60834 [test](search) Add regression test for wildcard query on variant subcolumns with multi-index - apache#60873 [fix](search) fix MATCH_ALL_DOCS losing occur attribute in multi-field expansion - apache#60891 [fix](search) inject MATCH_ALL_DOCS for multi-MUST_NOT queries in lucene mode
…o branch-4.0 Squashed backport of the following master PRs: - apache#59747 [fix](search) Make AND/OR/NOT operators case-sensitive in search DSL - apache#60654 [refactor](search) Refactor SearchDslParser to single-phase ANTLR parsing and fix ES compatibility issues - apache#60782 [fix](search) Upgrade query type for variant subcolumns with analyzer-based indexes - apache#60784 [fix](search) Fix MATCH_ALL_DOCS query failing in multi-field search mode - apache#60786 [feat](search) Support field-grouped query syntax field:(term1 OR term2) - apache#60790 [fix](search) Add searcher cache reuse and DSL result cache for search() function - apache#60793 [fix](search) Fix wildcard query on variant subcolumns returning empty results - apache#60798 [fix](search) Use FE-provided analyzer key for multi-index columns in search() - apache#60814 [fix](search) Fix implicit conjunction incorrectly modifying preceding term in lucene mode - apache#60834 [test](search) Add regression test for wildcard query on variant subcolumns with multi-index - apache#60873 [fix](search) fix MATCH_ALL_DOCS losing occur attribute in multi-field expansion - apache#60891 [fix](search) inject MATCH_ALL_DOCS for multi-MUST_NOT queries in lucene mode
… bug fixes (#61028) ### What problem does this PR solve? Squashed backport of all search() function improvements and bug fixes from master to branch-4.0. This PR combines the following master PRs into a single backport: | Master PR | Type | Description | |-----------|------|-------------| | #59747 | fix | Make AND/OR/NOT operators case-sensitive in search DSL | | #60654 | refactor | Refactor SearchDslParser to single-phase ANTLR parsing and fix ES compatibility issues | | #60782 | fix | Upgrade query type for variant subcolumns with analyzer-based indexes | | #60784 | fix | Fix MATCH_ALL_DOCS query failing in multi-field search mode | | #60786 | feat | Support field-grouped query syntax field:(term1 OR term2) | | #60790 | fix | Add searcher cache reuse and DSL result cache for search() function | | #60793 | fix | Fix wildcard query on variant subcolumns returning empty results | | #60798 | fix | Use FE-provided analyzer key for multi-index columns in search() | | #60814 | fix | Fix implicit conjunction incorrectly modifying preceding term in lucene mode | | #60834 | test | Add regression test for wildcard query on variant subcolumns with multi-index | | #60873 | fix | fix MATCH_ALL_DOCS losing occur attribute in multi-field expansion | | #60891 | fix | inject MATCH_ALL_DOCS for multi-MUST_NOT queries in lucene mode | ### Release note Backport search() function improvements including DSL parser refactoring, multi-field search fixes, variant subcolumn support, query caching, and field-grouped query syntax. ### Check List (For Author) - Test - [x] Regression test - [x] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [x] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason - Behavior changed: - [ ] No. - [x] Yes. New search() function features and bug fixes backported from master. - Does this need documentation? - [x] No. - [ ] Yes. ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label
What problem does this PR solve?
Problem Summary:
This PR adds searcher cache reuse and a DSL result cache for the
search()function to improve query performance on repeated search queries against the same segments.Key changes:
DSL result cache: Caches the final roaring bitmap per (segment, DSL) pair so repeated identical
search()queries skip Lucene execution entirely. Uses length-prefix key encoding to avoid hash collisions.Deep-copy bitmap semantics: Bitmaps are deep-copied on both cache read and write to prevent
mask_out_null()from polluting cached entries.Type-safe cache accessor: Replaces raw
void*return with a templateget_value<T>()that usesstatic_assertto ensure T derives fromLRUCacheValueBase.Session-level cache toggle: Adds
enable_search_function_query_cachesession variable (default: true) to allow disabling the cache per query viaSET_VAR.Const-correctness fix: Removes unsafe
const_castinbuild_dsl_signatureby copying TSearchParam before Thrift serialization.Defensive improvements: Adds null check for
result_bitmapon cache hit, logging for serialization fallback and cache bypass paths.Release note
Add DSL result cache for search() function to skip repeated Lucene execution on identical queries.
Check List (For Author)
Test
Behavior changed:
search_function_query_cache_limit, default 5% of mem_limit) for search() DSL results. Can be disabled per-query withSET enable_search_function_query_cache = false.Does this need documentation?
enable_search_function_query_cacheand BE configsearch_function_query_cache_limit.Check List (For Reviewer who merge this PR)