Skip to content

Test sampling rate increase cap#6412

Merged
raphaelgavache merged 5 commits intomainfrom
raphael/sampling_cap_increase
Mar 10, 2026
Merged

Test sampling rate increase cap#6412
raphaelgavache merged 5 commits intomainfrom
raphael/sampling_cap_increase

Conversation

@raphaelgavache
Copy link
Copy Markdown
Member

@raphaelgavache raphaelgavache commented Mar 3, 2026

System test for the following changes RFC

Normally the agent responds with rate_by_service containing the sampling rates it wants the tracer to use. A mock overrides that response with a controlled sequence:
first returning 0.1, then switching to 1.0, so the test can verify the tracer caps the increase gradually rather than jumping straight to the new rate.

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added, removed or renamed?

@raphaelgavache raphaelgavache changed the title first commit Test sampling rate increase cap Mar 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

CODEOWNERS have been resolved as:

tests/test_sampling_rate_capping.py                                     @DataDog/system-tests-core
manifests/cpp_httpd.yml                                                 @DataDog/dd-trace-cpp
manifests/cpp_kong.yml                                                  @DataDog/system-tests-core
manifests/cpp_nginx.yml                                                 @DataDog/dd-trace-cpp
manifests/dotnet.yml                                                    @DataDog/apm-dotnet @DataDog/asm-dotnet
manifests/golang.yml                                                    @DataDog/dd-trace-go-guild
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java
manifests/nodejs.yml                                                    @DataDog/dd-trace-js
manifests/php.yml                                                       @DataDog/apm-php @DataDog/asm-php
manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
manifests/ruby.yml                                                      @DataDog/ruby-guild @DataDog/asm-ruby
utils/_context/_scenarios/__init__.py                                   @DataDog/system-tests-core
utils/proxy/mocked_response.py                                          @DataDog/system-tests-core

@datadog-official
Copy link
Copy Markdown

datadog-official bot commented Mar 3, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 36010c2 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@raphaelgavache raphaelgavache marked this pull request as ready for review March 5, 2026 17:44
@raphaelgavache raphaelgavache requested review from a team as code owners March 5, 2026 17:44
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8969a6cdb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


logger.info(f"Observed _dd.agent_psr values (ramp phase): {sorted(agent_psr_values)}")

assert any(abs(v - LOW_RATE) < 0.01 for v in agent_psr_values), (
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid requiring low rate in post-cutoff span set

This assertion can fail for a correctly implemented tracer because _spans_before_ramp is captured immediately after the first observed low-rate span, so the post-cutoff window may legitimately contain only ramped values (e.g., if no additional low-rate requests are emitted after the cutoff). In that case CI reports a regression even though capped increase behavior is correct, making the new test flaky across tracer flush timings.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manifests/python.yml change lgtm

@raphaelgavache raphaelgavache merged commit 43b42ad into main Mar 10, 2026
2057 checks passed
@raphaelgavache raphaelgavache deleted the raphael/sampling_cap_increase branch March 10, 2026 13:00
gh-worker-dd-mergequeue-cf854d bot pushed a commit to DataDog/dd-trace-go that referenced this pull request Mar 12, 2026
When the trace-agent is restarted, a rate of 100% is initially provided by the trace-agent, increasing dramatically the number of traces sampled. A rate could go suddenly from 0.1% to 100% and back to 0.1% when the trace-agent eventually computes the new sampling rate.

In particular it is observed that when the agent restarts, the payload buffering that waits for new container tags breaches its memory limit and we send spans without container tags.

This PR applies a limit of sampling rate increases of x2 every 1s resulting in a x10 completed every 3-4s
1->100% takes 7s
0.1 -> 100% takes 10s

Matching system-test: DataDog/system-tests#6412

Below is a screen of the before/after of the [dd-trace-go implementation](#4488) with go_span_new using the PR tracer and go_spam_old using the latest release of dd-trace-go and both applications generating 500 traces/s. Notice how the new code does not burst in throughput
<img width="671" height="242" alt="Screenshot 2026-03-06 at 14 46 31" src="https://github.com/user-attachments/assets/df1496dd-1213-40d3-9d12-ad886052fc47" />

### Motivation

<!--
* What inspired you to submit this pull request?
* Link any related GitHub issues or PRs here.
* If this resolves a GitHub issue, include "Fixes #XXXX" to link the issue and auto-close it on merge.
-->

### Reviewer's Checklist
<!--
* Authors can use this list as a reference to ensure that there are no problems
  during the review but the signing off is to be done by the reviewer(s).
-->

- [ ] Changed code has unit tests for its functionality at or near 100% coverage.
- [ ] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag.
- [ ] There is a benchmark for any new code, or changes to existing code.
- [ ] If this interacts with the agent in a new way, a system test has been added.
- [ ] New code is free of linting errors. You can check this by running `make lint` locally.
- [ ] New code doesn't break existing tests. You can check this by running `make test` locally.
- [ ] Add an appropriate team label so this PR gets put in the right place for the release notes.
- [ ] All generated files are up to date. You can check this by running `make generate` locally.
- [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally.

Unsure? Have a question? Request a review!


Co-authored-by: kemal.akkoyun <kemal.akkoyun@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants