Test sampling rate increase cap#6412

raphaelgavache · 2026-03-03T01:49:23Z

System test for the following changes RFC

Normally the agent responds with rate_by_service containing the sampling rates it wants the tracer to use. A mock overrides that response with a controlled sequence:
first returning 0.1, then switching to 1.0, so the test can verify the tracer caps the increase gradually rather than jumping straight to the new rate.

Workflow

⚠️ Create your PR as draft ⚠️
Work on you PR until the CI passes
Mark it as ready for review
- Test logic is modified? -> Get a review from RFC owner.
- Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
A docker base image is modified?
- the relevant build-XXX-image label is present
A scenario is added, removed or renamed?
- Get a review from R&P team

github-actions · 2026-03-03T01:49:57Z

CODEOWNERS have been resolved as:

tests/test_sampling_rate_capping.py                                     @DataDog/system-tests-core
manifests/cpp_httpd.yml                                                 @DataDog/dd-trace-cpp
manifests/cpp_kong.yml                                                  @DataDog/system-tests-core
manifests/cpp_nginx.yml                                                 @DataDog/dd-trace-cpp
manifests/dotnet.yml                                                    @DataDog/apm-dotnet @DataDog/asm-dotnet
manifests/golang.yml                                                    @DataDog/dd-trace-go-guild
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java
manifests/nodejs.yml                                                    @DataDog/dd-trace-js
manifests/php.yml                                                       @DataDog/apm-php @DataDog/asm-php
manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
manifests/ruby.yml                                                      @DataDog/ruby-guild @DataDog/asm-ruby
utils/_context/_scenarios/__init__.py                                   @DataDog/system-tests-core
utils/proxy/mocked_response.py                                          @DataDog/system-tests-core

datadog-official · 2026-03-03T01:54:24Z

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 36010c2 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8969a6cdb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-05T17:50:08Z

+
+        logger.info(f"Observed _dd.agent_psr values (ramp phase): {sorted(agent_psr_values)}")
+
+        assert any(abs(v - LOW_RATE) < 0.01 for v in agent_psr_values), (


Avoid requiring low rate in post-cutoff span set

This assertion can fail for a correctly implemented tracer because _spans_before_ramp is captured immediately after the first observed low-rate span, so the post-cutoff window may legitimately contain only ramped values (e.g., if no additional low-rate requests are emitted after the cutoff). In that case CI reports a regression even though capped increase behavior is correct, making the new test flaky across tracer flush timings.

Useful? React with 👍 / 👎.

brettlangdon

manifests/python.yml change lgtm

When the trace-agent is restarted, a rate of 100% is initially provided by the trace-agent, increasing dramatically the number of traces sampled. A rate could go suddenly from 0.1% to 100% and back to 0.1% when the trace-agent eventually computes the new sampling rate. In particular it is observed that when the agent restarts, the payload buffering that waits for new container tags breaches its memory limit and we send spans without container tags. This PR applies a limit of sampling rate increases of x2 every 1s resulting in a x10 completed every 3-4s 1->100% takes 7s 0.1 -> 100% takes 10s Matching system-test: DataDog/system-tests#6412 Below is a screen of the before/after of the [dd-trace-go implementation](#4488) with go_span_new using the PR tracer and go_spam_old using the latest release of dd-trace-go and both applications generating 500 traces/s. Notice how the new code does not burst in throughput <img width="671" height="242" alt="Screenshot 2026-03-06 at 14 46 31" src="https://github.com/user-attachments/assets/df1496dd-1213-40d3-9d12-ad886052fc47" /> ### Motivation  ### Reviewer's Checklist  - [ ] Changed code has unit tests for its functionality at or near 100% coverage. - [ ] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag. - [ ] There is a benchmark for any new code, or changes to existing code. - [ ] If this interacts with the agent in a new way, a system test has been added. - [ ] New code is free of linting errors. You can check this by running `make lint` locally. - [ ] New code doesn't break existing tests. You can check this by running `make test` locally. - [ ] Add an appropriate team label so this PR gets put in the right place for the release notes. - [ ] All generated files are up to date. You can check this by running `make generate` locally. - [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally. Unsure? Have a question? Request a review! Co-authored-by: kemal.akkoyun <kemal.akkoyun@datadoghq.com>

first commit

1c3675a

raphaelgavache changed the title ~~first commit~~ Test sampling rate increase cap Mar 3, 2026

raphaelgavache added 2 commits March 5, 2026 12:37

adjust to pass

87ec97e

do based on java snapshot

e8969a6

raphaelgavache marked this pull request as ready for review March 5, 2026 17:44

raphaelgavache requested review from a team as code owners March 5, 2026 17:44

raphaelgavache requested review from Anilm3, brettlangdon, christophe-papazian, daniel-romano-DD and manuel-alvarez-alvarez and removed request for a team March 5, 2026 17:44

chatgpt-codex-connector bot reviewed Mar 5, 2026

View reviewed changes

raphaelgavache mentioned this pull request Mar 5, 2026

feat: cap default sampling rate increases DataDog/dd-trace-go#4488

Merged

9 tasks

raphaelgavache added 2 commits March 5, 2026 13:05

target go build

7d2a788

fix

36010c2

brettlangdon approved these changes Mar 5, 2026

View reviewed changes

raphaelgavache mentioned this pull request Mar 5, 2026

Add capped sampling rate increases DataDog/dd-trace-java#10715

Merged

nccatoni approved these changes Mar 6, 2026

View reviewed changes

christophe-papazian approved these changes Mar 10, 2026

View reviewed changes

raphaelgavache merged commit 43b42ad into main Mar 10, 2026
2057 checks passed

raphaelgavache deleted the raphael/sampling_cap_increase branch March 10, 2026 13:00

nccatoni mentioned this pull request Mar 12, 2026

Add SAMPLING_RATE_CAPPING scenario to CI #6478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test sampling rate increase cap#6412