[AWS-cloudtrail] set host.id field#17827
Conversation
🚀 Benchmarks reportTo see the full report comment with |
|
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
packages/aws/data_stream/cloudtrail/elasticsearch/ingest_pipeline/default.yml
Show resolved
Hide resolved
packages/aws/data_stream/cloudtrail/elasticsearch/ingest_pipeline/default.yml
Outdated
Show resolved
Hide resolved
andrewkroh
left a comment
There was a problem hiding this comment.
Also, the PR body only links to a private security-team issue, no commit message explaining what or why. This should be the squash commit message and the long-term record in git history. Please fill in the "Proposed commit message" section per the PR template.
💚 Build Succeeded
History
|
There was a problem hiding this comment.
The host.id value seems better aligned to ECS now.
Regarding the proposed commit message, those are going to be viewed as plain text by tools like git log so do not use Markdown. Some recent examples: #17508, #17379. So please edit that to avoid markdown, and then be sure to paste it during the squash and merge.
|
Package aws - 6.4.0 containing this change is available at https://epr.elastic.co/package/aws/6.4.0/ |
Fix CloudTrail host entity classification to only use EC2 instance IDs. The entity classification logic incorrectly treated AMI, EBS volume, and snapshot IDs as host resources. Only EC2 instance IDs (i-*) are genuine host identifiers. Removed volume, snapshot, and image from hostResourceTypes, keeping only instance. Removed vol-, snap-, and ami- from hostIdPrefixes, keeping only i-. These non-instance resource IDs now fall through to genericTargets and are stored in entity.target.id instead of host.target.entity.id. Added host.id as a new field set to a single string value (the first EC2 instance ID) rather than an array, since host.id should represent one host per document. Updated test expected files accordingly: replaced host blocks with entity.target.id for snapshot and AMI tests, and set host.id as a single string in the five EC2 instance test files.
Proposed commit message
closes https://github.com/elastic/security-team/issues/16343.
Fix host entity classification in CloudTrail ingest pipeline
Problem
The CloudTrail ingest pipeline's entity classification logic treated all resources matching
hostIdPrefixes(i-,vol-,snap-,ami-) andhostResourceTypes(instance,volume,snapshot,image) as host entities. This caused AMI IDs, EBS volume IDs, and snapshot IDs to be incorrectly stored inhost.target.entity.idandhost.id.These resource types are not hosts in any meaningful sense — only EC2 instance IDs (
i-*) represent actual compute hosts. For example,ModifySnapshotAttributeevents were producing documents withhost.id: ["snap-0a392d80692e2526a"], which is semantically wrong.Changes
hostResourceTypes: Removedvolume,snapshot, andimage— onlyinstanceremains.hostIdPrefixes: Removedvol-,snap-,ami-— onlyi-remains.host.id: New field added in this PR to surface the target host identifier. Set to a single string value (first EC2 instance ID) rather than an array, sincehost.idrepresents one host per document.vol-*,snap-*,ami-*) now fall through togenericTargetsand are stored inentity.target.idinstead.Affected fields
i-*)host.target.entity.id(array)host.target.entity.id(array),host.id(string)ami-*)host.target.entity.identity.target.idvol-*)host.target.entity.identity.target.idsnap-*)host.target.entity.identity.target.idTest updates
test-modify-snapshot-attribute-json.log-expected.json—hostblock replaced withentity.target.idtest-modify-image-attribute-json.log-expected.json—hostblock replaced withentity.target.idhost.idas a single string valueChecklist
changelog.ymlfile.Author's Checklist
How to test this PR locally
cd packages/awselastic-package test pipeline -v --data-streams cloudtrail --generateRelated issues
Screenshots