Hello everyone,
This thread is dedicated to sharing the meeting minutes of the LLVM Qualification Working Group. We will use this space to publish summaries and action items from our monthly sync-ups, in order to keep the broader community informed.
Meeting notes are initially drafted collaboratively in a shared FramaPad and then archived here after each session for long-term reference and discussion.
Notes (FramaPad): MyPads
The LLVM Qualification WG was formed following community interest in exploring how LLVM components could be qualified for use in safety-critical domains (e.g. automotive, oil & gas, medical). We welcome contributions from all perspectives: compiler developers, toolchain integrators, users from regulated industries, and others interested in software tool confidence, safety assurance, and systematic quality evidence.
If you’re interested in participating or following along, feel free to join the discussions here or connect via the LLVM Community Discord in the #fusa-qual-wg channel.
Warm regards,
Wendi
(on behalf of the LLVM Qualification WG)
2 Likes
uwendi
3
[LLVM Qual WG] arm-tv demo with @regehr
2025/07/31 8:30AM JST
Recording: link
Chat transcript: link
Notes by Gemini 
Summary
@regehr introduced Alive2, a software tool for refinement checking of LLVM optimizations, and the arm-tv tool, developed by his group, for translation validation of ARM 64-bit assembly code, explaining their methodologies and demonstrating their application in bug detection. While arm-tv has found 46 bugs, primarily silent miscompiles, scalability challenges, particularly with memory access, were acknowledged. Some questions were raised about limitations during lifting, tool’s trustworthiness, adding new architectures.
Details
-
Introduction to Alive2 @regnerjr introduced Alive2, a software tool for refinement checking of LLVM optimizations. He explained that LLVM’s middle-end rewrites Intermediate Representation (IR) to improve code, often making it faster or smaller. These transformations are considered “refinements,” meaning the new code’s set of meanings is a subset of the old code’s. Alive2 uses symbolic execution of code before and after optimization and generates questions for the Z3 theorem prover to verify if the optimized code refines the unoptimized code.
-
Alive2 Compiler Explorer @regehr encouraged attendees to try Alive2 via its compiler explorer instance at alive2.llvm.org, noting its ease of use and providing an example problem to explore. He also mentioned that papers have been written about Alive2, but hands-on use is likely more engaging.
-
arm-tv** overview** @regehr presented the arm-tv tool, developed by his group, which performs translation validation for ARM 64-bit assembly code. He demonstrated an LLVM function that uses `memcmp` and showed how the ARM backend optimizes it, including inline substitution of `memcmp` and replacing control flow with a conditional select. The arm-tv tool aims to prove that the assembly code is a faithful translation of the LLVM IR.
-
Translation Validation methodology @regehr explained that translation validation involves assigning a mathematical meaning to the code before and after transformation. Alive2 is used to formally represent the meaning of LLVM functions. For ARM code, arm-tv assigns meaning either by using hand-written instruction semantics derived from the manual or through a mechanically derived version from ARM’s formal description of instructions. The tool then translates the ARM code back into LLVM IR and invokes Alive2 for a refinement check.
-
arm-tv** in action** @regehr demonstrated arm-tv, which is called backend-tv and also supports RISC-V. The tool parses assembly into LLVM MC inst, lifts the ARM assembly code by building a small execution environment that resembles an ARM processor with registers initialized with “freeze poison” (an indeterminate bit pattern), and then processes the lifted instructions. This process results in a clumsy but optimizable function that Alive2 can then efficiently check against the original code.
-
Bug detection with arm-tv @regehr shared that arm-tv has found 46 bugs, primarily silent miscompiles, most of which are in the machine-independent parts of the LLVM backend. He noted that while arm-tv recently started supporting RISC-V, fewer bugs have been found compared to ARM, attributing this to the multi-backend impact of the existing bugs. @regehr mentioned that most bugs were found with the help of fuzzers and an automated testing workflow.
-
Origin and scalability challenges @regehr revealed that the impetus for arm-tv came from a conversation with JF Bastien years ago about trusting LLVM’s top-of-tree for automotive applications. @YoungJunLee inquired about handling large functions more efficiently, to which @regehr acknowledged scalability as a significant weakness of the tool, particularly with memory access, indicating that improvements to Alive2’s memory encoding are needed.
-
Limitations and trustworthiness @uwendi asked about limitations or loss of information during lifting. @regehr explained that while ARM assembly semantics are cleaner, challenges arose in lifting code with powerful pointers to LLVM’s weaker object-offset model, necessitating changes to Alive2’s memory model to support “physical pointers”. He addressed concerns about trusting arm-tv, suggesting documenting the tool’s scope and limitations, with a separate group of people needed to verify its implementation for certification purposes.
-
Tool Usage and Bug Reporting @regehr stated that currently, only his team uses arm-tv. When a bug is reported by the tool, he verifies it on an actual ARM machine to confirm the misbehavior before reporting it to the LLVM developers, ensuring the tool’s output is vetted. He also mentioned the existence of false alarms due to the complexity of the LLVM memory model.
-
Impact on LLVM specification and Future Work @regehr shared an anecdote where arm-tv uncovered an ambiguity in the interaction between the LLVM Lang Ref and the AR64 ABI document, which led to a resolution and fix in LLVM. Regarding future work, he expressed interest in supporting translation validation of inline assembly and concurrency-related aspects of LLVM IR, such as volatile accesses and interrupt handlers in embedded systems.
-
Adding new architectures Luc Forget inquired about the modularity of arm-tv for adding new ISA semantics. @regehr explained that while not “super modular,” refactoring had made it easier to add RISC-V support, and adding a third architecture would likely not be difficult, though Alive2’s lack of multiple address space support remains a limitation for GPU backends. He also highlighted that supporting a new architecture primarily requires a description of its instruction set. @regehr mentioned that for ARM, they can automatically generate the instruction semantics from ARM’s Architecture Specification Language (ASL), but for RISC-V, it was done by hand. He hopes to derive x86-64 semantics automatically in the future, as manual implementation is too extensive.
1 Like
uwendi
4
LLVM Qualification Group’s August Sync-Up Agenda
Hi all,
The main topics for the next sync-up are as follows:
Internal process update: proposed changes to membership criteria
(Thanks to @petarj and @etomzak for their inputs)
- Discussion: Proposed changes to membership criteria to address the current internal process’ inherent challenges with active collaboration and contribution.
- Action: If possible, please complete the Participant Introduction and Membership Criteria Form before the sync-up.
Clang C/C++ WG insights on conformance to ISO
(Follow-up on the previous discussion regarding specifications and traceability to tests for Clang)
- Invitees: @Endill and @AaronBallman (to the EU/Asia or Americas-friendly timeslots, depending on their availabilities)
- Related RFC: https://discourse.llvm.org/t/rfc-c-conformance-test-suite/69821
- Current Status: Overview of Clang’s test suite (
clang/test/cxx) and conformance challenges.
- Discussion: How these insights impact LLVM Qualification Group’s goals, explore possible steps on creating better traceability and conformance for Clang.
Open Floor
- Any additional topics, questions, comments, or suggestions from group members.
- Review action items and assignees.
1 Like
uwendi
6
LLVM Qualification Group’s September Sync-Up Agenda
Hi all, hope you’re having a great summer!
For those who filled in the Participant Introduction & Membership form and indicated interest in being active contributors (Q3): our next sync-up is planned for next week (@petarj @CarlosAndresRamirez @evodius96 @petbernt @slotosch @YoungJunLee @ZakyHermawan).
Ahead of the call, I’d like to invite you to drop a quick message on Discord about the offline reviews we talked about last time (see also the minutes
):
Additional quick topics for the agenda
:
-
Introduction of @ZakyHermawan
-
Concerns or viewpoints about meeting transcriptions / AI summaries (Gemini)
-
@slotosch’s proposal for the LLVM Conference in Santa Clara
-
Eclipse SDV’s interest in the LLVM open qualification initiative + invitation to their community meetup in Japan
-
Poster at Innovations in Compiler Technology
-
Insights from a conversation with an ELISA project member on resources & funding
Given that time is short, I may also create separate Discord threads to keep these discussions moving more efficiently.
Thanks again to everyone who answered the form. @etomzak You’re warmly welcome in our calls, even if your availability is limited.
Small note: at the moment, @evodius96 is officially the only member from US/Canada time zones. @PLeVasseur is interested and expected to join sync-ups, so just letting you know for context.
Hi Wendi, due to travel I won’t be able to attend this upcoming call. If there is a better time for everyone, please don’t hesitate to make a change. Thank you!
1 Like
uwendi
8
Hi @evodius96, thanks for letting me know! Since you won’t be able to attend, I’ll cancel the upcoming call. We’ll keep the EU/Asia sync-up as the main source of updates this time, so I’d kindly ask you to have a look at the minutes afterward to stay in the loop. Looking forward to catching up with you in a future call once you’re back from your travels. Safe travels!
uwendi
9
Handling non-technical topics asynchronously
Hi all,
@petarj @CarlosAndresRamirez @petbernt @slotosch @YoungJunLee @capitan-davide
cc: @evodius96 @PLeVasseur @etomzak @ZakyHermawan
For our upcoming sync-up (tomorrow), we have more items on the agenda than we can realistically cover in one hour. Here’s a draft of the presentation (the final version will be uploaded to GitHub after the sync-up):
To make sure we use our meeting time efficiently, and to give everyone a fair chance to contribute, I’d like to suggest that we handle some of the non-technical topics asynchronously on our Discord channel.
Topics for discussion in Discord:
By shifting these items to Discord, we’ll free up the sync-up call to focus on technical discussions (e.g. directions for a grey-box approach, tool usage confidence, evaluation of development processes).
Outcomes from Discord discussions will also be summarized here in our meeting minutes on Discourse so nothing is lost. Looking forward to your thoughts and contributions on Discord! 
uwendi
10
Notes - September 2025
Participants
EU/Asia-friendly (Tuesday 2025/09/02, 5:30PM JST, 1h)
- Carlos Ramirez (host)
- Davide Cunial
- Erik Tomusk
- Florian Gilcher
- Jorge Sousa
- José Rui Simoes
- Oscar Slotosch
- Petter Berntsson
- Vlad Serebrennikov
- Wendi Urribarri (co-host - note taking & check time)
- YoungJun Lee
- Zaky Hermawan
Americas-friendly (Wednesday 2025/09/03, 6:00AM JST, 1h)
Cancelled - See https://discourse.llvm.org/t/llvm-qualification-wg-sync-ups-meeting-minutes/87148/8
Agenda
Refer to https://discourse.llvm.org/t/llvm-qualification-wg-sync-ups-meeting-minutes/87148/6
Links
Highlights
Non-technical topics
About note taking
- No shared concerns from “core members”
- One shared concern about AI writing down every word (from a non-member)
- Gemini not enabled today
New self-nomination through the Google Form - Zaky’s presentation
- EE student from Indonesia
- Coming as an individual
- Working with ISO/SAE 21434 (cybersecurity)
Oscar’s idea for the US LLVM 2025 conference (end of October)
- Proposal to have a corner about compiler qualification at the exhibition for sponsors
- Discuss with people and attract interest on it
- No conclusion about this point, to be taken for discussion to Discord
Technical topics
Wrap-up about direction and focus of the discussions since July
Reference functional safety standard
- Members from several industries (automotive, trains, robots, etc), so different functional safety standards apply
- General framework for functional safety of E/EE/PE systems is IEC 61508, so makes sense to use it as first guidance
- As IEC 61508 is parent of other functional safety standards, the expectations around tool confidence are very similar
Need to provide evidence of tool usage
- Three questions from the safety standards (see slides)
- If answers Yes - Yes - No, then there is a need to provide the evidence
- Comments about question 3:
- Most safety standards are written for users, so it depends on how much they examine the “relevant outputs”
- In the case of a compiler, relevant output → final executable
- More and more difficult to thoroughly verify the final executable (complexity)
- Many of the tools that are traditionally used by vendors are closed, some open tools that can be used to check the relevant outputs
First target: Clang compiler
- As a tool provider of Clang, we don’t know what will be the usage
- As a tool user, you can restrict yourself (for example, using it only for debugging, not for mass-production)
- Users will need to rely on the compiler depending on the usage
- All the C++ parsing and semantic analysis is done by the Clang frontend
- Language + Standard => Version changes are fast
- Which flavor of C or C++?
- C++ spec improves significantly
- C spec is more rigorous
- Suggestion of small scope:
- Limit to the lexer? Spec wise, it is more simple
- Opinion 1: use of restricting to lexer is limited; from safety point of view, trust the lexer but what about the rest; requirements and association to what use cases
- Opinion 2: agree, need of a valid use case for the lexer
About effort for a conformance test suite:
- Opinion 1: Amount of effort would be huge even for 1 version
- Opinion 2: Testing is laborious but not very hard
- Opinion 3: If you want to do a good conformance test, bottleneck is interpretation of the standard; testing specification against C/C++ is not as with Rust
- Comment: commercial test suites are expensive, 40-45K Euro to qualify only one version of a compiler
- What is generated is version dependent
- About usage of Alive2:
- Replace the Clang front-end with Alive2 front-end and generate Alive2 IR from source code?
- Clarification:
alivecc doesn’t replace Clang itself; it simply adds a pass plugin for verification at the IR transformation stage
Grey-box approach
- Qualification is typically black-box activity
- Disadvantage: to be done for every combination, optimization options, etc
- Grey-box approach could be useful, but one limitation is lack of specification of intermediate I/O
- Example: specification of the IR
- Identification of regressions in IR could be useful
Possibility of LTS?
- From this RFC, this will not happen - https://discourse.llvm.org/t/rfc-llvm-lts/84049
- Labor can be massive
- Very difficult interpretation of what an LTS is
- In Rust community an LTS is 2 years
- Rather do qualification work incrementally on top of the main version
- Which C++ flavor to support is a big question
Example of funding
- Fleet of students
- Reasonable budget
- Example: University of Romania
- “Top leadership” needed to guide the students
Selection of qualification methods
- ISO 26262 proposes four qualification methods
- Evaluation of the tool dev process is highly recommended only for ASIL A and ASIL B
- Not to be used alone without Validation so to cover ASIL C and ASIL D
- Clarification:
- Proposal is not about using Validation or Evaluation of dev process alone
- Have a mix of both to cover all safety integrity levels with at least one highly recommended method
- Many tool vendors already use a mix of these two methods for “certification“: validation by the vendor + audit by a certifying body
Actions
Wendi :
- share summary of topics with the group and the community
- point out the possible ways to proceed
- create threads for each subject on Discord? (easier to communicate)
All : participate in the open discussions (preferred on Discord, but Discourse is also fine)
1 Like
uwendi
11
Just a quick update: I’ve submitted a PR to update the documentation and add links to the August and September 2025 sync-up slide decks, which helped guide our recent discussions:
https://github.com/llvm/llvm-project/pull/156897
The slides are currently hosted in llvm-project/docs/qual-wg/slides, but following feedback, I plan to migrate them to a more appropriate location (likely llvm-www) once confirmed with the community. Please feel free to check the PR for details, and let me know if you have any feedback!
uwendi
12
LLVM Qualification Group’s October Sync-Up Agenda
Calendar: Getting Involved — LLVM 22.0.0git documentation
Non-technical Topics
-
Docs updates (September) – summary of recent GitHub changes for LLVM Qualification Group Docs: #156897, #157804, #156184, #158842, #160021, #161113
-
New members (September) – welcome to @sousajo-cc @jr-simoes @ZakyHermawan 
-
Decision-taking in the WG (requested by @slotosch) – discussion on how we define consensus, use votes, and set time limits for open topics
Technical Topics
-
Upstream efforts & action plan (small deliverables) – build an initial roadmap based on the “confidence in the use of software tools” workflow
-
Tutorial / Introduction (proposed by @YoungJunLee) – outline for newcomer materials
-
Qualification focus areas (proposed by @petbernt) – first candidate areas and lightweight templates
Looking forward to seeing everyone at the October sync-up and continuing to shape our next steps together.
uwendi
13
Notes - October 2025
Participants
EU/Asia-friendly (Tuesday 2025/10/07, 5:30PM JST, 1h)
- Carlos Ramirez
- Davide Cunial
- Oscar Slotosch
- Petter Berntsson
- Wendi Urribarri
- YoungJun Lee
- Zaky Hermawan
Americas-friendly (Wednesday 2025/10/08, 6:00AM JST, 1h)
- Alan Phipps
- Wendi Urribarri
Agenda
Refer to https://discourse.llvm.org/t/llvm-qualification-wg-sync-ups-meeting-minutes/87148/12
Links
Highlights
Gemini notes taken in the EU/Asia meeting, modified by @uwendi
Non-technical topics
Meeting Logistics and New Members
- Slides shared on Discord before the meeting
- Addition of three new members
- Updated member list has been added into the official group webpage (PR not merged yet)
Challenges in Decision-Making
- Difficulties in making decisions within the group
- Cultural factors, challenges in reaching consensus, and time limits as primary issue
- What does consensus means for us, what constitutes it
- Find compromise, common understanding
- Clear when people have different opinions, but for people who are not giving an opinion, how can we know?
- Divided positions - What happens when 50/50?
- Consensus could mean no objections are raised, or majority rule if participation is lacking
- Engagement in early conversations and reasoning are necessary, but for clear yes/no decisions, voting with a >50% threshold should be used, excluding non-votes from the percentage => If someone doesn’t vote, we cannot count the vote
- If there a “gray” area, then maybe discuss it in next sync-up meeting
- What is the model from other WGs? In general, it’s done through “informal” decision making
- Don’t over complicate => Lightweight process
Technical topics
Discussion on Confidence and Qualification Activities
-
Two work threads: classical black box validation + component confidence
-
These are explorations, not final decisions
-
Suggestion: rephrase the “confidence” aspect as “gray/white box qualification approach by focusing on sub components” rather than “improving confidence,” explaining that tool confidence is a risk analysis that cannot be improved but rather risks can be reduced through qualification
-
Counter-opinion: different providers can have different Tool Confidence Levels (TCLs) for the same tool due to mitigation actions, which could be seen as improving confidence
-
Proposal: this WG could provide advice on activities to increase confidence in tool usage, beyond just qualification
-
Suggested reformulation:
- Create confidence in the use of the tools, e.g. by using them carefully and checking the tool results
- This will help to reduce the tool confidence level (TCL) as defined in ISO 26262, which measure the tool risks
- The remaining risks of the tools have to be reduced by tool qualification
- The mostly used qualification method is validation
- It can be performed as a black-box approach, just by testing the requirements, e.g. the compliance of a compiler with a C or C++ standard
- Another validation strategy which can be applied to open-source tools is a white/grey box approach, i.e. by validating sub-components of the tools, e.g. a lexer, parser or backend of a compiler
-
Different perspective:
- Validation is acknowledged as a common method for qualification for usage of a tool for the higher risk contexts (SIL3/4, ASIL C/D…)
- But, evaluation of the development process is another highly recommended method for lower criticality contexts (ASIL B)
- Let’s document our arguments about suitability of LLVM’s development process upstream, and perform an assessment with auditors who could volunteer for it
- Development process - LLVM developer policy + Evaluation of the process: OSS best practices + human factors metrics
- A qualification could then be achieved up to ASIL B context usage
- Validation would be a 2nd step approach for qualification for higher criticality levels
Tool Qualification and Library Qualification
- Tool qualification by validation: different strategies like using test suites or breaking down qualification by validating sub-components such as the lexer or parser
- Qualifying sub-components is useful for overall qualification
- Library qualification is more rigorous and typically requires a white box approach with code coverage measurements
Focus on Upstream LLVM and Reusability
- The group’s focus is exclusively on the upstream LLVM project, aiming to create reusable output that downstream companies can utilize
- The group strives for reusability, building general or usable tools for various standards
Challenges in Defining a Toy Project
- Several constraints, including the lack of resources and the difficulty in choosing a language (C, C++, Rust…), compiler version, language standard, and compiler flags for a deliverable “toy project”
- Agreeing on a toy project would help answer these questions and provide a concrete direction for the group’s work
Standard Library Function Qualification
- Start with function-level qualification of standard library functions within a limited scope, such as a single header file, as it would be easier to manage than qualifying the entire compiler
- Use publicly known sources like CPP reference for requirements and tracing tests, aiming to provide upstream qualification evidence and design that downstream users could then utilize
- Library qualification is more rigorous than tool qualification but selecting requirements for library functions is a good idea
- If the group explores the library qualification topic, it might focus on pure C language due to the significant differences and complexities of C++
- The difference between C and C++ standard libraries is substantial, C++ libraries change frequently
- ISO/PAS 8926:2024 to qualify software components (incl. libraries)
- Planned to be merged with Part 8-Chapter 12 in next edition of ISO 26262 (2027)\
- Two axes: analysis of provenance and analysis of complexity
- Depending on results of the analysis, different qualification activities must be performed
Outputs and Collaboration
- Draft tutorial for newcomers, focusing on organizing documents, presentations, and projects like compiler and code coverage sanitizers
- External toolchains could be referenced
- RFC to gather community input, aiming to provide a framework for compiler qualification and enable sharable evidence for downstream users
- Other efforts can and will be explored in the future, and all members are encouraged to work on any of these subjects, based on their interests and bandwidth/availability
Actions
-
@petbernt will make a proposal presentation for library qualification
-
@YoungJunLee and @ZakyHermawan will help write the draft tutorial
-
@slotosch will put his thoughts on tool confidence and qualification activities in a message in the group’s Discord channel
-
@uwendi will continue working on analysis of the suitability of the LLVM Developer Policy as development process for safety-critical, and on how to reuse good quality best practices from ELISA as part of this analysis/assessment
-
@CarlosAndresRamirez will run experiments on LLVM and provide evidence to see what can be accomplished using human-centric metrics analysis in LLVM’s upstream development process
-
All: provide feedback on @petbernt‘s RFC (considering the potential library approach) by end of the current calendar week, then @petbernt will publish it
uwendi
14
LLVM Qualification Group’s November Sync-Up Agenda
Pre-reads / Prep
- Think ahead: Which linking policy should we adopt for Meeting Materials and why
- Bring 1–2 bullets: Your progress, top blocker, and any help needed
Non-technical topics
- Docs updates (October): summary of GitHub changes for LLVM Qualification Group Docs
- Decision on “Meeting Materials” linking policy: Per-meeting slide links (monthly PRs) versus Single folder link (rely on folder contents) in Meeting Materials
- WG page check-in: What to add/correct/improve (wording, links, missing topics, clarifications)
- US LLVM & relevant bits for this WG: Attendee highlights; what should feed into our backlog?
Technical topics / Round-robin updates
Looking forward to seeing everyone at the November sync-up.
uwendi
15
To our readers,
FYI, we’re merging the two regional sync-ups into one Tuesday 13:00 UTC call; see our Discord channel and the LLVM calendar for details.
uwendi
16
Notes - November 2025
Participants
Tuesday 2025/11/05, 13:00 UTC, 1h
Agenda
Refer to LLVM Qualification WG sync-ups meeting minutes - #14 by uwendi
Links
Highlights
Gemini notes taken, modified by @uwendi
Non-technical topics
Pull Requests and Meeting Materials
- The latest Pull Request (PR), prepared by @YoungJunLee was merged and there are no currently pending PRs.
- There is a concern regarding meeting materials being stored only in a personal Google Drive. We stopped storing them on GitHub due to feedback from 2 people in the community. We’ll consider contacting the infrastructure team to determine a suitable storage location, possibly archiving quarterly PDFs on GitHub while continuing to share links from the Google Drive for collaboration.
Takeaways from the US LLVM Conference
- @evodius96 attended the presentation from Peter Smith from ARM which was interesting and showing a practical example relevant to compiler qualification.
- @uwendi attended @slotosch’s round table. Other attendees asked questions about the group’s work. Feedback from Peter Smith is that he would like to see a keynote from the group next year outlining our vision. Directed the attendees to @petbernt’s request for comments.
- Does Solid Sands only provide a test suite for toolchain quality, as suggested by a previous conference talk? Looks like they currently offer a broader selection, including libraries qualification. Some concerns exist because it seems that they do not participate in the ISO C and C++ committees, potentially impacting their ability to keep up with language changes and interpretations.
- Discussed the challenge of judging the quality of conformity test kits without insight into the build and testing processes. Suggestion: false positives might be an indicator of quality.
- Conference presentation and group visibility: there might be motivation for the group to do more for next year’s conference, possibly a keynote, round table, or talk, to create more awareness. It was too early to present at the recent conference. April’s LLVM Europe conference might be a target, or potentially a conference in Asia if one is organized.
- Operational Maturity and Code Reviews: @uwendi attended the operational maturity round table, discussing code reviews as a way to prevent errors. There is an RFC from Infra about enforcing pull requests. Noted that 30% of commits from 5% of contributors are reviewed post-merge, and issues include a lack of reviewers and long post-commit review times.
Technical topics / Round-robin updates
Status of the Qualification RFC and Library Proposal
- Concern from @petbernt that non-context-aware readers might misunderstand the RFC.
- @petbernt agreed to post it in other Discord channels to increase visibility.
- @petbernt also has a small proposal for library qualification that is not yet finished but could be shared later in the week.
Tutorial Preparation
- @uwendi provided materials to @YoungJunLee that could be helpful to prepare the tutorial / set of initial documents for newcomers.
- Open to further assistance with the interpretation of the functional safety standards
- @uwendi can contact old colleagues from the railways industry for materials related to their standards.
- @uwendi shared personal efforts regarding the ISO 26262 standard, noting a successful submission and acceptance of around 30 simple comments and plans for two or three significant (possibly controversial) upcoming comments regarding the description of methods 1b and 1c (evaluation of the development process and validation of the software tool).
- @uwendi expressed concerns about the wording and requirements for method 1b, suggesting that evaluation of the tool development process should be highly recommended for all ASILs.
Software Tool Experiments and Automation
- @CarlosAndresRamirez shared results from his quality-focused experiments on the development process, which were conducted on LLVM and Alive2, confirming at least 12 defects in Alive2 that still need to be reported.
- @CarlosAndresRamirez noted that adoption of this strategy will face resistance unless the process is fully automated to prevent extra effort for developers. He will work on automation, possibly involving git hooks, and prepare templates and documentation for the group, emphasizing that this work relates to “evaluation of the development process” (Method 1b in ISO 26262).
- @uwendi requested a demo; agreed to schedule offline.
LLVM Developer Policy Suitability Analysis
- @uwendi presented a draft analysis of the LLVM developer policy’s suitability based on the open-source software practices checklist from the Elisa project’s Lighthouse OSS SIG. She gave an initial, rough evaluation based on gut feeling due to the immaturity of the maturity scale, but stated that the written process generally “looks good”.
- Areas rated as having “limited maturity” included one item in security/supply chain (SBOMs) and the lack of a formal bus factor metric program.
- While the written process looks good, measuring its execution and follow-through requires automation, leading to being stuck due to the undecided maturity scale.
- The checklist for LLVM’s suitability would support Method 1B, evaluation of the development process, and argued that a best practices list is more suitable than requiring assessment against a national or international standard, especially for open source.
- @uwendi proposed a new action for herself to start writing templates for the workflow steps outlined in the previous sync-up. These templates would be useful for both downstream and upstream. For tool qualification, the idea is to include worksheets for each method.
Actions / Next steps
- @petbernt will post some slides with a small proposal for library qualification, and post the RFC in other channels in Discord, to make it more visible.
- @CarlosAndresRamirez will report the confirmed bugs found in Alive2.
- @uwendi will contact the infrastructure team to discuss where meeting materials could be stored, possibly archiving quarterly PDFs in GitHub while continuing to share links to her Google Drive.
- @petbernt will ask Peter Smith from ARM to add a comment to the RFC about input request from the community.
- @uwendi will contact an old colleague from the railways industry to ask if they have materials and can help with the new version of the railway standard for tool qualification.
- @CarlosAndresRamirez will work on automation using git hooks or similar mechanisms for the development process quality strategy and prepare templates and documentation for the human-centered approach to finding defects to share with the group.
- @CarlosAndresRamirez will give a demo of the defect finding experiments at a dedicated slot or the next sync-up.
- @uwendi will start writing templates for the gray boxes in the workflow from the last sync-up and share them with the group for review and improvements.
- @petbernt will send some presentations about Solid Sands and how their super test and super framework works.
uwendi
17
LLVM Qualification Group’s December Sync-Up Agenda
Hello all,
Our next sync-up meeting will be dedicated to a special topic: Function-Level Qualification Methodology for libc/libc++ (@petbernt ’s proposal)
Slides (open for comments):
@petbernt will tell us more about why standard libraries matter for qualification, giving us a walkthrough illustrated with examples from the slide deck:
- Overview of the proposed proof-of-concept:
- Unique challenges: vast API surface, varied implementations, historical behavior, testability
- Why “function-level qualification” might offer a scalable entry point
- Relationship to previous WG discussions on requirements traceability, upstream-friendly artefacts, and modular qualification pilots
- Structure of the approach
- Requirements decomposition (per function)
- Test strategy (functional, boundary, behavioral)
- Traceability approach across libc/libc++
- Criteria for function selection in the PoC
Let’s take time also for any clarifying questions and discussion about strengths & gaps. Some guiding questions:
- Does the function-level approach scale?
- Is the requirements/test breakdown consistent with typical qualification workflows?
- What do we consider “minimum viable artefacts” for a PoC?
- How should the WG ensure upstream-friendliness?
- What is the interaction with existing test suites (libc/test, libc++/test, other conformance suites)?
- How might other methods & tools (static analyzers, fuzzing, translation validation) play a complementary role?
- Any foreseeable blockers?
See you next week!
1 Like
uwendi
18
Notes - December 2025
Participants
Tuesday 2025/12/02, 13:00 UTC, 1h
Agenda
Refer to LLVM Qualification WG sync-ups meeting minutes - #17 by uwendi
Links
Gemini notes taken, modified by @uwendi
Summary
@petbernt proposed a function-based qualification approach for C and C++ standard library functions, aiming to create modular, upstream-friendly artifacts (like design and traceability evidence) that downstream users can reuse for functional safety qualification and certification. We discussed the feasibility, complexity, and compliance of applying this component-level approach to extensive libraries like libC and libC++ under standards like ISO 26262.
Highlights
- Proposal of a Function-Based Qualification Approach for C and C++ Standard Library Functions: The goal of this concept is to explore a methodology for function-level qualification of the standard C and C++ library that is upstream friendly and modular, motivated by the standard library’s inherent modularity, which makes it ideal for incremental qualification. It aims to demonstrate that qualification artifacts can be created in an open-source community and reused by downstream consumers, requiring compatibility with open-source methodologies, with requirements and design artifacts potentially being built from public non-normative references like CPP reference, as direct quotation of ISO specification text is prohibited due to copyright.
- Outcome and Objective of the Qualification Methodology Proposal: The outcome of the proposal would be a practical and transparent methodology demonstrating how safety qualification evidence can be created and collaboratively maintained upstream, incrementally increasing the scope of what the open-source community qualifies. This proposal focuses on a practical pilot direction for the Qual WG.
- Clarification on the Upstream Methodology and Downstream Qualification: The proposal is not for qualifying upstream, but for creating qualified materials (like design artifacts or traceability artifacts between implementation, design, and tests) that can live upstream and be used downstream. The actual qualification and eventual certification would be the responsibility of downstream users, who would review the evidence and provide it to a third party for assessment.
- Compliance with Safety Standards and Modularity of Qualification: The methodology could be applied to all different standards, noting that most standards focus on documentation and artifact tracing. C and C++ based qualification on a function level is a good approach, citing the necessity due to the vast number of functions in the C++ template library. Agreed on the need for a modular approach, viewing the proposal as a proof of concept that can be incrementally expanded.
- Scope and Role of the WG in Qualification Evidence Creation: The scope of the proposal is to create upstream evidence, not certification, by providing reusable upstream qualification artifacts, and it is not intended to qualify or certify LLVM itself. The concept envisions the WG producing artifacts in the upstream LLVM repository, which downstream users would then reuse in their safety case or safety plans before providing them to certification bodies. The goal is to reduce duplicated effort by having more evidence provided upstream. The downstream users would then reuse and extend these artifacts within their own safety life cycle, and hopefully begin contributing qualification evidence upstream. The WG’s role is to enable qualification, not to act as a certifying body, with the outcome being a model for creating referenceable qualification evidence for downstream organizations to build upon.
- Definition of a Software Component and its Relevance to Safety Standards: A software component is defined as a self-contained part of a software system with a defined interface (e.g., public API function signatures, header files), documented behavior that can be verified with specific expectations, an implementation (source code), and a verification scope with tests, coverage, and analysis tied to it. Safety standards, such as IEC 61508 and ISO 26262, view runtime libraries as software components, requiring each component to have behavioral requirements, a design representation, verification, coverage, evidence of exercise, and traceability between all these elements. Standard library functions naturally fit this definition, with one function or one header ideally classified as one component.
- Benefits of Function Level Qualification: Function-level qualification works because each C and C++ function naturally maps to a small, self-contained software component, enabling incremental qualification of one function, one header, or one behavioral subset at a time. This approach encourages parallel contributions, as each contributor can qualify a self-contained component, create a design artifact, and specific test cases. This small scope (for example, four or five requirements per function) could be manageable and easily parallelizable, aiding with traceability and review.
- High-Level Approach for Qualification Artifacts: The proposed approach involves defining upstream-friendly requirements that express observable behavior, which would be derived from reverse engineering existing implementations and public non-normative references (like CPP reference, since ISO text cannot be quoted). Requirements should be broken down into atomic, easily testable requirements (similar to autosar style), where one requirement states only one thing. The behavior should also be modeled in a design artifact, such as a PlantUML diagram, to serve as a detailed design artifact representing the function’s control flow and outcomes. This process would then trace requirements and the design model to the implementation and testing.
- Establishing Traceability and Leveraging Existing Test Suites: A lightweight traceability system is needed, which could use YAML or Markdown tables, linking requirements and design models to implementations (source code) and tests. @petbernt suggested trying to leverage and extend the existing LLVM test suites for
lib and lib++ to achieve full coverage. Evidence such as coverage metrics (line and branch initially, with MCDC coverage planned for future expansion) should be recorded and published in upstream qualification artifacts.
- Recommendation to Start with Simple Pure Functions: It is recommended to start with simple pure functions like
memcpy, memmove, memcmp, binary_search, fabs, or isalpha because they are deterministic and have observable outcomes, making them easy to describe and test without platform dependencies, hidden states, or global variables. This serves as an ideal proof of concept before addressing more complex or stateful functions.
- Memcpy Example: Key Behaviors and Qualification Steps: The example used is
memcpy, with key behaviors being: returning a destination pointer, copying exactly count bytes from source to destination, and having undefined behavior for overlapping regions. The proposed qualification steps for memcpy involve creating small atomic requirements based on public reference models (like CPP reference), modeling it in PlantUML, linking requirements to existing tests, and verifying and potentially extending coverage. The description of memcpy from cppreference.com was cited, highlighting that undefined behavior occurs for overlapping objects or invalid/null pointers.
- Refining Specifications into Atomic Requirements and Assumptions: The cited description of
memcpy was broken down into six atomic verifiable requirements and two assumptions for the user regarding undefined behavior. A requirement should only say one thing and be testable with one specific test case, which is the definition of atomic. The user assumptions require that source and destination regions do not overlap and that the caller ensures valid and non-null pointers. There is no proposal yet of a specific methodology and recommendations for performing this refinement from natural language to atomic requirements, ensuring that the rewritten specifications are truly a refinement of the original one from natural language.(that the refinement is correct).
- Behavioral Model using PlantUML and Traceability: Presented a behavioral model for
memcpy in PlantUML, where the requirements are traced to specific sections of the diagram. The simple example shows the function signature, the copy loop, reading and writing to addresses, and returning the destination pointer. Since PlantUML is text-based, it can be handled through regular LLVM pull request review methods.
- Proposed Structure for Qualification Evidence in LLVM Repository: The qualification evidence is proposed to live in specific folders: a
requirements folder (containing YAML files, one per atomic requirement, defining what must be true and derived from behavioral descriptions), a design folder (containing PlantUML models referencing requirements), and a traceability file per function. The traceability file would list all linked requirements, reference design artifacts, test files, and test cases, verifiable through validation scripts. An example of the traceability file structure was shown, listing header files, design reference, requirements, assumptions, and associated test functions and files, which establishes verifiable links and enables automated validation.
- Referencing ISO Standards and Open Source Requirements: About referring to ISO standards, there was a suggestion that while copying the text is restricted, referring to the unique IDs of the standards is possible and provides a more credible and stable reference than a wiki page. Nevertheless, not all upstream maintainers or users have access to the copyrighted ISO standard, which is why an open-source friendly approach using publicly known sources (like CPP reference) is needed. The current
lip repository already references specific clauses in ISO standards and POSIX, but acknowledges that non-ISO descriptions might have slight differences.
- Additional Requirements from ISO 26262 for Unit Verification: ISO 26262 software component qualification refers to software unit verification (which aligns with C functions), and requires systematic considerations of equivalence classes, boundary values, and extreme values, in addition to simple requirements and tests. These are additional efforts are currently beyond the basic proof of concept.
- Automated Generation of Traceability Matrix Visualization: Based on the traceability file, a PlantUML-based traceability matrix visualization can be automatically generated, linking design, requirements, test cases, and test files, which provides a clear visualization of traceability. This visualization would also be committed as text files to the upstream repository.
- Placement of Qualification Evidence within LLVM Projects: Qualification evidence is suggested to reside close to its specific project, such as within the
libC or libC++ subprojects, with separate subfolders for requirements, design, traceability, and scripts. This structure avoids top-level clutter, mirrors the LLVM modular layout, and clarifies ownership for reviewers. Existing test cases would be reused, and new tests for coverage gaps would reside in the same test directory. A top-level qualification tools folder could hold shared templates and utility scripts for common methodology across subprojects.
- Coverage and Confidence Building: For the initial proof of concept, line and branch coverage are suggested to demonstrate the methodology’s feasibility, with the future potential to expand to MCDC coverage once the test infrastructure is stable. The goal is to identify coverage gaps and create more tests until each qualified function becomes a reusable evidence unit, fully testable with 100% coverage and traceability.
- Reusability of Existing Tests and Tooling for Traceability: About usefulness of existing tests in the framework, at least one person in the WG has analyzed the test cases, confirming they are often not mapped to specific functions or requirements, but adding this information (e.g., in comments) would make them usable. The same suggested that while PlantUML might be useful for flow specification, it might be “over engineering” for simple functions, and for traceability, understanding the file location is sufficient and easily translated into a qualification report.
- Feasibility Concerns Regarding Effort and Library Complexity: Some expressed concern that implementing this approach looks like an “enormous effort” and questioned its applicability to large and complex libraries like
libC and libC++, especially due to different implementations for various architectures. Companies are already performing this work proprietarily, and sharing basic modeling of simple functions would provide the majority of the work upstream, simplifying the task for downstream users who would only need to add architecture-specific artifacts.
- Handling Compliance and Deviations in Complex Functions: Another concern is that complex or stateful functions in the libraries might not inherently comply with ISO 26262. Even if there are deviations from the standard, documenting the actual behavior and providing known workarounds (such as assumptions for undefined behavior in a safety manual) still results in a valid safety case for an assessor, as the documented behavior is transparent. One confirmed finding bugs in the C++ specification itself and agreed that a safety manual is the correct approach to document known issues.
- Proprietary Qualification Efforts and Upstream Contribution: Existing companies offer products with test cases, requirements, coverage, and analysis for over a thousand C++ template specifications. They would not upstream this huge value as it is their business model. The purpose of the Qualification WG should still be to attempt some form of qualification upstream, not only for the benefit of reducing costs if some qualification work were already available upstream, but also for the reunification of a fragmented landscape of knowledge and understandability of basic properties of these libraries that are common to all downstreams.
- Focus on Latest Standard Version and Next Steps: The methodology aligns with the current LLVM
libC efforts toward C23. Clarity on focusing on the latest standard version is important. Regarding a plan for developing the testing framework/script, LLVM already has the lit unit test framework, and the work would involve mapping existing tests to requirements and extending coverage.
Actions / Next steps
-
@petbernt will check with people working on libraries development and testing in the community (e.g. Clang C/C++ WG) for opinions, and for historical background on any similar approaches that could have been already tried in the past for upstream libraries.
-
@uwendi will reflect upon and write down ideas on how to explain the spec refinement process in the methodology.
1 Like
uwendi
19
LLVM Qualification Group’s January 2026 Sync-Up Agenda
Here’s the agenda for next sync-up meeting:
- 2026 Objectives & Radar: Key focus areas for next year; topics on the horizon; how to engage the broader LLVM community with regards to our action ideas.
- Action Items Review: Quick status check on open actions; identifying ways to unblock or move stalled items forward.
- Small, Practical Deliverables: Ideas for lightweight, useful outputs (e.g. short notes, alignment with other LLVM groups, “good-enough-for-now” artifacts that can evolve).
See you next Tuesday!