OWASP 2025 Global AppSec EU: Full Schedule

arrow_back View All Dates

10:30am CEST

A completely pluggable DevSecOps programme, for free, using community resources

Friday May 30, 2025 10:30am - 11:15am CEST

Despite our collective efforts, we haven’t managed to harmonize tools and processes. Several projects like ASVS, SAMM and others have attempted information harmony but only the now defunct Glue has attempted tool orchestration harmonization and for good reason, it is a hard problem to solve, almost impossible by volunteers alone.

This session introduces Smithy, the only open-source workflow engine for security tools. Smithy stands as a unifying force for building robust, scalable DevSecOps, and beyond, pipelines. Leveraging Smithy’s support for OCSF-native data formats, we centralized the outputs of disparate security tools into a cohesive data lake, unlocking actionable insights that improved vulnerability prioritization and resource allocation.

The talk will showcase real-world applications, including integrating OpenCRE, Cartography, AI-driven solutions and open-source resources to enhance vulnerability detection accuracy and reprioritization, for free, using ready made community resources.

Whether you're a tech lead, security engineer, or CISO, this presentation offers practical guidance for creating adaptable, data-driven security workflows without breaking the bank.

Speakers

Spyros Gasteratos

Security Engineer & Architect, OWASP

Spyros has over 15 years of experience in the security world. Since the beginning of his career he has been an avid supporter and contributor of open source software and an OWASP volunteer. Currently he is interested in the harmonization of security tools and information and is currently... Read More →

Friday May 30, 2025 10:30am - 11:15am CEST
Room 116+117 CCIB

Breakout: Builder Track

Audience Intermediate

10:30am CEST

Think Before You Prompt: Securing Large Language Models from a Code Perspective

Friday May 30, 2025 10:30am - 11:15am CEST

Room 114

As Large Language Models (LLMs) become integral to modern applications, securing them at the code level is critical to preventing prompt injection attacks, poisoned models, unauthorized modifications, and other vulnerabilities. This talk delves into common pitfalls and effective mitigations when integrating LLMs into software systems, whether working with cloud vendors or hosting your own models. By focusing on LLM security from a developer's perspective rather than runtime defenses, we emphasize a shift-left approach—embedding security early in the software development lifecycle to proactively mitigate threats and minimize risks before deployment.

We'll examine practical security challenges faced during LLM integration, including input sanitization, output validation, and model pinning. Through detailed code examples and a live demonstration of model tampering, attendees will witness firsthand how attackers can exploit inadequate security controls to compromise LLM systems. The demonstration will showcase a real-world scenario where a legitimate model is swapped with a malicious one, highlighting the critical importance of robust model integrity verification and secure deployment practices.

Participants will learn concrete implementation patterns and security controls that can prevent such attacks, with practical code samples they can apply to their own projects. The session will cover essential defensive techniques including proper API key management, secure model loading and validation, and safe handling of sensitive data in prompts. Whether you're building applications using cloud-based LLM services or deploying your own models, you'll leave with actionable code-level strategies to enhance your application's security posture and protect against emerging AI-specific threats.

Speakers

Yaron Avital

Security Researcher, Palo Alto Networks

Yaron Avital is a seasoned professional with a diverse background in the technology and cybersecurity fields. Yaron's career has spanned over 15 years in the private sector as a software engineer and team lead at global companies and startups.Driven by a passion for cybersecurity... Read More →

Tomer Segev

Security Researcher, Palo Alto Networks

Tomer Segev is a cybersecurity professional with a strong background in software development and security research. He began his career at 17 as a developer before serving as a cyber researcher in the top cyber unit of the IDF, where he gained hands-on experience in the most advanced... Read More →

Friday May 30, 2025 10:30am - 11:15am CEST
Room 114

Breakout: Defender Track

Audience Intermediate

10:30am CEST

LLMs vs. SAST: How AI Delivers Accurate Vulnerability Detection and Reduces False Positives

Friday May 30, 2025 10:30am - 11:15am CEST

Room 115

Static Application Security Testing faces a significant challenge: while current tools excel at identifying potential vulnerabilities in isolation, they struggle to understand the holistic context of how data flows between client and server components. This limitation leads to an overwhelming number of false positives, particularly in detecting Cross-Site Scripting (XSS) vulnerabilities, where the interaction between client and server components is crucial for determining genuine security risks.

In this presentation, we'll demonstrate how Large Language Models (LLMs) can revolutionize vulnerability detection by understanding complete codebases across client and server components. Through a series of practical experiments, we'll show how LLMs can:

- Track data flow paths between different application layers
- Identify genuine vulnerabilities while reducing false positives through context-aware analysis
- Provide detailed reasoning about vulnerability exploitability
- Handle real-world applications at scale

We'll share our findings from extensive research using LLMs, including successes and limitations in analyzing cross-component vulnerabilities. Attendees will learn how this approach could transform security testing and what challenges must be addressed for production implementation.

Speakers

Jonathan Santilli

Software Engineer and AI practitioner, Snyk

Jonathan Santilli defines himself as a problem solver, or at least he tries. With over 20 years of experience working for various tech companies, Jonathan has played different roles, from Team lead developer to Product manager and, of course, problem solver. Jonathan is mainly interested... Read More →

Kirill Efimov

Security R&D Team Lead, Mobb.ai

As a seasoned security researcher, I've led teams at Snyk and now helm security research at Mobb. With a wealth of publications and speaking engagements, I've delved deep into the intricacies of cybersecurity, unraveling vulnerabilities and crafting solutions. From pioneering research... Read More →

Friday May 30, 2025 10:30am - 11:15am CEST
Room 115

Breakout: Manager/Culture Track

Audience Intermediate

11:00am CEST

(MOVED TO MEMBER LOUNGE) OWASP Certified Secure Developer Open Call

Friday May 30, 2025 11:00am - 11:45am CEST

Room 111

Join Us in Shaping the Future of Secure Software Development

The OWASP Education and Training Committee is developing a certification program designed specifically for developers—and we need your expertise.

For the first time, this initiative will be showcased at OWASP Global AppSec EU 2025, and we’re inviting the community to help build the body of knowledge that will form the foundation of the certification curriculum.

If you're passionate about secure coding and developer education, this is your chance to contribute meaningfully to a global effort. Let’s build something that lasts—together.

Speakers

Shruti Kulkarni

Information Security Architect, 6point6

Shruti is an information security / enterprise security architect with experience in ISO27001, PCI-DSS, policies, standards, security tools, threat modelling, risk assessments. Shruti works on security strategies and collaborates with cross-functional groups to implement information... Read More →

Friday May 30, 2025 11:00am - 11:45am CEST
Room 111

Breakout: OWASP Project Demo Lab, Call for Contributors

Audience Advanced, Beginner, Intermediate

11:30am CEST

Navigating Agentic AI Security Risks: OWASP’s GenAI Guidance for Securing Autonomous AI Agents

Friday May 30, 2025 11:30am - 12:00pm CEST

Room 133-134

As artificial intelligence advances, autonomous AI agents are becoming integral to modern applications, automating decision-making, problem-solving, and even interacting dynamically with users. However, this evolution brings new security challenges that traditional cybersecurity frameworks struggle to address. OWASP’s GenAI Security Project has identified Agentic Security Risks as a critical category of threats that can compromise AI-driven systems, leading to unintended actions, data leaks, model manipulation, and adversarial exploits.

This session will explore Agentic Security Risks—a unique class of vulnerabilities stemming from AI agents’ autonomy, adaptability, and ability to interact with complex environments. We’ll dissect how malicious actors can exploit these systems by influencing their decision-making processes, injecting harmful instructions, or leveraging prompt-based attacks to bypass safety constraints.

Through a deep dive into OWASP’s latest findings, attendees will gain practical insights into risk identification and mitigation strategies tailored for AI-driven agents. The talk will cover:

Understanding Agentic Security Risks: How autonomous AI agents process, reason, and act—and where vulnerabilities emerge.
Threat Modeling for AI Agents: Key security considerations when deploying AI-driven agents in enterprise and consumer applications.
Exploitable Weaknesses in AI Agents: Case studies on prompt injection, adversarial manipulation, data poisoning, and model exfiltration.
OWASP’s Mitigation Framework: Best practices for securing agentic AI systems, including robust validation, policy enforcement, access control, and behavioral monitoring.
Security by Design: How to integrate GenAI security principles into the development lifecycle to preemptively mitigate risks.
By the end of the session, attendees will have a structured approach to assessing and mitigating security risks in agentic AI systems. Whether you’re a developer, security professional, or AI architect, this session will equip you with actionable strategies to secure your AI-powered applications against emerging threats.

Join us to explore the cutting edge of AI security and ensure that autonomous agents work for us—not against us.

Speakers

John Sotiropoulos

Head of AI Security / OWASP GenAI Security Project (Top 10 for LLM & Agentic Security Co-Lead), Kainos

John Sotiropoulos is the head of AI Security at Kainos where he is responsible for AI security and securing national-scale systems in government, regulators, and healthcare. John has gained extensive experience in building and securing systems in previous roles as developer, CTO... Read More →

Friday May 30, 2025 11:30am - 12:00pm CEST
Room 133-134

Breakout: OWASP Project Showcase Track, Lightning Talk

Audience Intermediate, Advanced

11:30am CEST

Restless Guests: From Subscription to Backdoor Intruder

Friday May 30, 2025 11:30am - 12:15pm CEST

Room 113

Through novel research our team uncovered a critical vulnerability in Azure's guest user model, revealing that guest users can create and own subscriptions in external tenants they've joined—even without explicit privileges. This capability, which is often overlooked by Azure administrators, allows attackers to exploit these subscriptions to expand their access, move laterally within resource tenants, and create stealthy backdoor identities in the Entra directory. Alarmingly, Microsoft has confirmed real-world attacks using this method, highlighting a significant gap in many Azure threat models. This talk will share the findings from this first of its kind research into this exploit found in the wild.

We'll dive into how subscriptions, intended to act as security boundaries, make it possible for any guest to create and control a subscription undermines this premise. We'll provide examples of attackers leveraging this pathway to exploit known attack vectors to escalate privileges and establish persistent access, a threat most Azure admins do not anticipate when inviting guest users. While Microsoft plans to introduce preventative options in the future, this gap leaves organizations exposed to risks they may not even realize exist––but should definitely know about!

Speakers

Simon Maxwell-Stewart

Security Researcher and Data Scientist, BeyondTrust

Simon Maxwell-Stewart is a seasoned data scientist with over a decade of experience in big data environments and a passion for pushing the boundaries of analytics. A Physics graduate from the University of Oxford, Simon began his career tackling complex data challenges and has since... Read More →

Friday May 30, 2025 11:30am - 12:15pm CEST
Room 113

Breakout: Breaker Track

Audience Intermediate

11:30am CEST

Surviving prioritisation when CVE stands for “Customer Very Enthusiastic"

Friday May 30, 2025 11:30am - 12:15pm CEST

Room 114

Everybody talks about problems with the width of CVE space - too many, coming too fast, how to prioritise them. This talk takes the problem into 3D - let’s talk about the depth of the space!

How a single medium risk CVE can consume crazy amounts of time of an AppSec team?

We will look into couple of examples of CVEs in a product that my team protects and trace their journey through the ecosystem. On the journey we will meet various dragons, hydras, and other dangerous creatures:

- LLM-empowered scanners hallucinating CVSS scores, packages, versions, anything;
- Good research teams making mistakes translating between different versions of CVSS
- Glory-chasing “research teams” writing their own advisories for no apparent reason
- Consensus based approach in CVE ecosystem guarantees security team cannot sleep until EVERY scanner has calmed down;
- And my favourite troll under the bridge: customers saying “I don’t care it’s not reachable in your context, I can’t deploy your product until my scanner is happy”.

The soundtrack for the quest is provided by the vendors continuously messaging you with fantastic promises to solve everything.

Can your character survive the quest and what loot do you need?

Speakers

Irene Michlin

Application Security Lead, Neo4j

Irene Michlin is an application security lead at Neo4j. Before going into application security, Irene worked as software engineer, architect, and technical lead at companies ranging from startups to corporate giants. Her professional interests include securing development life-cycles... Read More →

Global AppSec BCL Surviving Prioritisation key

Friday May 30, 2025 11:30am - 12:15pm CEST
Room 114

Breakout: Defender Track

Audience Intermediate

1:15pm CEST

Abusing misconfigurations in CI/CD to hijack apps and clouds

Friday May 30, 2025 1:15pm - 2:00pm CEST

Room 113

Writing and maintaining secure applications is hard enough, and in today's paradigm with DevOps and CI/CD developers are often tasked with integrating and automating a full code-to-cloud pipeline. This introduces new control plane to application risks. Some of these instances can lead to full compromise if exploited by a threat actor.

In this talk we will break down the core components of a modern CI/CD-workflow such as OIDC, GitHub Actions and Workload Identities. Then we will describe the security properties of these components, and present a threat model for the code-to-cloud flow. Based on this we will showcase and demonstrate common flaws that could lead to full application and cloud compromise.

To increase the capacity of organizations to detect such flaws we will release an open source tool, developed by the presenters, to discover and triage these issues. In the session the tool will be demonstrated and discussed. Attendees will get actionable knowledge and tooling that can be applied when leaving the room. The talk and tool is based on findings and experiences from cloud and application security assessment conducted by the presenters.

Speakers

Håkon Nikolai Stange Sørum

Principal Security Architect and Partner, O3 Cyber

Håkon has extensive knowledge on implementing secure software development practices for modern DevOps teams, designing and implementing cloud security architectures, and securely operating cloud infrastructure. Håkon offers industry insights into the implementation of secure design... Read More →

Karim El-Melhaoui

Principal Security Architect at O3 Cyber, Microsoft Security MVP, O3 Cyber

Karim is a seasoned and renowned thought leader within cloud security. At O3 Cyber, he conducts research and development and works with our clients, primarily in Financial Industry. Karim has a background in building and operating platform services for security on private and public... Read More →

OWASP GlobalAppSec Abusing CICD to hijack apps and clouds pdf

Friday May 30, 2025 1:15pm - 2:00pm CEST
Room 113

Breakout: Breaker Track

Audience Intermediate

1:15pm CEST

Scaling Threat Modeling with a Developer-Centric Approach

Friday May 30, 2025 1:15pm - 2:00pm CEST

Room 116+117

How can we make threat modeling scalable, actionable, and accessible for all stakeholders?

Traditional threat modeling methodologies struggle to scale in agile environments. They often result in over-scoped, resource-heavy processes that lack actionable insights and rely on scarce security expertise, limiting adoption in large organizations.

This talk introduces Rapid Developer-Driven Threat Modeling (RaD-TM), a lightweight, tool-agnostic approach designed for developers to embed threat modeling into the SDLC without relying on security experts. RaD-TM focuses on targeted assessments of specific functionalities rather than application-wide models, enabling iterative and efficient risk mitigation.
Using Risk Templates, which are predefined collections of relevant risks and controls tailored to specific contexts, RaD-TM fosters collaboration among stakeholders to build a scalable threat modeling process. This session will offer real-world examples and step-by-step guidance on integrating RaD-TM into the development workflow.

Speakers

Andrew Hainault

Managing Director, Aon Cyber Solutions

Andrew has over 25 years’ experience working in Information Security, Information Technology and Software Engineering, for public and private sector organisations in many sectors - including financial services / fintech, energy utilities, media, entertainment and insurance. With... Read More →

Andrea Scaduto

Secure coding, threat modeling, and ethical hacking

With a strong foundation in cybersecurity, Andrea holds an MSc in Computer Engineering, multiple IT Security certifications, and more than a decade of industry experience. His expertise spans breaking, building, and securing web, mobile, and cloud applications, with extensive knowledge... Read More →

20250530 Scaling Threat Modeling with a Developer Centric Approach pdf

Friday May 30, 2025 1:15pm - 2:00pm CEST
Room 116+117 CCIB

Breakout: Builder Track

Audience Intermediate

1:15pm CEST

Scale Security Programs with Scorecarding

Friday May 30, 2025 1:15pm - 2:00pm CEST

Room 115

Security teams increasingly take a collaborative, partnership-based approach to securing their applications and organizations. Scaling these efforts requires thoughtfully distributing awareness and ownership of security risk. Scorecarding is used at leading companies to make security posture visible, actionable, and engaging across the entire organization.

In this session, we’ll dive into how companies like Netflix, Chime, GitHub, and DigitalOcean use scorecarding to distribute security ownership, drive continuous improvement, and align risk management with business goals. You’ll walk away with practical, tool-agnostic strategies for implementing your own scorecarding program that not only enhances security posture but fosters a culture of shared responsibility and proactive risk management.

Speakers

Rami McCarthy

Principal Security Researcher, Wiz

Rami is a practitioner with expertise in cloud security and building impactful security programs for startups and high-growth companies. In past roles, he helped build the Infrastructure Security program at Figma and scale security at Cedar, a health-tech unicorn. Rami regularly blogs... Read More →

Friday May 30, 2025 1:15pm - 2:00pm CEST
Room 115

Breakout: Manager/Culture Track

Audience Intermediate

2:15pm CEST

OWASP Security Champions Guide Project

Friday May 30, 2025 2:15pm - 2:45pm CEST

Room 133-134

OWASP Security Champions Guide Project was started to create an open-source, vendor-neutral guidebook for Application Security professionals to help them build and improve their own successful Security Champion programs.

In this talk, Aleksandra will describe the main elements of the project and will guide you through the key principles of a successful Security Champions Program.

Regarding Security Champions programs, one size will not fit all – and as such our Project allows managers, security professionals or team leaders to pick and choose the elements their organization can adopt or leverage to create their own customized program.

Our Project team interviewed security leaders, program coordinators, and security champions to establish what makes a successful program. Participants represent a range of company sizes, industries, geographies, and also different levels of security program maturity. We want to know what works, what doesn’t work, what promotes success, and what leads to failure.

The principles have been drawn from an initial series of in-depth interviews with Application Security leaders from across the globe as part of our wider goal to provide a comprehensive Security Champions playbook.

The Ten Key Principles of a Successful Security Champions Program:
1. Be passionate about security
2. Start with a clear vision for your program
3. Secure management support
4. Nominate a dedicated captain
5. Trust your champions
6. Create a community
7. Promote knowledge sharing
8. Reward responsibility
9. Invest in your champions
10. Anticipate personnel changes

More about the Project:
- Existing Project webpage: https://owasp.org/www-project-security-champions-guidebook/
- New Project webpage: https://securitychampions.owasp.org/

Speakers

Aleksandra Kornecka

Security Engineer

Aleksandra is a security engineer with a global citizen mindset, unafraid to explore diverse destinations—both mentally and geographically. With a background in software testing and cognitive science, she brings a unique blend of technical and soft skills to the table.As a member... Read More →

Global AppSec BCL OWASP Security Champions Guide project Aleksandra Kornecka and the team pdf

Friday May 30, 2025 2:15pm - 2:45pm CEST
Room 133-134

Breakout: OWASP Project Showcase Track, Lightning Talk

Audience Intermediate

2:15pm CEST

Transaction authorization pitfalls – How to improve current financial, payment, and e-commerce apps?

Friday May 30, 2025 2:15pm - 3:00pm CEST

Room 116+117

During my career, I've had the opportunity to work with many financial institutions, payment processors, fintechs, and e-commerce operators. In recent years, the threat landscape for internet payments has changed significantly, since our smartphone has become the center of our digital life, financial transactions, and digital identity. Such concentration of power in a single asset has poor influence on overall security.

In my presentation, I will explore this dynamic threat landscape, show real-life vulnerabilities and threats, and discuss possible solutions to protect customers' funds. Additionally, I will examine the role of regulatory compliance in solving issues related to online payments.

My presentation will be divided into three parts.

In the first part of my presentation, I will show real-life threats and vulnerabilities affecting current transaction authorization processes, including technical and logical ones. I will present case studies of attacks that caused my relatives and friends to lose their money.

In the second part, I will discuss possible safeguards to raise the bar for attackers without compromising usability on many levels of user interaction:
- banking apps and systems, payments, fintechs
- e-commerce apps, social media apps, telecom operators
I will also demonstrate how developers, blue teams, and threat intelligence experts can cooperate to detect financial fraud at the application level and protect customers' funds.

In the third part, I will discuss whether current and upcoming financial sector regulations, such as DORA, PSD3, and PSR, address transaction authorization problems. I will also explore whether we as the IT security community can do more than just follow compliance rules.

Speakers

Wojciech Dworakowski

OWASP Poland Chapter Co-leader, Managing Partner, SecuRing

An IT Security Consultant with over 20 years of experience in the field. A Managing Partner at SecuRing. He has led multiple security assessments and penetration tests especially for financial services, payment systems, SaaS, and startups. A lecturer at many security conferences... Read More →

Friday May 30, 2025 2:15pm - 3:00pm CEST
Room 116+117 CCIB

Breakout: Builder Track

Audience Intermediate

2:15pm CEST

GenAI Security - Insights and Current Gaps in OS LLM Vulnerability Scanners and Guardrails

Friday May 30, 2025 2:15pm - 3:00pm CEST

Room 114

As Large Language Models (LLMs) become integral to various applications, securing them against evolving threats—such as **information leakage, jailbreak attacks, and prompt injection—**remains a critical challenge. This presentation provides a comparative analysis of open-source vulnerability scanners—Garak, Giskard, PyRIT, and CyberSecEval—that leverage red-teaming methodologies to uncover these risks. We explore their capabilities, limitations, and design principles, while conducting quantitative evaluations that expose key gaps in their ability to reliably detect attacks.

However, vulnerability detection alone is not enough. Proactive security measures, such as AI guardrails, are essential to mitigating real-world threats. We will discuss how guardrail mechanisms—including **input/output filtering, policy enforcement, and real-time anomaly detection—**can complement scanner-based assessments to create a holistic security approach for LLM deployments. Additionally, we present a preliminary labeled dataset, aimed at improving scanner effectiveness and enabling more robust guardrail implementations.

Beyond these tools, we will share our experience in developing a comprehensive GenAI security framework at Fujitsu, designed to integrate both scanning and guardrail solutions within an enterprise AI security strategy. This framework emphasizes multi-layered protection, balancing LLM risk assessments, red-teaming methodologies, and runtime defenses to proactively mitigate emerging threats.

Finally, based on our findings, we will provide strategic recommendations for organizations looking to enhance their LLM security posture, including:

Selecting the right scanners for red-teaming and vulnerability assessments
Implementing guardrails to ensure real-time policy enforcement and risk mitigation
Adopting a structured framework for securing GenAI systems at scale
This session aims to bridge theory and practice, equipping security professionals with actionable insights to fortify LLM deployments in real-world environments.

Speakers

Roman Vainshtein

GenAI Trust Team Lead, Fujitsu Research of Europe

I am the team-lead of Generative AI Trust and Security Research team at Fujitsu Research of Europe, where I lead efforts to enhance the security, trustworthiness, and resilience of Generative AI systems. My work focuses on bridging the gap between AI security, red-teaming methodologies... Read More →

OWASP APPSEC EU 25 (1) pptx

Friday May 30, 2025 2:15pm - 3:00pm CEST
Room 114

Breakout: Defender Track

Audience Intermediate

3:30pm CEST

When Regulation Backfires: How a Vulnerable Plugin Led to an XSS Pandemic

Friday May 30, 2025 3:30pm - 4:15pm CEST

Room 113

What began as a simple WAF bypass challenge on a single website turned into the discovery of a vulnerability affecting thousands of organizations. Join us in the journey of how an accessibility plugin, mandated by regulation, became the perfect vehicle for a widespread XSS vulnerability. We’ll explore the real-world impact of compromised sensitive systems, from government and military to healthcare and finance, showing how a single regulatory requirement led to an ecosystem-wide security breach.

We’ll also analyze the plugin’s source code to understand how and why this XSS vulnerability occurs, along with a behavior analysis that suggests the plugin may also be tracking users without consent, indicating potential malicious intent. Additionally, we’ll share the methodology and tools used to uncover and validate these vulnerabilities at scale.

Speakers

Eilon Cohen

Security Analyst, Checkmarx Research

That kid who took apart all his toys to see how they worked.Currently breaking (and fixing) things in the Research group at Checkmarx. Educational spans from Mechanical Engineering and Robotics to Computer science, but a self-made security personnel. Ex-IBM as a security engineer... Read More →

Ori Ron

Senior AppSec Researcher, Checkmarx

Ori Ron is a Senior Application Security Researcher at Checkmarx with over 8 years of experience. He works to find and help fix security vulnerabilities and enjoys sharing security knowledge through talks and write-ups. linkedin.com/in/ori-ron-40099912b/ checkmarx.com/author/or... Read More →

Friday May 30, 2025 3:30pm - 4:15pm CEST
Room 113

Breakout: Breaker Track

Audience Intermediate

3:30pm CEST

LLM-Powered private Threat Modeling

Friday May 30, 2025 3:30pm - 4:15pm CEST

Room 116+117

In this session, we'll explore the development of an in-house threat modeling assistant that leverages Large Language Models through AWS Bedrock and Anthropic Claude. Learn how we're building a private solution that automates and streamlines the threat modeling process while keeping sensitive security data within our control. We'll demonstrate how this proof-of-concept tool combines LangChain and Streamlit to create an interactive threat modeling experience. Join us to see how modern AI technologies can enhance security analysis while maintaining data privacy.

Speakers

Murat Zhumagali

Principal Security Engineer, Progress ShareFile

Master in CS from University of Southern California 2013 - 2016Security intern at IBM summer 2016Security engineer at IBM 2017 - 2021Senior Security engineer at Fiddler AI 2021 - 2023Lead Security engineer at Jukebox 2023 - 2024Principal Security engineer at Progress ShareFile 2024... Read More →

Threat Modeling Copilot Murat Zhumagali OWASP pdf

Friday May 30, 2025 3:30pm - 4:15pm CEST
Room 116+117 CCIB

Breakout: Builder Track

Audience Intermediate

3:30pm CEST

Know Thy Judge: Uncovering Vulnerabilities of AI Evaluators

Friday May 30, 2025 3:30pm - 4:15pm CEST

Room 114

Current methods for evaluating the safety of Large Language Models (LLMs) risk creating a false sense of security. Organizations deploying generative AI often rely on automated “judges” to detect safety violations like jailbreak attacks, as scaling evaluations with human experts is impractical. These judges—typically built with LLMs—underpin key safety processes such as offline benchmarking and automated red-teaming, as well as online guardrails designed to minimize risks from attacks. However, this raises a crucial question of meta-evaluation: can we trust the evaluations provided by these evaluators?

In this talk, we examine how popular LLM-as-judge systems were initially evaluated—typically using narrow datasets, constrained attack scenarios, and limited human validation—and why these approaches can fall short. We highlight two critical challenges: (i) evaluations in the wild, where factors like prompt sensitivity and distribution shifts can affect performance, and (ii) adversarial attacks that target the judges themselves. Through practical examples, we demonstrate how minor changes in data or attack strategies that do not affect the underlying safety nature of the model outputs can significantly reduce a judge’s ability to assess jailbreak success.

Our aim is to underscore the need for rigorous threat modeling and clearer applicability domains for LLM-as-judge systems. Without these measures, low attack success rates may not reliably indicate robust safety, leaving deployed models vulnerable to unseen risks.

Speakers

Francisco Girbal Eiras

Machine Learning Research Scientist, DynamoAI

Francisco is an ML Research Scientist at Dynamo AI, a leading startup building enterprise solutions that enable private, secure, and compliant generative AI systems. He earned his PhD in trustworthy machine learning from the University of Oxford as part of the Autonomous Intelligent... Read More →

Eliott Zemour

Senior ML Research Engineer, Dynamo AI

linkedin.com/in/eliott-zemour/

Dan Ross

Head of AI Compliance Strategy, Dynamo AI

Dan Ross, Head of AI Compliance Strategy at Dynamo AI, focuses on aligning artificial intelligence, policy, risk and security management, and business application. Prior to Dynamo, Dan spent close to a decade at Promontory Financial Group, a premier risk and regulatory advisory firm... Read More →

Friday May 30, 2025 3:30pm - 4:15pm CEST
Room 114

Breakout: Defender Track

Audience Intermediate