Loading…
Audience: Intermediate clear filter
arrow_back View All Dates
Friday, May 30
 

10:30am CEST

A completely pluggable DevSecOps programme, for free, using community resources
Friday May 30, 2025 10:30am - 11:15am CEST
Despite our collective efforts, we haven’t managed to harmonize tools and processes. Several projects like ASVS, SAMM and others have attempted information harmony but only the now defunct Glue has attempted tool orchestration harmonization and for good reason, it is a hard problem to solve, almost impossible by volunteers alone.

This session introduces Smithy, the only open-source workflow engine for security tools. Smithy stands as a unifying force for building robust, scalable DevSecOps, and beyond, pipelines. Leveraging Smithy’s support for OCSF-native data formats, we centralized the outputs of disparate security tools into a cohesive data lake, unlocking actionable insights that improved vulnerability prioritization and resource allocation.

The talk will showcase real-world applications, including integrating OpenCRE, Cartography, AI-driven solutions and open-source resources to enhance vulnerability detection accuracy and reprioritization, for free, using ready made community resources.

Whether you're a tech lead, security engineer, or CISO, this presentation offers practical guidance for creating adaptable, data-driven security workflows without breaking the bank.
Speakers
avatar for Spyros Gasteratos

Spyros Gasteratos

Security Engineer & Architect, OWASP
Spyros has over 15 years of experience in the security world. Since the beginning of his career he has been an avid supporter and contributor of open source software and an OWASP volunteer. Currently he is interested in the harmonization of security tools and information and is currently... Read More →
Friday May 30, 2025 10:30am - 11:15am CEST
Room 116+117 CCIB

10:30am CEST

Think Before You Prompt: Securing Large Language Models from a Code Perspective
Friday May 30, 2025 10:30am - 11:15am CEST
As Large Language Models (LLMs) become integral to modern applications, securing them at the code level is critical to preventing prompt injection attacks, poisoned models, unauthorized modifications, and other vulnerabilities. This talk delves into common pitfalls and effective mitigations when integrating LLMs into software systems, whether working with cloud vendors or hosting your own models. By focusing on LLM security from a developer's perspective rather than runtime defenses, we emphasize a shift-left approach—embedding security early in the software development lifecycle to proactively mitigate threats and minimize risks before deployment.

We'll examine practical security challenges faced during LLM integration, including input sanitization, output validation, and model pinning. Through detailed code examples and a live demonstration of model tampering, attendees will witness firsthand how attackers can exploit inadequate security controls to compromise LLM systems. The demonstration will showcase a real-world scenario where a legitimate model is swapped with a malicious one, highlighting the critical importance of robust model integrity verification and secure deployment practices.

Participants will learn concrete implementation patterns and security controls that can prevent such attacks, with practical code samples they can apply to their own projects. The session will cover essential defensive techniques including proper API key management, secure model loading and validation, and safe handling of sensitive data in prompts. Whether you're building applications using cloud-based LLM services or deploying your own models, you'll leave with actionable code-level strategies to enhance your application's security posture and protect against emerging AI-specific threats.
Speakers
avatar for Yaron Avital

Yaron Avital

Security Researcher, Palo Alto Networks
Yaron Avital is a seasoned professional with a diverse background in the technology and cybersecurity fields. Yaron's career has spanned over 15 years in the private sector as a software engineer and team lead at global companies and startups.Driven by a passion for cybersecurity... Read More →
avatar for Tomer Segev

Tomer Segev

Security Researcher, Palo Alto Networks
 Tomer Segev is a cybersecurity professional with a strong background in software development and security research. He began his career at 17 as a developer before serving as a cyber researcher in the top cyber unit of the IDF, where he gained hands-on experience in the most advanced... Read More →
Friday May 30, 2025 10:30am - 11:15am CEST
Room 114

10:30am CEST

LLMs vs. SAST: How AI Delivers Accurate Vulnerability Detection and Reduces False Positives
Friday May 30, 2025 10:30am - 11:15am CEST
The exponentially growing number of software security vulnerabilities and data breaches highlights a persistent gap between the implementation of the secure development lifecycle and particularly secure coding practices and their intended outcomes. Despite significant financial investments in application security and the advancements in secure software development methodologies, the effectiveness of these practices remains inconsistent. Our session is based on a multi-phase and multi-year research, conducted in two global enterprise software companies and explores how a combination of developers' security education, organizational security climate, and metrics can enhance secure coding performance and reduce software vulnerabilities.

In December 2004, Steve Lipner introduced to the world the trustworthy computing security development lifecycle. A framework which included three main pillars: Requirements for repeatable secure development processes, requirements for engineers secure coding education and requirements for measurements and accountability for software security. Guided by this three-pillar framework , our research emphasizes the under-addressed areas of developer education and organizational accountability and measurements.

Through a series of three studies, conducted in two global software companies and led by the University of Haifa in Israel, this session will present the results of an academic research that made an attempt to identify the root cause for the ever increasing number of software security vulnerabilities and investigates the effectiveness of secure coding training, the impact of organizational security climate interventions, and the correlation between security climate and secure coding performance in order to evaluate whether the later two, which were prominently left in the shades, could provide a solution to the problem.

The first study evaluates the efficacy of secure coding training programs, revealing that while training improves knowledge, it fails to significantly to reduce newly introduced vulnerabilities. The second study demonstrates that targeted organizational interventions, including leadership communication and process improvements, significantly enhance organizational security climate. The final study found significant correlation between positive security climate and secure coding performance improvement, evidenced by a higher ratio of mitigated vulnerabilities.

This research provides actionable insights for both academia and industry. It underscores the importance of integrating secure coding education with organizational climate improvements to achieve measurable security outcomes. The findings offer a comprehensive approach to reducing cyber security risks while advocating for a dual focus on technical skills and cultural transformation within software development environments.
Speakers
avatar for Jonathan Santilli

Jonathan Santilli

Software Engineer and AI practitioner, Snyk
Jonathan Santilli defines himself as a problem solver, or at least he tries. With over 20 years of experience working for various tech companies, Jonathan has played different roles, from Team lead developer to Product manager and, of course, problem solver. Jonathan is mainly interested... Read More →
avatar for Kirill Efimov

Kirill Efimov

Security R&D Team Lead, Mobb.ai
 As a seasoned security researcher, I've led teams at Snyk and now helm security research at Mobb. With a wealth of publications and speaking engagements, I've delved deep into the intricacies of cybersecurity, unraveling vulnerabilities and crafting solutions. From pioneering research... Read More →
Friday May 30, 2025 10:30am - 11:15am CEST
Room 115

11:00am CEST

OWASP Certified Secure Developer Open Call
Friday May 30, 2025 11:00am - 11:45am CEST
Join Us in Shaping the Future of Secure Software Development

The OWASP Education and Training Committee is developing a certification program designed specifically for developers—and we need your expertise.

For the first time, this initiative will be showcased at OWASP Global AppSec EU 2025, and we’re inviting the community to help build the body of knowledge that will form the foundation of the certification curriculum.

If you're passionate about secure coding and developer education, this is your chance to contribute meaningfully to a global effort. Let’s build something that lasts—together.
Speakers
avatar for Shruti Kulkarni

Shruti Kulkarni

Information Security Architect, 6point6
Shruti is an information security / enterprise security architect with experience in ISO27001, PCI-DSS, policies, standards, security tools, threat modelling, risk assessments. Shruti works on security strategies and collaborates with cross-functional groups to implement information... Read More →
Friday May 30, 2025 11:00am - 11:45am CEST
Room 133-134

11:30am CEST

Navigating Agentic AI Security Risks: OWASP’s GenAI Guidance for Securing Autonomous AI Agents
Friday May 30, 2025 11:30am - 12:00pm CEST
As artificial intelligence advances, autonomous AI agents are becoming integral to modern applications, automating decision-making, problem-solving, and even interacting dynamically with users. However, this evolution brings new security challenges that traditional cybersecurity frameworks struggle to address. OWASP’s GenAI Security Project has identified Agentic Security Risks as a critical category of threats that can compromise AI-driven systems, leading to unintended actions, data leaks, model manipulation, and adversarial exploits.

This session will explore Agentic Security Risks—a unique class of vulnerabilities stemming from AI agents’ autonomy, adaptability, and ability to interact with complex environments. We’ll dissect how malicious actors can exploit these systems by influencing their decision-making processes, injecting harmful instructions, or leveraging prompt-based attacks to bypass safety constraints.

Through a deep dive into OWASP’s latest findings, attendees will gain practical insights into risk identification and mitigation strategies tailored for AI-driven agents. The talk will cover:

Understanding Agentic Security Risks: How autonomous AI agents process, reason, and act—and where vulnerabilities emerge.
Threat Modeling for AI Agents: Key security considerations when deploying AI-driven agents in enterprise and consumer applications.
Exploitable Weaknesses in AI Agents: Case studies on prompt injection, adversarial manipulation, data poisoning, and model exfiltration.
OWASP’s Mitigation Framework: Best practices for securing agentic AI systems, including robust validation, policy enforcement, access control, and behavioral monitoring.
Security by Design: How to integrate GenAI security principles into the development lifecycle to preemptively mitigate risks.
By the end of the session, attendees will have a structured approach to assessing and mitigating security risks in agentic AI systems. Whether you’re a developer, security professional, or AI architect, this session will equip you with actionable strategies to secure your AI-powered applications against emerging threats.

Join us to explore the cutting edge of AI security and ensure that autonomous agents work for us—not against us.
Speakers
avatar for John Sotiropoulos

John Sotiropoulos

Head of AI Security / OWASP GenAI Security Project (Top 10 for LLM & Agentic Security Co-Lead), Kainos
John Sotiropoulos is the head of AI Security at Kainos where he is responsible for AI security and securing national-scale systems in government, regulators, and healthcare.  John has gained extensive experience in building and securing systems in previous roles as developer, CTO... Read More →
Friday May 30, 2025 11:30am - 12:00pm CEST
Room 131-132

11:30am CEST

Restless Guests: From Subscription to Backdoor Intruder
Friday May 30, 2025 11:30am - 12:15pm CEST
Through novel research our team uncovered a critical vulnerability in Azure's guest user model, revealing that guest users can create and own subscriptions in external tenants they've joined—even without explicit privileges. This capability, which is often overlooked by Azure administrators, allows attackers to exploit these subscriptions to expand their access, move laterally within resource tenants, and create stealthy backdoor identities in the Entra directory. Alarmingly, Microsoft has confirmed real-world attacks using this method, highlighting a significant gap in many Azure threat models. This talk will share the findings from this first of its kind research into this exploit found in the wild.

We'll dive into how subscriptions, intended to act as security boundaries, make it possible for any guest to create and control a subscription undermines this premise. We'll provide examples of attackers leveraging this pathway to exploit known attack vectors to escalate privileges and establish persistent access, a threat most Azure admins do not anticipate when inviting guest users. While Microsoft plans to introduce preventative options in the future, this gap leaves organizations exposed to risks they may not even realize exist––but should definitely know about!
Speakers
avatar for Simon Maxwell-Stewart

Simon Maxwell-Stewart

Security Researcher and Data Scientist, BeyondTrust
Simon Maxwell-Stewart is a seasoned data scientist with over a decade of experience in big data environments and a passion for pushing the boundaries of analytics. A Physics graduate from the University of Oxford, Simon began his career tackling complex data challenges and has since... Read More →
Friday May 30, 2025 11:30am - 12:15pm CEST
Room 113

11:30am CEST

Surviving prioritisation when CVE stands for “Customer Very Enthusiastic"
Friday May 30, 2025 11:30am - 12:15pm CEST
Everybody talks about problems with the width of CVE space - too many, coming too fast, how to prioritise them. This talk takes the problem into 3D - let’s talk about the depth of the space!

How a single medium risk CVE can consume crazy amounts of time of an AppSec team?

We will look into couple of examples of CVEs in a product that my team protects and trace their journey through the ecosystem. On the journey we will meet various dragons, hydras, and other dangerous creatures:

- LLM-empowered scanners hallucinating CVSS scores, packages, versions, anything;
- Good research teams making mistakes translating between different versions of CVSS
- Glory-chasing “research teams” writing their own advisories for no apparent reason
- Consensus based approach in CVE ecosystem guarantees security team cannot sleep until EVERY scanner has calmed down;
- And my favourite troll under the bridge: customers saying “I don’t care it’s not reachable in your context, I can’t deploy your product until my scanner is happy”.

The soundtrack for the quest is provided by the vendors continuously messaging you with fantastic promises to solve everything.

Can your character survive the quest and what loot do you need?
Speakers
avatar for Irene Michlin

Irene Michlin

Application Security Lead, Neo4j
Irene Michlin is an application security lead at Neo4j. Before going into application security, Irene worked as software engineer, architect, and technical lead at companies ranging from startups to corporate giants. Her professional interests include securing development life-cycles... Read More →
Friday May 30, 2025 11:30am - 12:15pm CEST
Room 114

1:15pm CEST

Abusing misconfigurations in CI/CD to hijack apps and clouds
Friday May 30, 2025 1:15pm - 2:00pm CEST
Writing and maintaining secure applications is hard enough, and in today's paradigm with DevOps and CI/CD developers are often tasked with integrating and automating a full code-to-cloud pipeline. This introduces new control plane to application risks. Some of these instances can lead to full compromise if exploited by a threat actor.

In this talk we will break down the core components of a modern CI/CD-workflow such as OIDC, GitHub Actions and Workload Identities. Then we will describe the security properties of these components, and present a threat model for the code-to-cloud flow. Based on this we will showcase and demonstrate common flaws that could lead to full application and cloud compromise.

To increase the capacity of organizations to detect such flaws we will release an open source tool, developed by the presenters, to discover and triage these issues. In the session the tool will be demonstrated and discussed. Attendees will get actionable knowledge and tooling that can be applied when leaving the room. The talk and tool is based on findings and experiences from cloud and application security assessment conducted by the presenters.
Speakers
avatar for Håkon Nikolai Stange Sørum

Håkon Nikolai Stange Sørum

Principal Security Architect and Partner, O3 Cyber
Håkon has extensive knowledge on implementing secure software development practices for modern DevOps teams, designing and implementing cloud security architectures, and securely operating cloud infrastructure. Håkon offers industry insights into the implementation of secure design... Read More →
avatar for Karim El-Melhaoui

Karim El-Melhaoui

Principal Security Architect at O3 Cyber, Microsoft Security MVP, O3 Cyber
Karim is a seasoned and renowned thought leader within cloud security. At O3 Cyber, he conducts research and development and works with our clients, primarily in Financial Industry. Karim has a background in building and operating platform services for security on private and public... Read More →
Friday May 30, 2025 1:15pm - 2:00pm CEST
Room 113

1:15pm CEST

Scaling Threat Modeling with a Developer-Centric Approach
Friday May 30, 2025 1:15pm - 2:00pm CEST
How can we make threat modeling scalable, actionable, and accessible for all stakeholders?

Traditional threat modeling methodologies struggle to scale in agile environments. They often result in over-scoped, resource-heavy processes that lack actionable insights and rely on scarce security expertise, limiting adoption in large organizations.

This talk introduces Rapid Developer-Driven Threat Modeling (RaD-TM), a lightweight, tool-agnostic approach designed for developers to embed threat modeling into the SDLC without relying on security experts. RaD-TM focuses on targeted assessments of specific functionalities rather than application-wide models, enabling iterative and efficient risk mitigation.
Using Risk Templates, which are predefined collections of relevant risks and controls tailored to specific contexts, RaD-TM fosters collaboration among stakeholders to build a scalable threat modeling process. This session will offer real-world examples and step-by-step guidance on integrating RaD-TM into the development workflow.
Speakers
avatar for Andrew Hainault

Andrew Hainault

Managing Director, Aon Cyber Solutions
Andrew has over 25 years’ experience working in Information Security, Information Technology and Software Engineering, for public and private sector organisations in many sectors - including financial services / fintech, energy utilities, media, entertainment and insurance. With... Read More →
avatar for Andrea Scaduto

Andrea Scaduto

Secure coding, threat modeling, and ethical hacking
With a strong foundation in cybersecurity, Andrea holds an MSc in Computer Engineering, multiple IT Security certifications, and more than a decade of industry experience. His expertise spans breaking, building, and securing web, mobile, and cloud applications, with extensive knowledge... Read More →
Friday May 30, 2025 1:15pm - 2:00pm CEST
Room 116+117 CCIB

1:15pm CEST

Scale Security Programs with Scorecarding
Friday May 30, 2025 1:15pm - 2:00pm CEST
Security teams increasingly take a collaborative, partnership-based approach to securing their applications and organizations. Scaling these efforts requires thoughtfully distributing awareness and ownership of security risk. Scorecarding is used at leading companies to make security posture visible, actionable, and engaging across the entire organization.

In this session, we’ll dive into how companies like Netflix, Chime, GitHub, and DigitalOcean use scorecarding to distribute security ownership, drive continuous improvement, and align risk management with business goals. You’ll walk away with practical, tool-agnostic strategies for implementing your own scorecarding program that not only enhances security posture but fosters a culture of shared responsibility and proactive risk management.
Speakers
avatar for Rami McCarthy

Rami McCarthy

Principal Security Researcher, Wiz
Rami is a practitioner with expertise in cloud security and building impactful security programs for startups and high-growth companies. In past roles, he helped build the Infrastructure Security program at Figma and scale security at Cedar, a health-tech unicorn. Rami regularly blogs... Read More →
Friday May 30, 2025 1:15pm - 2:00pm CEST
Room 115

2:15pm CEST

OWASP Security Champions Guide Project
Friday May 30, 2025 2:15pm - 2:45pm CEST
OWASP Security Champions Guide Project was started to create an open-source, vendor-neutral guidebook for Application Security professionals to help them build and improve their own successful Security Champion programs.

In this talk, Aleksandra will describe the main elements of the project and will guide you through the key principles of a successful Security Champions Program.

Regarding Security Champions programs, one size will not fit all – and as such our Project allows managers, security professionals or team leaders to pick and choose the elements their organization can adopt or leverage to create their own customized program.

Our Project team interviewed security leaders, program coordinators, and security champions to establish what makes a successful program. Participants represent a range of company sizes, industries, geographies, and also different levels of security program maturity. We want to know what works, what doesn’t work, what promotes success, and what leads to failure.

The principles have been drawn from an initial series of in-depth interviews with Application Security leaders from across the globe as part of our wider goal to provide a comprehensive Security Champions playbook.

The Ten Key Principles of a Successful Security Champions Program:
1. Be passionate about security
2. Start with a clear vision for your program
3. Secure management support
4. Nominate a dedicated captain
5. Trust your champions
6. Create a community
7. Promote knowledge sharing
8. Reward responsibility
9. Invest in your champions
10. Anticipate personnel changes

More about the Project:
- Existing Project webpage: https://owasp.org/www-project-security-champions-guidebook/
- New Project webpage: https://securitychampions.owasp.org/
Speakers
avatar for Aleksandra Kornecka

Aleksandra Kornecka

Security Engineer
Aleksandra is a security engineer with a global citizen mindset, unafraid to explore diverse destinations—both mentally and geographically. With a background in software testing and cognitive science, she brings a unique blend of technical and soft skills to the table.As a member... Read More →
Friday May 30, 2025 2:15pm - 2:45pm CEST
Room 131-132

2:15pm CEST

Transaction authorization pitfalls – How to improve current financial, payment, and e-commerce apps?
Friday May 30, 2025 2:15pm - 3:00pm CEST
During my career, I've had the opportunity to work with many financial institutions, payment processors, fintechs, and e-commerce operators. In recent years, the threat landscape for internet payments has changed significantly, since our smartphone has become the center of our digital life, financial transactions, and digital identity. Such concentration of power in a single asset has poor influence on overall security.

In my presentation, I will explore this dynamic threat landscape, show real-life vulnerabilities and threats, and discuss possible solutions to protect customers' funds. Additionally, I will examine the role of regulatory compliance in solving issues related to online payments.

My presentation will be divided into three parts.

In the first part of my presentation, I will show real-life threats and vulnerabilities affecting current transaction authorization processes, including technical and logical ones. I will present case studies of attacks that caused my relatives and friends to lose their money.

In the second part, I will discuss possible safeguards to raise the bar for attackers without compromising usability on many levels of user interaction:
- banking apps and systems, payments, fintechs
- e-commerce apps, social media apps, telecom operators
I will also demonstrate how developers, blue teams, and threat intelligence experts can cooperate to detect financial fraud at the application level and protect customers' funds.

In the third part, I will discuss whether current and upcoming financial sector regulations, such as DORA, PSD3, and PSR, address transaction authorization problems. I will also explore whether we as the IT security community can do more than just follow compliance rules.
Speakers
avatar for Wojciech Dworakowski

Wojciech Dworakowski

OWASP Poland Chapter Co-leader, Managing Partner, SecuRing
An IT Security Consultant with over 20 years of experience in the field. A Managing Partner at SecuRing. He has led multiple security assessments and penetration tests especially for financial services, payment systems, SaaS, and startups. A lecturer at many security conferences... Read More →
Friday May 30, 2025 2:15pm - 3:00pm CEST
Room 116+117 CCIB

2:15pm CEST

GenAI Security - Insights and Current Gaps in OS LLM Vulnerability scanners and Guardrails
Friday May 30, 2025 2:15pm - 3:00pm CEST
As Large Language Models (LLMs) become integral to various applications, securing them against evolving threats—such as **information leakage, jailbreak attacks, and prompt injection—**remains a critical challenge. This presentation provides a comparative analysis of open-source vulnerability scanners—Garak, Giskard, PyRIT, and CyberSecEval—that leverage red-teaming methodologies to uncover these risks. We explore their capabilities, limitations, and design principles, while conducting quantitative evaluations that expose key gaps in their ability to reliably detect attacks.

However, vulnerability detection alone is not enough. Proactive security measures, such as AI guardrails, are essential to mitigating real-world threats. We will discuss how guardrail mechanisms—including **input/output filtering, policy enforcement, and real-time anomaly detection—**can complement scanner-based assessments to create a holistic security approach for LLM deployments. Additionally, we present a preliminary labeled dataset, aimed at improving scanner effectiveness and enabling more robust guardrail implementations.

Beyond these tools, we will share our experience in developing a comprehensive GenAI security framework at Fujitsu, designed to integrate both scanning and guardrail solutions within an enterprise AI security strategy. This framework emphasizes multi-layered protection, balancing LLM risk assessments, red-teaming methodologies, and runtime defenses to proactively mitigate emerging threats.

Finally, based on our findings, we will provide strategic recommendations for organizations looking to enhance their LLM security posture, including:

Selecting the right scanners for red-teaming and vulnerability assessments
Implementing guardrails to ensure real-time policy enforcement and risk mitigation
Adopting a structured framework for securing GenAI systems at scale
This session aims to bridge theory and practice, equipping security professionals with actionable insights to fortify LLM deployments in real-world environments.
Speakers
avatar for Roman Vainshtein

Roman Vainshtein

Head of the GenAI Trust, Fujitsu Research of Europe
I am the Head of the Generative AI Trust and Security Research team at Fujitsu Research of Europe, where I lead efforts to enhance the security, trustworthiness, and resilience of Generative AI systems. My work focuses on bridging the gap between AI security, red-teaming methodologies... Read More →
Friday May 30, 2025 2:15pm - 3:00pm CEST
Room 114

3:30pm CEST

When Regulation Backfires: How a Vulnerable Plugin Led to an XSS Pandemic
Friday May 30, 2025 3:30pm - 4:15pm CEST
What began as a simple WAF bypass challenge on a single website turned into the discovery of a vulnerability affecting thousands of organizations. Join us in the journey of how an accessibility plugin, mandated by regulation, became the perfect vehicle for a widespread XSS vulnerability. We’ll explore the real-world impact of compromised sensitive systems, from government and military to healthcare and finance, showing how a single regulatory requirement led to an ecosystem-wide security breach.

We’ll also analyze the plugin’s source code to understand how and why this XSS vulnerability occurs, along with a behavior analysis that suggests the plugin may also be tracking users without consent, indicating potential malicious intent. Additionally, we’ll share the methodology and tools used to uncover and validate these vulnerabilities at scale.
Speakers
avatar for Eilon Cohen

Eilon Cohen

Security Analyst, Checkmarx Research
That kid who took apart all his toys to see how they worked.Currently breaking (and fixing) things in the Research group at Checkmarx. Educational spans from Mechanical Engineering and Robotics to Computer science, but a self-made security personnel. Ex-IBM as a security engineer... Read More →
avatar for Ori Ron

Ori Ron

Senior AppSec Researcher, Checkmarx
Ori Ron is a Senior Application Security Researcher at Checkmarx with over 8 years of experience. He works to find and help fix security vulnerabilities and enjoys sharing security knowledge through talks and write-ups. linkedin.com/in/ori-ron-40099912b/ checkmarx.com/author/or... Read More →
Friday May 30, 2025 3:30pm - 4:15pm CEST
Room 113

3:30pm CEST

LLM-Powered private Threat Modeling
Friday May 30, 2025 3:30pm - 4:15pm CEST
In this session, we'll explore the development of an in-house threat modeling assistant that leverages Large Language Models through AWS Bedrock and Anthropic Claude. Learn how we're building a private solution that automates and streamlines the threat modeling process while keeping sensitive security data within our control. We'll demonstrate how this proof-of-concept tool combines LangChain and Streamlit to create an interactive threat modeling experience. Join us to see how modern AI technologies can enhance security analysis while maintaining data privacy.
Speakers
avatar for Murat Zhumagali

Murat Zhumagali

Principal Security Engineer, Progress ShareFile
Master in CS from University of Southern California 2013 - 2016Security intern at IBM summer 2016Security engineer at IBM 2017 - 2021Senior Security engineer at Fiddler AI 2021 - 2023Lead Security engineer at Jukebox 2023 - 2024Principal Security engineer at Progress ShareFile 2024... Read More →
Friday May 30, 2025 3:30pm - 4:15pm CEST
Room 116+117 CCIB

3:30pm CEST

Know Thy Judge: Uncovering Vulnerabilities of AI Evaluators
Friday May 30, 2025 3:30pm - 4:15pm CEST
Current methods for evaluating the safety of Large Language Models (LLMs) risk creating a false sense of security. Organizations deploying generative AI often rely on automated “judges” to detect safety violations like jailbreak attacks, as scaling evaluations with human experts is impractical. These judges—typically built with LLMs—underpin key safety processes such as offline benchmarking and automated red-teaming, as well as online guardrails designed to minimize risks from attacks. However, this raises a crucial question of meta-evaluation: can we trust the evaluations provided by these evaluators?

In this talk, we examine how popular LLM-as-judge systems were initially evaluated—typically using narrow datasets, constrained attack scenarios, and limited human validation—and why these approaches can fall short. We highlight two critical challenges: (i) evaluations in the wild, where factors like prompt sensitivity and distribution shifts can affect performance, and (ii) adversarial attacks that target the judges themselves. Through practical examples, we demonstrate how minor changes in data or attack strategies that do not affect the underlying safety nature of the model outputs can significantly reduce a judge’s ability to assess jailbreak success.

Our aim is to underscore the need for rigorous threat modeling and clearer applicability domains for LLM-as-judge systems. Without these measures, low attack success rates may not reliably indicate robust safety, leaving deployed models vulnerable to unseen risks.
Speakers
avatar for Francisco Girbal Eiras

Francisco Girbal Eiras

Machine Learning Research Scientist, DynamoAI
Francisco is an ML Research Scientist at Dynamo AI, a leading startup building enterprise solutions that enable private, secure, and compliant generative AI systems. He earned his PhD in trustworthy machine learning from the University of Oxford as part of the Autonomous Intelligent... Read More →
avatar for Eliott Zemour

Eliott Zemour

Senior ML Research Engineer, Dynamo AI
DR

Dan Ross

Head of AI Compliance Strategy, Dynamo AI
Dan Ross, Head of AI Compliance Strategy at Dynamo AI, focuses on aligning artificial intelligence, policy, risk and security management, and business application. Prior to Dynamo, Dan spent close to a decade at Promontory Financial Group, a premier risk and regulatory advisory firm... Read More →
Friday May 30, 2025 3:30pm - 4:15pm CEST
Room 114
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -