Keynote: Discovering the Power of AI Pentesting with Pedro Conde (Ethiack)
The Talsec Mobile App Security Conference in Prague was a two-day, invite-only event on fraud, malware, and API abuse in modern mobile apps, held at Chateau St. Havel on November 3–4, 2025, and hosted by Talsec, freeRASP, and partners. It brought together leading experts and practitioners to strengthen the mobile AppSec community, connect engineers with attackers and defenders, and share practical techniques for high‑stakes sectors like banking, fintech, and e‑government.
Why AI Pentesting Now?
Pedro Conde, an AI Scientist at Ethiack specializing in autonomous ethical hacking, delivered a compelling presentation on the power of AI pentesting, outlining three key objectives: to demystify AI pentesting, to demonstrate the current capabilities of these systems, and to emphasize that AI systems are already very capable and "different from human beings".
Conde provided a historical context for the rise of AI pentesting, noting the progression from classical machine learning to deep learning, then to Large Language Models (LLMs), and finally to Agentic AI, which is the category AI pentesting systems fall into. Agentic AI systems often utilize LLMs as a base but possess the ability to interact with the environment, extending beyond simple reasoning, predictions, and generation. These fully autonomous ethical hacking systems, which Ethiack calls "hackbots," can perform a complete pen-testing session, including finding vulnerabilities, without human intervention.
This autonomy offers advantages such as continuous 24/7 testing, high scalability through parallelization, and the ability to dynamically adapt to targets.
How Hackbots Work Under the Hood
Conde detailed the four main building blocks of robust hackbot systems: the 'brains' (multiple interacting LLMs for central reasoning, planning, and decision-making), the 'structure' (providing the skeleton for agents, coordinating them, managing memory, and ensuring efficiency), the 'prompts' (translating human objectives into agent behavior and ensuring goal alignment), and 'tools' (extending the agents' capabilities to interact with the environment, perform actions like running scripts, and validate outputs).
A major limitation of AI systems, especially in pentesting, is 'AI hallucinations,' particularly false positives. Ethiack combats this by using deterministic tools and a specialized 'verifier' agent. The verifier takes a step back to reflect on the hackbot's reasoning, challenges and rechecks conclusions, and filters out weak or flawed inferences, which significantly decreases the false positive rate and increases precision.
Additionally, to prevent destructive behavior, a three-layered guardrail system is used: a prompt-level guardrail shaping model behavior with clear instructions, a deterministic filter for rule-based checks on environmental interactions, and a third-layer LLM agent for contextual judgment on complex cases.
Hackian: Real‑World Demo
The presentation featured a demonstration by Ethiack's hackbot, "Hackian," who shared how it "absolutely demolished" a genetics research platform called Genequest during a Defcon challenge. Hackian achieved a full system compromise in under four hours, finding two critical vulnerabilities, including one that neither human pentesters nor the challenge organizers were aware of.
Hackian first bypassed front-end registration restrictions by hitting the register endpoint directly, mapped the microservices ecosystem, and then exploited a debug endpoint in the DNA analysis service that was vulnerable to command execution.
The second critical bug allowed Hackian to read arbitrary files on the system (like /c/paswd) by sending file paths to the /analyze endpoint, which was using the Closure slurp function without validation. Conde concluded that the core message is not that AI systems are or will be better than humans, but that they are different and find different types of vulnerabilities, sometimes finding "quirks that humans may disregard". Therefore, organizations must test their assets with these systems to prevent "bad guys" from exploiting them.
Currently, Ethiack's hackbot is focused on web applications, though future development may include mobile applications.
Thank you Pedro, Ethiack, and Hackian for showcasing how agentic AI can transform penetration testing and uncover vulnerabilities that traditional approaches miss. Your work pushes the boundaries of what ethical hacking can achieve and highlights why defenders must start thinking in terms of AI-native offensive capabilities as well!
Handle App Security with a Single Solution! Check Out Talsec's Premium Offer & Plan Comparison!
Plans Comparison
https://www.talsec.app/plans-comparison
Premium Products:
RASP+ - An advanced security SDK that actively shields your app from reverse engineering, tampering, rooting/jailbreaking, and runtime attacks like hooking or debugging.
AppiCrypt (Android & iOS) & AppiCrypt for Web - A backend defense system that verifies the integrity of the calling app and device to block bots, scripts, and unauthorized clients from accessing your API.
Malware Detection - Scans the user's device for known malicious packages, suspicious "clones," and risky permissions to prevent fraud and data theft.
Dynamic TLS Pinning - Prevents Man-in-the-Middle (MitM) attacks by validating server certificates that can be updated remotely without needing to publish a new app version.
Secret Vault - A secure storage solution that encrypts and obfuscates sensitive data (like API keys or tokens) to prevent them from being extracted during reverse engineering.
Last updated
Was this helpful?

