Perplexity AI

Summary

NEW DELHI: Perplexity AI has announced a major security advancement in the field of AI safety, with CEO Aravind Srinivas revealing that the company has…

NEW DELHI: Perplexity AI has announced a major security advancement in the field of AI safety, with CEO Aravind Srinivas revealing that the company has fine-tuned a version of the Qwen3-30B model capable of scanning raw HTML and detecting prompt injection attacks before a user initiates any request.

In a post shared on X, Srinivas said the enhanced model is designed to strengthen the Comet Assistant by preventing malicious manipulations in real-time at the client level.

Prompt injection attacks are an emerging cybersecurity risk where attackers embed harmful text into web content to manipulate AI systems into leaking information, executing unauthorised tasks, or bypassing safety filters.

“We’ve fine-tuned a version of Qwen3-30B that can scan raw HTML and detect prompt injection attacks even before a user initiates any request to the Comet Assistant on the client,” Srinivas wrote.

The company has also released BrowseSafe-Bench, a public benchmark of simulated attacks that allows developers and researchers to evaluate AI vulnerability under real-world adversarial conditions.

Perplexity aims to accelerate collaboration across the AI safety community through open benchmarking.

Performance Benchmark Results

A performance chart attached to the announcement shows the fine-tuned BrowseSafe model leading the F1 score rankings at 90.4%, outperforming several major AI systems including:

  • GPT-5 (Medium Reasoning) – 85.5%

  • GPT-5 – 84.9%

  • Sonnet 4.5 – 80.7–86.3%

  • Haiku 4.5 – 81.0%

  • PromptGuard-2, which scored significantly lower at 35–36%

Srinivas emphasized that the release is part of an ongoing mission to make AI assistants safer without compromising usability, writing that the goal is to continue “making Comet safe and secure to use while benefiting from all the agency and utility it offers to all users.”

The announcement comes as major AI developers race to safeguard models against jailbreaks, adversarial attacks, and manipulation threats—issues that have gained urgency as AI systems become deeply integrated into consumer and enterprise applications.

Perplexity’s approach of open-sourcing both the model and testing framework appears positioned to influence broader industry standards on transparent safety evaluation.