Skip to content

fix: improve XSS context analyzer for edge cases#7089

Open
MekonMAC wants to merge 2 commits intoprojectdiscovery:devfrom
MekonMAC:fix/xss-context-edge-cases
Open

fix: improve XSS context analyzer for edge cases#7089
MekonMAC wants to merge 2 commits intoprojectdiscovery:devfrom
MekonMAC:fix/xss-context-edge-cases

Conversation

@MekonMAC
Copy link

@MekonMAC MekonMAC commented Mar 3, 2026

Summary

This PR addresses the four context-classification edge cases reported in #7086, improving the accuracy of the XSS context analyzer introduced in #7076.

Note: This PR depends on #7076 being merged first, as it fixes issues in the XSS context analyzer code introduced there.

Changes

  1. javascript: URIs — Attributes like href="javascript:..." are now correctly classified as ContextScript instead of ContextAttribute. Applies to href, src, action, formaction, xlink:href, data, and poster attributes.

  2. Non-executable script types<script type="application/json">, application/ld+json, text/template, importmap, and similar non-executable script types are now classified as ContextNone instead of executable script context.

  3. Case-insensitive marker detection — The initial marker presence check is now case-insensitive, catching transformed reflections where the server may uppercase or lowercase the input.

  4. srcdoc attributes — The srcdoc attribute is now treated as ContextHTMLText (HTML injection context) since it allows full HTML content in iframes.

Test Plan

  • Added 15+ new test cases covering all edge cases
  • All existing tests continue to pass
  • Tests verify correct context classification for:
    • javascript: URIs with various casing and whitespace
    • JSON and template script blocks
    • Case-insensitive marker detection
    • srcdoc attribute handling

Before/After

Case Before After
<a href="javascript:alert(canary)"> ContextAttribute ContextScript
<script type="application/json">canary</script> ContextScript ContextNone
Body: CANARY, marker: canary Not detected Detected
<iframe srcdoc="<script>canary"> ContextAttribute ContextHTMLText

Fixes #7086

🤖 Generated with OpenClaw

Summary by CodeRabbit

Release Notes

  • New Features

    • Added XSS context analyzer that detects HTML reflections and generates context-appropriate test payloads.
    • Enhanced HTTP response handling to analyze response body, headers, and status codes for improved fuzzing accuracy.
  • Improvements

    • Improved thread-safety for concurrent random utility generation in fuzzing operations.

dejan-v1007 and others added 2 commits February 28, 2026 19:54
Implements an XSS context analyzer that integrates with nuclei's
fuzzing pipeline. The analyzer:

- Injects a canary string with XSS-critical characters (<>"'/) to
  detect reflection and character survival in HTTP responses
- Uses golang.org/x/net/html tokenizer to classify reflection context
  into 8 types: HTMLText, Attribute, AttributeUnquoted, Script,
  ScriptString, Style, HTMLComment, and None
- Selects context-appropriate XSS payloads filtered by which special
  characters survive server-side encoding
- Replays payloads and verifies unencoded reflection to confirm XSS
- Reports CSP header presence as a note on exploitability

New files:
  pkg/fuzz/analyzers/xss/types.go    - context types and constants
  pkg/fuzz/analyzers/xss/context.go  - HTML tokenizer context detection
  pkg/fuzz/analyzers/xss/analyzer.go - main analyzer with replay logic
  pkg/fuzz/analyzers/xss/context_test.go - comprehensive tests

Modified files:
  pkg/fuzz/analyzers/analyzers.go    - add response fields to Options
  pkg/protocols/http/http.go         - register xss analyzer via import
  pkg/protocols/http/request.go      - pass response data to analyzer
  pkg/protocols/http/request_fuzz.go - init nil Parameters map

Closes projectdiscovery#5838
This commit addresses four context-classification issues:

1. javascript: URIs in href/src/action attributes are now correctly
   classified as ContextScript instead of ContextAttribute

2. <script type="application/json"> and other non-executable script
   types (ld+json, templates, importmap) are now classified as
   ContextNone instead of executable script context

3. Marker reflection detection is now case-insensitive, catching
   transformed reflections where server may uppercase/lowercase

4. srcdoc attributes are now treated as HTML injection context
   (ContextHTMLText) since they allow full HTML content

Includes comprehensive test coverage for all new behaviors.

Fixes projectdiscovery#7086
@auto-assign auto-assign bot requested a review from dogancanbakir March 3, 2026 07:20
@neo-by-projectdiscovery-dev
Copy link

neo-by-projectdiscovery-dev bot commented Mar 3, 2026

Neo - PR Security Review

No security issues found

Highlights

  • Improves XSS context analyzer to correctly detect javascript: URIs as executable script contexts
  • Adds detection for non-executable script types (JSON, templates, importmap) to reduce false positives
  • Implements case-insensitive marker detection to catch transformed reflections
  • Correctly classifies srcdoc attribute as HTML injection context
Hardening Notes
  • The Semgrep finding about math/rand usage in analyzers.go:4 is not exploitable in this context. The random number generator is used solely for generating unique test markers (canaries) and payload identifiers during fuzzing operations, not for security-sensitive purposes like session tokens, CSRF tokens, or cryptographic operations. This is an acceptable use case for math/rand.

Comment @neo help for available commands. · Open in Neo

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Walkthrough

This pull request introduces a complete XSS context analyzer for the fuzzing engine, including thread-safe random utilities, HTTP response tracking in Options, HTML reflection detection, context classification, and payload selection logic. The analyzer registers under "xss_context" and performs context-aware XSS verification.

Changes

Cohort / File(s) Summary
Core Analyzer Infrastructure
pkg/fuzz/analyzers/analyzers.go
Added mutex-protected random utilities (RandStringBytesMask, GetRandomInteger), expanded Options struct with ResponseBody, ResponseHeaders, and ResponseStatusCode fields, introduced xss_context to AnalyzerTemplate.Name values, updated payload transformations to use new utilities.
XSS Analyzer Implementation
pkg/fuzz/analyzers/xss/analyzer.go
Implemented new Analyzer type with Name(), ApplyInitialTransformation(), and Analyze() methods. Performs initial canary replacement, extracts canary from responses, detects reflection contexts via HTML tokenizer, selects context-appropriate payloads, and replays requests to verify reflection and detect XSS vulnerabilities.
XSS Reflection Detection and Types
pkg/fuzz/analyzers/xss/context.go, pkg/fuzz/analyzers/xss/types.go
Added DetectReflections() with HTML tokenization for multi-context marker detection (tags, attributes, scripts, styles, comments). Introduced Context enum and ReflectionInfo struct for reflection classification. Helper functions include detectScriptStringContext(), detectAttrQuoting(), BestReflection(), and detection of executable vs non-executable scripts, JavaScript URIs, and srcdoc attributes. CharacterSet struct tracks survived XSS-critical characters.
XSS Analyzer Tests
pkg/fuzz/analyzers/xss/context_test.go
Comprehensive unit and benchmark tests covering DetectReflections across multiple HTML contexts (text, attributes, scripts, styles, comments, event handlers, tag names, RCDATA), edge cases (JavaScript URIs, JSON scripts, srcdoc, case-insensitivity), helper logic validation, and performance benchmarking.
Protocol Integration
pkg/protocols/http/http.go, pkg/protocols/http/request_fuzz.go
Added XSS analyzer import alongside existing time analyzer. Minimal formatting change to request_fuzz.go with no functional impact.

Sequence Diagram

sequenceDiagram
    participant Fuzzer as Fuzzer Engine
    participant XSSAnalyzer as XSS Analyzer
    participant HTMLParser as HTML Parser
    participant PayloadSelector as Payload Selector
    participant Replayer as Request Replayer
    participant Server as Target Server

    Fuzzer->>XSSAnalyzer: ApplyInitialTransformation([XSS_CANARY])
    XSSAnalyzer->>XSSAnalyzer: Generate unique canary + contextual chars
    XSSAnalyzer-->>Fuzzer: Return transformed payload

    Fuzzer->>Fuzzer: Send fuzzed request with canary
    Fuzzer->>Server: HTTP Request
    Server-->>Fuzzer: Response (body + headers + status)

    Fuzzer->>XSSAnalyzer: Analyze(options with response)
    XSSAnalyzer->>HTMLParser: DetectReflections(body, canary)
    HTMLParser->>HTMLParser: Tokenize HTML, track tag stack
    HTMLParser->>HTMLParser: Classify reflection contexts
    HTMLParser-->>XSSAnalyzer: []ReflectionInfo

    XSSAnalyzer->>XSSAnalyzer: BestReflection(reflections)
    XSSAnalyzer->>PayloadSelector: selectPayloads(context, characterSet)
    PayloadSelector-->>XSSAnalyzer: Payload candidates

    loop For each payload
        XSSAnalyzer->>Replayer: replayAndVerify(payload)
        Replayer->>Server: Inject payload, send request
        Server-->>Replayer: Response
        Replayer->>Replayer: Check payload reflection
        Replayer-->>XSSAnalyzer: (bool, result string, error)
    end

    XSSAnalyzer-->>Fuzzer: (vulnerable, details, error)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 ✨ Hops with joy
New XSS analyzer hops into the fuzz,
Context-aware reflection detection—what a buzz!
HTML tokens dance, payloads align,
JavaScript URIs and srcdoc shine,
Safer web fuzzing, line by line! 🔒

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: improve XSS context analyzer for edge cases' directly describes the main focus of the PR, which adds comprehensive fixes for four edge-case classification issues in the XSS context analyzer.
Linked Issues check ✅ Passed The PR successfully addresses all four coding objectives from issue #7086: correctly classifies javascript: URIs as ContextScript, treats non-executable script types as ContextNone, implements case-insensitive marker detection, and treats srcdoc as ContextHTMLText.
Out of Scope Changes check ✅ Passed All changes are within scope: the core fixes to context.go, new test coverage in context_test.go, integration of the XSS analyzer into the fuzzing pipeline via analyzers.go, analyzer imports in http.go, and supporting type definitions in types.go. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/fuzz/analyzers/xss/analyzer.go (1)

218-227: Character survival detection depends on strict character ordering.

The detection assumes characters appear in the exact order canary + "<>"'/" from canaryChars. For example, GreaterThan checks canary+"<>" which requires both < and > to survive in that exact sequence. If a server filters only <, the > won't be detected independently.

This is acceptable given the canary injection includes all characters together, but the logic could be clearer:

♻️ Consider explicit character checks (optional)
 func detectCharacterSurvival(body string, canary string) CharacterSet {
+	// Characters are appended as: canary + `<>"'/`
+	// We check for progressive survival since they're injected together
 	return CharacterSet{
 		LessThan:     strings.Contains(body, canary+"<"),
-		GreaterThan:  strings.Contains(body, canary+"<>") || strings.Contains(body, canary+">"),
-		DoubleQuote:  strings.Contains(body, canary+`<>"`),
-		SingleQuote:  strings.Contains(body, canary+`<>"'`),
-		ForwardSlash: strings.Contains(body, canary+canaryChars), // full canary+chars survived
+		GreaterThan:  strings.Contains(body, canary+"<>"),
+		DoubleQuote:  strings.Contains(body, canary+`<>"`),
+		SingleQuote:  strings.Contains(body, canary+`<>"'`),
+		ForwardSlash: strings.Contains(body, canary+canaryChars),
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 218 - 227,
detectCharacterSurvival currently infers survival by looking for specific
multi-character sequences (e.g., GreaterThan uses canary+"<>"), which misses
independently-surviving characters; change the checks in detectCharacterSurvival
to test each XSS-critical character separately (e.g., check body for canary+"<",
canary+">", canary+`"`, canary+"'", and canary+"/") so each CharacterSet field
(LessThan, GreaterThan, DoubleQuote, SingleQuote, ForwardSlash) is set by an
independent strings.Contains call rather than relying on combined-order
patterns.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 79-82: The early canary presence check uses a case-sensitive
strings.Contains(body, canary) which mismatches the case-insensitive logic in
DetectReflections; update the check in the function (same block that currently
uses strings.Contains) to perform a case-insensitive containment test (e.g.,
compare normalized lowercase versions of body and canary or otherwise use a
case-insensitive contains) so the initial guard and DetectReflections use the
same matching behavior.

---

Nitpick comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 218-227: detectCharacterSurvival currently infers survival by
looking for specific multi-character sequences (e.g., GreaterThan uses
canary+"<>"), which misses independently-surviving characters; change the checks
in detectCharacterSurvival to test each XSS-critical character separately (e.g.,
check body for canary+"<", canary+">", canary+`"`, canary+"'", and canary+"/")
so each CharacterSet field (LessThan, GreaterThan, DoubleQuote, SingleQuote,
ForwardSlash) is set by an independent strings.Contains call rather than relying
on combined-order patterns.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b2ab7d3 and 8a6ef82.

📒 Files selected for processing (8)
  • pkg/fuzz/analyzers/analyzers.go
  • pkg/fuzz/analyzers/xss/analyzer.go
  • pkg/fuzz/analyzers/xss/context.go
  • pkg/fuzz/analyzers/xss/context_test.go
  • pkg/fuzz/analyzers/xss/types.go
  • pkg/protocols/http/http.go
  • pkg/protocols/http/request.go
  • pkg/protocols/http/request_fuzz.go

Comment on lines +79 to +82
// Check if canary is reflected at all
if !strings.Contains(body, canary) {
return false, "", nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistency between case-sensitive canary check here and case-insensitive in DetectReflections.

Line 80 uses strings.Contains(body, canary) (case-sensitive), but DetectReflections at Line 88 uses case-insensitive matching. If the server transforms the canary to uppercase, this check fails early and returns false, even though DetectReflections would have found the reflection.

Consider aligning the case sensitivity:

🔧 Proposed fix
 	// Check if canary is reflected at all
-	if !strings.Contains(body, canary) {
+	if !strings.Contains(strings.ToLower(body), strings.ToLower(canary)) {
 		return false, "", nil
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check if canary is reflected at all
if !strings.Contains(body, canary) {
return false, "", nil
}
// Check if canary is reflected at all
if !strings.Contains(strings.ToLower(body), strings.ToLower(canary)) {
return false, "", nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 79 - 82, The early canary
presence check uses a case-sensitive strings.Contains(body, canary) which
mismatches the case-insensitive logic in DetectReflections; update the check in
the function (same block that currently uses strings.Contains) to perform a
case-insensitive containment test (e.g., compare normalized lowercase versions
of body and canary or otherwise use a case-insensitive contains) so the initial
guard and DetectReflections use the same matching behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XSS Context Analyzer misclassifies javascript: URIs and JSON script blocks

2 participants