---
title: Rewrite HTML attributes after parsing
description: "Implement IHtmlResponseRewriter to mutate already-parsed HTML — lowercase anchors, normalize hrefs, stamp rel=noopener — sharing the document parse with every other rewriter."
canonical_url: https://usepennington.net/how-to/response-pipeline/html-rewriter/
sidecar_url: https://usepennington.net/how-to/response-pipeline/html-rewriter.md
content_hash: sha256:ff788e182105705a70570fcd50f5f2660c8a3952dc48f5bc48313b6456c6bf93
tokens: 2222
uid: how-to.response-pipeline.html-rewriter
reading_time_minutes: 4
---

Guides
# Rewrite HTML attributes after parsing

Implement IHtmlResponseRewriter to mutate already-parsed HTML — lowercase anchors, normalize hrefs, stamp rel=noopener — sharing the document parse with every other rewriter.

 
To rewrite anchors, inject attributes, normalize URLs, or strip sentinels in already-rendered HTML, implement `IHtmlResponseRewriter`. Every rewriter shares one AngleSharp parse against the same `IDocument`. For non-HTML response types (JSON, plain text) or work that needs the final byte stream, use [Transform the response body on every page](https://usepennington.net/how-to/response-pipeline/response-processor.md) instead.

 
The recipe references `examples/ExtensibilityLabExample/AnchorLowercaseRewriter.cs`, which exercises both phases of the contract against a bare `AddPennington` host.

 
## Before you begin

 
 - An existing Pennington site rendering HTML pages (see [Create your first Pennington site](https://usepennington.net/tutorials/getting-started/first-site.md) if not).
 - A clear sense of which phase fits the edit: a non-HTML token (something not valid HTML structure, like `<xref:uid>` or a sentinel comment) belongs in `PreParseAsync`; anything queryable by selectors belongs in `ApplyAsync`.
 
 
## Write the rewriter

 
Implement [Pennington.Infrastructure.IHtmlResponseRewriter](https://usepennington.net/reference/api/i-html-response-rewriter.md) as a sealed class. Three rules carry the page:

 
 - `ShouldApply` runs per-response; return `false` to skip both phases when the content-type, path, or headers mean there is nothing to do. The example narrows to `text/html` responses so non-HTML endpoints (search index JSON, llms.txt) bypass the rewriter entirely.
 - `PreParseAsync` receives the raw HTML string and returns the string to parse. Use it only when the target construct is not valid HTML structure — raw `<xref:uid>` tags are the canonical shipped example. Return the input unchanged when there is nothing to do.
 - `ApplyAsync` receives the already-parsed `IDocument` shared by every rewriter — query with `QuerySelectorAll`, mutate attributes and text, and return. Do not re-serialize or reparse.
 
 
```csharp:symbol
namespace ExtensibilityLabExample;
  
using AngleSharp.Dom;
using AngleSharp.Html.Dom;
using Microsoft.AspNetCore.Http;
using Pennington.Infrastructure;
  
/// <summary>
/// Implements <see cref="IHtmlResponseRewriter"/> and demonstrates both
/// halves of the contract:
/// <list type="bullet">
/// <item><description><see cref="PreParseAsync"/> runs a cheap string
///   replace over the raw HTML before AngleSharp parses it. We use it to
///   strip the <c>&lt;!--LOWERCASE-SENTINEL--&gt;</c> comment — the kind
///   of pre-parse cleanup a real rewriter does for non-HTML tokens like
///   <c>&lt;xref:uid&gt;</c>.</description></item>
/// <item><description><see cref="ApplyAsync"/> walks the parsed document
///   and lowercases the text content of every <c>&lt;a&gt;</c> tag
///   marked <c>data-lowercase</c>.</description></item>
/// </list>
/// <para>
/// <see cref="Order"/> is 500 — after the shipped xref (10), locale (20),
/// and base-URL (30) rewriters so our pass sees already-resolved hrefs.
/// </para>
/// <para>
/// Backs how-to 2.3.50 <c>/how-to/extensibility/html-rewriter</c>.
/// </para>
/// </summary>
public sealed class AnchorLowercaseRewriter : IHtmlResponseRewriter
{
    public int Order => 500;
  
    public bool ShouldApply(HttpContext context)
    {
        var contentType = context.Response.ContentType;
        return contentType is not null
               && contentType.StartsWith("text/html", StringComparison.OrdinalIgnoreCase);
    }
  
    /// <summary>
    /// Pre-parse pass. Strip the sentinel comment so it is gone before
    /// AngleSharp runs. A string replace is the right tool when the
    /// target construct is not valid HTML structure (raw <c>&lt;xref&gt;</c>
    /// tags are the canonical example shipped with Pennington).
    /// </summary>
    public Task<string> PreParseAsync(string html, HttpContext context)
    {
        if (!html.Contains("<!--LOWERCASE-SENTINEL-->", StringComparison.Ordinal))
        {
            return Task.FromResult(html);
        }
  
        return Task.FromResult(html.Replace("<!--LOWERCASE-SENTINEL-->", string.Empty, StringComparison.Ordinal));
    }
  
    /// <summary>
    /// DOM pass. Walk the parsed document, find every <c>&lt;a&gt;</c>
    /// with <c>data-lowercase</c>, lowercase its text content.
    /// </summary>
    public Task ApplyAsync(IDocument document, HttpContext context)
    {
        foreach (var element in document.QuerySelectorAll("a[data-lowercase]"))
        {
            if (element is not IHtmlAnchorElement anchor)
            {
                continue;
            }
  
            if (string.IsNullOrEmpty(anchor.TextContent))
            {
                continue;
            }
  
            anchor.TextContent = anchor.TextContent.ToLowerInvariant();
        }
  
        return Task.CompletedTask;
    }
}
```

 
## Pick an Order value

 
The shipped rewriters occupy `Order` values from 10 (xref resolution) through 60 (the last built-in transform); xref resolution, locale prefixing, and base-URL prefixing run in that relative order because each produces the link form the next one consumes. Pick above 60 to run after every shipped transform, below 10 to run before xref resolution, or between the built-ins only when that placement is deliberate. For the exact `Order` of each shipped rewriter, see [Pennington.Infrastructure.IHtmlResponseRewriter](https://usepennington.net/reference/api/i-html-response-rewriter.md). The example uses 500 so anchors are lowercased after every shipped transform has run.

 
## Register the rewriter

 
Every registered `IHtmlResponseRewriter` is picked up and ordered by its `Order` value, so a single registration next to the host wiring is sufficient. Use the lifetime that matches your dependencies — `AddSingleton` for stateless rewriters, `AddTransient` (or `AddFileWatched`) when the rewriter captures file-watched state.

 
```csharp
builder.Services.AddSingleton<IHtmlResponseRewriter, AnchorLowercaseRewriter>();
```

 
## Configure the shipped word-break rewriter

 
One shipped rewriter you configure rather than implement is the word-break rewriter. `AddWordBreak` turns it on; it inserts `<wbr>` break opportunities into long identifiers so dotted namespaces and PascalCase names wrap inside narrow columns instead of overflowing.

 
```csharp
builder.Services.AddWordBreak(options =>
{
    options.CssSelector = "h1, h2, h3, h4, h5, h6, span, .text-break";
    options.MinimumCharacters = 20;
});
```

 
A heading like `Pennington.Infrastructure.WordBreakOptions` then renders with breaks after each dot and before each interior case boundary:

 
Before:

 
```html
<h3>Pennington.Infrastructure.WordBreakOptions</h3>
```

 
After:

 
```html
<h3>Pennington.<wbr>Infrastructure.<wbr>WordBreakOptions</h3>
```

 
For every option and its default, see [Pennington.Infrastructure.WordBreakOptions](https://usepennington.net/reference/api/word-break-options.md).

 
## Result

 
Anchors marked `data-lowercase` have their text content lowercased, and the sentinel comment is gone from view-source.

 
Before:

 
```html
<!--LOWERCASE-SENTINEL-->
<a data-lowercase href="/docs/">Read the DOCS</a>
<a data-lowercase href="/blog/">Latest POSTS</a>
```

 
After:

 
```html
<a data-lowercase href="/docs/">read the docs</a>
<a data-lowercase href="/blog/">latest posts</a>
```

 
Anchors without `data-lowercase` and non-HTML responses pass through unchanged.

 
## Verify

 
 - Run `dotnet run --project examples/ExtensibilityLabExample` and visit `/lowercase-demo/`. Every `<a data-lowercase>` anchor text is lowercase in the rendered HTML and `<!--LOWERCASE-SENTINEL-->` is absent from view-source.
 - Static build: `dotnet run --project examples/ExtensibilityLabExample -- build output` — grep `output/lowercase-demo/index.html` to confirm the rewriter also runs during publish.
 
 
## Related

 
 - Reference: [Response processing interfaces](https://usepennington.net/reference/api/i-response-processor.md)
 - Reference: [WordBreakOptions](https://usepennington.net/reference/api/word-break-options.md) — the shipped word-break rewriter's configuration
 - Background: [The response-processing pipeline](https://usepennington.net/explanation/core/response-processing.md)
 - Related how-to: [Write a response processor](https://usepennington.net/how-to/response-pipeline/response-processor.md)
 
 
[Previous
                
                Attach derived metadata to every page](https://usepennington.net/how-to/markdown-pipeline/metadata-enrichers.md)[Next
                    
                Transform the response body on every page](https://usepennington.net/how-to/response-pipeline/response-processor.md)