This documentation is also published as Markdown for efficient machine reading: the whole site is indexed at /llms.txt, and every page has a clean Markdown copy at the same URL with .md appended. These are generated from the same source and cost far fewer tokens to read than this rendered HTML.

Skip to main content Skip to navigation

HeadingSectionExtractor Pennington.Search

Splits post-pipeline page HTML into one HeadingSection per heading (plus a lead section) so the search index can carry heading-level records that deep-link to anchors. Walks the rendered content element in document order; h2h6 with an id start a new section, h1 is treated as the page title (not indexed into a section body), and <pre> subtrees are dropped when code blocks are excluded.

Methods

Extract

#
public IReadOnlyList<HeadingSection> Extract(IElement content, bool excludeCodeBlocks)

Extracts the lead section plus one section per anchored heading from content.

Parameters

content IElement
excludeCodeBlocks bool

Returns

IReadOnlyList<HeadingSection>

Pennington.Search.HeadingSectionExtractor

namespace Pennington.Search;

/// Splits post-pipeline page HTML into one HeadingSection per heading (plus a lead section) so the search index can carry heading-level records that deep-link to anchors. Walks the rendered content element in document order; h2h6 with an id start a new section, h1 is treated as the page title (not indexed into a section body), and <pre> subtrees are dropped when code blocks are excluded.
public class HeadingSectionExtractor
{
    /// Extracts the lead section plus one section per anchored heading from content.
    
public IReadOnlyList<HeadingSection> Extract(IElement content, bool excludeCodeBlocks)
; }