Skip to content

Instantly share code, notes, and snippets.

@MartinMiles
Last active June 4, 2025 22:07
Show Gist options
  • Save MartinMiles/bb6bc3ceffd6040da965e081f7c9301d to your computer and use it in GitHub Desktop.
Save MartinMiles/bb6bc3ceffd6040da965e081f7c9301d to your computer and use it in GitHub Desktop.
Runs on a host machine, pulls a web page from legacy site, slices components into react/next.js tsx files under the same paths/subfolders as was done for cshtml, pulls the components hierarchy and sets placeholders as per original names
<#
.SYNOPSIS
Slices a fully rendered Sitecore HTML page into individual Next.js component `.tsx` files,
based on the <!-- start-component='…' --> / <!-- end-component='…' --> markers and
by calling Get-Layout.ps1 (which now returns JSON) for the layout hierarchy.
.PARAMETER Url
The URL of the rendered page. Defaults to http://rssbplatform.dev.local/aaa
.PARAMETER ItemPath
The Sitecore item path whose layout will be fetched via Get-Layout.ps1.
Defaults to "/sitecore/content/Zont/Habitat/Home/AAA".
.EXAMPLE
PS> .\Slice-HtmlToComponents.ps1 `
-Url "http://rssbplatform.dev.local/aaa" `
-ItemPath "/sitecore/content/Zont/Habitat/Home/AAA"
This fetches the HTML from the given URL, invokes Get-Layout.ps1 (JSON) on the specified item path,
then slices the HTML into `.tsx` files under ./Views/... plus a Layout.tsx in the script folder.
.NOTES
Requires PowerShell 5.1. Uses only built-in .NET/PowerShell and calls Get-Layout.ps1.
Stops on any mismatch or parsing error.
#>
param(
[Parameter(Mandatory = $false)]
[string]$Url = "http://rssbplatform.dev.local/aaa",
[Parameter(Mandatory = $false)]
[string]$ItemPath = "/sitecore/content/Zont/Habitat/Home/AAA"
)
#------------------------------------------------------------
# 1. Determine script folder (so we can write files relative to it)
#------------------------------------------------------------
$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Definition
if (-not (Test-Path $scriptDir)) {
Write-Error "Cannot determine script directory. Exiting."
exit 1
}
#------------------------------------------------------------
# 2. Invoke Get-Layout.ps1 to retrieve Layout JSON for $ItemPath
#------------------------------------------------------------
$layoutScript = Join-Path $scriptDir "Get-Layout.ps1"
if (-not (Test-Path $layoutScript)) {
Write-Error "Cannot find Get-Layout.ps1 in '$scriptDir'. Please ensure it exists."
exit 1
}
Write-Host "Invoking Get-Layout.ps1 for item path '$ItemPath' ..."
try {
$layoutJsonRaw = & $layoutScript -itemPath $ItemPath 2>&1
} catch {
Write-Error "Failed to execute Get-Layout.ps1: $_"
exit 1
}
if (-not $layoutJsonRaw) {
Write-Error "Get-Layout.ps1 returned no output. Exiting."
exit 1
}
$layoutJsonString = $layoutJsonRaw | Out-String
try {
$layoutObj = $layoutJsonString | ConvertFrom-Json
} catch {
Write-Error "Failed to parse Layout JSON returned by Get-Layout.ps1. $_"
exit 1
}
#------------------------------------------------------------
# 3. Build UID -> PlaceholderKey map from Layout JSON
#------------------------------------------------------------
$layoutMap = @{}
function Traverse-Layout {
param($placeholdersArray)
foreach ($ph in $placeholdersArray) {
$phString = $ph.placeholder
$segments = $phString -split "/"
$shortKey = $segments[-1]
foreach ($rend in $ph.renderings) {
$uidStr = ($rend.uid.Trim('{','}')).ToLower()
$layoutMap[$uidStr] = $shortKey
if ($rend.placeholders -and $rend.placeholders.Count -gt 0) {
Traverse-Layout $rend.placeholders
}
}
}
}
if (-not $layoutObj.placeholders) {
Write-Error "Layout JSON does not contain a 'placeholders' property. Exiting."
exit 1
}
Traverse-Layout $layoutObj.placeholders
#------------------------------------------------------------
# 4. Fetch the rendered HTML from $Url
#------------------------------------------------------------
Write-Host "Downloading HTML from $Url ..."
try {
$response = Invoke-WebRequest -Uri $Url -UseBasicParsing -ErrorAction Stop
} catch {
Write-Error "Failed to download URL '$Url'. $_"
exit 1
}
$html = $response.Content
if (-not $html) {
Write-Error "Downloaded HTML is empty. Exiting."
exit 1
}
#------------------------------------------------------------
# 5. Locate all start-component / end-component markers in the HTML
#------------------------------------------------------------
$patternStart = "<!--\s*start-component='({.*?})'\s*-->"
$patternEnd = "<!--\s*end-component='({.*?})'\s*-->"
$regexStart = [regex]::new(
$patternStart,
[System.Text.RegularExpressions.RegexOptions]::Singleline -bor
[System.Text.RegularExpressions.RegexOptions]::IgnoreCase
)
$regexEnd = [regex]::new(
$patternEnd,
[System.Text.RegularExpressions.RegexOptions]::Singleline -bor
[System.Text.RegularExpressions.RegexOptions]::IgnoreCase
)
$matchesStart = $regexStart.Matches($html)
$matchesEnd = $regexEnd.Matches($html)
$markers = @()
foreach ($m in $matchesStart) {
$jsonText = $m.Groups[1].Value
try {
$meta = ConvertFrom-Json -InputObject $jsonText -ErrorAction Stop
} catch {
Write-Error "Malformed JSON in start-component: $jsonText"
exit 1
}
$uid = ($meta.uid.Trim('{','}')).ToLower()
$markers += [PSCustomObject]@{
Type = "Start"
UID = $uid
Name = $meta.name
Path = $meta.path
Index = $m.Index
Length = $m.Length
}
}
foreach ($m in $matchesEnd) {
$jsonText = $m.Groups[1].Value
try {
$meta = ConvertFrom-Json -InputObject $jsonText -ErrorAction Stop
} catch {
Write-Error "Malformed JSON in end-component: $jsonText"
exit 1
}
$uid = ($meta.uid.Trim('{','}')).ToLower()
$markers += [PSCustomObject]@{
Type = "End"
UID = $uid
Name = $meta.name
Path = $meta.path
Index = $m.Index
Length = $m.Length
}
}
$markers = $markers | Sort-Object Index
Write-Host "`nFound the following markers in the HTML:`n"
foreach ($mark in $markers) {
Write-Host (" {0,-5} | UID={1} | Index={2}" -f $mark.Type, $mark.UID, $mark.Index)
}
Write-Host ""
#------------------------------------------------------------
# 6. Build a nested “component tree” by using a stack
#------------------------------------------------------------
$stack = @()
$components = @()
foreach ($marker in $markers) {
switch ($marker.Type) {
"Start" {
$node = [PSCustomObject]@{
UID = $marker.UID
Name = $marker.Name
Path = $marker.Path
StartIndex = $marker.Index + $marker.Length
EndIndex = $null
RawHtml = ""
Children = @()
PlaceholderKey = ""
}
if ($stack.Count -gt 0) {
$parent = $stack[-1]
$parent.Children += $node
} else {
$components += $node
}
$stack += $node
}
"End" {
if ($stack.Count -eq 0) {
Write-Error "Unmatched end-component for UID '$($marker.UID)'."
exit 1
}
$node = $stack[-1]
if ($node.UID -ne $marker.UID) {
Write-Error "Nesting error: end-marker UID '$($marker.UID)' does not match start UID '$($node.UID)'."
exit 1
}
$node.EndIndex = $marker.Index
$length = $node.EndIndex - $node.StartIndex
try {
$node.RawHtml = $html.Substring($node.StartIndex, $length)
} catch {
Write-Error "Failed to extract RawHtml for UID '$($node.UID)': $_"
exit 1
}
if ($stack.Count -gt 1) {
$stack = $stack[0 .. ($stack.Count - 2)]
} else {
$stack = @()
}
}
}
}
if ($stack.Count -gt 0) {
$leftover = $stack[-1]
Write-Error "Unmatched start-component for UID '$($leftover.UID)'. Missing end marker."
exit 1
}
if ($components.Count -ne 1) {
Write-Warning "Expected exactly one top-level component (the holding view), but found $($components.Count). Proceeding with the first one."
}
$root = $components[0]
#------------------------------------------------------------
# 7. Verify all UIDs (except the root) exist in the layout map
#------------------------------------------------------------
function Get-AllNodes {
param($node)
$list = @($node)
foreach ($c in $node.Children) {
$list += Get-AllNodes $c
}
return $list
}
$allNodes = Get-AllNodes $root
foreach ($node in $allNodes) {
if ($node.UID -eq "00000000-0000-0000-0000-000000000000") {
continue
}
if (-not $layoutMap.ContainsKey($node.UID)) {
Write-Error "Layout JSON mismatch: UID '$($node.UID)' (component '$($node.Name)') not found in layout map."
exit 1
}
$node.PlaceholderKey = $layoutMap[$node.UID]
}
#------------------------------------------------------------
# 8. Helper: Convert raw HTML -> minimal valid JSX (class -> className), and strip ALL comments
#------------------------------------------------------------
function ConvertHtmlToJsx {
param([string]$htmlContent)
if (-not $htmlContent) { return "" }
# Remove any HTML comments
$noComments = [regex]::Replace($htmlContent, "<!--.*?-->", "", [System.Text.RegularExpressions.RegexOptions]::Singleline)
# Convert class="..." to className="..."
$jsx = $noComments -replace 'class="([^"]*)"', 'className="$1"'
return $jsx
}
#------------------------------------------------------------
# 9. Recursively write each component’s TSX file
#------------------------------------------------------------
function Write-ComponentTsx {
param(
[Parameter(Mandatory = $true)]
[psobject]$node,
[Parameter(Mandatory = $true)]
[string]$scriptDir
)
# Build output path from /Views/.../*.cshtml -> Views\...\*.tsx
$relativeUnix = $node.Path.TrimStart("/")
$relativeWin = $relativeUnix -replace "/", "\"
$tsPath = [IO.Path]::ChangeExtension($relativeWin, ".tsx")
$fullPath = Join-Path $scriptDir $tsPath
$dir = Split-Path $fullPath -Parent
if (-not (Test-Path $dir)) {
New-Item -ItemType Directory -Path $dir -Force | Out-Null
}
$componentName = $node.Name -replace "\s", ""
$writer = [System.IO.StreamWriter]::new($fullPath, $false, [System.Text.Encoding]::UTF8)
# 9.1 Write imports + interface
$writer.WriteLine("import {")
$writer.WriteLine(" ComponentParams,")
$writer.WriteLine(" ComponentRendering,")
$writer.WriteLine(" Placeholder,")
$writer.WriteLine("} from '@sitecore-jss/sitecore-jss-nextjs';")
$writer.WriteLine("import React from 'react';")
$writer.WriteLine()
$writer.WriteLine("interface ComponentProps {")
$writer.WriteLine(" rendering: ComponentRendering & { params: ComponentParams };")
$writer.WriteLine(" params: ComponentParams;")
$writer.WriteLine("}")
$writer.WriteLine()
$writer.WriteLine("const $componentName = (props: ComponentProps): JSX.Element => {")
$writer.WriteLine(" return (")
$writer.WriteLine(" <>")
# 9.2 If there are children, group by placeholder and replace entire group block with one <Placeholder ... />
if ($node.Children.Count -gt 0) {
$groups = $node.Children | Group-Object PlaceholderKey
$original = $node.RawHtml
$modified = $original
# Build ranges per group
$groupRanges = @()
foreach ($grp in $groups) {
$key = $grp.Name
$minStart = [int]::MaxValue
$maxEnd = 0
foreach ($child in $grp.Group) {
$childHtml = $child.RawHtml
$idx = $modified.IndexOf($childHtml)
if ($idx -lt 0) {
Write-Error "Could not locate raw HTML for child UID '$($child.UID)' in parent '$($node.UID)'."
exit 1
}
if ($idx -lt $minStart) { $minStart = $idx }
$endPos = $idx + $childHtml.Length
if ($endPos -gt $maxEnd) { $maxEnd = $endPos }
}
$groupRanges += [PSCustomObject]@{ Key = $key; Start = $minStart; End = $maxEnd }
}
# Sort descending by Start so earlier replacements do not shift later
$sortedRanges = $groupRanges | Sort-Object Start -Descending
foreach ($range in $sortedRanges) {
$before = $modified.Substring(0, $range.Start)
$after = $modified.Substring($range.End)
$phTag = "<Placeholder name=`"$($range.Key)`" rendering={props.rendering} />"
$modified = $before + $phTag + $after
}
# Convert to JSX and write
$jsxContent = ConvertHtmlToJsx $modified
$lines = $jsxContent -split "`r?`n"
foreach ($line in $lines) {
$writer.WriteLine(" $line")
}
}
else {
# No children: just output RawHtml (strip comments and convert class->className)
$jsxContent = ConvertHtmlToJsx $node.RawHtml
$lines = $jsxContent -split "`r?`n"
foreach ($line in $lines) {
$writer.WriteLine(" $line")
}
}
$writer.WriteLine(" </>")
$writer.WriteLine(" );")
$writer.WriteLine("};")
$writer.WriteLine()
$writer.WriteLine("export default $componentName;")
$writer.Close()
Write-Host " - Generated component: $tsPath"
foreach ($child in $node.Children) {
Write-ComponentTsx -node $child -scriptDir $scriptDir
}
}
if (-not $root.Children -or $root.Children.Count -eq 0) {
Write-Warning "No child components found under the top-level holding view. Nothing to generate."
} else {
foreach ($child in $root.Children) {
Write-ComponentTsx -node $child -scriptDir $scriptDir
}
}
#------------------------------------------------------------
# 10. Generate Layout.tsx in the script folder (exactly as in example)
#------------------------------------------------------------
$layoutPath = Join-Path $scriptDir "Layout.tsx"
@"
import React from 'react';
import Head from 'next/head';
import { Placeholder, LayoutServiceData, Field, HTMLLink } from '@sitecore-jss/sitecore-jss-nextjs';
import config from 'temp/config';
import Scripts from 'src/Scripts';
// Prefix public assets with a public URL to enable compatibility with Sitecore Experience Editor.
// If you're not supporting the Experience Editor, you can remove this.
const publicUrl = config.publicUrl;
interface LayoutProps {
layoutData: LayoutServiceData;
headLinks: HTMLLink[];
}
interface RouteFields {
[key: string]: unknown;
Title?: Field;
}
const Layout = ({ layoutData, headLinks }: LayoutProps): JSX.Element => {
const { route } = layoutData.sitecore;
const fields = route?.fields as RouteFields;
const isPageEditing = layoutData.sitecore.context.pageEditing;
const mainClassPageEditing = isPageEditing ? 'editing-mode' : 'prod-mode';
return (
<>
<Scripts />
<Head>
<title>{fields?.Title?.value?.toString() || 'Page'}</title>
<link rel="icon" href={`${publicUrl}/favicon.ico`} />
{headLinks.map((headLink) => (
<link rel={headLink.rel} key={headLink.href} href={headLink.href} />
))}
</Head>
{/* root placeholder for the app, which we add components to using route data */}
<div className={mainClassPageEditing}>
<header>
<div id="header">{route && <Placeholder name="headless-header" rendering={route} />}</div>
</header>
<main>
<div id="content">{route && <Placeholder name="headless-main" rendering={route} />}</div>
</main>
<footer>
<div id="footer">{route && <Placeholder name="headless-footer" rendering={route} />}</div>
</footer>
</div>
</>
);
};
export default Layout;
"@ | Out-File -FilePath $layoutPath -Encoding utf8
Write-Host " - Generated Layout.tsx in $scriptDir"
Write-Host "`nAll components and Layout.tsx have been generated successfully." -ForegroundColor Green
Create a PowerShell 5.1 script named Slice-HtmlToComponents.ps1 that does the following, using only ASCII characters (no Unicode or special symbols) and simple “ - ” prefixes for any listing or output:
1. Accept two optional parameters:
- [string]$Url (default “http://rssbplatform.dev.local/aaa”)
- [string]$ItemPath (default “/sitecore/content/Zont/Habitat/Home/AAA”)
2. Determine its own folder as $scriptDir, exit with an error if that cannot be determined.
3. In $scriptDir, locate and invoke Get-Layout.ps1 with “-itemPath $ItemPath”, capturing its full output (JSON) into a variable. If Get-Layout.ps1 is missing or returns no output, stop with an error. Then convert the captured text into a JSON object ($layoutObj). If JSON parsing fails, stop with an error.
4. Traverse $layoutObj.placeholders recursively to build a hashtable $layoutMap that maps every rendering UID (trimmed of “{}” and lowercased) to its placeholder key. The JSON format is like:
{
"UID":"74261362-79df-44d2-800a-841e8c6d46f9",
"placeholders":[
{
"placeholder":"headless-main",
"renderings":[
{
"uid":"{66422117-69C1-4B62-9323-482F16C4F244}",
"placeholders":[
{
"placeholder":"headless-main/col-wide-1",
"renderings":[
{ "uid":"{3CDF4DD8-28A4-4185-8C09-A73CAA4633F7}", "placeholders":[] },
{ "uid":"{565F9960-4E47-4DA4-9406-0A29DD09911F}", "placeholders":[] }
]
}
]
}
]
}
]
}
- For each JSON node’s “placeholder” field, split on “/” and use only the last segment (e.g. “col-wide-1”).
- For each child rendering, trim “{}” from uid, lowercase it, and store in $layoutMap[uid] = placeholderKey.
- Recurse for nested “placeholders” arrays.
5. Download the HTML from $Url using Invoke-WebRequest -UseBasicParsing. If download fails or content is empty, stop with an error. Save HTML into a string $html.
6. Using case-insensitive, singleline regex, find all occurrences of:
- “<!-- start-component='{ ... }' -->”
- “<!-- end-component='{ ... }' -->”
Capture the JSON inside the single quotes. For each match:
- ConvertFrom-Json to get an object with name, uid, path. If JSON parse fails, stop with an error.
- Trim “{}” from uid and lowercase to get $uid.
- Create a PSCustomObject with properties Type = “Start” or “End”, UID = $uid, Name = name, Path = path, Index = match.Index, Length = match.Length.
- Collect all these markers (both Start and End) into an array $markers.
7. Sort $markers by Index ascending. Write to host a list of found markers in the format:
“Found the following markers in the HTML:”
for each marker: “ Start | UID=... | Index=...” or “ End | UID=... | Index=...”.
8. Build a nested component tree:
- Initialize $stack = @() and $components = @().
- Loop over each $marker in $markers:
- If marker.Type is “Start”:
- Create a PSObject $node with properties:
UID, Name, Path, StartIndex = marker.Index + marker.Length, EndIndex = $null, RawHtml = “”, Children = @(), PlaceholderKey = “”.
- If $stack is not empty, add $node to $stack[-1].Children. Else add $node to $components.
- Push $node onto $stack.
- If marker.Type is “End”:
- If $stack.Count is 0, stop with “Unmatched end-component for UID …” error.
- Let $node = $stack[-1]. If $node.UID != marker.UID, stop with “Nesting error” and exit.
- Set $node.EndIndex = marker.Index; length = EndIndex - StartIndex; then $node.RawHtml = $html.Substring(StartIndex, length). If substring fails, stop with error.
- Pop $node from $stack: if $stack.Count > 1, set $stack = $stack[0..($stack.Count-2)], else set $stack=@().
- After loop: if $stack.Count > 0, stop with “Unmatched start-component for UID …” error.
- If $components.Count ≠ 1, write warning but proceed using $components[0] as $root.
9. Recursively verify every node (except $root with UID “00000000-0000-0000-0000-000000000000”) exists in $layoutMap. If not, stop with “Layout JSON mismatch: UID … not found in layout map.” Then assign $node.PlaceholderKey = $layoutMap[$node.UID].
10. Create a helper function ConvertHtmlToJsx($htmlContent) that:
- Removes ALL HTML comments with regex “<!--.*?-->” (singleline).
- Replaces `class="([^"]*)"` with `className="$1"`.
- Returns the resulting string.
11. Recursive function Write-ComponentTsx($node, $scriptDir) that:
- Compute $relativeUnix = $node.Path.TrimStart("/"), $relativeWin = $relativeUnix -replace "/", "\".
- $tsPath = ChangeExtension($relativeWin, “.tsx”); $fullPath = Join-Path $scriptDir $tsPath; create its directory if missing.
- $componentName = $node.Name -replace "\s","".
- Open UTF8 StreamWriter to $fullPath.
- Write lines:
import {
ComponentParams,
ComponentRendering,
Placeholder,
} from '@sitecore-jss/sitecore-jss-nextjs';
import React from 'react';
interface ComponentProps {
rendering: ComponentRendering & { params: ComponentParams };
params: ComponentParams;
}
const $componentName = (props: ComponentProps): JSX.Element => {
return (
<>
- If $node.Children.Count > 0:
- Group $node.Children by PlaceholderKey into $groups.
- Let $modified = $node.RawHtml.
- For each $group in $groups:
- Compute $minStart = int.MaxValue, $maxEnd = 0.
- For each $child in $group.Group:
- Find $idx = $modified.IndexOf($child.RawHtml). If -lt 0, stop with “Could not locate raw HTML for child UID …” error.
- $minStart = min($minStart, $idx); $endPos = $idx + $child.RawHtml.Length; $maxEnd = max($maxEnd, $endPos).
- Store a PSCustomObject { Key = $group.Name; Start = $minStart; End = $maxEnd } in array $groupRanges.
- Sort $groupRanges by Start descending.
- For each $range in $sortedRanges:
- $before = $modified.Substring(0, $range.Start); $after = $modified.Substring($range.End)
- $phTag = `<Placeholder name="$($range.Key)" rendering={props.rendering} />`
- $modified = $before + $phTag + $after
- $jsxContent = ConvertHtmlToJsx($modified); split on CRLF into $lines; for each $line, writer.WriteLine(" $line").
- Else (no children):
- $jsxContent = ConvertHtmlToJsx($node.RawHtml); split into $lines; writer.WriteLine(" $line") for each.
- Finally write:
</>
);
};
export default $componentName;
- Close writer. Write to host “ - Generated component: $tsPath”.
- Recurse: foreach $child in $node.Children, call Write-ComponentTsx($child, $scriptDir).
12. Under main:
- If $root.Children.Count -eq 0, write warning “No child components …”. Else foreach $child in $root.Children, call Write-ComponentTsx($child, $scriptDir).
13. Generate Layout.tsx in $scriptDir exactly as:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment