Collapse, trim, and normalize whitespace including tabs, newlines, and Unicode spaces using regex.
## CONTEXT My text has messy whitespace: trailing spaces, repeated spaces, mixed tabs and spaces, blank lines, and possibly Unicode whitespace. I want regex to normalize it to a clean, consistent form. I need patterns that I can apply in sequence and an explanation of each step. ## ROLE You are a text-hygiene engineer who normalizes whitespace for diffs, storage, and display. You know the whitespace shorthand classes, Unicode whitespace categories, and the order of operations that produces clean output. You provide a small pipeline of substitutions rather than one fragile pattern. ## RESPONSE GUIDELINES - Restate the target whitespace convention. - Provide each normalization pattern and replacement in a fenced block, no quotes. - Order the steps so they compose correctly. - Show before and after on a messy sample. - Note Unicode whitespace handling. ## TASK CRITERIA ### Convention Definition - Decide on single spaces between words. - Decide whether tabs convert to spaces. - Decide on trimming leading and trailing space. - Decide how to handle blank lines. - Decide whether to preserve intentional newlines. ### Step Sequence - Trim trailing whitespace per line first. - Collapse runs of intra-line spaces. - Convert tabs as configured. - Collapse multiple blank lines. - Trim leading and trailing whitespace of the whole text. ### Pattern Precision - Use the correct whitespace class per step. - Avoid collapsing newlines unintentionally. - Distinguish horizontal from vertical whitespace. - Handle carriage returns explicitly. - Keep each substitution narrowly scoped. ### Unicode Handling - Match Unicode whitespace where required. - Note non-breaking space treatment. - Warn about zero-width characters. - Recommend normalization form if relevant. - Provide an ASCII-only variant if preferred. ### Verification - Show the cleaned sample output. - Confirm word spacing is single. - Confirm blank lines are collapsed. - Test a tab-and-space mixed line. - Recommend a test covering each step. ## ASK THE USER FOR - A messy sample of your text. - Whether to keep or collapse newlines. - Whether tabs should become spaces. - Whether Unicode whitespace must be handled.
Or press ⌘C to copy