RegEx Cleanup Examples

Top  Previous  Next

The User Interface > Preview/Edit Window > Text/Edit Tab > Editor Toolbar... > Text Cleanup > RegEx Cleanup Examples

Removing all but first line

Q: I need is an edit option that automatically deletes all text from selected entries EXCEPT the first line. So if an entry has 4 lines, keep just the first one.

A: Use the RegEx option, with "line byline" turned OFF, replacing this:

\x0D\x0A.*

with nothing.

 

This means, "all text following the first line-break".


Removing double line-breaks.

Q: The linebreak removal is great, but I also want to remove double breaks, which normally signal a WANTED break between two paragraphs (one to end the last sentence, and one to break the paragraphs).

A: Use the regular line-break option in the text clean-up, and add a RegEx stripper to remove any double-breaks, and replace with a single break.

Use the RegEx option, with "line byline" turned OFF, replacing this:

\x0D\x0A\x0D\x0A

with

\x0D\x0A


Removing triple (and more) line-breaks.

Q: How about removing "more than two in a row"?

A: Use the following:

Find: (\x0D\x0A){3,}   [x] RegEx? 

Repl: \x0D\x0A\x0D\x0A [x] RegEx? 

                       [ ] Line By Line? 

Explanation:

\x0D is a carriage return, \x0A is a linefeed, expressed in hexidecimal notation, understandable by the RegEx parser.

By placing in round braces (), that groups them together so that the iterator {} can treat as a whole. 3 means "at least 3 of these", where "these" are the preceding pattern, which is treated as a whole due to the round braces. The comma after the 3 separates min/max. Here min is 3, and max is unlimited. I could have written {3,999} instead, for example. The max is optional, but the comma isn't - otherwise it would do "only 3".

We're replacing with 2 carriage-return/linefeeds.

We're turning off the line-by-line option, so that the linebreaks are visible to the RegEx (otherwise it would not see them).