Regex Snippet: Useful Regex Substitutions for Reformatting Copied Text Snippets

Replace single spaces after end-of-sentence periods with double spaces:

(rx (seq (group (or (seq (not (any whitespace ".")) (not upper-case))
                    (seq (any whitespace ".") (not word))
                    (seq (not (any whitespace ".")) word))
                (? ")")
                (any ".?!")
                (* (any "'’’‚‚❜❟"
                       "\"””„„❞❠"
                       "_*+/=-"
                       ")]}")))
         " "
         (group (not (any " " lower-case)))))
\(\(?:[^.[:space:]][^[:upper:]]\|[.[:space:]][^[:word:]]\|[^.[:space:]][[:word:]]\))?[!.?][]"')-+/=_}’‚”„❜❞-❠-]*\) \([^ [:lower:]]\)

and replace with

\1  \2

Replace double-hyphens with en-dashes:

\([^-]\)--\([^-]\) → \1–\2

Replace triple-hyphens with em-dashes:

\([^-]\)---\([^-]\) → \1—\2

Remove superscript annotations:

\^{\[/?.*?/?\]} →

Remove trailing double-backslash and spaces:

*\\\\$ →

Fix empty body text hyperlinks:

\]\[\]\] → ]]

Remove trailing whitespace:

*$ →

Remove null links that are commonly inserted by Pandoc in place of HTML self-links:

\[\[https?://\(?:[\w-]\.\)*?[\w-]+/?.*?/null\]\(?:\[\]\)?\] →

Replace YouTube tracking links with plain links:

https://www\.youtube\.com/redirect\?.*?&q=\(.*?\)&v=[a-zA-Z0-9\-\._~%\$!&'()\*\+,;=:@]\{8,24\} → \1

Remove YouTube icon URIs in hyperlink bodies:

\[\[https://www\.gstatic\.com/youtube/img/.*?\.\(?:png\|jpeg\|jpg\|svg\)]] →

Remove wikimedia embedded image of rendered LaTeX:

\[\[https?://wikimedia.org/api/rest_v1/media/math/render/.*?\]\] →

Local Graph

org-roam bd12a34f-f545-4508-ad59-01fb77a2420f Snippets a4053afc-b93b-49d0-bd79-30590c059a4b Regex Snippet: Useful Regex Substitut... bd12a34f-f545-4508-ad59-01fb77a2420f->a4053afc-b93b-49d0-bd79-30590c059a4b