EDDYMENS

Published a year ago

Matching Markdown And HTML Headings Using Regex (JS)

The Javascript code below matches the different types of headings in a Markdown string. This includes HTML headings since Markdown is a superset of HTML.

The different types of headings the script matches are:

  • Markdown hash based heading: Thus #, ##, ###, ####, etc.
  • Alternative Markdown heading syntax: i.e.: ==== and ------.
  • HTML heading tags: Thus <‌h1>, <‌h2>, etc.

Script

01: <script> 02: const regex = /(?<headerTag>#+)\s+(?<headingText>[^"\n]*)|(<(?<HTMLHeaderTag>h[1-6]).*?>(?<HTMLHeadingText>.*?)<\/h[1-6]>)|(((?<altMDHeadingText>.+)\n)(?<altMDHeadingTag>-+|=+))/gm; 03: 04: const str = `# Overview 05: AWS S3 is a cloud storage service that caters to the storage needs of modern software applications. S3 buckets can be used to host static sites. 06: 07: ## Getting started 08: Once you have your AWS account all set up you can log in and then use the search bar up top to search for the S3 service. 09: 10: ### Third-level header 11: Third-level header content goes here. 12: 13: #### Forth-level header 14: Fourth-level content goes here. 15: 16: Alternative Heading Level 1 17: =========================== 18: Alternative heading 1 text. 19: 20: Alternative Heading Level 2 21: --------------------------- 22: Alternative heading 2 text. 23: 24: <h1> HTML Header 1 </h1> 25: Level 1 heading 26: 27: <h2> HTML Header 2 </h2> 28: Level 2 heading. 29: 30: <h6> HTML Header 6 </h6> 31: Level 6 heading.`; 32: let m; 33: let headings = []; 34: while ((m = regex.exec(str)) !== null) { 35: if (m.index === regex.lastIndex) regex.lastIndex++; 36: headings.push({ 37: headingTag : m.groups.HTMLHeaderTag ?? m.groups.altMDHeadingTag ?? m.groups.headerTag, 38: headingText : m.groups.HTMLHeadingText ?? m.groups.altMDHeadingText ?? m.groups.headingText 39: }); 40: } 41: console.log(headings); 42: </script>

Output

01: [ 02: { 03: "headingTag":"#", 04: "headingText":"Overview" 05: }, 06: { 07: "headingTag":"##", 08: "headingText":"Getting started" 09: }, 10: { 11: "headingTag":"###", 12: "headingText":"Third-level header" 13: }, 14: { 15: "headingTag":"####", 16: "headingText":"Forth-level header " 17: }, 18: { 19: "headingTag":"===========================", 20: "headingText":"Alternative Heading Level 1" 21: }, 22: { 23: "headingTag":"---------------------------", 24: "headingText":"Alternative Heading Level 2" 25: }, 26: { 27: "headingTag":"h1", 28: "headingText":" HTML Header 1 " 29: }, 30: { 31: "headingTag":"h2", 32: "headingText":" HTML Header 2 " 33: }, 34: { 35: "headingTag":"h6", 36: "headingText":" HTML Header 6 " 37: } 38: ]

Here is another article you might like 😊 Matching Markdown And HTML Headings Using Regex | PHP