I'm attempting to get content between certain html tags. I have been referring most recently to this question How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)? . I've tried two or three of the suggestions here, and another suggestion from another page. I cannot get any of them to work.
The regex <\s*p(\s+.*?>|>).*?<\s*/\s*p\s*>
works inside of an online sed editor, but it doesn't work in my GNU shell.
The pattern sed -n '/PAT1/,/PAT2/{/PAT2/!p}' FILE
written as sed -n '/<p>/,/<\/p>/p' FILE
seems to fail silently, as it just returns everything in the file.
The pattern awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file in my shell as awk '/<p>/{flag=1; next}/<\/p>/{flag=0} flag' file
returns the file without the matches, but it contains the also contains rest of the (non-matching) file.
` and `
` on a separate line and then tackle the problem e.g. `sed -E 's/<\/?p>/\n&\n/g;H;$!d;x;s/(\n)\n/\1/g;s/\n(\n<\/p>)/\1/g' file|sed -n '/
/,/<\/p>/{//!p}`
– potong Jan 31 '23 at 09:40' operator. |3 match all containing text between two groups with '.*' |4. match closing html tag '
– Andrew Jan 31 '23 at 15:36.*(?=/
/) here's an attempt for another datatype using grep: /(?<=/MHhGRkUw/).*(?=/MHhGRkVG/)/ – Andrew Jan 31 '23 at 15:40