Well, I've searched, there is a lot of questions about names, but I couldn't find any solution for the case I'm looking for.
I set the text with jQuery into a var
when there is a cast in the description.
So I'm trying to get the cast of movies. The problem is that the text content also has categories that are also separated with commas, and may be wrongly detected as the cast.
Therefore if I deny when the text has more than 3 letters capitalized I can filter the correct cast in the returned text.
How the cast are usually disposed inside the text:
John Smith, Mary Jane, Neo, Trinity, Morpheus, Mr. Anderson
Sometimes the last name is not present which is the reason complicating things. This way it confuses with categories. If I could set a bunch of words from a denied list maybe it would be better than deny capitalized letters which is very usual to be under the categories.
var castExists = $('span.post-bold:contains("Cast")');
var cast = "";
if (castExists.length) {
cast = $("div.post-message").text();
var reg = /^(?!\s)([a-z ,A-Z.'-]+)/gm;
var getCast = reg.exec( cast );
if (getCast !== null) {
cast = getCast[0].toString().trim();
}
else {
getCast = '';
}
}
Title: Movie Title
Production: Something
Year: 2021
Categories: Drama, HORROR, Sci Fi, TV Show, Action
Cast: John Smith, Mary Jane, Neo, Trinity, Morpheus, Mr. Anderson
While Title:, Year:, Cast: , etc are under the span.post-bold
tag, everything is inside the div.post-message
For example:
<div class="post-message">
<span class="post-bold">Title</span>
: Movie Title
<span class="post-bold">Year</span>
: 2021
<span class="post-bold">Categories</span>
: Drama, HORROR, Sci Fi, TV Show, Action
<span class="post-bold">Cast</span>
: John Smith, Mary Jane, Neo, Trinity, Morpheus, Mr. Anderson
</div>
As it depends how an user created, the order of things may be different.
Here, the last regex I was trying to write, but which wasn't working
([A-Z][a-z]{1,}( |, )([A-Z][a-z]{1,})?)+
Update:
I created this link on regex101.com on regex101 with the examples, as I saw in the comments appears I wasn't so clear on the question. This way I think people have better chance to help me. The ones with names should get, the ones with categories must not.
PS: I set the regular expression Mohammad told me in the comments on the link.