Problem: a snippet of code (typically a few lines) in language like Java, C++, etc. (not limiting it to particular language).
I need to extract words from it that seem like unique identifiers - variable names, function/method names, class names, etc.
Obviously that means skipping all the whitespace, newlines, brackets, punctuation, and importantly keywords.
I realize it's somewhat similar to this question: Sanitize/Rewrite HTML on the Client Side
I guess some modification of that code using regexes could get me smth approximately good enough. But I wonder if there's a better (cleaner, shorter) way?