First: note that programming language syntax is not regular, and, thus, cannot actually be parsed with a regular expression. It is context-free and, thus, you will require at least a context-free grammar to parse it. You might be able to get by with something for simple cases (ie, a subset of the true syntax), but it is impossible to write an expression that will work in all cases.
That said, this works for the case you gave:
val split = source split """(?s)/\*\*|\*/"""
val parts =
split.grouped(2).flatMap {
case Array(code,comment) => Seq(code, "/**" + comment + "*/")
case code => code
}
.map(_.trim)
.filter(_.nonEmpty)
The variable parts
then contains the 4 strings you specified.
This expression will fail on, for example, an input where /**
is contained inside a javadoc comment (/** /** */
) or a is found between the quotation marks of a string literal (val s = " /** "
).