Here are three options, the first providing a good Regex that does what you want, and the other two for parsing URL's using an alternative to Regex which handle URL component encoding/decoding correctly.
Parsing using Regex
NOTE: Regex method is unsafe in most use cases since it does not properly parse the URL into components, then decode each component separately. Normally you cannot decode the whole URL into one string and then parse safely because some encoded characters might confuse the Regex later. This is similar to parsing XHTML using regex (as described here). See alternatives to Regex below.
Here is a cleaned up regex as a unit test case that handles more URLs safely. At the end of this post is a unit test you can use for each method.
private val SECRET_CODE_REGEX = """xeno://soundcloud[/]?.*[\?&]code=([^#&]+).*""".toRegex()
fun findSecretCode(withinUrl: String): String? =
SECRET_CODE_REGEX.matchEntire(withinUrl)?.groups?.get(1)?.value
This regex handles these cases:
- with and without trailing
/
in path
- with and without fragment
- parameter as first, middle or last in list of parameters
- parameter as only parameter
Note that idiomatic way to make a regex in Kotlin is someString.toRegex()
. It and other extension methods can be found in the Kotlin API Reference.
Parsing using UriBuilder or similar class
Here is an example using the UriBuilder
from the Klutter library for Kotlin. This version handles encoding/decoding including more modern JavaScript unicode encodings not handled by the Java standard URI
class (which has many issues). This is safe, easy, and you don't need to worry about any special cases.
Implementation:
fun findSecretCode(withinUrl: String): String? {
fun isValidUri(uri: UriBuilder): Boolean = uri.scheme == "xeno"
&& uri.host == "soundcloud"
&& (uri.encodedPath == "/" || uri.encodedPath.isNullOrBlank())
val parsed = buildUri(withinUrl)
return if (isValidUri(parsed)) parsed.decodedQueryDeduped?.get("code") else null
}
The Klutter uy.klutter:klutter-core-jdk6:$klutter_version
artifact is small, and includes some other extensions include the modernized URL encoding/decoding. (For $klutter_version
use the most current release).
Parsing with JDK URI Class
This version is a little longer, and shows you need to parse the raw query string yourself, decode after parsing, then find the query parameter:
fun findSecretCode(withinUrl: String): String? {
fun isValidUri(uri: URI): Boolean = uri.scheme == "xeno"
&& uri.host == "soundcloud"
&& (uri.rawPath == "/" || uri.rawPath.isNullOrBlank())
val parsed = URI(withinUrl)
return if (isValidUri(parsed)) {
parsed.getRawQuery().split('&').map {
val parts = it.split('=')
val name = parts.firstOrNull() ?: ""
val value = parts.drop(1).firstOrNull() ?: ""
URLDecoder.decode(name, Charsets.UTF_8.name()) to URLDecoder.decode(value, Charsets.UTF_8.name())
}.firstOrNull { it.first == "code" }?.second
} else null
}
This could be written as an extension on the URI class itself:
fun URI.findSecretCode(): String? { ... }
In the body remove parsed
variable and use this
since you already have the URI, well you ARE the URI. Then call using:
val secretCode = URI(myTestUrl).findSecretCode()
Unit Tests
Given any of the functions above, run this test to prove it works:
class TestSo34594605 {
@Test fun testUriBuilderFindsCode() {
// positive test cases
val testUrls = listOf("xeno://soundcloud/?code=secret_code_data#",
"xeno://soundcloud?code=secret_code_data#",
"xeno://soundcloud/?code=secret_code_data",
"xeno://soundcloud?code=secret_code_data",
"xeno://soundcloud?code=secret_code_data&other=fish",
"xeno://soundcloud?cat=hairless&code=secret_code_data&other=fish",
"xeno://soundcloud/?cat=hairless&code=secret_code_data&other=fish",
"xeno://soundcloud/?cat=hairless&code=secret_code_data",
"xeno://soundcloud/?cat=hairless&code=secret_code_data&other=fish#fragment"
)
testUrls.forEach { test ->
assertEquals("secret_code_data", findSecretCode(test), "source URL: $test")
}
// negative test cases, don't get things on accident
val badUrls = listOf("xeno://soundcloud/code/secret_code_data#",
"xeno://soundcloud?hiddencode=secret_code_data#",
"http://www.soundcloud.com/?code=secret_code_data")
badUrls.forEach { test ->
assertNotEquals("secret_code_data", findSecretCode(test), "source URL: $test")
}
}