-2

I tried to learn regex to do this simple task, I tested may patterns using regex101.com editor but with no success.

I want to extract this link (http://mp3lg4.tdf-cdn.com/9243/lag_164753.mp3) from this javascript text, please note that the links doesn't always end with mp3, it could end with anything.

JavaScript Text:

    function reqListener () {
        var div = document.createElement("div");
        div.innerHTML = new XMLSerializer().serializeToString(this.responseXML.documentElement);
        document.body.insertBefore(div, document.body.childNodes[0]);
    }

    var oReq = new XMLHttpRequest();
    oReq.addEventListener("load", reqListener);
    oReq.open("GET", "//static.radio.fr/inc/v2/images/icons/icon-sprites.svg?_=93cbcb9ebf4e2d5276480a0d9c06c653056f0d85");
    oReq.send();

    var environment = {
        develop: false,
        production: true,
        debug: false
    };
    if (window.environment && window.environment.production) {
        window.console.debug = function() {};
        window.console.log = function() {};
    }
    var require = {
        baseUrl: "/inc/v2/js",
        config: {
            'logger': {
                enabled: false,
                filter: (window.environment && window.environment.develop) && (window.location.search.indexOf('test_production=') === -1) ? 'debug' : 'info'
            },
            'components/station/stationService': {
                station: {"continent":"Europe","country":"France","logo300x300":"http://static.radio.fr/images/broadcasts/15/43/8275/1/c300.png","city":"Paris","stationType":"radio_station","description":"Virgin Radio est une station de radio musicale privée Française. Elle a été créée en 2008, suite au changement de nom de la radio Europe 2, et fait partie du groupe Lagardère SCA. La radio cible une audience de jeunes adultes grâce aux hits Electro-Rock et Pop qu’elle propose. L’audience de la chaîne dépasse les 2,7 millions d’auditeurs quotidiens.\r\nCette radio FM est disponible dorénavant par internet grâce à ses flux de diffusion MP3 de 64 et 128 kbps.\r\nAprès son passage à vide du début des années 2010, Virgin Radio revient en force avec son son “Pop - Rock - Electro”.","language":["Français"],"logo100x100":"http://static.radio.fr/images/broadcasts/15/43/8275/1/c100.png","streamUrls":[{"streamUrl":"http://mp3lg4.tdf-cdn.com/9243/lag_164753.mp3","loadbalanced":false,"metaDataAvailable":false,"playingMode":"STEREO","type":"STREAM","sampleRate":44100,"streamContentFormat":"MP3","bitRate":128,"idBroadcast":8275,"sortOrder":0,"streamFormat":"ICECAST","id":47609,"streamStatus":"VALID","contentType":"audio/mpeg"},{"streamUrl":"http://mp3lg3.scdn.arkena.com/10490/virginradio.mp3","loadbalanced":false,"metaDataAvailable":false,"playingMode":"STEREO","type":"STREAM","sampleRate":44100,"streamContentFormat":"MP3","bitRate":64,"idBroadcast":8275,"sortOrder":1,"streamFormat":"ICECAST","id":57003,"streamStatus":"VALID","contentType":"audio/mpeg"}],"playable":"PLAYABLE","genres":["Pop","Rock"],"logo175x175":"http://static.radio.fr/images/broadcasts/15/43/8275/1/c175.png","adParams":{"st_city":["Paris"],"languages":["Français"],"genres":["Pop","Rock"],"topics":[],"st_cont":["Europe"],"station":["virginradio"],"family":["Virgin"],"st_region":[],"type":["radio_station"],"st_country":["France"]},"alias":"Virgin;;Virgin Radio;;103.5;;103,5;Pop Rock Electro","rank":8,"id":8275,"types":["Radio FM"],"website":"http://www.virginradio.fr/","topics":[],"shortDescription":"Virgin Radio propose d'écouter le meilleur des sons “Pop - Rock - Electro”","logo44x44":"http://static.radio.fr/images/broadcasts/15/43/8275/1/c44.png","numberEpisodes":0,"podcastUrls":[],"hideReferer":false,"name":"Virgin Radio Officiel","subdomain":"virginradio","lastModified":"2018-05-10T03:18:17.000Z","family":["Virgin"],"region":"","frequencies":[{"area":"Abbeville","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2299,"frequency":99.6},{"area":"Agen","broadcastId":8275,"frequencyType":"FM","cityId":4416,"id":2317,"frequency":89.8},{"area":"Ajaccio","broadcastId":8275,"frequencyType":"FM","cityId":165,"id":2370,"frequency":99.8},{"area":"Alençon","broadcastId":8275,"frequencyType":"FM","cityId":2956,"id":2424,"frequency":100.9},{"area":"Allos","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2465,"frequency":105.4},{"area":"Amiens","broadcastId":8275,"frequencyType":"FM","cityId":185,"id":2502,"frequency":93.6},{"area":"Angers","broadcastId":8275,"frequencyType":"FM","cityId":193,"id":2542,"frequency":94.8},{"area":"Angoulême","broadcastId":8275,"frequencyType":"FM","cityId":1975,"id":2564,"frequency":100.3},{"area":"Annecy","broadcastId":8275,"frequencyType":"FM","cityId":198,"id":2587,"frequency":100.5},{"area":"Annemasse","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2594,"frequency":90.1},{"area":"Arcachon","broadcastId":8275,"frequencyType":"FM","cityId":5789,"id":2644,"frequency":94.1},{"area":"Argentan","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2668,"frequency":96.1},{"area":"Arras","broadcastId":8275,"frequencyType":"FM","cityId":2721,"id":2708,"frequency":91.9},{"area":"Aubenas","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2759,"frequency":106.9},{"area":"Aubusson","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2783,"frequency":101.8},{"area":"Auch","broadcastId":8275,"frequencyType":"FM","cityId":230,"id":2799,"frequency":100.2},{"area":"Aurillac","broadcastId":8275,"frequencyType":"FM","cityId":237,"id":2845,"frequency":89},{"area":"Autun","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2861,"frequency":87.6},{"area":"Auxerre","broadcastId":8275,"frequencyType":"FM","cityId":240,"id":2881,"frequency":98.9},{"area":"Avallon","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2904,"frequency":90.8},{"area":"Avignon","broadcastId":8275,"frequencyType":"FM","cityId":241,"id":2927,"frequency":89},{"area":"Avranches","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":2938,"frequency":89},{"area":"Bar-le-Duc","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3007,"frequency":102},{"area":"Barcelonnette","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3027,"frequency":94},{"area":"Bastia","broadcastId":8275,"frequencyType":"FM","cityId":1962,"id":3066,"frequency":107.2},{"area":"Bayeux","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3078,"frequency":101.7},{"area":"Bayonne","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3095,"frequency":97.7},{"area":"Beauvais","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3129,"frequency":103.5},{"area":"Belfort","broadcastId":8275,"frequencyType":"FM","cityId":303,"id":3163,"frequency":98.4},{"area":"Bellegarde-sur-Valserine","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3186,"frequency":103.1},{"area":"Belley","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3199,"frequency":96.1},{"area":"Bergerac","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3217,"frequency":93.2},{"area":"Besançon","broadcastId":8275,"frequencyType":"FM","cityId":324,"id":3261,"frequency":100.4},{"area":"Béthune","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3276,"frequency":90.1},{"area":"Blois","broadcastId":8275,"frequencyType":"FM","cityId":344,"id":3325,"frequency":97.2},{"area":"Bonnières-sur-Seine","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3358,"frequency":88.8},{"area":"Bordeaux","broadcastId":8275,"frequencyType":"FM","cityId":360,"id":3387,"frequency":94.3},{"area":"Boulogne-sur-Mer","broadcastId":8275,"frequencyType":"FM","cityId":365,"id":3419,"frequency":91.5},{"area":"Bourg-en-Bresse","broadcastId":8275,"frequencyType":"FM","cityId":1991,"id":3442,"frequency":96.3},{"area":"Bourges","broadcastId":8275,"frequencyType":"FM","cityId":366,"id":3475,"frequency":99.6},{"area":"Brest","broadcastId":8275,"frequencyType":"FM","cityId":379,"id":3526,"frequency":96.5},{"area":"Briançon","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3554,"frequency":96},{"area":"Brioude","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3577,"frequency":89.8},{"area":"Brive-la-Gaillarde","broadcastId":8275,"frequencyType":"FM","cityId":1996,"id":3600,"frequency":88.1},{"area":"Caen","broadcastId":8275,"frequencyType":"FM","cityId":413,"id":3628,"frequency":96.8},{"area":"Cahors","broadcastId":8275,"frequencyType":"FM","cityId":10713,"id":3647,"frequency":96.8},{"area":"Calvi","broadcastId":8275,"frequencyType":"FM","cityId":10711,"id":3686,"frequency":106.7},{"area":"Cannes","broadcastId":8275,"frequencyType":"FM","cityId":423,"id":3717,"frequency":88.1},{"area":"Carcassonne","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3753,"frequency":96},{"area":"Carhaix-Plouguer","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3756,"frequency":106.8},{"area":"Carpentras","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3766,"frequency":103.3},{"area":"Castelnaudary","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3789,"frequency":102.3},{"area":"Castres","broadcastId":8275,"frequencyType":"FM","cityId":440,"id":3804,"frequency":102.4},{"area":"Cauterets","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3810,"frequency":94.1},{"area":"Chalon-sur-Saône","broadcastId":8275,"frequencyType":"FM","cityId":8551,"id":3872,"frequency":97.8},{"area":"Châlons-en-Champagne","broadcastId":8275,"frequencyType":"FM","cityId":11051,"id":3893,"frequency":95.5},{"area":"Chamonix","broadcastId":8275,"frequencyType":"FM","cityId":1986,"id":3939,"frequency":98.3},{"area":"Charleville-Mézières","broadcastId":8275,"frequencyType":"FM","cityId":459,"id":3962,"frequency":99.9},{"area":"Charolles","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3969,"frequency":95.1},{"area":"Chartres","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":3987,"frequency":103.3},{"area":"Château-du-Loir","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":4002,"frequency":103.7},{"area":"Château-Thierry","broadcastId":8275,"frequencyType":"FM","cityId":0,"id":4021,"frequency":102.4},{"area":"Châteaubriant","broadcastId":8275,"frequencyType":"FM","cityId":6200,"id":4030,"frequency":88.6},{"area":"…

I want to apply this regex pattern in this code:

Public Function regExInput(myPatern As String, myInput As String) As String

Dim strPattern As String: strPattern = myPatern
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
regExInput = myInput

If strPattern <> "" Then
    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = strPattern
    End With

    If regEx.Test(regExInput) Then
        MsgBox (regEx.Replace(regExInput, strReplace))
    Else
        MsgBox ("Not matched")
    End If
End If
End Function
halfer
  • 19,824
  • 17
  • 99
  • 186
Daiwik Dan
  • 195
  • 10
  • What are some patterns you have tried? What was your closest pattern? – Huynh May 10 '18 at 13:35
  • I used "streamUrl.+128" and it gave me streamUrls":[{"streamUrl":"http://mp3lg4.tdf-cdn.com/9243/lag_164753.mp3","loadbalanced":false,"metaDataAvailable":false,"playingMode":"STEREO","type":"STREAM","sampleRate":44100,"streamContentFormat":"MP3","bitRate":128 – Daiwik Dan May 10 '18 at 13:43
  • yes it's JSON inside a javascript, I will edit and post the whole thing. – Daiwik Dan May 10 '18 at 13:44
  • 2
    Is regex a requirement? If not you should be able to parse the JSON and get the value of streamUrl easily – Huynh May 10 '18 at 13:45
  • no a regex is not a requirement, but i am scraping a website using VBA xmlHTTP, and I scrapped everything i need except this link, any closer answer will help, then maybe i can use excel functions to filter the result even further. – Daiwik Dan May 10 '18 at 13:48
  • I'm confused by *here is part of the javascript in which the link exist, the link looks like this "streamUrls":[{"streamUrl":"http://mp3lg4.tdf-cdn.com/9243/lag_164753.mp3"* is this a string or what? It looks like JSON. Just use `JSON.parse`? what is the "text"? What's VBA got to do with anything? – Liam May 10 '18 at 13:51
  • The more I read your question the more confusing it is. Please be clear on what your trying to achieve – Liam May 10 '18 at 13:53
  • that JSON exist inside a javascript function, i don't know if JSON will understand the javascript text file. also i don't know where should I use JSON.parse? i am using VBA, i downloaded the javascript function and i want to extract that link from it. – Daiwik Dan May 10 '18 at 13:54
  • 1
    So why did you tag `javascript`? – revo May 10 '18 at 13:55
  • go to http://virginradio.radio.fr/ and view the page source, do a search for "streamurl" and you will see the link i want to extract. that's my goal. – Daiwik Dan May 10 '18 at 13:55
  • revo idk, maybe because someone knows how to scrap javascript text!! – Daiwik Dan May 10 '18 at 13:57
  • OK JSON is not Javascript (text). [They are different things that happen to look similar..](https://stackoverflow.com/questions/8294088/javascript-object-vs-json) – Liam May 10 '18 at 14:05
  • Wait your trying to parse actual javascript code using VBA? Is that what your trying to do? Regex is going to fail miserably at this. – Liam May 10 '18 at 14:07
  • Yes that's what i am trying to do :/, any suggestions please? – Daiwik Dan May 10 '18 at 14:08
  • Why? If your trying to learn regex this isn't the right place to start. – Liam May 10 '18 at 14:09
  • no I tried to learn regex to solve a problem, I thought regex is a good solution, my problem is to extract that link. – Daiwik Dan May 10 '18 at 14:11
  • This will work but I would want to rely on it https://regexr.com/3p8i5. You'd need to select the second section group. I don't know VBA so that's as far as I can go – Liam May 10 '18 at 14:12
  • 1
    that pattern doesn't extract the link, i mean stream link, anyway thanks, i will play with it and hope i can find the right pattern. – Daiwik Dan May 10 '18 at 14:15

1 Answers1

2

Here is a regex that should do what you want:

/{\s*["']\s*streamUrl\s*["']\s*:\s*["']\s*(http[^"']+)/

Try it out here: https://regex101.com/r/sAXaOE/1

The match is in Group 1 (look under match info on the right).

And in VBA it might be something like this:

Dim myRegExp, myMatches, myMatch
Set myRegExp = New RegExp
myRegExp.IgnoreCase = True
myRegExp.Global = True
myRegExp.Pattern = "{\s*[""']\s*streamUrl\s*[""']\s*:\s*[""']\s*(http[^""']+)"
Set myMatches = myRegExp.Execute(SubjectString)
For Each myMatch In myMatches
    For I = 1 To myMatch.SubMatches.Count
        'backreference text: myMatch.SubMatches(I-1)
    Next
Next
Jaifroid
  • 437
  • 3
  • 6