0

I'm trying to parse HTML page with additional JavaScript or jQuery get class names and than replace it with random characters. I can easily extract class names but replacing it causes trouble. I have this code so far:

class_ids = [tag.split() for tag in re.findall(r'class=(?:"|\')([a-zA-Z0-9-_\s]+)(?:"|\')', html_page)]
class_ids = set([item for sublist in class_ids for item in sublist])

For each class I'll generate corresponding random characters class name (exp. footer : sjrh13li). Simply replacing footer string through file will also replace it in body text, also class names like title will also convert tag <title></title> to <cjir4331></cjir4331>. I've tried to replace whole line like class="title" => class="cjir4331" but this doesn't solve problems like class="title huge" because I need to detect classes title and huge separately and replace them. HTML code is combined with JavaScript code so document.getElementsByClassName('someClass') must be converted to document.getElementsByClassName('noleretko4356').

Is there any way around this?

jh314
  • 27,144
  • 16
  • 62
  • 82
sstevan
  • 477
  • 2
  • 9
  • 25
  • That's a pretty complex request for a pretty useless result. Are you only doing this to obfuscate CSS? – Domino Apr 29 '15 at 13:23
  • 1
    @JacqueGoupil Yeah...I know it's useless. Can't explain it to my boss :) Well yes but I can't use Javascript to obfuscate CSS and it has to be random. – sstevan Apr 29 '15 at 13:28
  • Sounds like you should be using some sort of HTML parser - parsing HTML with regexp's never works out well. – max Apr 29 '15 at 17:38
  • Well...I've found this [link](http://htmlmuncher.com/). Works for me. – sstevan Apr 30 '15 at 18:11

1 Answers1

0

Not sure I understand what you are trying to do. Are you trying to obfuscate your code before launching it in production ? Do you need to preserve the css associated with those elements ?

Does something like this helps you ?

https://jsfiddle.net/p36pre5o/6/

$('#btn').click(function(){
    var guid = generateUUID();
    $('.test').removeClass('test').addClass(guid);
});

This code will find all elements with a class test and replace it with a random guid.

Simon ML
  • 1,819
  • 2
  • 14
  • 32