2

I'm trying to replace links in html content with custom links I generate. I tried both this method and regex replacement. The regex replacement was even slower.

It takes about 500ms for this method to execute in my Spring MVC application. The linkRepository is a MongoRepository. spring-data-mongodb is used.

Does anyone know how to speed this up?

    Document doc = Jsoup.parse(response.getContent());
    Elements tags = doc.select("a[href]");
    for(Element tag : tags) {
        String url = tag.attr("href");
        Link link = new Link();
        link.setUrl(url);
        link.setDelivery(d);
        linkRepository.save(link);
        url = linkTo(methodOn(TrackController.class).actionTrack(link.getId(), d.getId(), null)).toUri().toString();
        tag.attr("href", url);
    }
    String content = doc.outerHtml();

Update:

I also tried saving all the links at once instead of saving them in the loop and using some generated ID rather than the mongodb ID. That didn't help.

Update 2:

I added some more profiling and it seems to be MongoDB taking half a second to save stuff.

DEBUG foo.bar.serving.filter.TrackingUrlFilter: Parsing HTML Content took 0:00:00.003
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Select links took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.044
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.032
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.033
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.031
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.032
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.033
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.033
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.002
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.033
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.002
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.039
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.032
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.002
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.033
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Saving link took 0:00:00.036
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating link took 0:00:00.001
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Setting link took 0:00:00.000
DEBUG foo.bar.serving.filter.TrackingUrlFilter: Generating html took 0:00:00.001

Does anybody have an idea how I can speed this up?

Update 3:

After some more fiddeling, I tried some MongoDB Options. Setting the writeConcern property of my MongoTemplate to UNACKNOWLEDGED brought the time per insert down to 1-3ms. Should it really be the solution to ignore anything MongoDB says?

Jochen Ullrich
  • 568
  • 3
  • 22
  • 1
    [THIS](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags?page=1&tab=votes#tab-top) might be relevant – Dawnkeeper Jan 16 '15 at 09:21
  • what i do is just replacing urls. both solutions worked, regex was even slower, though. it's just about speeding it up. :) – Jochen Ullrich Jan 16 '15 at 09:23
  • What is the size of the HTML content ? Have tried to run just the HTML parsing part, without the saving, to see if the saving is taking up a lot of time ? – Florent Bayle Jan 16 '15 at 09:32
  • 134 lines, about 10 links. i will try doing that. i'll post once i did. – Jochen Ullrich Jan 16 '15 at 09:33
  • I suggest to use profiler and/or rewrite code in way it will be more helpful to determine bottleneck. Code which you presented won't help anybody to help you. For example in pseudocode: class Foo { private Element element; private Link link; private String newUrl; } List tags = retrieveTags(); List foo = prepareLinksForTags(tags); saveLinks(foo); retrieveNewUrls(foo); replaceUrls(foo); – zimi Jan 16 '15 at 09:38

0 Answers0