I asked a related question in Crawling rules in heritrix, how to load embedded content? and came up with a solution there. Later I found this post as well. I am submitting my solution here as well:
Note: I know the question is old so it was most likely made for an older heritrix version. I am using 3.4
<bean id="scope" class="org.archive.modules.deciderules.DecideRuleSequence">
<property name="rules">
<list>
<bean class="org.archive.modules.deciderules.AcceptDecideRule" />
<bean class="org.archive.modules.deciderules.NotMatchesListRegexDecideRule">
<property name="decision" value="REJECT"/>
<property name="regexList">
<list>
<value>.*site\.domain/path/.*</value>
</list>
</property>
</bean>
<bean class="org.archive.modules.deciderules.HopsPathMatchesRegexDecideRule">
<property name="decision" value="ACCEPT"/>
<property name="regex" value="(E|X)" />
</bean>
<!-- Below are some of the "standard" rules set up on a fresh job, it behaves the same with and without them when it comes to not loading embedded stuff -->
<bean class="org.archive.modules.deciderules.TooManyHopsDecideRule">
<!-- <property name="maxHops" value="20" /> -->
</bean>
<!-- ...and REJECT those with suspicious repeating path-segments... -->
<bean class="org.archive.modules.deciderules.PathologicalPathDecideRule">
<!-- <property name="maxRepetitions" value="2" /> -->
</bean>
<!-- ...and REJECT those with more than threshold number of path-segments... -->
<bean class="org.archive.modules.deciderules.TooManyPathSegmentsDecideRule">
<!-- <property name="maxPathDepth" value="20" /> -->
</bean>
<!-- ...but always ACCEPT those marked as prerequisitee for another URI... -->
<bean class="org.archive.modules.deciderules.PrerequisiteAcceptDecideRule">
</bean>
<!-- ...but always REJECT those with unsupported URI schemes -->
<bean class="org.archive.modules.deciderules.SchemeNotInSetDecideRule">
</bean>
</list>
</property>
</bean>
Adjust <value>.*site\.domain/path/.*</value>
to match you site, and path if any.
You can also adjust <property name="regex" value="(E|X)" />
where E|X can be just E if you just want the known included things in the page, like images, css etc. X is a bit experimental for trying things found in javascript files as well.