7

We are using PMD Copy Paste Detector (CPD) to analyze our C and C++ code. However, there are a few parts of the code that are very similar, but with a good reason and we would like to suppress the warnings for these parts.

The documentation of PMD CPD only mentions something about annotations, but this will not work for our these languages.

How can I still ignore warnings for specific parts?

Is there a comment to do so perhaps?

[UPDATE] I'm using the following Groovy script to run CPD:

@GrabResolver(name = 'jcenter', root = 'https://jcenter.bintray.com/')
@Grab('net.sourceforge.pmd:pmd-core:5.4.+')
@Grab('net.sourceforge.pmd:pmd-cpp:5.4.+')
import net.sourceforge.pmd.cpd.CPD
import net.sourceforge.pmd.cpd.CPDConfiguration
import java.util.regex.Pattern

def tokens = 60
def scanDirs = ['./path/to/scan', './scan/this/too']
def ignores = [
    './ignore/this/path',
    './this/must/be/ignored/too'
    ].collect({ it.replace('/', File.separator) })
def rootDir = new File('.')
def outputDir = new File('./reports/analysis/')

def filename_date_format = 'yyyyMMdd'
def encoding = System.getProperty('file.encoding')
def language_converter = new CPDConfiguration.LanguageConverter()
def config = new CPDConfiguration()
config.language = new CPDConfiguration.LanguageConverter().convert('c')
config.minimumTileSize = tokens
config.renderer = config.getRendererFromString 'xml', 'UTF-8'
config.skipBlocksPattern = '//DUPSTOP|//DUPSTART'
config.skipLexicalErrors = true
def cpd = new CPD(config)

scanDirs.each { path ->
    def dir = new File(path);
    dir.eachFileRecurse(groovy.io.FileType.FILES) {
        // Ignore file?
        def doIgnore = false
        ignores.each { ignore ->
            if(it.path.startsWith(ignore)) {
                doIgnore = true
            }
        }
        if(doIgnore) {
            return
        }

        // Other checks
        def lowerCaseName = it.name.toLowerCase()
        if(lowerCaseName.endsWith('.c') || lowerCaseName.endsWith('.cpp') || lowerCaseName.endsWith('.h')) {
            cpd.add it
        }
    }
}

cpd.go();

def duplicationFound = cpd.matches.hasNext()

def now = new Date().format(filename_date_format)
def outputFile = new File(outputDir.canonicalFile, "cpd_report_${now}.xml")
println "Saving report to ${outputFile.absolutePath}"

def absoluteRootDir = rootDir.canonicalPath
if(absoluteRootDir[-1] != File.separator) {
    absoluteRootDir += File.separator
}

outputFile.parentFile.mkdirs()
def xmlOutput = config.renderer.render(cpd.matches);
if(duplicationFound) {
  def filePattern = "(<file\\s+line=\"\\d+\"\\s+path=\")${Pattern.quote(absoluteRootDir)}([^\"]+\"\\s*/>)"
  xmlOutput = xmlOutput.replaceAll(filePattern, '$1$2')
} else {
  println 'No duplication found.'
}

outputFile.write xmlOutput
Arno Moonen
  • 1,134
  • 2
  • 10
  • 28

4 Answers4

3

You can define your custom markers for excluding certain blocks from analysis through the --skip-blocks-pattern option.

--skip-blocks-pattern Pattern to find the blocks to skip. Start and End pattern separated by |. Default is #if 0|#endif.

For example the following will ignore blocks between /* SUPPRESS CPD START */ and /* SUPPRESS CPD END */ comments (the comment must occupy a separate line):

$ ./run.sh cpd --minimum-tokens 100 --files /path/to/c/source --language cpp ----skip-blocks-pattern '/* SUPPRESS CPD START */|/* SUPPRESS CPD END */'

Note however, that this will cause the tool perform copy-paste-detection inside code delimited by #if 0/#endif.

Leon
  • 31,443
  • 4
  • 72
  • 97
  • I'll try this later today. I believe another tool already tests to see if "#if 0" is used, because we have already defined that this should not be used. Note that I'm using a custom Groovy script to run CPD, so I will need to figure out how to pass this when I'm calling `CPD.go(config)` from my script. – Arno Moonen Jul 04 '16 at 08:46
  • I've updated the original post to include the script I use to run CPD, including the `skipBlocksPattern` option. Unfortunately, this does not seem to work for me (yet?). Will investigate this more soon, I hope. – Arno Moonen Jul 05 '16 at 07:22
2

After searching through the code of PMD on GitHub, I think I can safely say that this is NOT supported at this point in time (current version being PMD 5.5.0).

A search for CPD-START in their repository, does not show any results within the pmd-cpp directory (see the search results on GitHub).

Arno Moonen
  • 1,134
  • 2
  • 10
  • 28
  • Unfortunate. But PMD *was* designed for Java after all. Maybe the Clang static analyzer (or another tool) may provide you with better customization? – StoryTeller - Unslander Monica Jul 04 '16 at 08:17
  • I'm not aware of any code duplication / "copy paste detector" in Clang, so as far as I know this is not an alternative. – Arno Moonen Jul 04 '16 at 09:48
  • PMD 5.7.0 started supporting comment based suppressions, which are currently supported on several languages including C/C++ https://pmd.github.io/pmd-6.13.0/pmd_userdocs_cpd.html#suppression – Johnco Apr 17 '19 at 03:35
  • Thanks @Johnco for the update. I have changed the accepted answer to yours. – Arno Moonen Apr 18 '19 at 06:50
1

I know this is a ~3 years old question, but for completeness, CPD started supporting this in PMD 5.6.0 (April 2017) in Java, and since 6.3.0 (April 2018) it has been extended to many other languages such as C/C++. Nowadays, almost all CPD supported languages allow for comment-based suppressions.

The complete (current) docs for comment-based suppression are available at https://pmd.github.io/pmd-6.13.0/pmd_userdocs_cpd.html#suppression

It's worth noting, if a file has a // CPD-OFF comment, but no matching // CPD-ON, everything will be ignored until the end of file.

Johnco
  • 4,037
  • 4
  • 34
  • 43
-1

I don't have any help for CPD. In general, I know about such tools; I don't understand the bit about "warnings".

Our CloneDR tool finds exact and near-miss duplicate code. IMHO, it finds better clones than CPD, because it uses the language syntax/ structure as a guide. [This fact is backed up by a research report done by a third party that you can find at the site]. And it does not issue "warnings".

If there is code that it thinks is involved in a clone, the tool will generate an output report page for the clones involved. But that isn't a warning. There is no way to suppress the reporting behavior. Obviously, if you have seen such a clone and decide it is not interesting, you can mark one of the clone entries with a comment stating that it is an uninteresting clone; that comment will show up in the clone report. (Such) comments have no impact whatsover on what clones are detected by CloneDR, so adding them does not change the computed answer.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Well, I call it a warning, but basically CPD does that too. It simply lists the clones it found and there is a way to tell it to "exclude" it from that list (thus suppressing the "warning"). Does that tool also work for plain C? – Arno Moonen Jul 05 '16 at 13:29
  • Yes, it works for a wide variety of languages was well as for plain C. Sometimes (for C) you have to provide it a bit of configuration data to handle poorly structure preprocessor directives. See https://www.semanticdesigns.com/Products/Formatters/CPreprocessorConstraints.html for some examples. Modulo that it works pretty well. – Ira Baxter Jul 06 '16 at 03:27