I want to do TaintTracking with functions that taint their arguments with userinput. Example:
fgets(buf, sizeof(buf), stdin); // buf is tainted
[...]
n = strlen(buf); // tainted argument to strlen
[...]
memcpy(somewhere, buf, n) // tainted call to memcpy
Semmle should be able to spot this with a Query like the following (just with fgets->strlen as example). I am borrowing code from SecurityOptions:
import cpp
import semmle.code.cpp.dataflow.TaintTracking
class IsTaintedArg extends string {
IsTaintedArg() { this = "IsTaintedArg" }
predicate userInputArgument(FunctionCall functionCall, int arg) {
exists(string fname |
functionCall.getTarget().hasGlobalName(fname) and
exists(functionCall.getArgument(arg)) and (fname = "fgets" and arg = 0) // argument 0 of fgets is tainted
)
}
predicate isUserInput(Expr expr, string cause) {
exists(FunctionCall fc, int i |
this.userInputArgument(fc, i) and
expr = fc.getArgument(i) and
cause = fc.getTarget().getName()
)
}
}
class TaintedFormatConfig extends TaintTracking::Configuration {
TaintedFormatConfig() { this = "TaintedFormatConfig" }
override predicate isSource(DataFlow::Node source) {
exists (IsTaintedArg opts |
opts.isUserInput(source.asExpr(), _)
)
}
override predicate isSink(DataFlow::Node sink) {
exists (FunctionCall fc | sink.asExpr() = fc.getArgument(0) and fc.getTarget().hasName("strlen")) // give me all calls that land in strlen's first argument
}
}
from TaintedFormatConfig cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
select sink, source
Yet it does not look like it is working.
When I just Query cfg.isSource()
or cfg.isSink()
however, both source and sink are recognized. But hasFlow()
still returns nothing - although a path should definitely exist.
I am using libssh2 to test my findings, the example flow exists here.
My Query to test around is here.
Does anyone have any idea what I might be doing wrong in the Query above?