3

Briefly speaking, I need to build tensorflow into a static lib, and I have successfully achieved that goal by scripts under tensorflow/contrib/makefile/build_all_linux.sh. But after linking this libtensorflow-core.a to my test program, it showed that the performance is surprisingly poor. I found a bunch of logs like below exists. It seems no cpu sse support has been built into the static lib. Any help would be appreciated.

2018-05-04 13:45:44.304314: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "EncodeProto" device_type: "CPU"') for unknown op: EncodeProto
2018-05-04 13:45:44.304648: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "DecodeProtoV2" device_type: "CPU"') for unknown op: DecodeProtoV2
2018-05-04 13:45:44.304720: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "PopulationCount" device_type: "CPU" constraint { name: "T" allowed_values { list { type: DT_INT64 } } }') for unknown op: PopulationCount 
2018-05-04 13:45:44.304732: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "PopulationCount" device_type: "CPU" constraint { name: "T" allowed_values { list { type: DT_INT32 } } }') for unknown op: PopulationCount 
2018-05-04 13:45:44.304741: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "PopulationCount" device_type: "CPU" constraint { name: "T" allowed_values { list { type: DT_INT16 } } }') for unknown op: PopulationCount 
2018-05-04 13:45:44.304750: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "PopulationCount" device_type: "CPU" constraint { name: "T" allowed_values { list { type: DT_UINT16 } } }') for unknown op: PopulationCount 
2018-05-04 13:45:44.304759: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "PopulationCount" device_type: "CPU" constraint { name: "T" allowed_values { list { type: DT_INT8 } } }') for unknown op: PopulationCount 
2018-05-04 13:45:44.304768: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "PopulationCount" device_type: "CPU" constraint { name: "T" allowed_values { list { type: DT_UINT8 } } }') for unknown op: PopulationCount 
2018-05-04 13:45:44.304910: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "MutableDenseHashTable" device_type: "CPU" constraint { name: "key_dtype" allowed_values { list { type: DT_INT64 } } } constraint { name: "value_dtype" allowed_values { list { type: DT_VARIANT } } }') for unknown op: MutableDenseHashTable
273K
  • 29,503
  • 10
  • 41
  • 64
Ivan Jobs
  • 71
  • 1
  • 4
  • 1
    What compiler did you use? `gcc -O3 -march=native` should be good. Possible duplicate of [How to compile Tensorflow with SSE4.2 and AVX instructions?](https://stackoverflow.com/q/41293077) – Peter Cordes May 04 '18 at 13:17

0 Answers0