0

I am using pyclang to parse a header:

import argparse
import clang.cindex

relevant_tokens = [
    clang.cindex.CursorKind.FUNCTION_DECL,
    clang.cindex.CursorKind.CXX_METHOD,
    clang.cindex.CursorKind.CONSTRUCTOR,
    clang.cindex.CursorKind.DESTRUCTOR,
    clang.cindex.CursorKind.STRUCT_DECL,
    clang.cindex.CursorKind.CLASS_DECL,
    clang.cindex.CursorKind.FIELD_DECL,
    clang.cindex.CursorKind.FUNCTION_TEMPLATE,
    clang.cindex.CursorKind.CLASS_TEMPLATE,

]

def ExtractTokensOfInterest(path):

    index = clang.cindex.Index.create()
    translation_unit = index.parse(path, args=['-std=c++20'])
    for token in translation_unit.get_tokens(extent=translation_unit.cursor.extent):
        if token.kind == clang.cindex.TokenKind.IDENTIFIER and token.cursor.kind in relevant_tokens:
            print(token.cursor.displayname)
            print(token.cursor.result_type.spelling)

I am running on a fairly complicated header that imports vulkan structs and it works flawlessly for those, but as soon as an stl vector is concerned it fails, this is a sample output:

SetGlobalPointer(HardwareInterface *)
void
BeginSingleTimeCommands(HardwareInterface &)
vk::CommandBuffer
EndSingleTimeCommands(HardwareInterface &, vk::CommandBuffer &)
void
HardwareInterface()
void
HardwareInterface(GLFWwindow *)
void
Dummy()
int
Dummy()
int
Dummy()
int
Dummy()
int
GetInstance()
vk::Instance &
GetPhysicalDevice()
vk::PhysicalDevice &
GetDevice()
vk::Device &
GetSurface()
vk::SurfaceKHR &
GetCommandPool()
vk::CommandPool &
GetGraphicQueue()
vk::Queue &
GetComputeQueue()
vk::Queue &
GetQueueFamily()
int32_t
GetCmdBuffer()
vk::CommandBuffer &
StartCmdUpdate()
void
EndCmdUpdate()
void
DrawIndexed(const int &, vk::Buffer, vk::Buffer, uint, uint, size_t, size_t, size_t)
void
CreateSemaphores(uint32_t)
int
CreateSemaphores(uint32_t)
int
CreateSemaphores(uint32_t)
int
CreateSemaphores(uint32_t)
int
CreateSemaphores(uint32_t)
int
CreateSemaphores(uint32_t)
int
CreateSemaphores(uint32_t)
int
CreateFences(uint32_t)
int
CreateFences(uint32_t)
int
CreateFences(uint32_t)
int
CreateFences(uint32_t)
int
CreateFences(uint32_t)
int
CreateFences(uint32_t)
int
CreateFences(uint32_t)
int

Notice that dummy return? This is what the method declaration looks like:

std::vector<std::string> Dummy();

So PyClang just managed to resolve a lot of super complicated template based typdefs for vulkan without sweating, but it cannot handle and stl vector of any kind, not even of type std::string?

Example header:

#pragma once

#define GLFW_INCLUDE_VULKAN

/** @cond */
#include <GLFW/glfw3.h>

#include "vulkan/vulkan.hpp"
/** @endcond */

#include "VkExtensionsStubs.hpp"
#include "Utils.hpp"
#include "VulkanDebugging.hpp"

using uint = unsigned int;

struct Dummy
{
  int field1;
  std::string field2;
};

/**
 * @brief Wrapper to facilitate communication with the graphics hardware through Vulkan.
 *
 * The instance, physical device, logical device... Are all encapsulated here, it mostly
 * reduces the amoutn of parameters that need to be passed around, but it also provides
 * some low level functionality to dispatch GPU work.
 *
 */
class HardwareInterface
{
  private:
    vk::UniqueInstance instance;
    vk::PhysicalDevice physical_device;
    vk::UniqueDevice device;
    vk::UniqueDebugUtilsMessengerEXT debug_messenger;
    vk::UniqueSurfaceKHR surface;
    vk::UniqueCommandPool cmd_pool;
    vk::UniqueCommandBuffer cmd_buffer;

    int32_t queue_family = -1;
    vk::Queue graphic_queue;
    vk::Queue compute_queue;

  public:
    // TODO (medium): Try to make this into a shared_ptr (currently causes segfault)
    static HardwareInterface* h_interface;
    static void SetGlobalPointer(HardwareInterface* ptr) { h_interface = ptr; }

    /**
     * @brief Begin a small command that will be executed once and immediately.
     *
     * @param h_interface Wrapper around the hardware info (e.g. device, logical device,
     * instance...)
     * @return vk::CommandBuffer Buffer to record one time commands.
     */
    static vk::CommandBuffer BeginSingleTimeCommands(HardwareInterface& h_interface);
    /**
     * @brief End and call the command. Call after `BeginSingleTimeCommands()`.
     *
     * @param h_interface Wrapper around the hardware info (e.g. device, logical device,
     * instance...)
     * @param command_buffer Command created by BeginSingleTimeCommands() where the
     * commands were recorded.
     */
    static void EndSingleTimeCommands(
        HardwareInterface& h_interface, vk::CommandBuffer& command_buffer);

    HardwareInterface(){};
    /**
     * @brief Construct a new Hardware Interface wrapper.
     *
     * @param window GLFWwindow containing the surface to which we will present.
     *
     * Example:
     * (Create a hardware interface)
     * \code {.cpp}
     *      glfwInit();
     *      glfwWindowHint(GLFW_CLIENT_API, GLFW_NO_API);
     *      GLFWwindow* window = glfwCreateWindow(1, 1, "dummy window", nullptr, nullptr);
     *      HardwareInterface hi(window);
     * \endcode
     */
    HardwareInterface(GLFWwindow* window);
    /**
     * @brief Get the Vulkan Instance.
     *
     * @return vk::Instance& Vulkan Instance.
     */
    vk::Instance& GetInstance() { return *instance; }
    /**
     * @brief Get the Vulkan PhysicalDevice.
     *
     * @return vk::PhysicalDevice& Vulkan PhysicalDevice.
     */
    vk::PhysicalDevice& GetPhysicalDevice() { return physical_device; }
    /**
     * @brief Get the Vulkan Device.
     *
     * @return vk::Device& Vulkan Device.
     */
    vk::Device& GetDevice() { return *device; }
    /**
     * @brief Get the Vulkan Surface.
     *
     * @return vk::SurfaceKHR& Vulkan Surface.
     */
    vk::SurfaceKHR& GetSurface() { return *surface; }
    /**
     * @brief Get the Vulkan CommandPool.
     *
     * @return vk::CommandPool& Vulkan CommandPool.
     */
    vk::CommandPool& GetCommandPool() { return *cmd_pool; }
    /**
     * @brief Get the Vulkan Graphic Queue.
     *
     * @return vk::Queue Graphic Queue.
     */
    vk::Queue& GetGraphicQueue() { return graphic_queue; }
    /**
     * @brief Get the Vulkan Compute Queue.
     *
     * @return vk::Queue Compute Queue.
     */
    vk::Queue& GetComputeQueue() { return compute_queue; }
    /**
     * @brief Get the Vulkan Queue Family.
     *
     * @return int32_t Vulkan Queue Family.
     */
    int32_t GetQueueFamily() { return queue_family; }
    /**
     * @brief Get the Vulkan CommandBuffer.
     *
     * @return vk::CommandBuffer& Vulkan CommandBuffer.
     */
    vk::CommandBuffer& GetCmdBuffer() { return *cmd_buffer; }
    /**
     * @brief Start recording commands for a major compute or render operation.
     *
     */
    void StartCmdUpdate();
    /**
     * @brief End recording commands for a major compute or render operation.
     *
     */
    void EndCmdUpdate();
    /**
     * @brief Draw using an index buffer expressing vertex connectivity.
     *
     * @param vertex_buffers Vertex attribute buffers.
     * @param index_buffer Index buffer.
     * @param instance_buffer Buffer containing the instance data.
     * @param index_num Number of indices in the index buffer.
     * @param instance_count Total number of instances to render if any. If larger than
     * 0 instanced rendering is called insteaf of the regular call.
     * @param vertex_offset Offset within the vertex buffer at which indices will start.
     * @param index_offset Index used as the base index within the index buffer.
     * @param element_count Number of elements to draw.
     */
    void DrawIndexed(
      const std::vector<vk::Buffer>& vertex_buffers,
      vk::Buffer index_buffer,
      vk::Buffer instance_buffer,
      uint index_num,
      uint instance_count,
      size_t vertex_offset,
      size_t index_offset,
      size_t element_count);
    /**
     * @brief Create a list of Semaphores for GPU to GPU syncing.
     *
     * @param semaphore_num Number of semaphores to create.
     * @return std::vector<vk::UniqueSemaphore> List of semaphores for syncing.
     */
    std::vector<vk::UniqueSemaphore> CreateSemaphores(uint32_t semaphore_num);
    /**
     * @brief Create a list of Fences for CPU to GPU syncing.
     *
     * @param fence_num Number of fences to create.
     * @return std::vector<vk::UniqueFence> List of fences for syncing.
     */
    std::vector<vk::UniqueFence> CreateFences(uint32_t fence_num);
};

Makogan
  • 8,208
  • 7
  • 44
  • 112
  • `vector` shows an ambivalence about `using namespace std`. Since this is a header file, the `using` declaration is not generally recommended, and `std::string` is expected. But `vector` won't work without the `using` declaration. So is it there or not? (Had you provided an [mre], the clarification would not have been necessary.) – rici Jan 30 '22 at 05:13
  • This has nothing to do with the namespacing it;s consistent with and without it. Also this is an MRE, the python file can be copied as is and run. – Makogan Jan 30 '22 at 05:15
  • Only if I guess the contents of the file you're parsing. – rici Jan 30 '22 at 05:16
  • Any file with an stl vector declaration will exhibit this behaviour. But I have shared the header, I didn;t do so originally because it blaots the post and it's minimally relevant to teh problem. – Makogan Jan 30 '22 at 05:19
  • 1
    I don't see `#include ` in that file. – rici Jan 30 '22 at 05:21
  • It;s implicitly included by the vulkan headers, I also tried to include it explicitly now based on your comment, exact same behaviour unfortunately. Remember this hpp file doesn;t need to compile only to be syntactically valid, since we are poarsing, not compiling. – Makogan Jan 30 '22 at 05:28
  • OK, I installed clang-11. Since installing Vulkan seems way out of scope, I didn't try that; I just created a minimal test file containing `#include `, `#include ` and `std::vector test();`. It worked fine, printing `test std::vector`. However, if I remove either (or both) of the include's, I get five instances of `test() int`. So I can't help thinking that the problem here is a missing include, if not ``, then ``. What happens if you just compile the file with `clang-11 -c`? – rici Jan 30 '22 at 16:45
  • Hmm, what seems to be throwing it off is one of the includes which relies on a compile time argument (i.e. passing the -I flag to where the file lives). – Makogan Jan 31 '22 at 05:13
  • Oh boy this tool is about to become much more complicated, now I need to pass the full build information just to do some light static parsing... – Makogan Jan 31 '22 at 05:18
  • with c++ there is no such thing as "light static parsing"; templates must be instantiated in order to know which names are types, and without that an accurate parse is not possible. However, if what you pasted is not too far from reality, adding explicit includes in the headers will go some way towards a solution. IMO, headers should explicitly include any header necessary to define a type used in the header. Anyway, good luck. – rici Jan 31 '22 at 06:10

1 Answers1

0

I had a similar issue, "int" seems to be the fallback type whenever something goes wrong, e.g. some unresolved includes. The answer of this post worked for me: Parsing with libclang; unable to parse certain tokens (Python in Windows)

Edit: Missing includes can be indicated by checking the diagnostics:

for diag in tu.diagnostics:
   if diag.severity == diag.Fatal:
      print(diag.location)
      print(diag.spelling)
      print(diag.option)
DE.
  • 1
  • 1