Identifying a forward declaration with libclang

January 12, 2017

I’ve recently written a small utility (called layout), which uses libclang to parse C and C++ source code and determine the size, layout, and padding for all fields in all types. It generates C or C++ code which can be compiled to report this information for a specific compiler. During my first attempt, I ran into a problem detecting that a given type was a forward declaration.

A little bit about libclang

The C API for clang allows you to walk the abstract syntax tree (AST) for any C or C++ code using a visitor pattern. At each node of the AST, the visitor receives a CXCursor object representing that node. We can ask this cursor a number of questions, like what kind of node it is (e.g. a struct or a field) or what type it represents (e.g. int or double).

The layout utility walks the AST, finds each class or struct, then looks for each field in that class or struct and gathers data about each field. What happens when it encounters a forward declaration like this?

struct ForwardDeclared;

struct FullType
{
  ForwardDeclared* field;
};

The wrong solution

The visitor for types checks the kind of the cursor, and handles struct and class kinds.

auto cursorKind = clang_getCursorKind(cursor);
if (cursorKind == CXCursor_StructDecl || cursorKind == CXCursor_ClassDecl)
{
  ...
}

However, both FullType and ForwardDeclared are CXCursor_StructDecl cursors, so both are handled in the same way. The resulting generated C++ code tries to call sizeof(ForwardDeclared), which will cause a compiler error. Indeed, the very reason to use a forward declaration is to avoid the need for the compiler to know the size of a given type!

My first attempt to work around this issue was to ignore any types with no fields. With this change, the output from the code generated by the utility for this case seems to work.

$ ./layout blog.cpp | g++ -xc++ -; ./a.out
FullType (8b):
Field |              Type | Offset | Size | Padding
field | ForwardDeclared * |      0 |    8 |       0

But suppose that I actually have a type with no fields?

struct ForwardDeclared;

struct FullType
{
  ForwardDeclared* field;
};

struct EmptyType
{
};

The output for this code is the same as the code above. The type EmptyType is simply being skipped, which is incorrect.

A better solution

The clang C++ API has a method named isThisDeclarationADefinition which does exactly what we need. It will return true for FullType and EmptyType, but false for ForwardDeclared. Unfortunately, this method is not exposed on the C API, so the layout utility cannot use it. I started to work on exposing this method via the C API in libclang, when I stumbled across the clang_getCursorDefinition method.

I wonder what this method does when we pass it the cursor for a forward declaration. From its documentation:

If given a cursor for which there is no corresponding definition, e.g., because there is no definition of that entity within this translation unit, returns a NULL cursor.

We can use the clang_getNullCursor method to obtain the NULL cursor value described in the documentation. Then a method to identify a forward declaration might look like this:

static bool is_forward_declaration(CXCursor cursor)
{
  return clang_equalCursors(clang_getCursorDefinition(cursor),
                            clang_getNullCursor());
}

With this new method in place, we can now get better output from layout:

$ ./layout blog.cpp | g++ -xc++ -;./a.out
FullType (8b):
Field |              Type | Offset | Size | Padding
field | ForwardDeclared * |      0 |    8 |       0
EmptyType (1b):
No fields

The output now includes the EmptyType, and indicates its size.

Update (December 7, 2018)

Astute reader Fredrik Svantesson pointed out that is_forward_declaration won’t work properly if the declaration and definition are in the same translation unit. In that case, clang_getCursorDefinition will not return the null cursor, but instead will return the definition cursor!

So, we need an additional check for this case:

static bool is_forward_declaration(CXCursor cursor)
{
  auto definition = clang_getCursorDefinition(cursor);

  // If the definition is null, then there is no definition in this translation
  // unit, so this cursor must be a forward declaration.
  if (clang_equalCursors(definition, clang_getNullCursor()))
    return true;

  // If there is a definition, then the forward declaration and the definition
  // are in the same translation unit. This cursor is the forward declaration if
  // it is _not_ the definition.
   return !clang_equalCursors(cursor, definition);
}

Content © Josh Peterson

Site design by Sirupsen