They're collections of type information (structs, enums, function types, etc.)
The information is contained in two key-value stores:
I recommend you reference 1 and 2 while reading the FAQ.
Some variable definitions have type information, but don't produce a type name useful for future definitions, examples:
enum {A=1,B=2} foo; foo has type with no type name (it does have a variable name)struct {int A; int B;} bar; bar has type with no type nameIn C, enum and struct definitions can create a new type name as a byproduct of a definition by using a "tag name":
enum MyEnum {A=1,B=2} foo; foo has type named MyEnumstruct MyStruct {int A; int B;} bar; bar has type named MyStructIn the second set of examples, the types are named, and that name could be used to declare other variables, like enum MyEnum bar2; and struct MyStruct bar2.
Functions' types are not named. The function name is considered the name of a function object, and the function's type is anonymous.
In summary:
typedef int foo; // type:int, name:foo
// structs, without and with a "tag name"
struct {int A; int B;} foo; // type:struct{int A, intB;}, name:<anonymous>
struct MyStruct {int A; int B;} foo; // type:struct{int A, intB;}, name:MyStruct
// enumerations, without and with a "tag name"
enum {A=1,B=2} foo; // type:enum{A=1,B=2}, name:<anonymous>
enum MyEnum {A=1,B=2} foo; // type:enum{A=1,B=2}, name:MyEnum
// functions
int main(int ac, char **av); // type int ()(int, char **), name:<anonymous>
typedef int (MyFunc)(int ac, char **av); // type int ()(int, char **), name:MyFunc
A named type is a type with a name that can identify it. Like Color is the name of type enum {RED=0, ORANGE=1, YELLOW=2, ...}.
A named object is the name of an external/imported symbol for which the typelibrary has type information. Like MessageBoxA is the name of a function whose type is int ()(HWND, LPCSTR, LPCSTR, UINT).
>>> bv.add_type_library(TypeLibrary.load_from_file('test.bntl'))
>>> bv.import_library_object('_MySuperComputation')
<type: int32_t (int32_t, int32_t, char*)>
While this succeeds, the type of _MySuperComputation will not change. It is only changed if the named type object information is available when the imports are being processed at binary load time.
TODO
Because they're anonymous, and the type view only shows named types.
A function with prototype int sum(int a, int b); means there's a function object named sum with type int ()(int, int), but no name for that type in the named types key/value store:
>>> bv.functions[0].function_type
<type: uint64_t (int32_t arg1, int32_t arg2)>
>>> bv.functions[0].function_type.registered_name == None
True
Because the type libraries added to a binary view only makes their type information available for use. The types view will show a type from a type library only after it is used (on demand).
Try this experiment:
bv.type_libraries, bv.typesbv.add_type_library(TypeLibrary.load_from_file('test.bntl'))bv.type_libraries is extended, but bv.types is unchanged!bv.get_type_by_name('Rectangle') returns nothingstruct Rectanglebv.types is extended, and the types view shows struct Rectangle in the auto typesIt's a way to refer to a type by name, without having its declaration immediately available.
For example, examine this struct from 1:
struct Rectangle2 {
int width;
int height;
struct Point center;
}
We don't know at this moment what a struct Point is. Maybe we've already added it. Maybe we'll add it later. Maybe it's in another type library. But we want to add a Rectangle now. So we leave the center field as a reference to the named type struct Point.
Load the resulting test.bntl in your binary and try to set some data to type struct Rectangle2 and you'll be met with this message:
TypeLibrary: failed to import type 'Point'; referenced but not present in library 'libtest.so.1`
Makes sense! Now go to types view and define struct Point { int x; int y; } and try again, success!
100001000 struct rectangle_unresolved data_100001000 =
100001000 {
100001000 int32_t width = 0x5f0100
100001004 int32_t height = 0x5f030005
100001008 struct Point center =
100001008 {
100001008 int32_t x = 0x655f686d
10000100c int32_t y = 0x75636578
100001010 }
100001008 }
You should repeat the experiment using struct Rectangle and see that you're allowed to create variables with type containing pointers to unresolved type references.
When a binary is loaded and its external symbols is processed, the symbol names are searched against the named objects from type libraries. If there is a match, it obeys the type from the type library. Upon success, you'll see a message like:
type library test.bntl found hit for _DoSuperComputation
At this moment, there is no built in functionality to apply named objects to an existing Binary Ninja database.
I've seen "types of types", "sorts of types", "kinds of types", "classes of types" used to differentiate the varieties of possible types, and there are probably more. Binja uses "class", example:
>>> type_obj.type_class
<TypeClass.FunctionTypeClass: 8>
In api/python/enums.py we can see Binja currently thinks of types falling into 13 classes: Void, Bool, Integer, Float, Structure, Enumeration, Pointer, Array, Function, VarArgs, Value, NamedTypeReference, WideCharType.
Compare to LLDB, which also uses the term "class", and currently has 19 of them: Array, BlockPointer, Builtin, Class, ComplexFloat, ComplexInteger, Enumeration, Function, MemberPointer, ObjCObject, ObjCInterface, ObjCObjectPointer, Pointer, Reference, Struct, Typedef, Union, Vector, Other.
Compare to GDB, which uses the term "type code" and has 25 of them.
While technically not part of the type, having names of function parameters is very useful and can thus be optionally stored in a type.
Function types (types with .type_class == FunctionTypeClass) have a .parameters attribute, a list of FunctionParameter objects. When those objects have .name=='' you get the bare bones function types like int ()(int, char **). When those objects have their .name populated you get the more meaningful int ()(int argc, char **argv).
By a hierarchy of objects from api/python/types.py referencing one another. The "glue" object is binaryninja.types.Type and depending on the complexity of the type it represents (stored in its .type_class attribute), could have an attribute with more information. For instance, if the binaryninja.types.Type has .type_class == FunctionTypeClass then its .parameters attribute is a list of binaryninja.types.FunctionParameter. See 2.
As an example, here is the hierarchical representation of type struct Rectangle from 1:
typelib.named_types["Rectangle"] =
----------------------------------
Type class=Structure
Structure
StructureMember "width"
Type class=Integer width=4
StructureMember "height"
Type class=Integer width=4
StructureMember "center"
Type class=Pointer
Type class=NamedTypeReference
NamedTypeReference <named type: struct Point>
Here is the representation of type int ()(int, int) named MyFunctionType in 1:
Type class=Function
Type class=Integer width=4 // .return_value
FunctionParameter ""
Type class=Integer width=4 // .parameters[0]
FunctionParameter ""
Type class=Integer width=4 // .parameters[1]
FunctionParameter ""
Type class=Integer width=4 // .parameters[2]
Binja reads the ELF's .dynstr section for the requested name. It kind of behaves like the dynamic linker in this regard.
This requested name should be a soname, like "libfoo.so.1" but could be a linkname like "libfoo.so". See [3][3].
Binja's logic for determining a match is straightforward:
typelibname.removesuffix('.bntl') == requestedname or requestedname in alternativenames
Therefore, without any alternative names, libc.so.bntl will not be loaded by Binja if an ELF requests libc.so.6.
We recommend and follow the following convention. Type libraries should be named for the filename from which they were generated with the phrase ".bntl" added. When the source library contains additional minor and release number, like libfoo.so.1.2.3 Binja would not load the resulting type library libfoo.so.1.2.3.bntl for an ELF requesting soname libfoo.so.1. Therefore the alternative names list should include the most specific version numbers, incrementally stripped down to the soname, and finally a linkname for good measure.
Example:
libfoo.so.1.2.3 is used to generated libfoo.so.1.2.3.bntl
Alternative names list should have:
libfoo.so.1.2.3 <-- includes version, minor, release (most specific)
libfoo.so.1.2 <-- includes version, minor (less specific)
libfoo.so.1 <-- includes version (soname)
libfoo.so <-- linkname
https://github.com/Vector35/binaryninja-api/blob/dev/python/examples/typelib_create.py
https://github.com/Vector35/binaryninja-api/blob/dev/python/examples/typelib_dump.py
The ldconfig tool is responsible for creating symlinks from soname to realnames, like /usr/lib/libfoo.so.1 -> /usr/lib/libfoo.so.1.0. At library installation time, a symlink from linkname to soname may have been created, like /usr/lib/liboo.so -> /usr/lib/libfoo.so.1. See https://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html.