In C++, a class name which is declared with class
, struct
,
or union
, is not only a tag, as in C, but also a type name. Thus
there should be stabs with both `t' and `T' symbol descriptors
(see section Giving a Type a Name).
To save space, there is a special abbreviation for this case. If the `T' symbol descriptor is followed by `t', then the stab defines both a type name and a tag.
For example, the C++ code
struct foo {int x;};
can be represented as either
.stabs "foo:T19=s4x:1,0,32;;",128,0,0,0 # 128 is N_LSYM .stabs "foo:t19",128,0,0,0
or
.stabs "foo:Tt19=s4x:1,0,32;;",128,0,0,0
In C++, a symbol (such as a type name) can be defined within another type.
In stabs, this is sometimes represented by making the name of a symbol which contains `::'. Such a pair of colons does not end the name of the symbol, the way a single colon would (see section The String Field). I'm not sure how consistently used or well thought out this mechanism is. So that a pair of colons in this position always has this meaning, `:' cannot be used as a symbol descriptor.
For example, if the string for a stab is `foo::bar::baz:t5=*6',
then foo::bar::baz
is the name of the symbol, `t' is the
symbol descriptor, and `5=*6' is the type information.
<< the examples that follow are based on a01.C >>
C++ adds two more builtin types to the set defined for C. These are the unknown type and the vtable record type. The unknown type, type 16, is defined in terms of itself like the void type.
The vtable record type, type 17, is defined as a structure type and then as a structure tag. The structure has four fields: delta, index, pfn, and delta2. pfn is the function pointer.
<< In boilerplate $vtbl_ptr_type, what are the fields delta, index, and delta2 used for? >>
This basic type is present in all C++ programs even if there are no virtual methods defined.
.stabs "struct_name:sym_desc(type)type_def(17)=type_desc(struct)struct_bytes(8) elem_name(delta):type_ref(short int),bit_offset(0),field_bits(16); elem_name(index):type_ref(short int),bit_offset(16),field_bits(16); elem_name(pfn):type_def(18)=type_desc(ptr to)type_ref(void), bit_offset(32),field_bits(32); elem_name(delta2):type_def(short int);bit_offset(32),field_bits(16);;" N_LSYM, NIL, NIL
.stabs "$vtbl_ptr_type:t17=s8 delta:6,0,16;index:6,16,16;pfn:18=*15,32,32;delta2:6,32,16;;" ,128,0,0,0
.stabs "name:sym_dec(struct tag)type_ref($vtbl_ptr_type)",N_LSYM,NIL,NIL,NIL
.stabs "$vtbl_ptr_type:T17",128,0,0,0
The stabs describing C++ language features are an extension of the stabs describing C. Stabs representing C++ class types elaborate extensively on the stab format used to describe structure types in C. Stabs representing class type variables look just like stabs representing C language variables.
Consider the following very simple class definition.
class baseA { public: int Adat; int Ameth(int in, char other); };
The class baseA
is represented by two stabs. The first stab describes
the class as a structure type. The second stab describes a structure
tag of the class type. Both stabs are of stab type N_LSYM
. Since the
stab is not located between an N_FUN
and an N_LBRAC
stab this indicates
that the class is defined at file scope. If it were, then the N_LSYM
would signify a local variable.
A stab describing a C++ class type is similar in format to a stab describing a C struct, with each class member shown as a field in the structure. The part of the struct format describing fields is expanded to include extra information relevant to C++ class members. In addition, if the class has multiple base classes or virtual functions the struct format outside of the field parts is also augmented.
In this simple example the field part of the C++ class stab representing member data looks just like the field part of a C struct stab. The section on protections describes how its format is sometimes extended for member data.
The field part of a C++ class stab representing a member function differs substantially from the field part of a C struct stab. It still begins with `name:' but then goes on to define a new type number for the member function, describe its return type, its argument types, its protection level, any qualifiers applied to the method definition, and whether the method is virtual or not. If the method is virtual then the method description goes on to give the vtable index of the method, and the type number of the first base class defining the method.
When the field name is a method name it is followed by two colons rather than one. This is followed by a new type definition for the method. This is a number followed by an equal sign and the type of the method. Normally this will be a type declared using the `#' type descriptor; see section The `#' Type Descriptor; static member functions are declared using the `f' type descriptor instead; see section Function Types.
The format of an overloaded operator method name differs from that of other methods. It is `op$::operator-name.' where operator-name is the operator name such as `+' or `+='. The name ends with a period, and any characters except the period can occur in the operator-name string.
The next part of the method description represents the arguments to the
method, preceded by a colon and ending with a semi-colon. The types of
the arguments are expressed in the same way argument types are expressed
in C++ name mangling. In this example an int
and a char
map to `ic'.
This is followed by a number, a letter, and an asterisk or period, followed by another semicolon. The number indicates the protections that apply to the member function. Here the 2 means public. The letter encodes any qualifier applied to the method definition. In this case, `A' means that it is a normal function definition. The dot shows that the method is not virtual. The sections that follow elaborate further on these fields and describe the additional information present for virtual methods.
.stabs "class_name:sym_desc(type)type_def(20)=type_desc(struct)struct_bytes(4) field_name(Adat):type(int),bit_offset(0),field_bits(32); method_name(Ameth)::type_def(21)=type_desc(method)return_type(int); :arg_types(int char); protection(public)qualifier(normal)virtual(no);;" N_LSYM,NIL,NIL,NIL
.stabs "baseA:t20=s4Adat:1,0,32;Ameth::21=##1;:ic;2A.;;",128,0,0,0 .stabs "class_name:sym_desc(struct tag)",N_LSYM,NIL,NIL,NIL .stabs "baseA:T20",128,0,0,0
As shown above, describing even a simple C++ class definition is accomplished by massively extending the stab format used in C to describe structure types. However, once the class is defined, C stabs with no modifications can be used to describe class instances. The following source:
main () { baseA AbaseA; }
yields the following stab describing the class instance. It looks no different from a standard C stab describing a local variable.
.stabs "name:type_ref(baseA)", N_LSYM, NIL, NIL, frame_ptr_offset
.stabs "AbaseA:20",128,0,0,-20
The class definition shown above declares Ameth. The C++ source below defines Ameth:
int baseA::Ameth(int in, char other) { return in; };
This method definition yields three stabs following the code of the
method. One stab describes the method itself and following two describe
its parameters. Although there is only one formal argument all methods
have an implicit argument which is the this
pointer. The this
pointer is a pointer to the object on which the method was called. Note
that the method name is mangled to encode the class name and argument
types. Name mangling is described in the ARM (The Annotated
C++ Reference Manual, by Ellis and Stroustrup, ISBN
0-201-51459-1); `gpcompare.texi' in Cygnus GCC distributions
describes the differences between GNU mangling and ARM
mangling.
.stabs "name:symbol_descriptor(global function)return_type(int)", N_FUN, NIL, NIL, code_addr_of_method_start .stabs "Ameth__5baseAic:F1",36,0,0,_Ameth__5baseAic
Here is the stab for the this
pointer implicit argument. The
name of the this
pointer is always this
. Type 19, the
this
pointer is defined as a pointer to type 20, baseA
,
but a stab defining baseA
has not yet been emitted. Since the
compiler knows it will be emitted shortly, here it just outputs a cross
reference to the undefined symbol, by prefixing the symbol name with
`xs'.
.stabs "name:sym_desc(register param)type_def(19)= type_desc(ptr to)type_ref(baseA)= type_desc(cross-reference to)baseA:",N_RSYM,NIL,NIL,register_number .stabs "this:P19=*20=xsbaseA:",64,0,0,8
The stab for the explicit integer argument looks just like a parameter to a C function. The last field of the stab is the offset from the argument pointer, which in most systems is the same as the frame pointer.
.stabs "name:sym_desc(value parameter)type_ref(int)", N_PSYM,NIL,NIL,offset_from_arg_ptr .stabs "in:p1",160,0,0,72
<< The examples that follow are based on A1.C >>
This is used to describe a class method. This is a function which takes
an extra argument as its first argument, for the this
pointer.
If the `#' is immediately followed by another `#', the second one will be followed by the return type and a semicolon. The class and argument types are not specified, and must be determined by demangling the name of the method if it is available.
Otherwise, the single `#' is followed by the class type, a comma,
the return type, a comma, and zero or more parameter types separated by
commas. The list of arguments is terminated by a semicolon. In the
debugging output generated by gcc, a final argument type of void
indicates a method which does not take a variable number of arguments.
If the final argument type of void
does not appear, the method
was declared with an ellipsis.
Note that although such a type will normally be used to describe fields in structures, unions, or classes, for at least some versions of the compiler it can also be used in other contexts.
The `@' type descriptor is used together with the `*' type descriptor for a pointer-to-non-static-member-data type. It is followed by type information for the class (or union), a comma, and type information for the member data.
The following C++ source:
typedef int A::*int_in_a;
generates the following stab:
.stabs "int_in_a:t20=*21=@19,1",128,0,0,0
Note that there is a conflict between this and type attributes (see section The String Field); both use type descriptor `@'. Fortunately, the `@' type descriptor used in this C++ sense always will be followed by a digit, `(', or `-', and type attributes never start with those things.
In the simple class definition shown above all member data and functions were publicly accessible. The example that follows contrasts public, protected and privately accessible fields and shows how these protections are encoded in C++ stabs.
If the character following the `field-name:' part of the string is `/', then the next character is the visibility. `0' means private, `1' means protected, and `2' means public. Debuggers should ignore visibility characters they do not recognize, and assume a reasonable default (such as public) (GDB 4.11 does not, but this should be fixed in the next GDB release). If no visibility is specified the field is public. The visibility `9' means that the field has been optimized out and is public (there is no way to specify an optimized out field with a private or protected visibility). Visibility `9' is not supported by GDB 4.11; this should be fixed in the next GDB release.
The following C++ source:
class vis { private: int priv; protected: char prot; public: float pub; };
generates the following stab:
# 128 is N_LSYM .stabs "vis:T19=s12priv:/01,0,32;prot:/12,32,8;pub:12,64,32;;",128,0,0,0
`vis:T19=s12' indicates that type number 19 is a 12 byte structure
named vis
The priv
field has public visibility
(`/0'), type int (`1'), and offset and size `,0,32;'.
The prot
field has protected visibility (`/1'), type char
(`2') and offset and size `,32,8;'. The pub
field has
type float (`12'), and offset and size `,64,32;'.
Protections for member functions are signified by one digit embedded in the field part of the stab describing the method. The digit is 0 if private, 1 if protected and 2 if public. Consider the C++ class definition below:
class all_methods { private: int priv_meth(int in){return in;}; protected: char protMeth(char in){return in;}; public: float pubMeth(float in){return in;}; };
It generates the following stab. The digit in question is to the left of an `A' in each case. Notice also that in this case two symbol descriptors apply to the class name struct tag and struct type.
.stabs "class_name:sym_desc(struct tag&type)type_def(21)= sym_desc(struct)struct_bytes(1) meth_name::type_def(22)=sym_desc(method)returning(int); :args(int);protection(private)modifier(normal)virtual(no); meth_name::type_def(23)=sym_desc(method)returning(char); :args(char);protection(protected)modifier(normal)virtual(no); meth_name::type_def(24)=sym_desc(method)returning(float); :args(float);protection(public)modifier(normal)virtual(no);;", N_LSYM,NIL,NIL,NIL
.stabs "all_methods:Tt21=s1priv_meth::22=##1;:i;0A.;protMeth::23=##2;:c;1A.; pubMeth::24=##12;:f;2A.;;",128,0,0,0
const
, volatile
, const volatile
)<< based on a6.C >>
In the class example described above all the methods have the normal modifier. This method modifier information is located just after the protection information for the method. This field has four possible character values. Normal methods use `A', const methods use `B', volatile methods use `C', and const volatile methods use `D'. Consider the class definition below:
class A { public: int ConstMeth (int arg) const { return arg; }; char VolatileMeth (char arg) volatile { return arg; }; float ConstVolMeth (float arg) const volatile {return arg; }; };
This class is described by the following stab:
.stabs "class(A):sym_desc(struct)type_def(20)=type_desc(struct)struct_bytes(1) meth_name(ConstMeth)::type_def(21)sym_desc(method) returning(int);:arg(int);protection(public)modifier(const)virtual(no); meth_name(VolatileMeth)::type_def(22)=sym_desc(method) returning(char);:arg(char);protection(public)modifier(volatile)virt(no) meth_name(ConstVolMeth)::type_def(23)=sym_desc(method) returning(float);:arg(float);protection(public)modifier(const volatile) virtual(no);;", ...
.stabs "A:T20=s1ConstMeth::21=##1;:i;2B.;VolatileMeth::22=##2;:c;2C.; ConstVolMeth::23=##12;:f;2D.;;",128,0,0,0
<< The following examples are based on a4.C >>
The presence of virtual methods in a class definition adds additional data to the class description. The extra data is appended to the description of the virtual method and to the end of the class description. Consider the class definition below:
class A { public: int Adat; virtual int A_virt (int arg) { return arg; }; };
This results in the stab below describing class A. It defines a new type (20) which is an 8 byte structure. The first field of the class struct is `Adat', an integer, starting at structure offset 0 and occupying 32 bits.
The second field in the class struct is not explicitly defined by the C++ class definition but is implied by the fact that the class contains a virtual method. This field is the vtable pointer. The name of the vtable pointer field starts with `$vf' and continues with a type reference to the class it is part of. In this example the type reference for class A is 20 so the name of its vtable pointer field is `$vf20', followed by the usual colon.
Next there is a type definition for the vtable pointer type (21). This is in turn defined as a pointer to another new type (22).
Type 22 is the vtable itself, which is defined as an array, indexed by a range of integers between 0 and 1, and whose elements are of type 17. Type 17 was the vtable record type defined by the boilerplate C++ type definitions, as shown earlier.
The bit offset of the vtable pointer field is 32. The number of bits in the field are not specified when the field is a vtable pointer.
Next is the method definition for the virtual member function A_virt
.
Its description starts out using the same format as the non-virtual
member functions described above, except instead of a dot after the
`A' there is an asterisk, indicating that the function is virtual.
Since is is virtual some addition information is appended to the end
of the method description.
The first number represents the vtable index of the method. This is a 32 bit unsigned number with the high bit set, followed by a semi-colon.
The second number is a type reference to the first base class in the inheritance hierarchy defining the virtual member function. In this case the class stab describes a base class so the virtual function is not overriding any other definition of the method. Therefore the reference is to the type number of the class that the stab is describing (20).
This is followed by three semi-colons. One marks the end of the current sub-section, one marks the end of the method field, and the third marks the end of the struct definition.
For classes containing virtual functions the very last section of the string part of the stab holds a type reference to the first base class. This is preceded by `~%' and followed by a final semi-colon.
.stabs "class_name(A):type_def(20)=sym_desc(struct)struct_bytes(8) field_name(Adat):type_ref(int),bit_offset(0),field_bits(32); field_name(A virt func ptr):type_def(21)=type_desc(ptr to)type_def(22)= sym_desc(array)index_type_ref(range of int from 0 to 1); elem_type_ref(vtbl elem type), bit_offset(32); meth_name(A_virt)::typedef(23)=sym_desc(method)returning(int); :arg_type(int),protection(public)normal(yes)virtual(yes) vtable_index(1);class_first_defining(A);;;~%first_base(A);", N_LSYM,NIL,NIL,NIL
.stabs "A:t20=s8Adat:1,0,32;$vf20:21=*22=ar1;0;1;17,32; A_virt::23=##1;:i;2A*-2147483647;20;;;~%20;",128,0,0,0
Stabs describing C++ derived classes include additional sections that describe the inheritance hierarchy of the class. A derived class stab also encodes the number of base classes. For each base class it tells if the base class is virtual or not, and if the inheritance is private or public. It also gives the offset into the object of the portion of the object corresponding to each base class.
This additional information is embedded in the class stab following the number of bytes in the struct. First the number of base classes appears bracketed by an exclamation point and a comma.
Then for each base type there repeats a series: a virtual character, a visibility character, a number, a comma, another number, and a semi-colon.
The virtual character is `1' if the base class is virtual and `0' if not. The visibility character is `2' if the derivation is public, `1' if it is protected, and `0' if it is private. Debuggers should ignore virtual or visibility characters they do not recognize, and assume a reasonable default (such as public and non-virtual) (GDB 4.11 does not, but this should be fixed in the next GDB release).
The number following the virtual and visibility characters is the offset from the start of the object to the part of the object pertaining to the base class.
After the comma, the second number is a type_descriptor for the base type. Finally a semi-colon ends the series, which repeats for each base class.
The source below defines three base classes A
, B
, and
C
and the derived class D
.
class A { public: int Adat; virtual int A_virt (int arg) { return arg; }; }; class B { public: int B_dat; virtual int B_virt (int arg) {return arg; }; }; class C { public: int Cdat; virtual int C_virt (int arg) {return arg; }; }; class D : A, virtual B, public C { public: int Ddat; virtual int A_virt (int arg ) { return arg+1; }; virtual int B_virt (int arg) { return arg+2; }; virtual int C_virt (int arg) { return arg+3; }; virtual int D_virt (int arg) { return arg; }; };
Class stabs similar to the ones described earlier are generated for each base class.
.stabs "A:T20=s8Adat:1,0,32;$vf20:21=*22=ar1;0;1;17,32; A_virt::23=##1;:i;2A*-2147483647;20;;;~%20;",128,0,0,0 .stabs "B:Tt25=s8Bdat:1,0,32;$vf25:21,32;B_virt::26=##1; :i;2A*-2147483647;25;;;~%25;",128,0,0,0 .stabs "C:Tt28=s8Cdat:1,0,32;$vf28:21,32;C_virt::29=##1; :i;2A*-2147483647;28;;;~%28;",128,0,0,0
In the stab describing derived class D
below, the information about
the derivation of this class is encoded as follows.
.stabs "derived_class_name:symbol_descriptors(struct tag&type)= type_descriptor(struct)struct_bytes(32)!num_bases(3), base_virtual(no)inheritance_public(no)base_offset(0), base_class_type_ref(A); base_virtual(yes)inheritance_public(no)base_offset(NIL), base_class_type_ref(B); base_virtual(no)inheritance_public(yes)base_offset(64), base_class_type_ref(C); ...
.stabs "D:Tt31=s32!3,000,20;100,25;0264,28;$vb25:24,128;Ddat: 1,160,32;A_virt::32=##1;:i;2A*-2147483647;20;;B_virt: :32:i;2A*-2147483647;25;;C_virt::32:i;2A*-2147483647; 28;;D_virt::32:i;2A*-2147483646;31;;;~%20;",128,0,0,0
A derived class object consists of a concatenation in memory of the data
areas defined by each base class, starting with the leftmost and ending
with the rightmost in the list of base classes. The exception to this
rule is for virtual inheritance. In the example above, class D
inherits virtually from base class B
. This means that an
instance of a D
object will not contain its own B
part but
merely a pointer to a B
part, known as a virtual base pointer.
In a derived class stab, the base offset part of the derivation
information, described above, shows how the base class parts are
ordered. The base offset for a virtual base class is always given as 0.
Notice that the base offset for B
is given as 0 even though
B
is not the first base class. The first base class A
starts at offset 0.
The field information part of the stab for class D
describes the field
which is the pointer to the virtual base class B
. The vbase pointer
name is `$vb' followed by a type reference to the virtual base class.
Since the type id for B
in this example is 25, the vbase pointer name
is `$vb25'.
.stabs "D:Tt31=s32!3,000,20;100,25;0264,28;$vb25:24,128;Ddat:1, 160,32;A_virt::32=##1;:i;2A*-2147483647;20;;B_virt::32:i; 2A*-2147483647;25;;C_virt::32:i;2A*-2147483647;28;;D_virt: :32:i;2A*-2147483646;31;;;~%20;",128,0,0,0
Following the name and a semicolon is a type reference describing the
type of the virtual base class pointer, in this case 24. Type 24 was
defined earlier as the type of the B
class this
pointer. The
this
pointer for a class is a pointer to the class type.
.stabs "this:P24=*25=xsB:",64,0,0,8
Finally the field offset part of the vbase pointer field description
shows that the vbase pointer is the first field in the D
object,
before any data fields defined by the class. The layout of a D
class object is a follows, Adat
at 0, the vtable pointer for
A
at 32, Cdat
at 64, the vtable pointer for C at 96, the
virtual base pointer for B
at 128, and Ddat
at 160.
The data area for a class is a concatenation of the space used by the data members of the class. If the class has virtual methods, a vtable pointer follows the class data. The field offset part of each field description in the class stab shows this ordering.
<< How is this reflected in stabs? See Cygnus bug #677 for some info. >>
Go to the first, previous, next, last section, table of contents.