Different types of stabs describe the various ways that variables can be allocated: on the stack, globally, in registers, in common blocks, statically, or as arguments to a function.
If a variable's scope is local to a function and its lifetime is only as long as that function executes (C calls such variables automatic), it can be allocated in a register (see section Register Variables) or on the stack.
Each variable allocated on the stack has a stab with the symbol
descriptor omitted. Since type information should begin with a digit,
`-', or `(', only those characters precluded from being used
for symbol descriptors. However, the Acorn RISC machine (ARM) is said
to get this wrong: it puts out a mere type definition here, without the
preceding `type-number='. This is a bad idea; there is no
guarantee that type descriptors are distinct from symbol descriptors.
Stabs for stack variables use the N_LSYM
stab type, or
C_LSYM
for XCOFF.
The value of the stab is the offset of the variable within the local variables. On most machines this is an offset from the frame pointer and is negative. The location of the stab specifies which block it is defined in; see section Block Structure.
For example, the following C code:
int main () { int x; }
produces the following stabs:
.stabs "main:F1",36,0,0,_main # 36 is N_FUN .stabs "x:1",128,0,0,-12 # 128 is N_LSYM .stabn 192,0,0,LBB2 # 192 is N_LBRAC .stabn 224,0,0,LBE2 # 224 is N_RBRAC
See section Procedures for more information on the N_FUN
stab, and
section Block Structure for more information on the N_LBRAC
and
N_RBRAC
stabs.
A variable whose scope is not specific to just one source file is
represented by the `G' symbol descriptor. These stabs use the
N_GSYM
stab type (C_GSYM for XCOFF). The type information for
the stab (see section The String Field) gives the type of the variable.
For example, the following source code:
char g_foo = 'c';
yields the following assembly code:
.stabs "g_foo:G2",32,0,0,0 # 32 is N_GSYM .global _g_foo .data _g_foo: .byte 99
The address of the variable represented by the N_GSYM
is not
contained in the N_GSYM
stab. The debugger gets this information
from the external symbol for the global variable. In the example above,
the .global _g_foo
and _g_foo:
lines tell the assembler to
produce an external symbol.
Some compilers, like GCC, output N_GSYM
stabs only once, where
the variable is defined. Other compilers, like SunOS4 /bin/cc, output a
N_GSYM
stab for each compilation unit which references the
variable.
Register variables have their own stab type, N_RSYM
(C_RSYM
for XCOFF), and their own symbol descriptor, `r'.
The stab's value is the number of the register where the variable data
will be stored.
AIX defines a separate symbol descriptor `d' for floating point registers. This seems unnecessary; why not just just give floating point registers different register numbers? I have not verified whether the compiler actually uses `d'.
If the register is explicitly allocated to a global variable, but not initialized, as in:
register int g_bar asm ("%g5");
then the stab may be emitted at the end of the object file, with the other bss symbols.
A common block is a statically allocated section of memory which can be referred to by several source files. It may contain several variables. I believe Fortran is the only language with this feature.
A N_BCOMM
stab begins a common block and an N_ECOMM
stab
ends it. The only field that is significant in these two stabs is the
string, which names a normal (non-debugging) symbol that gives the
address of the common block. According to IBM documentation, only the
N_BCOMM
has the name of the common block (even though their
compiler actually puts it both places).
The stabs for the members of the common block are between the
N_BCOMM
and the N_ECOMM
; the value of each stab is the
offset within the common block of that variable. IBM uses the
C_ECOML
stab type, and there is a corresponding N_ECOML
stab type, but Sun's Fortran compiler uses N_GSYM
instead. The
variables within a common block use the `V' symbol descriptor (I
believe this is true of all Fortran variables). Other stabs (at least
type declarations using C_DECL
) can also be between the
N_BCOMM
and the N_ECOMM
.
Initialized static variables are represented by the `S' and `V' symbol descriptors. `S' means file scope static, and `V' means procedure scope static. One exception: in XCOFF, IBM's xlc compiler always uses `V', and whether it is file scope or not is distinguished by whether the stab is located within a function.
In a.out files, N_STSYM
means the data section, N_FUN
means the text section, and N_LCSYM
means the bss section. For
those systems with a read-only data section separate from the text
section (Solaris), N_ROSYM
means the read-only data section.
For example, the source lines:
static const int var_const = 5; static int var_init = 2; static int var_noinit;
yield the following stabs:
.stabs "var_const:S1",36,0,0,_var_const # 36 is N_FUN ... .stabs "var_init:S1",38,0,0,_var_init # 38 is N_STSYM ... .stabs "var_noinit:S1",40,0,0,_var_noinit # 40 is N_LCSYM
In XCOFF files, the stab type need not indicate the section;
C_STSYM
can be used for all statics. Also, each static variable
is enclosed in a static block. A C_BSTAT
(emitted with a
`.bs' assembler directive) symbol begins the static block; its
value is the symbol number of the csect symbol whose value is the
address of the static block, its section is the section of the variables
in that static block, and its name is `.bs'. A C_ESTAT
(emitted with a `.es' assembler directive) symbol ends the static
block; its name is `.es' and its value and section are ignored.
In ECOFF files, the storage class is used to specify the section, so the stab type need not indicate the section.
In ELF files, for the SunPRO compiler version 2.0.1, symbol descriptor `S' means that the address is absolute (the linker relocates it) and symbol descriptor `V' means that the address is relative to the start of the relevant section for that compilation unit. SunPRO has plans to have the linker stop relocating stabs; I suspect that their the debugger gets the address from the corresponding ELF (not stab) symbol. I'm not sure how to find which symbol of that name is the right one. The clean way to do all this would be to have a the value of a symbol descriptor `S' symbol be an offset relative to the start of the file, just like everything else, but that introduces obvious compatibility problems. For more information on linker stab relocation, See section Having the Linker Relocate Stabs in ELF.
Fortran (at least, the Sun and SGI dialects of FORTRAN-77) has a feature
which allows allocating arrays with malloc
, but which avoids
blurring the line between arrays and pointers the way that C does. In
stabs such a variable uses the `b' symbol descriptor.
For example, the Fortran declarations
real foo, foo10(10), foo10_5(10,5) pointer (foop, foo) pointer (foo10p, foo10) pointer (foo105p, foo10_5)
produce the stabs
foo:b6 foo10:bar3;1;10;6 foo10_5:bar3;1;5;ar3;1;10;6
In this example, real
is type 6 and type 3 is an integral type
which is the type of the subscripts of the array (probably
integer
).
The `b' symbol descriptor is like `V' in that it denotes a
statically allocated symbol whose scope is local to a function; see
See section Static Variables. The value of the symbol, instead of being the address
of the variable itself, is the address of a pointer to that variable.
So in the above example, the value of the foo
stab is the address
of a pointer to a real, the value of the foo10
stab is the
address of a pointer to a 10-element array of reals, and the value of
the foo10_5
stab is the address of a pointer to a 5-element array
of 10-element arrays of reals.
Formal parameters to a function are represented by a stab (or sometimes two; see below) for each parameter. The stabs are in the order in which the debugger should print the parameters (i.e., the order in which the parameters are declared in the source file). The exact form of the stab depends on how the parameter is being passed.
Parameters passed on the stack use the symbol descriptor `p' and
the N_PSYM
symbol type (or C_PSYM
for XCOFF). The value
of the symbol is an offset used to locate the parameter on the stack;
its exact meaning is machine-dependent, but on most machines it is an
offset from the frame pointer.
As a simple example, the code:
main (argc, argv) int argc; char **argv;
produces the stabs:
.stabs "main:F1",36,0,0,_main # 36 is N_FUN .stabs "argc:p1",160,0,0,68 # 160 is N_PSYM .stabs "argv:p20=*21=*2",160,0,0,72
The type definition of argv
is interesting because it contains
several type definitions. Type 21 is pointer to type 2 (char) and
argv
(type 20) is pointer to type 21.
The following symbol descriptors are also said to go with N_PSYM
.
The value of the symbol is said to be an offset from the argument
pointer (I'm not sure whether this is true or not).
pP (<<??>>) pF Fortran function parameter X (function result variable)
If the parameter is passed in a register, then traditionally there are two symbols for each argument:
.stabs "arg:p1" . . . ; N_PSYM .stabs "arg:r1" . . . ; N_RSYM
Debuggers use the second one to find the value, and the first one to know that it is an argument.
Because that approach is kind of ugly, some compilers use symbol
descriptor `P' or `R' to indicate an argument which is in a
register. Symbol type C_RPSYM
is used in XCOFF and N_RSYM
is used otherwise. The symbol's value is the register number. `P'
and `R' mean the same thing; the difference is that `P' is a
GNU invention and `R' is an IBM (XCOFF) invention. As of version
4.9, GDB should handle either one.
There is at least one case where GCC uses a `p' and `r' pair rather than `P'; this is where the argument is passed in the argument list and then loaded into a register.
According to the AIX documentation, symbol descriptor `D' is for a parameter passed in a floating point register. This seems unnecessary--why not just use `R' with a register number which indicates that it's a floating point register? I haven't verified whether the system actually does what the documentation indicates.
On the sparc and hppa, for a `P' symbol whose type is a structure
or union, the register contains the address of the structure. On the
sparc, this is also true of a `p' and `r' pair (using Sun
cc
) or a `p' symbol. However, if a (small) structure is
really in a register, `r' is used. And, to top it all off, on the
hppa it might be a structure which was passed on the stack and loaded
into a register and for which there is a `p' and `r' pair! I
believe that symbol descriptor `i' is supposed to deal with this
case (it is said to mean "value parameter by reference, indirect
access"; I don't know the source for this information), but I don't know
details or what compilers or debuggers use it, if any (not GDB or GCC).
It is not clear to me whether this case needs to be dealt with
differently than parameters passed by reference (see section Passing Parameters by Reference).
There is a case similar to an argument in a register, which is an argument that is actually stored as a local variable. Sometimes this happens when the argument was passed in a register and then the compiler stores it as a local variable. If possible, the compiler should claim that it's in a register, but this isn't always done.
If a parameter is passed as one type and converted to a smaller type by
the prologue (for example, the parameter is declared as a float
,
but the calling conventions specify that it is passed as a
double
), then GCC2 (sometimes) uses a pair of symbols. The first
symbol uses symbol descriptor `p' and the type which is passed.
The second symbol has the type and location which the parameter actually
has after the prologue. For example, suppose the following C code
appears with no prototypes involved:
void subr (f) float f; {
if f
is passed as a double at stack offset 8, and the prologue
converts it to a float in register number 0, then the stabs look like:
.stabs "f:p13",160,0,3,8 # 160 isN_PSYM
, here 13 isdouble
.stabs "f:r12",64,0,3,0 # 64 isN_RSYM
, here 12 isfloat
In both stabs 3 is the line number where f
is declared
(see section Line Numbers).
GCC, at least on the 960, has another solution to the same problem. It
uses a single `p' symbol descriptor for an argument which is stored
as a local variable but uses N_LSYM
instead of N_PSYM
. In
this case, the value of the symbol is an offset relative to the local
variables for that function, not relative to the arguments; on some
machines those are the same thing, but not on all.
On the VAX or on other machines in which the calling convention includes the number of words of arguments actually passed, the debugger (GDB at least) uses the parameter symbols to keep track of whether it needs to print nameless arguments in addition to the formal parameters which it has printed because each one has a stab. For example, in
extern int fprintf (FILE *stream, char *format, ...); ... fprintf (stdout, "%d\n", x);
there are stabs for stream
and format
. On most machines,
the debugger can only print those two arguments (because it has no way
of knowing that additional arguments were passed), but on the VAX or
other machines with a calling convention which indicates the number of
words of arguments, the debugger can print all three arguments. To do
so, the parameter symbol (symbol descriptor `p') (not necessarily
`r' or symbol descriptor omitted symbols) needs to contain the
actual type as passed (for example, double
not float
if it
is passed as a double and converted to a float).
If the parameter is passed by reference (e.g., Pascal VAR
parameters), then the symbol descriptor is `v' if it is in the
argument list, or `a' if it in a register. Other than the fact
that these contain the address of the parameter rather than the
parameter itself, they are identical to `p' and `R',
respectively. I believe `a' is an AIX invention; `v' is
supported by all stabs-using systems as far as I know.
Conformant arrays are a feature of Modula-2, and perhaps other languages, in which the size of an array parameter is not known to the called function until run-time. Such parameters have two stabs: a `x' for the array itself, and a `C', which represents the size of the array. The value of the `x' stab is the offset in the argument list where the address of the array is stored (it this right? it is a guess); the value of the `C' stab is the offset in the argument list where the size of the array (in elements? in bytes?) is stored.
Go to the first, previous, next, last section, table of contents.