Go to the first, previous, next, last section, table of contents.

Variables

Different types of stabs describe the various ways that variables can be allocated: on the stack, globally, in registers, in common blocks, statically, or as arguments to a function.

Automatic Variables Allocated on the Stack

If a variable's scope is local to a function and its lifetime is only as long as that function executes (C calls such variables automatic), it can be allocated in a register (see section Register Variables) or on the stack.

Each variable allocated on the stack has a stab with the symbol descriptor omitted. Since type information should begin with a digit, `-', or `(', only those characters precluded from being used for symbol descriptors. However, the Acorn RISC machine (ARM) is said to get this wrong: it puts out a mere type definition here, without the preceding `type-number='. This is a bad idea; there is no guarantee that type descriptors are distinct from symbol descriptors. Stabs for stack variables use the N_LSYM stab type, or C_LSYM for XCOFF.

The value of the stab is the offset of the variable within the local variables. On most machines this is an offset from the frame pointer and is negative. The location of the stab specifies which block it is defined in; see section Block Structure.

For example, the following C code:

int
main ()
{
  int x;
}

produces the following stabs:

.stabs "main:F1",36,0,0,_main   # 36 is N_FUN
.stabs "x:1",128,0,0,-12        # 128 is N_LSYM
.stabn 192,0,0,LBB2             # 192 is N_LBRAC
.stabn 224,0,0,LBE2             # 224 is N_RBRAC

See section Procedures for more information on the N_FUN stab, and section Block Structure for more information on the N_LBRAC and N_RBRAC stabs.

Global Variables

A variable whose scope is not specific to just one source file is represented by the `G' symbol descriptor. These stabs use the N_GSYM stab type (C_GSYM for XCOFF). The type information for the stab (see section The String Field) gives the type of the variable.

For example, the following source code:

char g_foo = 'c';

yields the following assembly code:

.stabs "g_foo:G2",32,0,0,0     # 32 is N_GSYM
     .global _g_foo
     .data
_g_foo:
     .byte 99

The address of the variable represented by the N_GSYM is not contained in the N_GSYM stab. The debugger gets this information from the external symbol for the global variable. In the example above, the .global _g_foo and _g_foo: lines tell the assembler to produce an external symbol.

Some compilers, like GCC, output N_GSYM stabs only once, where the variable is defined. Other compilers, like SunOS4 /bin/cc, output a N_GSYM stab for each compilation unit which references the variable.

Register Variables

Register variables have their own stab type, N_RSYM (C_RSYM for XCOFF), and their own symbol descriptor, `r'. The stab's value is the number of the register where the variable data will be stored.

AIX defines a separate symbol descriptor `d' for floating point registers. This seems unnecessary; why not just just give floating point registers different register numbers? I have not verified whether the compiler actually uses `d'.

If the register is explicitly allocated to a global variable, but not initialized, as in:

register int g_bar asm ("%g5");

then the stab may be emitted at the end of the object file, with the other bss symbols.

Common Blocks

A common block is a statically allocated section of memory which can be referred to by several source files. It may contain several variables. I believe Fortran is the only language with this feature.

A N_BCOMM stab begins a common block and an N_ECOMM stab ends it. The only field that is significant in these two stabs is the string, which names a normal (non-debugging) symbol that gives the address of the common block. According to IBM documentation, only the N_BCOMM has the name of the common block (even though their compiler actually puts it both places).

The stabs for the members of the common block are between the N_BCOMM and the N_ECOMM; the value of each stab is the offset within the common block of that variable. IBM uses the C_ECOML stab type, and there is a corresponding N_ECOML stab type, but Sun's Fortran compiler uses N_GSYM instead. The variables within a common block use the `V' symbol descriptor (I believe this is true of all Fortran variables). Other stabs (at least type declarations using C_DECL) can also be between the N_BCOMM and the N_ECOMM.

Static Variables

Initialized static variables are represented by the `S' and `V' symbol descriptors. `S' means file scope static, and `V' means procedure scope static. One exception: in XCOFF, IBM's xlc compiler always uses `V', and whether it is file scope or not is distinguished by whether the stab is located within a function.

In a.out files, N_STSYM means the data section, N_FUN means the text section, and N_LCSYM means the bss section. For those systems with a read-only data section separate from the text section (Solaris), N_ROSYM means the read-only data section.

For example, the source lines:

static const int var_const = 5;
static int var_init = 2;
static int var_noinit;

yield the following stabs:

.stabs "var_const:S1",36,0,0,_var_const      # 36 is N_FUN
...
.stabs "var_init:S1",38,0,0,_var_init        # 38 is N_STSYM
...
.stabs "var_noinit:S1",40,0,0,_var_noinit    # 40 is N_LCSYM

In XCOFF files, the stab type need not indicate the section; C_STSYM can be used for all statics. Also, each static variable is enclosed in a static block. A C_BSTAT (emitted with a `.bs' assembler directive) symbol begins the static block; its value is the symbol number of the csect symbol whose value is the address of the static block, its section is the section of the variables in that static block, and its name is `.bs'. A C_ESTAT (emitted with a `.es' assembler directive) symbol ends the static block; its name is `.es' and its value and section are ignored.

In ECOFF files, the storage class is used to specify the section, so the stab type need not indicate the section.

In ELF files, for the SunPRO compiler version 2.0.1, symbol descriptor `S' means that the address is absolute (the linker relocates it) and symbol descriptor `V' means that the address is relative to the start of the relevant section for that compilation unit. SunPRO has plans to have the linker stop relocating stabs; I suspect that their the debugger gets the address from the corresponding ELF (not stab) symbol. I'm not sure how to find which symbol of that name is the right one. The clean way to do all this would be to have a the value of a symbol descriptor `S' symbol be an offset relative to the start of the file, just like everything else, but that introduces obvious compatibility problems. For more information on linker stab relocation, See section Having the Linker Relocate Stabs in ELF.

Fortran Based Variables

Fortran (at least, the Sun and SGI dialects of FORTRAN-77) has a feature which allows allocating arrays with malloc, but which avoids blurring the line between arrays and pointers the way that C does. In stabs such a variable uses the `b' symbol descriptor.

For example, the Fortran declarations

real foo, foo10(10), foo10_5(10,5)
pointer (foop, foo)
pointer (foo10p, foo10)
pointer (foo105p, foo10_5)

produce the stabs

foo:b6
foo10:bar3;1;10;6
foo10_5:bar3;1;5;ar3;1;10;6

In this example, real is type 6 and type 3 is an integral type which is the type of the subscripts of the array (probably integer).

The `b' symbol descriptor is like `V' in that it denotes a statically allocated symbol whose scope is local to a function; see See section Static Variables. The value of the symbol, instead of being the address of the variable itself, is the address of a pointer to that variable. So in the above example, the value of the foo stab is the address of a pointer to a real, the value of the foo10 stab is the address of a pointer to a 10-element array of reals, and the value of the foo10_5 stab is the address of a pointer to a 5-element array of 10-element arrays of reals.

Parameters

Formal parameters to a function are represented by a stab (or sometimes two; see below) for each parameter. The stabs are in the order in which the debugger should print the parameters (i.e., the order in which the parameters are declared in the source file). The exact form of the stab depends on how the parameter is being passed.

Parameters passed on the stack use the symbol descriptor `p' and the N_PSYM symbol type (or C_PSYM for XCOFF). The value of the symbol is an offset used to locate the parameter on the stack; its exact meaning is machine-dependent, but on most machines it is an offset from the frame pointer.

As a simple example, the code:

main (argc, argv)
     int argc;
     char **argv;

produces the stabs:

.stabs "main:F1",36,0,0,_main                 # 36 is N_FUN
.stabs "argc:p1",160,0,0,68                   # 160 is N_PSYM
.stabs "argv:p20=*21=*2",160,0,0,72

The type definition of argv is interesting because it contains several type definitions. Type 21 is pointer to type 2 (char) and argv (type 20) is pointer to type 21.

The following symbol descriptors are also said to go with N_PSYM. The value of the symbol is said to be an offset from the argument pointer (I'm not sure whether this is true or not).

pP (<<??>>)
pF Fortran function parameter
X  (function result variable)

Passing Parameters in Registers

If the parameter is passed in a register, then traditionally there are two symbols for each argument:

.stabs "arg:p1" . . .       ; N_PSYM
.stabs "arg:r1" . . .       ; N_RSYM

Debuggers use the second one to find the value, and the first one to know that it is an argument.

Because that approach is kind of ugly, some compilers use symbol descriptor `P' or `R' to indicate an argument which is in a register. Symbol type C_RPSYM is used in XCOFF and N_RSYM is used otherwise. The symbol's value is the register number. `P' and `R' mean the same thing; the difference is that `P' is a GNU invention and `R' is an IBM (XCOFF) invention. As of version 4.9, GDB should handle either one.

There is at least one case where GCC uses a `p' and `r' pair rather than `P'; this is where the argument is passed in the argument list and then loaded into a register.

According to the AIX documentation, symbol descriptor `D' is for a parameter passed in a floating point register. This seems unnecessary--why not just use `R' with a register number which indicates that it's a floating point register? I haven't verified whether the system actually does what the documentation indicates.

On the sparc and hppa, for a `P' symbol whose type is a structure or union, the register contains the address of the structure. On the sparc, this is also true of a `p' and `r' pair (using Sun cc) or a `p' symbol. However, if a (small) structure is really in a register, `r' is used. And, to top it all off, on the hppa it might be a structure which was passed on the stack and loaded into a register and for which there is a `p' and `r' pair! I believe that symbol descriptor `i' is supposed to deal with this case (it is said to mean "value parameter by reference, indirect access"; I don't know the source for this information), but I don't know details or what compilers or debuggers use it, if any (not GDB or GCC). It is not clear to me whether this case needs to be dealt with differently than parameters passed by reference (see section Passing Parameters by Reference).

Storing Parameters as Local Variables

There is a case similar to an argument in a register, which is an argument that is actually stored as a local variable. Sometimes this happens when the argument was passed in a register and then the compiler stores it as a local variable. If possible, the compiler should claim that it's in a register, but this isn't always done.

If a parameter is passed as one type and converted to a smaller type by the prologue (for example, the parameter is declared as a float, but the calling conventions specify that it is passed as a double), then GCC2 (sometimes) uses a pair of symbols. The first symbol uses symbol descriptor `p' and the type which is passed. The second symbol has the type and location which the parameter actually has after the prologue. For example, suppose the following C code appears with no prototypes involved:

void
subr (f)
     float f;
{

if f is passed as a double at stack offset 8, and the prologue converts it to a float in register number 0, then the stabs look like:

.stabs "f:p13",160,0,3,8   # 160 is N_PSYM, here 13 is double
.stabs "f:r12",64,0,3,0    # 64 is N_RSYM, here 12 is float

In both stabs 3 is the line number where f is declared (see section Line Numbers).

GCC, at least on the 960, has another solution to the same problem. It uses a single `p' symbol descriptor for an argument which is stored as a local variable but uses N_LSYM instead of N_PSYM. In this case, the value of the symbol is an offset relative to the local variables for that function, not relative to the arguments; on some machines those are the same thing, but not on all.

On the VAX or on other machines in which the calling convention includes the number of words of arguments actually passed, the debugger (GDB at least) uses the parameter symbols to keep track of whether it needs to print nameless arguments in addition to the formal parameters which it has printed because each one has a stab. For example, in

extern int fprintf (FILE *stream, char *format, ...);
...
fprintf (stdout, "%d\n", x);

there are stabs for stream and format. On most machines, the debugger can only print those two arguments (because it has no way of knowing that additional arguments were passed), but on the VAX or other machines with a calling convention which indicates the number of words of arguments, the debugger can print all three arguments. To do so, the parameter symbol (symbol descriptor `p') (not necessarily `r' or symbol descriptor omitted symbols) needs to contain the actual type as passed (for example, double not float if it is passed as a double and converted to a float).

Passing Parameters by Reference

If the parameter is passed by reference (e.g., Pascal VAR parameters), then the symbol descriptor is `v' if it is in the argument list, or `a' if it in a register. Other than the fact that these contain the address of the parameter rather than the parameter itself, they are identical to `p' and `R', respectively. I believe `a' is an AIX invention; `v' is supported by all stabs-using systems as far as I know.

Passing Conformant Array Parameters

Conformant arrays are a feature of Modula-2, and perhaps other languages, in which the size of an array parameter is not known to the called function until run-time. Such parameters have two stabs: a `x' for the array itself, and a `C', which represents the size of the array. The value of the `x' stab is the offset in the argument list where the address of the array is stored (it this right? it is a guess); the value of the `C' stab is the offset in the argument list where the size of the array (in elements? in bytes?) is stored.

Go to the first, previous, next, last section, table of contents.