09.09.2019
Posted by 
  1. Assembly Language Symbol Table

Preface Overview of the Assembler Overview of the ARM Architecture Structure of Assembly Language Modules Writing ARM Assembly Language Condition Codes Using the Assembler Symbols, Literals, Expressions, and Operators Symbol naming rules Variables Numeric constants Assembly time substitution of variables Register-relative and PC-relative expressions Labels Labels for PC-relative addresses Labels for register-relative addresses Labels for absolute addresses Numeric local labels Syntax of.

Technical Support

On-Line Manuals

Assembler User Guide

PrefaceOverview of the AssemblerOverview of the ARM ArchitectureStructure of Assembly Language ModulesDefiningDefining A Symbol In Asswmbaly LanguageWriting ARM Assembly LanguageCondition CodesUsing the AssemblerSymbols, Literals, Expressions, and OperatorsSymbol naming rulesVariablesNumeric constantsAssembly time substitution of variablesRegister-relative and PC-relative expressionsLabelsLabels for PC-relative addressesLabels for register-relative addressesLabels for absolute addressesNumeric local labelsSyntax of numeric local labelsString expressionsString literalsNumeric expressionsSyntax of numeric literalsSyntax of floating-point literalsLogical expressionsLogical literalsUnary operatorsBinary operatorsMultiplicative operatorsString manipulation operatorsShift operatorsAddition, subtraction, and logical operatorsRelational operatorsBoolean operators

Assembly Language Symbol Table

Operator precedenceDifference between operator precedence in assemblyVFP ProgrammingAssembler Command-line OptionsARM and Thumb InstructionsVFP InstructionsDirectives ReferenceLanguageVia File Syntax
Non-Confidential PDF versionARM DUI0379H
ARM® Compiler v5.06 for µVision®armasm User GuideVersion 5

7.1 Symbol naming rules

You must follow some rules when naming symbols in assembly language source code.

  • Symbol names must be unique within their scope.
  • You can use uppercase letters, lowercase letters, numeric characters, or the underscore character in symbol names. Symbol names are case-sensitive, and all characters in the symbol name are significant.
  • Do not use numeric characters for the first character of symbol names, except in numeric local labels.
  • Symbols must not use the same name as built-in variable names or predefined symbol names.
  • If you use the same name as an instruction mnemonic or directive, use double bars to delimit the symbol name. For example:
    The bars are not part of the symbol.
  • You must not use the symbols $t , $t.x , or $d as program labels. These are mapping symbols that mark the beginning of Thumb, ThumbEE, and data within the object file.
  • Symbols beginning with the characters $v are mapping symbols that relate to VFP and might be output when building for a target with VFP. ARM recommends you avoid using symbols beginning with $v in your source code.
If you have to use a wider range of characters in symbols, for example, when working with compilers, use single bars to delimit the symbol name. For example:
The bars are not part of the symbol. You cannot use bars, semicolons, or newlines within the bars.
Related concepts
Related reference
Non-Confidential PDF versionARM DUI0379H
Copyright © 2007, 2008, 2011, 2012, 2014-2016 ARM. All rights reserved.

Products

Development Tools
Hardware & Collateral

Downloads

Support

Contact

Cookie Settings Terms of Use Privacy Accessibility Trademarks Contact Us Feedback

Copyright © 2005-2019 Arm Limited (or its affiliates). All rights reserved.

As part of our university project we have to write a mini-assembler. It's a two-pass assembler. I was wondering why it is allowed in assembly language to use symbols, in opcodes for example before they're actually declared (later in code)? I assume there must be a reason for this because in most of programming languages that I know first you declare a variable then you use it. Moreover if this were the case in assembly language, two-pass assemblers won't need to exist I guess.

Yos
YosYos

2 Answers

You often jump to some forward location, or pass some constant (i.e. the named address of some literal string) defined forward to a assembler instruction. In both cases, a use before define is needed.

As an example, take some non trivial C code foo.c, and ask your GCC compiler to emit the assembler code for it using gcc -O -fverbose-asm -S foo.c then look into the generated foo.s; and it would be better to do that on some existing C source file of several hundred lines at least (e.g. from some existing free software project).

BTW, it is mostly a matter of convention. One could imagine some assembler syntax requiring some .FORWARD symb directive to explicitly declare some symb to be used forward. But historically assembler programs where not written that way (and most assemblers don't even have any syntax to declare but not define a symbol). And requiring a .FORWARD directive for each use before define symbol is a burden: you'll need a lot of such directives in practice. So better avoid them.

Notice that some recent (and higher level) programming languages do not require to forward declare symbols, in particular the Go language permits you to call a function by name without it having been forward-declared.

(there are also other reasons why an assembler is a two-pass thing : assemblers are producing object files with relocation information)

BTW, every machine code program has some kind of loops, so the control flow graph is cyclic. If it wasn't, your program would exit very quickly (in a fraction of a second). Loops (or their equivalent, e.g. recursion) are fundamental to computers. Most conditionals (i.e. an if instruction in C) translate to forward conditional jumps in assembler. Read also about the halting problem.

Notice that symbols (or labels, they are the same) are untyped in assembler code.

Basile StarynkevitchBasile Starynkevitch
183k15 gold badges186 silver badges389 bronze badges

Assemblers will add a symbol to their symbol table as soon as the symbol is encountered when it is either referenced or defined. During the first pass, the assembler makes assumptions about the symbol type (size, located in data or code section, ...), but doesn't need to know the actual address.

When the symbol is defined, then it's value is stored into the symbol table entry for use during the second pass.

There are multi-pass assemblers that reduce code size related to forward references that affect instruction size.

rcgldrrcgldr
17.1k3 gold badges16 silver badges41 bronze badges

Not the answer you're looking for? Browse other questions tagged assembly or ask your own question.