Preface Overview of the Assembler Overview of the ARM Architecture Structure of Assembly Language Modules Writing ARM Assembly Language Condition Codes Using the Assembler Symbols, Literals, Expressions, and Operators Symbol naming rules Variables Numeric constants Assembly time substitution of variables Register-relative and PC-relative expressions Labels Labels for PC-relative addresses Labels for register-relative addresses Labels for absolute addresses Numeric local labels Syntax of.
Technical Support
On-Line Manuals
Assembler User Guide
PrefaceOverview of the AssemblerOverview of the ARM ArchitectureStructure of Assembly Language Modules
Assembly Language Symbol Table
Operator precedenceDifference between operator precedence in assemblyVFP ProgrammingAssembler Command-line OptionsARM and Thumb InstructionsVFP InstructionsDirectives Reference
Non-Confidential | PDF version | ARM DUI0379H | ||
|
7.1 Symbol naming rules
You must follow some rules when naming symbols in assembly language source code.
- Symbol names must be unique within their scope.
- You can use uppercase letters, lowercase letters, numeric characters, or the underscore character in symbol names. Symbol names are case-sensitive, and all characters in the symbol name are significant.
- Do not use numeric characters for the first character of symbol names, except in numeric local labels.
- Symbols must not use the same name as built-in variable names or predefined symbol names.
- If you use the same name as an instruction mnemonic or directive, use double bars to delimit the symbol name. For example: The bars are not part of the symbol.
- You must not use the symbols
$t
,$t.x
, or$d
as program labels. These are mapping symbols that mark the beginning of Thumb, ThumbEE, and data within the object file. - Symbols beginning with the characters $v are mapping symbols that relate to VFP and might be output when building for a target with VFP. ARM recommends you avoid using symbols beginning with $v in your source code.
Related concepts
Related reference
Non-Confidential | PDF version | ARM DUI0379H |
Copyright © 2007, 2008, 2011, 2012, 2014-2016 ARM. All rights reserved. |
ProductsDevelopment Tools | Hardware & Collateral | Downloads | Support | Contact |
Cookie Settings Terms of Use Privacy Accessibility Trademarks Contact Us Feedback Copyright © 2005-2019 Arm Limited (or its affiliates). All rights reserved. |
As part of our university project we have to write a mini-assembler. It's a two-pass assembler. I was wondering why it is allowed in assembly language to use symbols, in opcodes for example before they're actually declared (later in code)? I assume there must be a reason for this because in most of programming languages that I know first you declare a variable then you use it. Moreover if this were the case in assembly language, two-pass assemblers won't need to exist I guess.
2 Answers
You often jump to some forward location, or pass some constant (i.e. the named address of some literal string) defined forward to a assembler instruction. In both cases, a use before define is needed.
As an example, take some non trivial C code foo.c
, and ask your GCC compiler to emit the assembler code for it using gcc -O -fverbose-asm -S foo.c
then look into the generated foo.s
; and it would be better to do that on some existing C source file of several hundred lines at least (e.g. from some existing free software project).
BTW, it is mostly a matter of convention. One could imagine some assembler syntax requiring some .FORWARD symb
directive to explicitly declare some symb
to be used forward. But historically assembler programs where not written that way (and most assemblers don't even have any syntax to declare but not define a symbol). And requiring a .FORWARD
directive for each use before define symbol is a burden: you'll need a lot of such directives in practice. So better avoid them.
Notice that some recent (and higher level) programming languages do not require to forward declare symbols, in particular the Go language permits you to call a function by name without it having been forward-declared.
(there are also other reasons why an assembler is a two-pass thing : assemblers are producing object files with relocation information)
BTW, every machine code program has some kind of loops, so the control flow graph is cyclic. If it wasn't, your program would exit very quickly (in a fraction of a second). Loops (or their equivalent, e.g. recursion) are fundamental to computers. Most conditionals (i.e. an if
instruction in C) translate to forward conditional jumps in assembler. Read also about the halting problem.
Notice that symbols (or labels, they are the same) are untyped in assembler code.
Basile StarynkevitchBasile StarynkevitchAssemblers will add a symbol to their symbol table as soon as the symbol is encountered when it is either referenced or defined. During the first pass, the assembler makes assumptions about the symbol type (size, located in data or code section, ...), but doesn't need to know the actual address.
When the symbol is defined, then it's value is stored into the symbol table entry for use during the second pass.
There are multi-pass assemblers that reduce code size related to forward references that affect instruction size.
rcgldrrcgldr