Add additional ARM64 instruction opcode constants needed by arm64-gen.c:
- ARM64_FMOV_*: Floating-point move variants
- ARM64_STR_Q_PRE/LDR_Q_POST: Quadword load/store with pre/post increment
- ARM64_LDPSW: Load pair of words with sign-extend
- ARM64_LDR_S_SIMD: SIMD load (distinct from scalar LDR_S)
- ARM64_MOV_V_D: Move vector to double
- ARM64_FCMP: Floating-point compare
- ARM64_SDIV: Signed divide
- ARM64_MUL: Multiply
These constants will be used in the next commit to refactor arm64-gen.c.
Replace hardcoded magic numbers with symbolic constants for ARM64
instruction opcodes, matching the style used in x86_64 backend.
Changes:
- arm64-tok.h: Add 93 new opcode constants and helper macros
- Instruction opcodes: ARM64_ADD_IMM, ARM64_LDR_X, ARM64_B, etc.
- Helper macros: ARM64_RD(), ARM64_RN(), ARM64_IMM12(), etc.
- Field encodings: ARM64_SF(), ARM64_S(), ARM64_SH(), etc.
- arm64-asm.c: Refactor all instruction generation functions
- gen_movz/gen_movn/gen_movk: Use ARM64_MOVZ/MOVN/MOVK
- gen_add_imm/gen_sub_imm: Use ARM64_ADD_IMM/SUB_IMM
- gen_dp_reg: Use symbolic opcodes
- gen_ldst_imm/gen_ldst_pair: Use ARM64_LDR_*/STR_*
- gen_b/gen_bl/gen_br/gen_blr/gen_ret: Use ARM64_B/BL/BR/BLR/RET
- gen_cbz/gen_cbnz: Use ARM64_CBZ/CBNZ
- gen_shift: Use ARM64_LSL_REG/LSR_REG/ASR_REG/ROR_REG
- gen_barrier: Use ARM64_ISB/DSB/DMB
- gen_mrs/gen_msr: Use symbolic constants
- Inline asm save/restore: Use ARM64_STP_X/LDP_X
- arm64-gen.c: Begin systematic refactoring (first batch)
- arm64_sub_sp: Use ARM64_SUB_IMM with helper macros
Benefits:
- Readability: Self-documenting code (ARM64_LDR_X vs 0xF9400000)
- Maintainability: Easier to spot encoding errors
- Consistency: Matches x86_64 backend style
- Safety: Helper macros prevent bit-shift mistakes
All tests pass with no functional changes.
- arm64_check_offset: use (uint64_t)0x1ff for consistency with scaled_mask
- arm64_sub_sp: use 0xffful suffix for uint64_t diff parameter
These changes ensure consistent type handling and avoid implicit
integer promotions when working with 64-bit values.
- Remove unnecessary braces from single-statement if blocks
- Remove trailing whitespace throughout file
- Remove duplicate comment
Style now matches existing ARM64 backend and TCC conventions:
- Allman style for function definitions
- No braces for single-statement control structures
- Consistent 4-space indentation
Implement full GCC-style extended inline assembly for ARM64 backend:
- Add constraint parsing (constraint_priority, skip_constraint_modifiers)
- Implement register allocation (asm_compute_constraints)
- Add code generation for prolog/epilog and load/store (asm_gen_code)
- Support output/input/read-write operands with r, w, f, x, m, g constraints
- Support immediate constraints (i, I, J, K, L, n)
- Handle clobber lists (registers, memory, cc)
- Support constraint references, early clobber, named operands
- Fix '#' character handling in tccpp.c for ARM64 asm mode
Tests: Add comprehensive test suite with 18 test cases covering all features.
All existing TCC tests continue to pass.
parse_addr_operand() silently accepted invalid register names like
[xyz] without error. Now explicitly validates the register and calls
tcc_error() if arm64_parse_regvar() returns -1 or >= 32.
Before: invalid registers caused silent wrong code or confusing errors
After: clear error message 'invalid register in address operand'
LSL/LSR/ASR immediate shifts are UBFM/SBFM aliases with specific
immr/imms field encodings:
- LSL #shift: immr = (width - shift) & 0x3F, imms = width - 1
- LSR #shift: immr = shift & 0x3F, imms = width - 1
- ASR #shift: immr = shift & 0x3F, imms = width - 1
Fixes:
- immr field now always masked with 0x3F (6 bits), not width-1
- imms field is constant (width-1), not calculated from shift
- ROR uses EXTR format (Rm=shift, Rn=src, Rd=dest), not UBFM format
Based on ARM ARM documentation for UBFM/SBFM/EXTR instructions.
__bound_ptr_add was implemented manually while adjacent __bound_ptr_indir*
functions used the REDIR_PTR_INDIR macro. This consolidates the pattern
for consistency.
The static assertions in tccrun.c only validate CONTEXT when building
native Windows ARM64 (_WIN64 && __aarch64__). Cross-compilation builds
use the fallback definition without validation, so layout errors would
be silent.
Add matching C_ASSERT() checks after the ARM64_NT_CONTEXT definition
to catch struct layout mismatches during cross-compilation.
OPT_VREG, OPT_IM12, OPT_SHIFT, and OPT_REGSET were defined in the enum
and as OP_* bit masks but never used by any parsing function or
instruction handler in arm64-asm.c.
These appear to be artifacts copied from other assembler implementations
(arm-asm.c uses OP_VREG32/OP_VREG64/OP_REGSET32, riscv64-asm.c uses
OP_IM12S) but were never integrated into the ARM64 operand parsing logic.
Removing these unused definitions:
- Eliminates confusion for developers
- Reduces code clutter
- Makes the actual operand types (OPT_REG, OPT_IM, OPT_ADDR, OPT_COND)
clearer
asm_branch() had two identical 15-case switch blocks (30 lines total)
that duplicated condition code mapping. This also duplicated the logic
in the existing parse_condition() helper.
Added get_branch_condition() helper that:
1. Maps branch tokens (TOK_ASM_beq) to condition tokens (TOK_ASM_eq)
2. Calls the existing parse_condition() helper
3. Returns the condition code (0-13) or -1 for non-conditional branches
This reduces code duplication from 30 lines to a single 29-line helper
function, and ensures all condition mapping logic is in one place.
Multiple instruction handlers were extracting op->reg without checking
that the operand was actually a register. When parse_operand() failed
to recognize a token, it set op->reg = -1, which when masked with 0x1F
became 31 (xzr/sp), silently encoding wrong instructions.
Now each handler validates operand types before extraction:
- asm_shift: validates op1 and op2 are registers
- asm_data_proc: validates op1, op2, and op3 are registers
- asm_ldst: validates op1 is register, op2 is address
- asm_ldst_pair: validates op1 and op2 are registers, op3 is address
This implements fail-fast behavior to catch typos and invalid operands
immediately rather than producing silently incorrect code.
Previously, parse_operand() would silently accept any unrecognized token
and pass it to asm_expr() as an immediate, causing typos like:
add x0, x1, xyz ; 'xyz' is not a valid register
to be silently assembled as a symbol reference instead of erroring.
Now, if a token is not a register, condition code, or valid immediate
prefix (#, :, @, $), an error is emitted for identifier tokens.
This implements fail-fast behavior for invalid operands, making it easier
to catch typos and mistakes in assembly code.
The fallback CONTEXT definition at lines 2073-2124 was unreachable dead code.
The guard '#if defined(__aarch64__) && !defined(_ARM64_CONTEXT_DECLARED)'
could never be true because:
1. Line 50-51: __aarch64__ automatically defines _ARM64_
2. Line 1426: #if defined(_ARM64_) || defined(__aarch64__) always enters
3. Line 1473: _ARM64_CONTEXT_DECLARED is always defined inside that block
4. Line 2073: The fallback guard is therefore always false
This 52-line duplicate was a maintenance hazard that could silently diverge
from the official ARM64_NT_CONTEXT definition. Remove it entirely.
The fallback CONTEXT struct incorrectly defined Bvr (Breakpoint Value
Registers) and Wvr (Watchpoint Value Registers) as DWORD (32-bit) instead
of DWORD64 (64-bit).
On ARM64:
- BCR/WCR (Control Registers) are 32-bit ✓
- BVR/WVR (Value Registers) are 64-bit ✓
This mismatch caused struct size and layout errors, potentially corrupting
debug register state when used with Windows debugging APIs.
These macros were defined twice (lines ~273 and ~317) with identical
values and #ifndef guards. The duplicates appear to be a copy-paste
oversight from adding ARM64 support.
Remove the redundant second set of defines. The first set (lines 273-284)
already provides the fallback definitions needed when Windows headers
are unavailable.
The fallback CONTEXT struct for ARM64 had multiple structural issues:
- ContextFlags was DWORD64 (8 bytes) instead of ULONG (4 bytes)
- Missing Cpsr field entirely
- Missing DECLSPEC_ALIGN(16) attribute
- X registers as simple array X[29] instead of union with named struct X[31]
These mismatches caused incorrect struct size and field offsets, leading to
register corruption when used with Windows APIs like GetThreadContext or
RtlRestoreContext.
The fallback struct now matches the official ARM64_NT_CONTEXT layout exactly,
ensuring binary compatibility with Windows ARM64 system calls.
The asm_data_proc function was OR-ing register widths together, which
allowed invalid ARM64 instructions like 'add x0, w1, w2' (mixed widths).
ARM64 requires all registers in data processing instructions to have
the same width (all X or all W).
Fix by validating that all three operand registers have matching widths
and emitting an error if they don't match.
pe_get_process_msvcrt_handle() used LoadLibraryA which increments the
module reference count, but never called FreeLibrary to release it.
Use GetModuleHandleA instead, which returns a handle to the already-
loaded msvcrt.dll module without incrementing the reference count.
This is the correct API for accessing system DLLs that are already
mapped into the process address space.
The fallback CONTEXT struct for ARM64 (used when __aarch64__ is defined
but _ARM64_CONTEXT_DECLARED is not set) incorrectly defined V[32] as
DWORD64 (64-bit) instead of ARM64_NT_NEON128 (128-bit).
This caused register corruption when RtlRestoreContext restores NEON/VFP
registers, as the struct size was 256 bytes instead of the correct
512 bytes.
Fixes potential corruption on toolchains that define __aarch64__ but not
_ARM64_ (e.g., clang on macOS or certain cross-compilation scenarios).