A Machine Description System Status and Development Directions
Benoˆıt Dupont de Dinechin
[email protected]
STMicroelectronics April 12, 2005
1
A Machine Description System
Why a Machine Description System
Why a Machine Description System Software Development Tools Program Representations • Binary absolute and relocatable form • Assembler source and disassembler output • Assembler internal representation (instructions & operands) • Compiler internal representation (operations & parameters) Software Development Tools ISA Representations • Binary encoding (operand, instructions, bundles) • Storage elements (memory, registers, constant generators) • Instruction behavior (instruction set simulation) • Instruction properties (compiler code optimizations) • Instruction semantics (compiler code selection) April 12, 2005
2
A Machine Description System
Why a Machine Description System
Purpose of a Machine Description System A Machine Description System (MDS) is a structured data repository and a collection of programs used to configure the target-specific parts of the software development tools. Existing Machine Description Systems Green Hills Software Canonical Machine Description to configure the compiler, assembler, linker, disassembler, debugger. CMG Bristol Architects CHESS tables to drive the documentation and the instruction set simulator of the ST200 processors. Coware / LisaTek Lisa language to automatically produce simulators (instruction and cycle accurate), assembler, linker, debugger, compiler instruction scheduler.
April 12, 2005
3
A Machine Description System
Why a Machine Description System
U. of Virginia Computer Systems Description Languages (CSDL). Different languages and tools are available: SLED for encoding / decoding; RTL and λ-RTL for instruction semantics; Alpha for calling conventions; Block language for stack frame composition. Open64 TargInfo Calls to a special C++ library to generate the Open64 target description tables and enumerations. This covers assembly syntax, instruction encoding, bundling, scheduling, register allocation, ABI targeting. AST MachGen C code & data to configure the LxISS, the GNU AS, and to generate constant evaluators for the LxBE. STS ARC & CSD Machine descriptions systems respectively used by the LAO ST100 and the LAO ST200. The ARC covered assembly syntax, instruction encoding, bundling, scheduling, register allocation, instruction properties and semantics.
April 12, 2005
4
A Machine Description System
Why a Machine Description System
Lessons from these Machine Description Systems • The CSDL system from U. of Virginia is the most advanced, but is now a disparate collection of language and tools. • The single language-based approaches (GHS, Lisa) mix the different description levels and constrain the description into some hierarchy. A single language cannot cover everything. • The authoring of machine descriptions in C++ (Open64 TargInfo, AST MachGen) tends to be tool-centric so they do not include the descriptions they don’t use. • Not able to describe aliases between storage locations (Open64). • No instruction selection / peephole re-selection generators. • No understanding of instruction bundling / bundle templates. • Not able to distinguish between core-specific and target-specific. April 12, 2005
5
A Machine Description System
Design of the AST/STS MDS
Design of the AST/STS MDS AST/STS MDS Design Objectives In case of the AST/STS Machine Description System, we focus on: • Assembling and disassembling, linking with the GNU tools. • Automatic generation of the Open64 TargInfo source code. • Automatic generation of the LAO2 target description tables. • Replacement of some MachGen capabilities for the AST ISS. • High-speed instruction and bundle encoding for the AST X-JIT. • Eventually able to describe stack frames and ABI conventions. • Describe all the ST200 cores; next describe ARM cores (V5E).
April 12, 2005
6
A Machine Description System
Design of the AST/STS MDS
AST/STS MDS Design Decisions • Focus on the completeness, the non-redundancy, and the verifiability, of the MDS data repository: – use a pseudo-relational data schema (allows embedded lists) – encode data repository as XML and validate against DTD – use mini-languages in XML element contents and parse them • Organize MDS processing in three stages: – front-end processing to translates and enrich foreign machine descriptions into the MDS data repository; – central part that expands information, merge identical core-level information into target-level information, and validates contents; – back-end processing to generate the files needed by each software development tool. April 12, 2005
7
A Machine Description System
Design of the AST/STS MDS
AST/STS MDS Design Decisions (Continued) • Ensure compiler-level information is available: – Multiple calling conventions and register roles for each – Register files and register classes for register allocation – Instruction scheduling information that is core specific – Use internal Behavior language to extract compiler properties • Ensure VLIW instruction bundling can be described: – Instruction bundling of VLIW processors does not fit in a variable-length instruction framework! – Instruction scheduling must produce valid bundles! – View VLIW bundle encoding as a two-step process: encode variable-length instruction; recode the encoded instructions and add control bits to make bundle April 12, 2005
8
A Machine Description System
Design of the AST/STS MDS
Why Use XML for MDS Data Repository • Text information repository under revision control • Easily convertible to specification documents • Stable lexical structure (escapes, macros, inclusion) • User-defined information structure (DTD) • Standard validation and processing tools (OpenSP) Common XML Mistakes to Avoid • Lack of overall structure of the XML database • Encode every bit of document structure in XML • Use of unproven XML technologies (XML Schema validation) We only use DTD validation, pseudo-relational tables, and plain Perl April 12, 2005
9
A Machine Description System
Design of the AST/STS MDS
Example of XML DTD Fragment
April 12, 2005
#REQUIRED #IMPLIED #REQUIRED #REQUIRED #IMPLIED #IMPLIED | #REQUIRED | #IMPLIED #IMPLIED
10
A Machine Description System
Design of the AST/STS MDS
Corresponding XML Fragment
April 12, 2005
11
A Machine Description System
Implementation of the AST/STS MDS
Implementation of the AST/STS MDS June 2004 Initial prototype of the MDS by Benoit. July – August 2004 Stephen Clarke generates the GNU AS tables. August 2004 Benoit forks the MDS into MDS2 to configure and build the MDS tables for simultaneous cores. September 2004 Stephen Clarke generates most of the Open64 TargInfo files except scheduling and ABI information and contributes a decode tree generator. October – November 2004 Benoit generalizes the MDS core process and implements a Behavior language compiled from Takumi. February – March 2005 Stephen Clarke finalizes MDS generation for production Open64 and GNU assembler / disassembler. March – April 2005 Benoit sets CodeX MDS and generates for LAO. April 12, 2005
12
A Machine Description System
Implementation of the AST/STS MDS
AST/STS MDS Organization and Flows • Available under http://mds.codex.cro.st.com/ svn co https://svn.mds.codex.cro.st.com/svnroot/mds/trunk/MDS mkdir build && cd build ../MDS/configure --target=st200 --with-cores="st235" --enable-tools="gnu o64" make clean all check diff
• Front-end scripting creates the Machine Description Database (MDD) XML tables from a variety of non-XML sources (CHESS, MachGen, special scripts) • The MDD XML tables are expanded using target-independent scripts into Machine Description Expanded (MDE) XML tables; then the target-level tables are merged across the different cores • Back-end scripting processes the MDS (MDD+MDE) XML tables and produce the files needed by software development tools: GNU AS tables, Open64 TargInfo, LAO machine description, etc.
April 12, 2005
13
A Machine Description System
CHESS ST220
CHESS ST230
CHESS ST231
Implementation of the AST/STS MDS
CHESS ST235
Machine Description Language (MDL)
$MDS/FE/CHESS/BIN/*.pl
$MDD_tables ST220
$MDS/FE/AST/BIN/*.pl
$MDD_tables ST230
$MDD_tables ST231
$MDD_tables ST235
$MDS/MDD/MDE/BIN/*.pl
$MDS_tables ST220
$MDS_tables ST230
$MDS_tables ST231
$MDS_tables ST235
Documentation HTML Files
$MDS/BE/DOC/BIN/*.pl
$MDS/MDD/MDF/BIN/*.pl
$MDS/BE/GNU/BIN/*.pl
GNU AS/LD Source Files
LAO ST200 Source Files
$MDS/BE/ECL/BIN/*.pl
$MDS_tables ST200
$MDS/BE/O64/BIN/*.pl
Open64 Target Info Files
April 12, 2005
14
A Machine Description System
Implementation of the AST/STS MDS
MDS Documentation • The bulk of the documentation is in the $MDS/DOC/MDD.dtd DTD file that explain the purpose of each XML element. • This commented DTD file is turned into LaTeX files then included into the $MDS/DOC/Design.tex file. Do make doc in the build directory to build the documentation. • Grammars for the MDS mini-languages are included: $MDS/DOC/Takumi.y for the CHESS Takumi language; $MDS/DOC/Behavior.y and $MDS/DOC/Semantics.y for the MDS behavior and semantics languages. MDS Auxiliary Tools A parser generator in Perl for Bison grammars is included in $MDS/BIN/yaxcc.pl. See documentation embedded in that file. April 12, 2005
15
A Machine Description System
MDS Design Outline
MDS Design Outline Storage The machine memory, register files, control registers, constant generators. BitField Identify a contiguous range of bits in a word. Operand Variant part of an Instruction encoding associated with an encoding method: Immediate, Modifier, RegClass, RegMask. Pattern Bit pattern as a list of BitField(s) and expected values. Format Specification for encoding Instruction(s) with a particular operand list and provides the corresponding assembler syntax. Convention Calling convention, including the register roles. Processor High-level scheduling features of a processor. Register Architectural register (MDE).
April 12, 2005
16
A Machine Description System
MDS Design Outline
Instruction XML Specification An instruction is a member of the ISA not yet specialized with regards to the operand list. Operand lists are defined by the Instruction formats.
April 12, 2005
17
A Machine Description System
MDS Design Outline
ST235 ANDC Instruction
(SEQ (WRITE..result1 (AND (NOT (SX.32 (READ.2.%2))) (SX.32 (READ.2.%3)))) (WRITE.3.%1 (READ..result1)))
April 12, 2005
18
A Machine Description System
MDS Design Outline
Instance XML Specification An instruction Instance results from the database join of Instruction and Format. In the corresponding Behavior, the access to operands is specialized.
April 12, 2005
#REQUIRED #REQUIRED #REQUIRED #IMPLIED #REQUIRED #REQUIRED #REQUIRED #IMPLIED #REQUIRED #REQUIRED #REQUIRED
19
A Machine Description System
MDS Design Outline
ST235 ANDC Instance Register–Register–Register
April 12, 2005
20
A Machine Description System
MDS Design Outline
ST235 ANDC Instance Register–Register–Register (Continued) (SEQ (WRITE..result1 (AND (NOT (F2S.32 (LOAD.2.GR (METHOD.%2) (CONST.1)))) (F2S.32 (LOAD.2.GR (METHOD.%3) (CONST.1))))) (STORE.3.GR (METHOD.%1) (CONST.1) (I2F (CONST.32) (READ..result1))))
April 12, 2005
21
A Machine Description System
MDS Design Outline
Operator XML Specification An Operator is the compiler abstraction of an instruction Instance. Unlike Instance, an Operator has Parameter(s) instead of Operand(s). There is exactly one Parameter per Read or Write action of the Operator (not accounting for side-effects on State Storage), independently from the Operand encoding issues.
April 12, 2005
Semantics?)> #REQUIRED #IMPLIED #IMPLIED #IMPLIED #IMPLIED #IMPLIED #REQUIRED #REQUIRED
22
A Machine Description System
MDS Design Outline
ST200 ANDC Operator Register–Register–Register
April 12, 2005
"Operator:st200:ANDC_DEST_SRC1_SRC2" "st220 st230 st231 st235" "Instance:st220:ANDC_DEST_SRC1_SRC2 Instance:st230:ANDC_DEST_SRC1_SRC "andc" "Operand:st200:DEST Operand:st200:SRC1 Operand:st200:SRC2" "%0 %1 = %2, %3" "Write" "RegClass:st200:General" "%1" "3 3 3 3" "Read" "RegClass:st200:General" "%2" "2 2 2 2" "Read" "RegClass:st200:General" "%3" "2 2 2 2"
23
A Machine Description System
MDS Design Outline
MDS Behavior Language • The Behavior language is based on statements and captures common sub-expressions in temporary variables. • It has control-flow constructs but no looping nor recursion. • Computations use with infinite precision or boolean arithmetic and explicitly cast from/to bit-fields when accessing storage. MDS Semantics Language • The Semantics language is a parallel list of guarded computations, where all the sub-expressions have been forward-substituted. • All computations work on bit-field type annotated with the minimum bit precision required for correctness. • Not yet (used for compiler properties and instruction selection) April 12, 2005
24
A Machine Description System
Future Work
Future Work MDS Short-Term Developments • Merge the Semantics language and tools by Christophe Guillon • Produce the LAO ST200 target tables for the ST235 64-bit pairs • Target the MDS to the ARM architecture: first target is V5E (no Thumb) • More factored information into Format, including Behavior of Operand access • Develop a PseudoC language to describe Instruction behavior and compile it into MDS Behavior • Design a Machine Description Language (MDL) to avoid authoring XML tables by scripts
April 12, 2005
25
A Machine Description System
Future Work
MDS Mid-Term Developments • Produce effective binary code compressors based on the bundle and instruction encoding trees • Produce instruction behavior functions and decoding automaton for use by simulators • Exploit operator semantics in compiler range analysis and SSA optimizations • Exploit operator semantics in compiler instruction selection and peep-hole optimizations • Drive register allocation from the MDS contents, including management of aliased register files
April 12, 2005
26
A Machine Description System
Conclusions
Conclusions • The MDS has already accumulated over one man-year of effort • The MDS is used in production for the ST200 STS software tools • The MDS is key to the AST/Computing SoC.NET project • The MDS will be used by AST/Computing Advanced Architecture Research (G. Desoli) • No other architecture description system has core merging capabilities, which are required for embedded processor families • No other architecture description system can produce such accurate compiler information • The MDS could handle orther architectures and improve its end-user interface, but this requires more resources
April 12, 2005
27