Instruction Summary for "P" ISA Extension Proposal
Document Number
xxxxxxxxx
Date Issued
2017-11-23
Copyright © 2017 Andes Technology Corporation. All rights reserved.
Copyright Notice Copyright © 2017 Andes Technology Corporation. All rights reserved. AndesCore™, AndeShape™, AndeSight™, AndESLive™, AndeSoft™, AndeStar™, AICE™, AICE-MCU™, AICE-MINI™, Andes Custom Extension™, and COPILOT™ are trademarks owned by Andes Technology Corporation. All other trademarks used herein are the property of their respective owners. This document contains confidential information of Andes Technology Corporation. Use of this copyright notice is precautionary and does not imply publication or disclosure. Neither the whole nor part of the information contained herein may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form by any means without the written permission of Andes Technology Corporation. The product described herein is subject to continuous development and improvement; information herein is given by Andes in good faith but without warranties. This document is intended only to assist the reader in the use of the product. Andes Technology Corporation shall not be liable for any loss or damage arising from the use of any information in this document, or any incorrect use of the product.
Contact Information Should you have any problems with the information contained herein, please contact Andes Technology Corporation by email
[email protected] or online website https://es.andestech.com/eservice/ for support giving:
the document title
the document number
the page number(s) to which your comments apply
a concise explanation of the problem
General suggestions for improvements are welcome.
Instruction Summary for "P" ISA Extension Proposal
Revision History Rev.
Revision Date
0.1
2017/11/20
Revised
Revised Content
Chapter-Section
All
Initial release
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page ii
Instruction Summary for "P" ISA Extension Proposal
Table of Contents COPYRIGHT NOTICE ......................................................................................................................................................... I CONTACT INFORMATION ............................................................................................................................................... I REVISION HISTORY ........................................................................................................................................................ II TABLE OF CONTENTS .................................................................................................................................................... III LIST OF TABLES ................................................................................................................................................................ V 1.
2.
INTRODUCTION ........................................................................................................................................................ 1 1.1.
SIMD INSTRUCTIONS .............................................................................................................................................. 1
1.2.
NON-SIMD INSTRUCTIONS ..................................................................................................................................... 2
1.3.
ZERO-OVERHEAD LOOP MECHANISM ...................................................................................................................... 2
DSP ISA EXTENSION INSTRUCTION SUMMARY ........................................................................................ 4 2.1.
SHORTHAND DEFINITIONS ...................................................................................................................................... 4
2.2.
SIMD DATA PROCESSING INSTRUCTIONS ............................................................................................................... 5
2.2.1.
16-bit Addition & Subtraction Instructions ................................................................................................... 5
2.2.2.
8-bit Addition & Subtraction Instructions .................................................................................................... 7
2.2.3.
16-bit Shift Instructions ................................................................................................................................... 8
2.2.4.
16-bit Compare Instructions ........................................................................................................................... 9
2.2.5.
8-bit Compare Instructions .......................................................................................................................... 10
2.2.6.
16-bit Misc Instructions ................................................................................................................................. 10
2.2.7.
8-bit Misc Instructions .................................................................................................................................. 12
2.2.8.
8-bit Unpacking Instructions ....................................................................................................................... 12
2.3.
NON-SIMD DATA PROCESSING INSTRUCTIONS .................................................................................................... 14
2.3.1.
32-bit Addition/Subtraction Instructions ................................................................................................... 14
2.3.2.
32-bit Shift Instructions ................................................................................................................................ 14
2.3.3.
16-bit Packing Instructions ........................................................................................................................... 15
2.3.4.
Most Significant Word “32x32” Multiply & Add Instructions .................................................................. 15
2.3.5.
Most Significant Word “32x16” Multiply & Add Instructions ................................................................... 16
2.3.6.
Signed 16-bit Multiply with 32-bit Add/Subtract Instructions ................................................................ 17
2.3.7.
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions ................................................................ 19
2.3.8.
Miscellaneous Instructions ........................................................................................................................... 20
2.3.9.
Q31 saturation Instructions .......................................................................................................................... 21
2.3.10.
Q15 saturation instructions ...................................................................................................................... 22
2.3.11.
Overflow status manipulation instructions ........................................................................................... 23
2.4. 2.4.1.
64-BIT INSTRUCTIONS ........................................................................................................................................... 24 64-bit Addition & Subtraction Instructions ................................................................................................ 24
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page iii
Instruction Summary for "P" ISA Extension Proposal 2.4.2.
32-bit Multiply with 64-bit Add/Subtract Instructions............................................................................. 26
2.4.3.
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions ................................................................ 27
2.5. 3.
ZERO-OVERHEAD LOOP (ZOL) MECHANISM INSTRUCTIONS ................................................................................ 30
USER-MODE CSR REGISTERS ........................................................................................................................... 30 3.1.
LOOP BEGIN REGISTER .......................................................................................................................................... 30
3.2.
LOOP END REGISTER ............................................................................................................................................. 31
3.3.
LOOP COUNT REGISTER ......................................................................................................................................... 32
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page iv
Instruction Summary for "P" ISA Extension Proposal
List of Tables TABLE 1. SIMD 16-BIT ADD/SUBTRACT INSTRUCTIONS ............................................................................................................ 5 TABLE 2. SIMD 8-BIT ADD/SUBTRACT INSTRUCTIONS ............................................................................................................. 7 TABLE 3. SIMD 16-BIT SHIFT INSTRUCTIONS ............................................................................................................................ 8 TABLE 4. SIMD 16-BIT COMPARE INSTRUCTIONS ...................................................................................................................... 9 TABLE 5. SIMD 8-BIT COMPARE INSTRUCTIONS ..................................................................................................................... 10 TABLE 6. SIMD 16-BIT MISCELLANEOUS INSTRUCTIONS ........................................................................................................ 10 TABLE 7. SIMD 8-BIT MISCELLANEOUS INSTRUCTIONS .......................................................................................................... 12 TABLE 8. 8-BIT UNPACKING INSTRUCTIONS ............................................................................................................................. 12 TABLE 9. 32-BIT ADD/SUB INSTRUCTIONS .............................................................................................................................. 14 TABLE 10. 32-BIT SHIFT INSTRUCTIONS .................................................................................................................................. 14 TABLE 11. 16-BIT PACKING INSTRUCTIONS ............................................................................................................................... 15 TABLE 12. SIGNED MSW 32X32 MULTIPLY AND ADD INSTRUCTIONS ..................................................................................... 15 TABLE 13. SIGNED MSW 32X16 MULTIPLY AND ADD INSTRUCTIONS ..................................................................................... 16 TABLE 14. SIGNED 16-BIT MULTIPLY 32-BIT ADD/SUBTRACT INSTRUCTIONS ......................................................................... 17 TABLE 15. SIGNED 16-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ......................................................................... 19 TABLE 16. MISCELLANEOUS INSTRUCTIONS ............................................................................................................................. 20 TABLE 17. Q31 SATURATION ALU INSTRUCTIONS .................................................................................................................... 21 TABLE 18. Q15 SATURATION ALU INSTRUCTIONS .................................................................................................................... 22 TABLE 19. OV (OVERFLOW) FLAG SET/CLEAR INSTRUCTIONS................................................................................................. 23 TABLE 20. 64-BIT ADD/SUBTRACT INSTRUCTIONS .................................................................................................................. 24 TABLE 21. 32-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ...................................................................................... 26 TABLE 22. SIGNED 16-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ........................................................................ 27 TABLE 23. ZOL MECHANISM INSTRUCTIONS ........................................................................................................................... 30
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page v
Instruction Summary for "P" ISA Extension Proposal
Typographical Convention Index Document Element
Font
Font Style
Size
Color
Normal text
Georgia
Normal
12
Black
Command line,
Lucida Console
Normal
11
Indigo
LUCIDA CONSOLE BOLD + ALL-CAPS 11
INDIGO
Note or warning
Georgia
Normal
12
Red
Hyperlink
Georgia
Underlined
12
Blue
source code or file paths VARIABLES OR PARAMETERS IN COMMAND LINE, SOURCE CODE OR FILE PATHS
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page vi
Instruction Summary for "P" ISA Extension Proposal
1. Introduction Digital Signal Processing (DSP), has emerged as an important technology for modern electronic systems. A wide range of modern applications employ DSP algorithms to solve problems in their particular domains, including sensor fusion, servo motor control, audio decode/encode, speech synthesis and coding, MPEG4 decode, medical imaging, computer vision, embedded control, robotics, human interface, etc. The AndeStar™ DSP instruction set extension increases the DSP algorithm processing capabilities of the AndesCore™ CPU IP products. With the addition of the AndeStar™ DSP instruction set extension, the AndesCore CPUs can now run these various DSP applications with lower power and higher performance. This DSP instruction set extension adds 8-bits and 16-bits SIMD instructions to increase the throughput of 8-bits and 16-bits DSP computations, so more work can be done in a fixed time slot or a task can be completed faster. It also adds enhanced 16-bits, 32-bits, 64-bits non-SIMD instructions to speed up frequent operations in DSP algorithms. To reduce the looping overhead of a repeated performance-critical DSP computation, this extension also includes a hardware zero-overhead loop mechanism.
1.1.
SIMD Instructions
Using the AndeStar V5 baseline 32-bit registers, we can perform four 8-bit operations or two 16-bit operations in parallel to maximize the throughput of these 8-bit and 16-bit compuations. And there are many DSP applications that can benefit from this performance feature. Therefore, this DSP instruction set extension adds many 8-bit and 16-bit SIMD instructions. The 8-bit SIMD instructions include a variety of signed/unsigned addition and subtraction operations, signed/unsigned comparison operations, signed/unsigned maximum and minimum operations, signed/unsigned unpacking operations, and signed absolute value operation.
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 1
Instruction Summary for "P" ISA Extension Proposal The 16-bit SIMD instructions include a variety of signed/unsigned addition and subtraction operations, different types of shift operations, signed/unsigned comparison operations, signed/unsigned maximum and minimum operations, signed/unsigned multiplication operations, signed/unsigned clipping operations, and signed absolute value operation.
1.2.
Non-SIMD Instructions
The non-SIMD instructions in this DSP extension include 16-bit packing operations, Q15 and Q31 saturating addition, subtraction, multiplication operations, 32-bit signed/unsigned halving addition and subtraction operations, 32-bit saturating left shift and rounding right shift operations, most significant word “32x32 multiply & add” operations, most signification word “32x16 multiply & add” operations, a variety of 32-bit accumulation or subtraction with 16-bit multiplication operations, and bit reverse, bit-wise selection, byte insertion, 32-bit word extraction from 64-bit data operations. To speed up 64-bit operations in DSP applications, this extension also includes a variety of 64-bit addition and subtraction operations, signed/unsigned 64-bit accumulation or subtraction with 32-bit multiplication operations, and signed 64-bit accumulation or subtraction with 16-bit multiplication operations.
1.3.
Zero-overhead Loop Mechanism
A set of Zero-Overhead Loop mechanism is provided to reduce the instruction fetch and execution overhead of loop-control instructions. Three user-mode CSR registers are provided to support this mechanism.
LB: stores the starting address of a loop. It is 32-bit. It can be written with “MTLBI” instruction.
LE: stores the ending address of a loop. It is 32-bit. The value of LE should be greater than or equal to LB. If this rule is violated, UPREDICTABLE behavior will happen. It can be written with “MTLEI” instruction.
LC: contains the loop count number that the zero-overhead looping operation will be performed. It is 32-bit. When LC is greater than 1, any execution of an instruction in an
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 2
Instruction Summary for "P" ISA Extension Proposal address that matches the value of LE will cause the Program Counter to change value to the content of LB and will decrement the value of LC by 1. When LC is less than or equal to 1, the zero-overhead loop mechanism will be turned off. It is used for any loop that needs to be executed at least once. For example, do { ......... } until (count > 4);
The zero-overhead looping operation can be summarized as follows: If ((LC > 1) && (PC == LE)) { LC = LC – 1; PC = LB; }
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 3
Instruction Summary for "P" ISA Extension Proposal
2. DSP ISA Extension Instruction Summary 2.1.
Shorthand Definitions
r.H == rH1 r[31:16], r.L == r.H0 r[15:0]
r.B3 r[31:24], r.B2 r[23:16], r.B1 r[15:8], r.B0 r[7:0]
r[xU] the upper 32-bit of a 64-bit number; xU represents the GPR number that contains this upper part 32-bit value.
r[xL] the lower 32-bit of a 64-bit number; xL represents the GPR number that contains this lower part 32-bit value.
r[xU].r[xL] a 64-bit number that is formed from a pair of GPRs.
s>> signed arithmetic right shift
u>> unsigned logical right shift
SAT.Qn() Saturate to the range of [-2n, 2n-1], if saturation happens, set PSW.OV.
SAT.Um() Saturate to the range of [0, 2m-1], if saturation happens, set PSW.OV.
RUND() Indicate “rounding”, i.e., add 1 to the most significant discarded bit for right shift or MSW-type multiplication instructions.
Sign or Zero Extending functions:
SEm(data) Sign-Extend data to m-bit.
ZEm(data) Zero-Extend data to m-bit.
ABS(x) Calculate the absolute value of “x”.
CONCAT(x,y) Concatinate “x” and “y” to form a value.
u< Unsinged less than comparison.
u<= Unsinged less than & equal comparison.
u> Unsinged greater than comparison.
s* Signed multiplication.
u* Unsigned multiplication.
rt is Rd in RISC-V ISA terminology.
ra is Rs1 and rb is Rs2 in RISC-V ISA terminology.
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 4
Instruction Summary for "P" ISA Extension Proposal 2.2.
SIMD Data Processing Instructions
2.2.1.
16-bit Addition & Subtraction Instructions
The SIMD 16-bit add/subtract instructions support 4 types of operations: Addition (two 16-bit additions), Subtraction (two 16-bit subtractions), Crossed Add & Sub, and Crossed Sub & Add. The overflow handling of these instructions can have 5 variations: Wrap-around (dropping overflow), Signed Halving (keeping overflow by dropping 1 LSB bit), Unsigned Halving, Signed Saturation (clipping overflow), and Unsigned Saturation. Together, there are 20 SIMD 16-bit add/subtract instructions. Table 1. SIMD 16-bit Add/Subtract Instructions Mnemonic
Instruction
Operation rt.Hx = ra.Hx + rb.Hx;
ADD16 rt, ra, rb
16-bit Addition
RADD16 rt, ra, rb
16-bit Signed Halving Addition
rt.Hx = (ra.Hx + rb.Hx) s>> 1; (x=1..0)
URADD16 rt, ra, rb
16-bit Unsigned Halving Addition
rt.Hx = (ra.Hx + rb.Hx) u>> 1; (x=1..0)
KADD16 rt, ra, rb
16-bit Signed Saturating Addition
UKADD16 rt, ra, rb
16-bit Unsigned Saturating Addition
SUB16 rt, ra, rb
16-bit Subtraction
RSUB16 rt, ra, rb
16-bit Signed Halving Subtraction
(x=1..0)
rt.Hx = SAT.Q15(ra.Hx + rb.Hx); (x=1..0) rt.Hx = SAT.U16(ra.Hx + rb.Hx); (x=1..0) rt.Hx = ra.Hx - rb.Hx; (x=1..0) rt.Hx = (ra.Hx - rb.Hx) s>> 1; (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 5
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
URSUB16 rt, ra, rb
16-bit Unsigned Halving Subtraction
KSUB16 rt, ra, rb
16-bit Signed Saturating Subtraction
UKSUB16 rt, ra, rb
16-bit Unsigned Saturating Subtraction
CRAS16 rt, ra, rb
16-bit Cross Add & Sub
RCRAS16 rt, ra, rb
16-bit Signed Halving Cross Add & Sub
URCRAS16 rt, ra, rb
KCRAS16 rt, ra, rb
UKCRAS16 rt, ra, rb
rt.Hx = SAT.Q15(ra.Hx - rb.Hx); (x=1..0) rt.Hx = SAT.U16(ra.Hx - rb.Hx); (x=1..0) rt.H = ra.H + rb.L; rt.L = ra.L – rb.H; rt.H = (ra.H + rb.L) s>> 1; rt.L = (ra.L – rb.H) s>> 1;
Sub
rt.L = (ra.L – rb.H) u>> 1;
16-bit Signed Saturating Cross Add &
rt.H = SAT.Q15(ra.H + rb.L);
Sub
rt.L = SAT.Q15(ra.L – rb.H);
16-bit Unsigned Saturating Cross Add
rt.H = SAT.U16(ra.H + rb.L);
& Sub
rt.L = SAT.U16(ra.L – rb.H);
RCRSA16 rt, ra, rb
16-bit Signed Halving Cross Sub & Add
UKCRSA16 rt, ra, rb
(x=1..0)
rt.H = (ra.H + rb.L) u>> 1;
16-bit Cross Sub & Add
KCRSA16 rt, ra, rb
rt.Hx = (ra.Hx - rb.Hx) u>> 1;
16-bit Unsigned Halving Cross Add &
CRSA16 rt, ra, rb
URCRSA16 rt, ra, rb
Operation
rt.H = ra.H - rb.L; rt.L = ra.L + rb.H; rt.H = (ra.H - rb.L) s>> 1; rt.L = (ra.L + rb.H) s>> 1;
16-bit Unsigned Halving Cross Sub &
rt.H = (ra.H - rb.L) u>> 1;
Add
rt.L = (ra.L + rb.H) u>> 1;
16-bit Signed Saturating Cross Sub &
rt.H = SAT.Q15(ra.H - rb.L);
Add
rt.L = SAT.Q15(ra.L + rb.H);
16-bit Unsigned Saturating Cross Sub
rt.H = SAT.U16(ra.H - rb.L);
& Add
rt.L = SAT.U16(ra.L + rb.H);
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 6
Instruction Summary for "P" ISA Extension Proposal 2.2.2.
8-bit Addition & Subtraction Instructions
The SIMD 8-bit add/subtract instructions support 2 types of operations: Addition (four 8-bit additions), Subtraction (four 8-bit subtractions). The overflow handling of these instructions can have 5 variations: Wrap-around (dropping overflow), Signed Halving (keeping overflow by dropping 1 LSB bit), Unsigned Halving, Signed Saturation (clipping overflow), and Unsigned Saturation. Together, there are 10 SIMD 8-bit add/subtract instructions. Table 2. SIMD 8-bit Add/Subtract Instructions Mnemonic
Instruction
ADD8 rt, ra, rb
8-bit Addition
RADD8 rt, ra, rb
8-bit Signed Halving Addition
URADD8 rt, ra, rb
8-bit Unsigned Halving Addition
KADD8 rt, ra, rb
8-bit Signed Saturating Addition
UKADD8 rt, ra, rb
8-bit Unsigned Saturating Addition
SUB8 rt, ra, rb
8-bit Subtraction
RSUB8 rt, ra, rb
8-bit Signed Halving Subtraction
URSUB8 rt, ra, rb
8-bit Unsigned Halving Subtraction
Operation rt.Bx = ra.Bx + rb.Bx; (x=3..0) rt.Bx = (ra.Bx + rb.Bx) s>> 1; (x=3..0) rt.Bx = (ra.Bx + rb.Bx) u>> 1; (x=3..0) rt.Bx = SAT.Q7(ra.Bx + rb.Bx); (x=3..0) rt.Bx = SAT.U8(ra.Bx + rb.Bx); (x=3..0) rt.Bx = ra.Bx - rb.Bx; (x=3..0) rt.Bx = (ra.Bx - rb.Bx) s>> 1; (x=3..0) rt.Bx = (ra.Bx - rb.Bx) u>> 1; (x=3..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 7
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
KSUB8 rt, ra, rb
8-bit Signed Saturating Subtraction
UKSUB8 rt, ra, rb
8-bit Unsigned Saturating Subtraction
2.2.3.
Operation rt.Bx = SAT.Q7(ra.Bx - rb.Bx); (x=3..0) rt.Bx = SAT.U8(ra.Bx - rb.Bx); (x=3..0)
16-bit Shift Instructions
There are 13 instructions here. Table 3. SIMD 16-bit Shift Instructions Mnemonic SRA16 rt, ra, rb
SRAI16 rt, ra, im4u
SRA16.u rt, ra, rb
SRAI16.u rt, ra, im4u
Instruction 16-bit Shift Right Arithmetic
(x=1..0) rt.Hx = ra.Hx s>> im4u;
Immediate
(x=1..0)
16-bit Rounding Shift Right Arithmetic
rt.Hx = RUND(ra.Hx s>> rb[3:0]); (x=1..0)
16-bit Rounding Shift Right Arithmetic
rt.Hx = RUND(ra.Hx s>> im4u);
Immediate
(x=1..0)
16-bit Shift Right Logical
SRLI16 rt, ra, im4u
16-bit Shift Right Logical Immediate
SRL16.u rt, ra, rb
16-bit Rounding Shift Right Logical
SLL16 rt, ra, rb
rt.Hx = ra.Hx s>> rb[3:0];
16-bit Shift Right Arithmetic
SRL16 rt, ra, rb
SRLI16.u rt, ra, im4u
Operation
rt.Hx = ra.Hx u>> rb[3:0]; (x=1..0) rt.Hx = ra.Hx u>> im4u; (x=1..0) rt.Hx = RUND(ra.Hx u>> rb[3:0]); (x=1..0)
16-bit Rounding Shift Right Logical
rt.Hx = RUND(ra.Hx u>> im4u);
Immediate
(x=1..0)
16-bit Shift Left Logical
rt.Hx = ra.Hx << rb[3:0]; (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 8
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation rt.Hx = ra.Hx << im4u;
SLLI16 rt, ra, im4u
16-bit Shift Left Logical Immediate
KSLL16 rt, ra, rb
16-bit Saturating Shift Left Logical
Pseudo instruction of “KSLRA16”.
16-bit Saturating Shift Left Logical
rt.Hx = SAT.Q15(ra.Hx << im4u);
Immediate
(x=1..0)
KSLLI16 rt, ra, im4u
(x=1..0)
if (rb[4:0] < 0)
KSLRA16 rt, ra, rb
16-bit Shift Left Logical with Saturation & Shift Right Arithmetic
rt.Hx = ra.Hx s>> -rb[4:0]; if (rb[4:0] > 0) rt.Hx = SAT.Q15(ra.Hx << rb[4:0]); (x=1..0) if (rb[4:0] < 0)
KSLRA16.u rt, ra, rb
16-bit Shift Left Logical with
rt.Hx = RUND(ra.Hx s>> -rb[4:0]);
Saturation & Rounding Shift Right
if (rb[4:0] > 0)
Arithmetic
rt.Hx = SAT.Q15(ra.Hx << rb[4:0]); (x=1..0)
2.2.4.
16-bit Compare Instructions
There are 5 instructions here. Table 4. SIMD 16-bit Compare Instructions Mnemonic
Instruction
CMPEQ16 rt, ra, rb
16-bit Compare Equal
SCMPLT16 rt, ra, rb
16-bit Signed Compare Less Than
SCMPLE16 rt, ra, rb
UCMPLT16 rt, ra, rb
Operation rt.Hx = (ra.Hx == rb.Hx)? 0xffff : 0; (x=1..0) rt.Hx = (ra.Hx < rb.Hx)? 0xffff : 0; (x=1..0)
16-bit Signed Compare Less Than &
rt.Hx = (ra.Hx <= rb.Hx)? 0xffff : 0;
Equal
(x=1..0)
16-bit Unsigned Compare Less Than
rt.Hx = (ra.Hx u< rb.Hx)? 0xffff : 0; (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 9
Instruction Summary for "P" ISA Extension Proposal Mnemonic UCMPLE16 rt, ra, rb
2.2.5.
Instruction
Operation
16-bit Unsigned Compare Less Than &
rt.Hx = (ra.Hx u<= rb.Hx)? 0xffff : 0;
Equal
(x=1..0)
8-bit Compare Instructions
There are 5 instructions here. Table 5. SIMD 8-bit Compare Instructions Mnemonic
Instruction
CMPEQ8 rt, ra, rb
8-bit Compare Equal
SCMPLT8 rt, ra, rb
8-bit Signed Compare Less Than
SCMPLE8 rt, ra, rb
UCMPLT8 rt, ra, rb
UCMPLE8 rt, ra, rb
2.2.6.
Operation rt.Bx = (ra.Bx == rb.Bx)? 0xff : 0; (x=3..0) rt.Bx = (ra.Bx < rb.Bx)? 0xff : 0; (x=3..0)
8-bit Signed Compare Less Than &
rt.Bx = (ra.Bx <= rb.Bx)? 0xff : 0;
Equal
(x=3..0)
8-bit Unsigned Compare Less Than
rt.Bx = (ra.Bx u< rb.Bx)? 0xff : 0; (x=3..0)
8-bit Unsigned Compare Less Than &
rt.Bx = (ra.Bx u<= rb.Bx)? 0xff : 0;
Equal
(x=3..0)
16-bit Misc Instructions
There are 13 instructions here. Table 6. SIMD 16-bit Miscellaneous Instructions Mnemonic SMIN16 rt, ra, rb
Instruction 16-bit Signed Minimum
Operation rt.Hx = (ra.Hx < rb.Hx)? ra.Hx : rb.Hx; (x=1..0) rt.Hx = (ra.Hx u< rb.Hx)? ra.Hx :
UMIN16 rt, ra, rb
16-bit Unsigned Minimum
rb.Hx; (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 10
Instruction Summary for "P" ISA Extension Proposal Mnemonic SMAX16 rt, ra, rb
Instruction 16-bit Signed Maximum
Operation rt.Hx = (ra.Hx > rb.Hx)? ra.Hx : rb.Hx; (x=1..0) rt.Hx = (ra.Hx u> rb.Hx)? ra.Hx :
UMAX16 rt, ra, rb
16-bit Unsigned Maximum
rb.Hx; (x=1..0)
SCLIP16 rt, ra, imm4u
UCLIP16 rt, ra, imm4u
n = imm4u; 16-bit Signed Clip Value
rt.Hx = SAT.Qn(ra.Hx); (x=1..0) m = imm4u;
16-bit Unsigned Clip Value
rt.Hx = SAT.Um(ra.Hx); (x=1..0) rt.Hx = SAT.Q15((ra.Hx s* rb.Hx) >>
KHM16 rt, ra, rb
16-bit Signed Multiply
15); (x=1..0) rt.Hx = SAT.Q15((ra.Hx s* rb.Hy) >>
KHMX16 rt, ra, rb
16-bit Crossed Signed Multiply
15); (x,y)=(1,0), (0,1)
SMUL16 rt, ra, rb
SMULX16 rt, ra, rb
UMUL16 rt, ra, rb
UMULX16 rt, ra, rb
KABS16 rt, ra
16-bit Signed Multiply to 32-bit
r[tU] = ra.H1 s* rb.H1; r[tL] = ra.H0 s* rb.H0;
16-bit Signed Crossed Multiply to
r[tU] = ra.H1 s* rb.H0;
32-bit
r[tL] = ra.H0 s* rb.H1;
16-bit Unsigned Multiply to 32-bit
r[tU] = ra.H1 u* rb.H1; r[tL] = ra.H0 u* rb.H0;
16-bit Unsigned Crossed Multiply to
r[tU] = ra.H1 u* rb.H0;
32-bit
r[tL] = ra.H0 u* rb.H1;
16-bit Absolute Value
rt.Hx = SAT.Q15(ABS(ra.Hx)); (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 11
Instruction Summary for "P" ISA Extension Proposal 2.2.7.
8-bit Misc Instructions
There are 5 instructions here. Table 7. SIMD 8-bit Miscellaneous Instructions Mnemonic
Instruction
SMIN8 rt, ra, rb
8-bit Signed Minimum
UMIN8 rt, ra, rb
8-bit Unsigned Minimum
SMAX8 rt, ra, rb
8-bit Signed Maximum
UMAX8 rt, ra, rb
8-bit Unsigned Maximum
KABS8 rt, ra
8-bit Absolute Value
2.2.8.
Operation rt.Bx = (ra.Bx < rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx u< rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx > rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx u> rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = SAT.Q7(ABS(ra.Bx)); (x=3..0)
8-bit Unpacking Instructions
There are 8 instructions here. Table 8. 8-bit Unpacking Instructions Mnemonic
Instruction
SUNPKD810 rt, ra
Signed Unpacking Bytes 1 & 0
SUNPKD820 rt, ra
Signed Unpacking Bytes 2 & 0
SUNPKD830 rt, ra
Signed Unpacking Bytes 3 & 0
SUNPKD831 rt, ra
Signed Unpacking Bytes 3 & 1
ZUNPKD810 rt, ra
Unsigned Unpacking Bytes 1 & 0
Operation rt.Hx = SE16(ra.By); (x,y) = (1,1), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,2), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,3), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,3), (0,1) rt.Hx = ZE16(ra.By); (x,y) = (1,1), (0,0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 12
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
ZUNPKD820 rt, ra
Unsigned Unpacking Bytes 2 & 0
ZUNPKD830 rt, ra
Unsigned Unpacking Bytes 3 & 0
ZUNPKD831 rt, ra
Unsigned Unpacking Bytes 3 & 1
Operation rt.Hx = ZE16(ra.By); (x,y) = (1,2), (0,0) rt.Hx = ZE16(ra.By); (x,y) = (1,3), (0,0) rt.Hx = ZE16(ra.By); (x,y) = (1,3), (0,1)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 13
Instruction Summary for "P" ISA Extension Proposal 2.3.
Non-SIMD Data Processing Instructions
2.3.1.
32-bit Addition/Subtraction Instructions
There are 4 instructions here. Table 9. 32-bit Add/Sub Instructions Mnemonic
Instruction
Operation
RADDW rt, ra, rb
32-bit Signed Halving Addition
rt = (ra + rb) s>> 1
URADDW rt, ra, rb
32-bit Unsigned Halving Addition
rt = (ra + rb) u>> 1
RSUBW rt, ra, rb
32-bit Signed Halving Subtraction
rt = (ra - rb) s>> 1
URSUBW rt, ra, rb
32-bit Unsigned Halving Subtraction
rt = (ra - rb) u>> 1
2.3.2.
32-bit Shift Instructions
There are 5 instructions here. Table 10. 32-bit Shift Instructions Mnemonic SRA.u rt, ra, rb SRAI.u rt, ra, imm5u
KSLL rt, ra, rb
KSLLI rt, ra, imm5u
Instruction Rounding Shift Right Arithmetic Rounding Shift Right Arithmetic Immediate Saturating Shift Left Logical Saturating Shift Left Logical Immediate
Operation rt = RUND(ra s>> rb[4:0]) rt = RUND(ra s>> imm5u) rt = SAT.Q31(ra << rb[4:0]); Pseudo instruction of “KSLRAW”. rt = SAT.Q31(ra << imm5u) if (rb[7:0] < 0)
KSLRAW.u rt, ra, rb
Shift Left Logical with Saturation &
rt = RUND(ra s>> SAT.U5(-rb[7:0]));
Rounding Shift Right Arithmetic
if (rb[7:0] > 0) rt = SAT.Q31(ra << SAT.U5(rb[7:0]));
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 14
Instruction Summary for "P" ISA Extension Proposal 2.3.3.
16-bit Packing Instructions
There are 4 instructions here. Table 11. 16-bit Packing Instructions Mnemonic
Instruction
Operation
PKBB16 rt, ra, rb
Pack two 16-bit data from Bottoms
rt = CONCAT(ra.H0, rb.H0);
PKBT16 rt, ra, rb
Pack two 16-bit data Bottom & Top
rt = CONCAT(ra.H0, rb.H1);
PKTB16 rt, ra, rb
Pack two 16-bit data Top & Bottom
rt = CONCAT(ra.H1, rb.H0);
PKTT16 rt, ra, rb
Pack two 16-bit data from Tops
rt = CONCAT(ra.H1, rb.H1);
2.3.4.
Most Significant Word “32x32” Multiply & Add Instructions
There are 8 instructions here. Table 12. Signed MSW 32x32 Multiply and Add Instructions Mnemonic SMMUL rt, ra, rb
Instruction MSW “32 x 32” Signed Multiplication (MSW 32 = 32x32)
Operation rt = (ra*rb)[63:32];
MSW “32 x 32” Signed Multiplication SMMUL.u rt, ra, rb
with Rounding
rt = RUND(ra*rb)[63:32];
(MSW 32 = 32x32) MSW “32 x 32” Signed Multiplication KMMAC rt, ra, rb
and Saturating Addition
rt = SAT.Q31(rt + (ra*rb)[63:32]);
(MSW 32 = 32 + 32x32) MSW “32 x 32” Signed Multiplication KMMAC.u rt, ra, rb
and Saturating Addition with
rt = SAT.Q31(rt +
Rounding
RUND(ra*rb)[63:32]);
(MSW 32 = 32 + 32x32)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 15
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation
MSW “32 x 32” Signed Multiplication KMMSB rt, ra, rb
and Saturating Subtraction
rt = SAT.Q31(rt - (ra*rb)[63:32]);
(MSW 32 = 32 - 32x32) MSW “32 x 32” Signed Multiplication KMMSB.u rt, ra, rb
and Saturating Subtraction with
rt = SAT.Q31(rt -
Rounding
RUND(ra*rb)[63:32]);
(MSW 32 = 32 - 32x32) MSW “32 x 32” Signed Multiplication KWMMUL rt, ra, rb
rt = SAT.Q31((ra*rb << 1)[63:32]);
& Double (MSW 32 = 32x32 << 1) MSW “32 x 32” Signed Multiplication
KWMMUL.u rt, ra,
& Double with Rounding
rb
2.3.5.
(MSW 32 = 32x32 << 1)
rt = SAT.Q31(RUND(ra*rb << 1)[63:32]);
Most Significant Word “32x16” Multiply & Add Instructions
There are 8 instructions here. Table 13. Signed MSW 32x16 Multiply and Add Instructions Mnemonic
Instruction
Operation
MSW “32 x Bottom 16” Signed SMMWB rt, ra, rb
Multiplication
rt = (ra*rb.L)[47:16];
(MSW 32 = 32x16) MSW “32 x Bottom 16” Signed SMMWB.u rt, ra, rb
Multiplication with Rounding
rt = RUND(ra*rb.L)[47:16];
(MSW 32 = 32x16) MSW “32 x Top 16” Signed SMMWT rt, ra, rb
Multiplication
rt = (ra*rb.H)[47:16];
(MSW 32 = 32x16)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 16
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation
MSW “32 x Top 16” Signed SMMWT.u rt, ra, rb
Multiplication with Rounding
rt = RUND(ra*rb.H)[47:16];
(MSW 32 = 32x16) MSW “32 x Bottom 16” Signed KMMAWB rt, ra, rb
Multiplication and Saturating Addition
rt = SAT.Q31(rt + (ra*rb.L)[47:16]);
(MSW 32 = 32 + 32x16) MSW “32 x Bottom 16” Signed KMMAWB.u rt, ra, rb
Multiplication and Saturating Addition
rt = SAT.Q31(rt +
with Rounding
RUND(ra*rb.L)[47:16]);
(MSW 32 = 32 + 32x16) MSW “32 x Top 16” Signed KMMAWT rt, ra, rb
Multiplication and Saturating Addition
rt = SAT.Q31(rt + (ra*rb.H)[47:16]);
(MSW 32 = 32 + 32x16) MSW “32 x Top 16” Signed KMMAWT.u rt, ra, rb
Multiplication and Saturating Addition
rt = SAT.Q31(rt +
with Rounding
RUND(ra*rb.H)[47:16]);
(MSW 32 = 32 + 32x16)
2.3.6.
Signed 16-bit Multiply with 32-bit Add/Subtract Instructions
There are 18 instructions here. Table 14. Signed 16-bit Multiply 32-bit Add/Subtract Instructions Mnemonic
Instruction
Operation
Signed Multiply Bottom 16 & Bottom SMBB rt, ra, rb
16
rt = ra.L*rb.L;
(32 = 16x16) SMBT rt, ra, rb
Signed Multiply Bottom 16 & Top 16 (32 = 16x16)
rt = ra.L*rb.H;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 17
Instruction Summary for "P" ISA Extension Proposal Mnemonic SMTT rt, ra, rb
KMDA rt, ra, rb
Instruction Signed Multiply Top 16 & Top 16 (32 = 16x16) Two “16x16” and Signed Addition (32 = 16x16 + 16x16)
Operation rt = ra.H*rb.H;
rt = SAT.Q31(ra.H*rb.H + ra.L*rb.L);
Two Crossed “16x16” and Signed KMXDA rt, ra, rb
Addition
rt = SAT.Q31(ra.H*rb.L + ra.L*rb.H);
(32 = 16x16 + 16x16) SMDS rt, ra, rb
Two “16x16” and Signed Subtraction (32 = 16x16 - 16x16)
rt = (ra.H*rb.H) - (ra.L*rb.L);
Two “16x16” and Signed Reversed SMDRS rt, ra, rb
Subtraction
rt = (ra.L*rb.L) - (ra.H*rb.H);
(32 = 16x16 - 16x16) Two Crossed “16x16” and Signed SMXDS rt, ra, rb
Subtraction
rt = (ra.H*rb.L) - (ra.L*rb.H);
(32 = 16x16 - 16x16) “Bottom 16 x Bottom 16” with 32-bit KMABB rt, ra, rb
Signed Addition
rt = SAT.Q31(rt + ra.L*rb.L);
(32 = 32 + 16x16) “Bottom 16 x Top 16” with 32-bit KMABT rt, ra, rb
Signed Addition
rt = SAT.Q31(rt + ra.L*rb.H);
(32 = 32 + 16x16) “Top 16 x Top 16” with 32-bit Signed KMATT rt, ra, rb
Addition
rt = SAT.Q31(rt + ra.H*rb.H);
(32 = 32 + 16x16) Two “16x16” with 32-bit Signed Double KMADA rt, ra, rb
Addition (32 = 32 + 16x16 + 16x16) Two Crossed “16x16” with 32-bit
KMAXDA rt, ra, rb
Signed Double Addition (32 = 32 + 16x16 + 16x16)
rt = SAT.Q31(rt + ra.H*rb.H + ra.L*rb.L);
rt = SAT.Q31(rt + ra.H*rb.L + ra.L*rb.H);
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 18
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction Two “16x16” with 32-bit Signed
KMADS rt, ra, rb
Addition and Subtraction (32 = 32 + 16x16 - 16x16) Two “16x16” with 32-bit Signed
KMADRS rt, ra, rb
Addition and Reversed Subtraction (32 = 32 + 16x16 - 16x16) Two Crossed “16x16” with 32-bit
KMAXDS rt, ra, rb
Signed Addition and Subtraction (32 = 32 + 16x16 - 16x16) Two “16x16” with 32-bit Signed Double
KMSDA rt, ra, rb
Subtraction (32 = 32 - 16x16 - 16x16) Two Crossed “16x16” with 32-bit
KMSXDA rt, ra, rb
Signed Double Subtraction (32 = 32 - 16x16 - 16x16)
2.3.7.
Operation rt = SAT.Q31(rt + ra.H*rb.H ra.L*rb.L);
rt = SAT.Q31(rt + ra.L*rb.L ra.H*rb.H);
rt = SAT.Q31(rt + ra.H*rb.L ra.L*rb.H);
rt = SAT.Q31(rt - ra.H*rb.H ra.L*rb.L);
rt = SAT.Q31(rt - ra.H*rb.L ra.L*rb.H);
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions Table 15. Signed 16-bit Multiply 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation a64 = r[aU].r[aL];
SMAL rt, ra, rb
“16 x 16” with 64-bit Signed Addition (64 = 64 + 16x16)
t64 = a64 + rb.H*rb.L;
r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 19
Instruction Summary for "P" ISA Extension Proposal 2.3.8.
Miscellaneous Instructions
There are 7 instructions here. Table 16. Miscellaneous Instructions Mnemonic
Instruction
SCLIP32 rt, ra, imm5u
Signed Clip Value
UCLIP32 rt, ra, imm5u
Unsigned Clip Value
BITREV rt, ra, rb
Bit Reverse
BITREVI rt, ra, imm5u
Bit Reverse Immediate
WEXT rt, ra, rb
Extract 32-bit from a 64-bit value
WEXTI rt, ra, imm5u
Operation n = imm5u; rt = SAT.Qn(ra); m = imm5u; rt = SAT.Um(ra); w = rb[4:0]; rt = ra[0:31] u>> (31- w); w = imm5u; rt = ra[0:31] u>> (31- w); a64 = r[aU].r[aL]; lsb = rb[4:0] rt = a64[(31+lsb):lsb];
Extract 32-bit from a 64-bit value
a64 = r[aU].r[aL]; lsb = imm5u
Immediate
rt = a64[(31+lsb):lsb];
BPICK rt, ra, rb, rc
Bit-wise Pick
INSB rt, ra, imm2u
Insert Byte
rt[i] = rc[i]? ra[i] : rb[i]; (i=31..0) insLSB = imm2u*8; rt[(insLSB+7):insLSB] = ra[7:0]; If (Rs1 == 0x80000000) {
KABS
Rd, Rs1
Absolute with Saturation
Rd=0x7fffffff; OV=1; } else { Rd = | Rs1 |; } If (Rs1 > 2 imm5u -1) {
UCLIP32 Rd, Rs1, imm5u
Clip Value
Rd=2 imm5u -1; OV=1; } else if (Rs1 < 0) { Rd=0; OV=1; } } else { Rd=Rs1; }
SCLIP32 Rd, Rs1, imm5u
Clip Value Signed
If (Rs1 > 2 imm5u -1) { Rd=2 imm5u -1; OV=1;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 20
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation } else if (Rs1 < -2 imm5u ) { Rd=-2 imm5u; OV=1; } else { Rd=Rs1; } Rd =
CLZ Rd, Rs1
Count leading zero
CLO Rd, Rs1
Count leading one
MAX Rd, Rs1, Rs2
Return the larger signed value
Rd = signed-max(Rs1, Rs2)
MIN Rd, Rs1, Rs2
Return the smaller signed value
Rd = signed-min (Rs1, Rs2)
AVE Rd, Rs1, Rs1
Average two signed integers with rounding
COUNT_ZERO_FROM_MSB(Rs1) Rd = COUNT_ONE_FROM_MSB(Rs1)
Rd = (Rs1 + Rs2 + 1) (arith) >> 1 a = ABS(Rs1(7,0) – Rs2(7,0));
PBSAD Rd, Rs1, Rs1
Parallel Byte Sum of Absolute Difference
b = ABS(Rs1(15,8) – Rs2(15,8)); c = ABS(Rs1(23,16) – Rs2(23,16)); d = ABS(Rs1(31,24) – Rs2(31,24)); Rd = a + b + c + d; a = ABS(Rs1(7,0) – Rs2(7,0));
PBSADA Rd, Rs1, Rs1
Parallel Byte Sum of Absolute Difference Accumulate
b = ABS(Rs1(15,8) – Rs2(15,8)); c = ABS(Rs1(23,16) – Rs2(23,16)); d = ABS(Rs1(31,24) – Rs2(31,24)); Rd = Rd + a + b + c + d;
2.3.9.
Q31 saturation Instructions
The following table lists instructions related to Q31 arithmetic. Table 17. Q31 saturation ALU Instructions Mnemonic
Instruction
Operation
KADDW Rt, Ra, Rb
Add with Q31 saturation.
Rt = SAT.Q31(Ra + Rb)
KSUBW
Subtract with Q31 saturation.
Rt = SAT.Q31(Ra – Rb)
Rt, Ra, Rb
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 21
Instruction Summary for "P" ISA Extension Proposal Mnemonic
KSLRAW Rt, Ra, Rb
Instruction Logical left shift or arithmetic right shift with Q31 saturation.
Operation (Rb[7:0] >=0) ? Rt = SAT.Q31(Ra << Rb[7:0]): Rt = (Ra s>> -Rb[7:0])
Multiply Q15 numbers with Q31 KDMBB Rt, Ra, Rb
saturation in bottom parts of two
Rt = SAT.Q31(Ra.H0 * Rb.H0)
registers. Multiply Q15 numbers with Q31 KDMTB Rt, Ra, Rb
saturation in top and bottom parts of
Rt = SAT.Q31(Ra.H1 * Rb.H0)
two registers. Multiply Q15 numbers with Q31 KDMBT Rt, Ra, Rb
saturation in bottom and top parts of
Rt = SAT.Q31(Ra.H0 * Rb.H1)
two registers. KDMTT Rt, Ra, Rb
Multiply Q15 numbers with Q31 saturation in top parts of two registers.
Rt = SAT.Q31(Ra.H1 * Rb.H1)
2.3.10. Q15 saturation instructions The following table lists instructions related to Q15 arithmetic. Table 18. Q15 saturation ALU Instructions Mnemonic
Instruction
Operation
KADDH Rt, Ra, Rb
Add with Q15 saturation.
Rt = SAT.Q15(Ra + Rb)
KSUBH
Subtract with Q15 saturation
Rt = SAT.Q15(Ra – Rb)
Rt, Ra, Rb
Multiply Q15 numbers in bottom parts KHMBB Rt, Ra, Rb
of two registers and extract high part with Q15 saturation.
Rt = SAT.Q15((Ra.H0 * Rb.H0) s>> 15)
Multiply Q15 numbers in top and KHMTB Rt, Ra, Rb
bottom parts of two registers and
Rt = SAT.Q15((Ra.H1 * Rb.H0) s>> 15)
extract high part with Q15 saturation. KHMBT Rt, Ra, Rb
Multiply Q15 numbers in bottom and top parts of two registers and extract
Rt = SAT.Q15((Ra.H0 * Rb.H1) s>> 15)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 22
Instruction Summary for "P" ISA Extension Proposal high part with Q15 saturation. Multiply Q15 numbers in top parts of KHMTT Rt, Ra, Rb
two registers and extract high part with
Rt = SAT.Q15((Ra.H1 * Rb.H1) s>> 15)
Q15 saturation.
2.3.11. Overflow status manipulation instructions The following table lists the user instructions related to Overflow (OV) flag manipulation. Table 19. OV (Overflow) flag Set/Clear Instructions Mnemonic
Instruction
Operation
RDOV Rt
Read mxstatus.OV to Rt.
Rt = ZE32(mxstatus.OV)
CLROV
Clear mxstatus.OV flag.
mxstatus.OV = 0
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 23
Instruction Summary for "P" ISA Extension Proposal 2.4.
64-bit Instructions
2.4.1.
64-bit Addition & Subtraction Instructions Table 20. 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
ADD64 rt, ra, rb
64-bit Addition
t64 = a64 + b64;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
RADD64 rt, ra, rb
64-bit Signed Halving Addition
t64 = (a64 + b64) s>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
URADD64 rt, ra, rb
64-bit Unsigned Halving Addition
t64 = (a64 + b64) u>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
KADD64 rt, ra, rb
64-bit Signed Saturating Addition
t64 = SAT.Q63(a64 + b64);
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
UKADD64 rt, ra, rb
64-bit Unsigned Saturating Addition
t64 = SAT.U64(a64 + b64);
r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 24
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
SUB64 rt, ra, rb
64-bit Subtraction
t64 = a64 - b64;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
RSUB64 rt, ra, rb
64-bit Signed Halving Subtraction
t64 = (a64 - b64) s>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
URSUB64 rt, ra, rb
64-bit Unsigned Halving Subtraction
t64 = (a64 - b64) u>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
KSUB64 rt, ra, rb
64-bit Signed Saturating Subtraction
t64 = SAT.Q63(a64 - b64);
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
UKSUB64 rt, ra, rb
64-bit Unsigned Saturating Subtraction
t64 = SAT.U64(a64 - b64);
r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 25
Instruction Summary for "P" ISA Extension Proposal 2.4.2.
32-bit Multiply with 64-bit Add/Subtract Instructions Table 21. 32-bit Multiply 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation c64 = r[tU].r[tL];
SMAR64 rt, ra, rb
32x32 with 64-bit Signed Addition
t64 = c64 + ra*rb; // signed
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
SMSR64 rt, ra, rb
32x32 with 64-bit Signed Subtraction
t64 = c64 - ra*rb; // signed
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UMAR64 rt, ra, rb
32x32 with 64-bit Unsigned Addition
t64 = c64 + ra*rb; // unsigned
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UMSR64 rt, ra, rb
32x32 with 64-bit Unsigned Subtraction
t64 = c64 - ra*rb; // unsigned
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
KMAR64 rt, ra, rb
32x32 with Saturating 64-bit Signed Addition
t64 = SAT.Q63(c64 + ra*rb);
r[tU].r[tL] = t64;
KMSR64 rt, ra, rb
32x32 with Saturating 64-bit Signed Subtraction
c64 = r[tU].r[tL];
t64 = SAT.Q63(c64 – ra*rb);
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 26
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UKMAR64 rt, ra, rb
32x32 with Saturating 64-bit Unsigned Addition
t64 = SAT.U64(c64 + ra*rb);
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UKMSR64 rt, ra, rb
32x32 with Saturating 64-bit Unsigned Subtraction
t64 = SAT.U64(c64 - ra*rb);
r[tU].r[tL] = t64;
2.4.3.
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions Table 22. Signed 16-bit Multiply 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation c64 = r[tU].r[tL];
“Bottom 16 x Bottom 16” with 64-bit SMALBB rt, ra, rb
Signed Addition
t64 = c64 + ra.L*rb.L;
(64 = 64 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; “Bottom 16 x Top 16” with 64-bit SMALBT rt, ra, rb
Signed Addition
t64 = c64 + ra.L*rb.H;
(64 = 64 + 16x16) r[tU].r[tL] = t64; “Top 16 x Top 16” with 64-bit Signed SMALTT rt, ra, rb
Addition (64 = 64 + 16x16)
c64 = r[tU].r[tL];
t64 = c64 + ra.H*rb.H;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 27
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation r[tU].r[tL] = t64;
c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed Double SMALDA rt, ra, rb
Addition
t64 = c64 + ra.H*rb.H + ra.L*rb.L;
(64 = 64 + 16x16 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two Crossed “16x16” with 64-bit SMALXDA rt, ra, rb
Signed Double Addition
t64 = c64 + ra.H*rb.L + ra.L*rb.H;
(64 = 64 + 16x16 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed SMALDS rt, ra, rb
Addition and Subtraction
t64 = c64 + ra.H*rb.H - ra.L*rb.L;
(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed SMALDRS rt, ra, rb
Addition and Reversed Subtraction
t64 = c64 + ra.L*rb.L - ra.H*rb.H;
(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two Crossed “16x16” with 64-bit SMALXDS rt, ra, rb
Signed Addition and Subtraction
t64 = c64 + ra.H*rb.L - ra.L*rb.H;
(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; Two “16x16” with 64-bit Signed Double SMSLDA rt, ra, rb
Subtraction (64 = 64 - 16x16 - 16x16)
c64 = r[tU].r[tL];
t64 = c64 - ra.H*rb.H - ra.L*rb.L;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 28
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation r[tU].r[tL] = t64; c64 = r[tU].r[tL];
Two Crossed “16x16” with 64-bit SMSLXDA rt, ra, rb
Signed Double Subtraction
t64 = c64 - ra.H*rb.L - ra.L*rb.H;
(64 = 64 - 16x16 - 16x16) r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 29
Instruction Summary for "P" ISA Extension Proposal 2.5.
Zero-Overhead Loop (ZOL) Mechanism Instructions
The following table lists the instructions in the Zero-Overhead Loop Mechanism. Table 23. ZOL Mechanism Instructions Mnemonic MTLBI imm16s MTLEI imm16s
Instruction Move to Loop Begin register Immediate. Move to Loop End register Immediate.
Operation LB = PC + SE32(imm16s << 1) LE = PC + SE32(imm16s << 1)
3. User-mode CSR Registers 3.1.
Loop Begin Register
Mnemonic Name: LB Presence Condition: TBD CSR Encoding: TBD This register stores the beginning address of a loop. It is used by the Zero-Overhead Loop mechanism. If the value of this register is larger than the value of the Loop End (LE) register while the zero-overhead loop mechanism is turned on, UNPREDICTABLE behavior may happen. It is a 32-bit register. And its format is as follows:
31
0 LB(31,0)
Bit 0 will be ignored by hardware. Setting it to 1 will not generate an “unaligned” exception. It is a RW type register. Its reset value is defined as DC (Don’t Care).
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 30
Instruction Summary for "P" ISA Extension Proposal 3.2.
Loop End Register
Mnemonic Name: LE Presence Condition: TBD CSR Encoding: TBD This register stores the ending address of a loop. It is used by the Zero-Overhead Loop mechanism. If the value of this register is smaller than the value of the Loop End (LE) register while the zero-overhead loop mechanism is turned on, UNPREDICTABLE behavior may happen. It is a 32-bit register. And its format is as follows:
31
0 LE(31,0)
Bit 0 will be ignored by hardware when hardware logic compares the value of program counter with the value of this register. It is a RW type register. Its reset value is defined as DC (Don’t Care).
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 31
Instruction Summary for "P" ISA Extension Proposal 3.3.
Loop Count Register
Mnemonic Name: LC Presence Condition: TBD CSR Encoding: TBD This register contains the loop count number that the Zero-Overhead looping operation will be performed. The Zero-Overhead loop mechanism will be turned on when the value of LC is greater than 1. It is a 32-bit register. And its format is as follows:
31
0 LC(31,0)
It is a RW type register. Its reset value is defined as 0.
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 32
Instruction Summary for "P" ISA Extension Proposal
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 33