Please add some slider posts using the Slide custom post type.

specificationsMain

Errata for "The SPARC Architecture Manual, Version 9"

The following is a list of corrections for known errors in "The SPARC Architecture Manual, Version 9" book. Page number references are taken from the 1st (1994) printing, R1.4.2 (revision 1.4.2, identified by the text "SAV09R1429309" inside the front cover), unless otherwise indicated.

list by Erratum #

list by Page #



[1] Page 212 (second page of the Read State Register description), 4th paragraph from the top is printed as:

"RDFPRS waits for all pending FPops to complete before reading the FPRS register."

It *should* read:

"RDFPRS waits for all pending FPops **and loads of floating-point registers** to complete before reading the FPRS register."

[2] Page 234 (Tagged Add):
The "op3" column is incorrect in the Opcode table; the low-order bit should be "0" for all Tagged-Add instructions. The table should read:

Opcode op3 Operation
TADDcc 10 0000 ...  
TADDccTV 10 0010 ...  

[3] Page 80 (subsection 6.3.6.4, RESTORED description):
In the last line of the 6.3.6.4, change:
CLEANWIN < NWINDOWS
to:
(CLEANWIN < (NWINDOWS-1))

[4] Page 216 (RESTORED):
Third paragraph, last sentence, change
CLEANWIN != NWINDOWS
to:
(CLEANWIN < (NWINDOWS-1))

[5] Page 76, Section 6.3.4.[12] (branches):
A *taken* conditional branch (not just a conditional branch) should have been referred to in the last sentences of two subsections.

Change the last sentence in 6.3.4.1, "Conditional Branches", to:

Note that the annul behavior of a taken conditional branch is different from that of an unconditional branch.

And change the last sentence in 6.3.4.2, "Unconditional Branches" to:

Note that the annul behavior of a unconditional branch is different from that of a taken conditional branch.

[6] Page 290, Section G, Table 43:
In the table entries for "cas", "casl", "casx", and "casxl", the built-in constant names beginning with "ASI" should all be proceeded by "#" (as they were correctly specified on p.286).

[7] Page 242, Write State Register page:
In the Exceptions section:
"WRASR with rs1=16..31"
should read:

"WRASR with rd=16..31".

[8] Page 57, subsection 5.2.10 (Register-Window State Registers):
A clarification has been added to Section 5, to allow an implementation with 16 or fewer register windows the option to implement the CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN registers with fewer than 5 bits each, if desired. The following text was added:

IMPL. DEP. #126: Privileged registers CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN contain values in the range 0..NWINDOWS-1. The effect of writing a value greater than NWINDOWS-1 to any of these registers is undefined. Although the width of each of these five registers is nominally 5 bits, the width is implementation-dependent and shall be between ceil(log2(NWINDOWS)) and 5 bits, inclusive. If fewer than 5 bits are implemented, the unimplemented upper bits shall read as 0 and writes to them shall have no effect. All five registers shall be the same width.

[9] Page 268, Table 32:
As a privileged instruction, "RDPR" should be listed with a trailing superscript "P".

[10] Pages 58-59, subsection subsection 5.2.10 (Register-Window State Registers):
Added note to descriptions of CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN registers that the effect of writing a value to them greater than NWINDOWS-1 is undefined.

[11] Page 81:
In section 6.3.9, "FMOVc" was corrected to read "FMOVr".

[12] Page 81:
In section 6.3.9, a sentence was added stating that FSR.cexc and FSR.ftt are cleared by FMOVcc and FMOVr whether or not the move occurs.

[13] Page 171, Appendix A: Sentence added specifying that LDFSR does not affect the upper 32 bits of FSR.

[14] Page 220(r142)/A.49(r142), third paragraph:
The words "the" and "and" were transposed in the implementation dependency description. It now reads: "The location of the SIR_enable control flag and the means of accessing the SIR_enable control flag..."

[15] Page 229, paragraph beginning "Store integer...": "...used for the load..." changed to "...used for the store...".

[16] Page 231, Appendix A: Corrected SWAP deprecation note to recommend use of "CASA" or "CASXA" (not "CASX") in place of SWAP.

[17] Page 258, D.3.3., rule (1): The text was clarified, to read "(1) The execution of Y is conditional on X, and S(Y) is true."

[18] Page 312, Appendix I:
Missing word "not" added to Compatibility Note: "The coprocessor opcodes were eliminated because they have not been used in SPARC-V7 and SPARC-V8, ..." ^^^

[19a] Page 195, Appendix A:
Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

"movre" and "movrz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRZ.

"movrne" and "movrnz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRNZ.

[19b] Page 228, Appendix A:

Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

[20] Page 241, Appendix A:
Added footnote to Suggested Assembly Language Syntax table, noting that the suggested syntax for WRASR with rd=16..31 may vary, citing reference to implementation dependency #48.

(Suggested Assembly Language Syntax is just that -- *suggested* -- so isn't part of the architecture specification anyway, but this change makes it clearer that if bits are interpreted differently in the instruction, one should expect its assembly-language syntax to change, as well)

[21] Page 40, Table 7:
Changed leftmost column text as follows:
"Single" to "Single f.p. or 32-bit integer"
"Double" to "Double f.p. or 64-bit integer"
"Quad" to "Quad f.p."

Corrections 22-57 were incorporated into R1.4.5, Dec 1999, | | which was to be used for the 2nd printing of the book. | | R1.4.5 (revision 1.4.5) can be identified by the text | | "SAV09R1429309" inside the front cover of the book. | | These corrections also appear in all subsequent revisions.

[22] p.13, subsection 2.57, definition of "reserved":

Wording:
"...intended to run on future version of"
was corrected to read:
"...intended to run on future versions of".

The sentence beginning "Reserved register fields" was amend to read: "Reserved register fields should always be written by software with values of those fields previously read from that register, or with zeroes; they should read as zero in hardware."

[23] p.21(r142), Editor's Notes:
Added Les Kohn's name to the Acknowledgements.

[24] p.28(r142), Tables 3,4,5:
Made use of hyphens & dashes made consistent, and easier to read.

[25] p.30(r142), paragraph just above subsection 5.1:
Changed end of sentence to read:

"...should be written with the values of those bits previously read from that register, or with zeroes."

[26] p.40(r142), Table 7:
Added lines for 32-bit and 64-bit signed integers in f.p. registers, for clarity.

[27] p.51, Figure 17:
Added bits 11..10 to the figure, so it looks like:

PID1 PID0 CLE TLE MM RED PEF AM PRIV IE AG
11 10 9 8 7 6 5 4 3 2 1
\________/ (changed here)
(see also Errata #28, #29, and #53)

[28] p.52(r142), inserted new subsection 5.2.1.1 before old one:
"IMPL. DEP. #127: The presence and semantics of PSTATE.PID1 and PSTATE.PID0 are implementation-dependent. Software intended to run on multiple implementations should only write these bits to values previously read from PSTATE, or to zeroes. See also TSTATE bits 19..18."
(see also Errata #27, #29, and #53)

[29] p.55(r142), Figure 22, (TSTATE register):
Extended the "saved PSTATE" field up through bit 19 of TSTATE; changed the diagram to look like:

... ASI from TL=x --- PSTATE from TL=X ...
 
31
24
23
20
19
8
 
    \_______/ (changed here)  
(see also Errata #27, #28, and #53)

[30] p.56(r142):
Added a new paragraph to the end of subsection 5.2.6:

"TSTATE bits 19 and 18 are implementation-dependent. ImplDep#126: If PSTATE bit 11 (10) is implemented, TSTATE bit 19 (18) shall be implemented and contain the state of PSTATE bit 11 (10) from the previous trap level. If PSTATE bit 11 (10) is not implemented, TSTATE bit 19 (18) shall read as zero. Software intended to run on multiple implementations should only write these bits to values previously read from PSTATE, or to zeroes."

[31] p.57(r142), subsection 5.2.10 (Register-Window State Registers): Added implementation dependency #126:

IMPL. DEP. #126: Privileged registers CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN contain values in the range 0..NWINDOWS-1. The effect of writing a value greater than NWINDOWS-1 to any of these registers is undefined. Although the width of each of these five registers is nominally 5 bits, the width is implementation-dependent and shall be between ceil(log2(NWINDOWS)) and 5 bits, inclusive. If fewer than 5 bits are implemented, the unimplemented upper bits shall read as 0, and writes to them shall have no effect. All five registers should have the same width.
(see also Errata #54)

[32] pp.58-9(r142), subsection 5.2.10 (Register-Window State Registers):
Added note to descriptions of CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN registers that the effect of writing a value to them greater than NWINDOWS-1 is undefined.

[33] p.76, Section 6:
Last sentence in 6.3.4.1, "Conditional Branches" changed to
:

Note that the annul behavior of a taken conditional branch is different from that of an unconditional branch.

And the last sentence in 6.3.4.2, "Unconditional Branches" changed to:

Note that the annul behavior of a unconditional branch is different from that of a taken conditional branch.

[34] p.80(r142), 6.3.6.4(r142), RESTORED:
(duplicate of Erratum #3)

[35] p.81(r141/r142):
In section 6.3.9, "FMOVc" was corrected to read "FMOVr".

[36] p.81(r141/r142):
In section 6.3.9, a sentence was added stating the clearing of FSR.cexc and FSR.ftt during condition moves FMOVcc and FMOVr:

FMOVcc and FMOVr instructions clear these FSR fields regardless of the value of the conditional predicate.

[37] p.121(r141/r142):
An index entry for "non-faulting loads" was fixed in section 8.3.

[38] p.151(r142), A.9(r142), Compare and Swap page:
Added mention of CASL and CASXL to the Programming Note:

Compare and Swap Little (CASL) and Compare and Swap Extended Little (CASXL) synthetic instructions are available for "little endian" memory accesses.

[39] p.171, Appendix A, "Load Floating-Point":
Sentence added:
The upper 32 bits of FSR are unaffected by LDFSR.

[40] p.181(r141/r142):
Section number "A.31" was fixed so it now increments to A.32. All following section numbers and odd page headers in Appendix A have changed.

[41] p.191(r141/r142):
Misspelling corrected in page heading: "Condition" --> "Condition"

[42] p.195(r141/r142), "Move Integer Register on Register Condition (MOVR)":
Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

"movre" and "movrz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRZ.

"movrne" and "movrnz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRNZ.

[43] p.212(r14[123]) A.43(r14[12])/A.44(r144):
(duplicate of Erratum #1)

[44] p.216(r142), A.46(r142), RESTORED page: (duplicate of Erratum #4)

[45] (duplicate of Erratum #14)

[46] p.228(r141/r142):
Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

[47] (duplicate of erratum #15)

[48] p.231(r142)/233(r144), AppendixA:
Corrected SWAP deprecation note to recommend use of "CASA" or "CASXA" (not "CASX") in place of SWAP.

[49] p.234, A.58(r14[12])/A.59(r144), Tagged Add:
op3 opcodes are wrong. Both should have "0" for low-order bit (as is correctly specified in Appendix E).

[50] p.241(r142), A.62(r142), Write State Register page:
(duplicate of Erratum #20)

[51] p.242(r142), A.62(r142), Write State Register page:
(duplicate of Erratum #7)

[52] p.253(r142), Appendix C:
Fixed 6 incorrect index entries.

[53] p.253(4142), Appendix C:
Added a new Implementation Dependency:

# Cat Def/Ref Description
127 f 52, 56 The presence and semantics of PSTATE.PID1 and PSTATE.PID0 are implementation-dependent. The presence of TSTATE bits 19 and 18 is implementation-dependent. If PSTATE bit 11 (10) is implemented, TSTATE bit 19 (18) shall be implemented and contain the state of PSTATE bit 11 (10) from the previous trap level. If PSTATE bit 11 (10) is not implemented, TSTATE bit 19 (18) shall read as zero. Software intended to run on multiple implementations should only write these bits to values previously read from PSTATE, or to zeroes.
(see also Errata #27, #28, and #29)

[54] p.255(r142), Appendix C:
Added implementation dependency #126.
(see correction #31 above for the text of implementation dependency #126)

[55] p.258(r142), D.3.3., rule (1):
(duplicate of Erratum #17)

[56] p.268(r142), Table 32:
(duplicate of Erratum #9)

[57] p.290(r142), Section G, Table 43:
(duplicate of Erratum #6)

[58] In Figure 3 in Chapter 6 (p.62), the 4th format description from the bottom of the page (op,rd,op3,rs1,i=0,--,rs2) contains an error; "i=0" should read "i=1".

[59] In section 6.3.1, "Memory Access Instructions", on p.67,
"and CAS accesses words or doublewords. " should be amended to read: "CASA accesses words, and CASXA acesses doublewords."

[60] In section 7.7, p. 111, the async_data_error exception description should be updated to read as follows:

async_data_error [tt = 0x040] (Precise, Deferred, or Disrupting) -- An implementation-dependent exception (impl. dep. #31) that indicates that one or more unrecoverable or uncorrectable but recoverable errors have been detected in the processor. This may include errors detected in the architectural registers (general-purpose registers, floating-point registers, ASRs, or ASI registers) and other core processor hardware. A single async_data_error exception may indicate multiple errors and may occur asynchronously to instruction execution. An async_data_error exception may cause a precise, deferred, or disrupting trap. When async_data_error causes a disrupting trap, the TPC and TNPC stacked by the trap do not necessarily indicate the instruction or data access that caused the error.

[61] The following text should be added to the second paragraph of section A.27 (p.176), to clarify the behavior of a little-endian doubleword load (LDD):

With respect to little endian memory, an LDD instruction behaves as if it is composed of two 32-bit loads, each of which is byte swapped independently before being written into each destination register.
(see also Errata #62, #71, and #72)

[62] The following text should be added to the second paragraph of section A.28 (p.178), to clarify the behavior of a little-endian doubleword load from alternate space (LDDA):

With respect to little endian memory, an LDDA instruction behaves as if it is composed of two 32-bit loads, each of which is byte swapped independently before being written into each destination register.
(see also Errata #61, #71, and #72)

[63] In the Index, p.354, the "signal monitor instruction" index entry should instead read "software intiated reset (SIR) instruction".

[64] There is an error in the definition of CLEANWIN (p.59) and the SAVE instruction that allows the locals of the "invalid" window to in some cases not be cleaned (zeroed) when it is allocated by a SAVE instruction.

A software workaround (used in the Solaris operating system and perhaps others), to keep user registers clean of kernel data, involves the use of an extra %wstate value. When the kernel returns to user code, it sets %wstate to the new value. The new trap table entry for spills with that %wstate value spills the window as usual but also backs up a window and performs the missing "clean" operation. The spill handler then sets %wstate back to the default value for a user process.

[65] In Chapter 7, "Traps", it is implied (but not explicitly stated) that the value PSTATE.TLE is preserved during traps that cause entry into RED_state and during XIR, WDR, and SIR resets. However, PSTATE.TLE may be left in an undefined states by one of those events. The correction, which applies to sections 7.6.2.1 (p.106), 7.6.2.3 (p.108), 7.6.2.4 (p.109), and 7.6.2.5 (p.110) is to change the little-ending mode settings from:

PSTATE.CLE <-- PSTATE.TLE (set endian mode for traps)
to:
PSTATE.CLE <-- PSTATE.TLE (set endian mode for traps)
PSTATE.TLE <-- undefined

[66] In Chapter 5, section 5.1.7.9 (p.48), the last sentence of the third paragraph is inaccurate. The entire third paragraph should be replaced with:

Floating-point operations which cause an overflow or underflow condition may also cause an "inexact" condition. For overflow and underflow conditions, FSR.cexc bits are set and trapping occurs as follows:

o If an IEEE 754 overflow condition occurs:

-- if TEM.OFM=0 and TEM.NXM=0, the cexc.ofc and cexc.nxc bits are both set to 1, the other three bits of cexc are set to 0, and
an IEEE_754_exception trap does *not* occur.

-- if TEM.OFM=0 and TEM.NXM=1, the cexc.nxc bit is set to 1, the other four bits of cexc are set to 0, and and an IEEE_754_exception trap *does* occur.

-- if TEM.OFM=1, the cexc.ofc bit is set to 1, the other four bits of cexc are set to 0, and an IEEE_754_exception trap *does* occur.

o If an IEEE 754 underflow condition occurs:

-- if TEM.UFM=0 and TEM.NXM=0, the cexc.ufc and cexc.nxc bits are both set to 1, the other three bits of cexc are set to 0, and an IEEE_754_exception trap does *not* occur.

-- if TEM.UFM=0 and TEM.NXM=1, the cexc.nxc bit is set to 1, the other four bits of cexc are set to 0, and an IEEE_754_exception trap *does* occur.
-- if TEM.UFM=1, the cexc.ufc bit is set to 1, the other four bits of cexc are set to 0, and an IEEE_754_exception trap *does* occur.

The above behavior is summarized in the following table
(x = don't-care):

Conditions
--------------------------------
Results
---------------------------
Exception(s)
Detected
in f.p.
operation
------------

Trap Enable
Mask Bits
(in FSR.TEM)
--------------
  fp_
  exception_
  ieee_754
  Trap
Current
Exception
Bits (in FSR.cexc)
---------------
of
---
uf
---
nx
---
OFM
---
UFM
---
NXM
---
-
-
-
x
x
x
-
-
*
x
x
0
-
*
*
x
0
0
*
-
*
0
x
0
              
-
-
*
x
x
1
-
*
*
x
0
1
-
*
-
x
1
x
-
*
*
x
1
x
*
-
*
1
x
x
*
-
*
0
x
1

 Occurs?
  -------

ofc
---
ufc
---
nxc
---
Notes
-----
  no
0
0
0
  no
0
0
1
  no
0
1
1
(1)
  no
1
0
1
(2)
 
  yes
0
0
1
  yes
0
0
1
  yes
0
1
0
  yes
0
1
0
  yes
1
0
0
(2)
  yes
0
0
1
(2)

(1) When the underflow trap is disabled (UFM=0), underflow is always accompanied by inexact.

(2) Overflow is always accompanied by inexact.

(see also Errata #67, #68, and #69)

[67] In Appendix B, section B.3 (p.245), the first paragraph:

"Underflow occurs if the exact unrounded result has magnitude
between zero and the smallest normalized number in the
destination format."

should be replaced by the following two paragraphs:

"On an implementation that detects tininess before rounding, trapped underflow occurs when the exact unrounded result has magnitude between zero and the smallest normalized number in the destination format.

On an implementation that detects tininess after rounding, trapped underflow occurs when the result, if it was rounded to a hypothetical format having the same precision as the destination but of unbounded range, would have magnitude between zero and the smallest normalized number in the actual destination format."

(see also Errata #66, #68, and #69)

[68] In Appendix B, section B.4 (p.245), the first two paragraphs:

The first paragraph:

"Underflow occurs if the exact unrounded result has magnitudebetween zero and the smallest normalized number in thedestination format, *and* the correctly rounded result in the destination format is inexact."

should be replaced by the following paragraph:

On an implementation that detects tininess before rounding, untrapped underflow occurs when the exact unrounded result has magnitude between zero and the smallest normalized number in the destination format, *and* the correctly-rounded result in the destination format is inexact."
And the beginning of the second paragraph:
"Table 28 summarizes what happens when an exact ..."
should be modified to read:
"Table 28 summarizes what happens on an implementation that detects tininess before rounding, when an exact ..."
(see also Errata #66, #67, and #69)

[69] In Appendix B, Table 28, "Untrapped Floating-Point Underflow" (p.245): Table 28 (and its footnote) should be replaced by the following revised table and text:

Table 28: Untrapped Floating-Point Underflow (Tininess Detected Before Rounding)

 
Underflow trap mask:
UFM=1 UFM=0 UFM=0
 
Inexact trap mask:
NXM=x NXM=x NXM=0
           
u = r   r is minimum normal none none none
    r is subnormal UF none none
    r is zero none none none
           
u ! = r   r is minimum normal UF NX uf nx
    r is subnormal UF NX uf nx
    r is zero UF NX uf nx
           
UF = IEEE_754_exception trap with cexc.ufc=1
NX = IEEE_754_exception trap with cexc.nxc=1
 
uf = cexc.ufc=1, aexc.ufa=1, no IEEE_754_exception trap
nx = cexc.nxc=1, aexc.nxa=1, no IEEE_754_exception trap

In an implementation that detects tininess after rounding, Table 28 applies to a narrower range of values of the exact unrounded result u. The precise bounds depend on the rounding direction specified in FSR.RD, as follows:

o Let m denote the smallest normalized number and e the absolute difference between 1 and the next larger representable number in the destination format. Then the bounds on u for which Table 28 applies are:

Rounding  
FSR.RD
Toward Range of Values of u
-------------
------------ ---------------------
0
nearest |u| < m(1 - e/4)
1
0 |u| < m
2
+infinity -m < u <= m(1 - 2/2)
3
-infinity -m(1 - e/2) <= u < m

o When u lies outside these ranges, underflow does not occur,
although an inexact exception still occurs when u != r, the rounded value.
(see also Errata #66, #67, and #68)

[70] In Appendix A, section A.40, "No Operation" (p.204):
For clarity, in the instruction format diagram the eterm "op" should be replaced by five zeroes.

[71] In Appendix A, section A.53, "Store Integer" (p.227):
The following paragraph should be added near the end of the Description subsection, prior to the Programming Note, to clarify the behavior of a little-endian doubleword store (STD):

"With respect to little-endian memory, a STD instruction behaves as if it is composed of two 32-bit stores, each of which is byte-swapped independently before being written into its respective destination memory word."

(see also Errata #61, #62, and #72)

[72] In Appendix A, section A.54, "Store Integer Into Alternate Space" (p.229):
The following paragraph should be added near the end of the Description subsection, prior to the Programming Note, to clarify the behavior of a little-endian doubleword store to alternate space (STDA):

"With respect to little-endian memory, a STDA instruction behaves as if it is composed of two 32-bit stores, each of which is byte-swapped independently before being written into its respective destination memory word."

(see also Errata #61, #62, and #71)

[73] In Chapter 7, pp.101-102: reference is made in two places to a range of trap priorities, with 0 as the highest priority and 31 as the lowest.
Architecturally, there are no absolute trap priorities (only relative trap priorities) and there is no specific limit to trap priority numbers. Trap priorities are only used by a processor to choose which exception will cause a trap at any given time; a trap priority is an ordinal number which need not be stored anywhere. Therefore, the following changes should be noted:

Caption above Table 15, p.101:
     Change:
    0 = Highest; 31 = Lowest"
      to:
 0 = Highest"

Text of first paragraph of section 7.5.3 on p.102:
     Change:
"Priority 0 is highest, priority 31 is lowest; that is, if......."
      to:
"A trap priority is an ordinal number, with 0 indicating
the highest priority and greater priority numbers
indicating decreasing priority; that is, if......"

[74] In Chapter 7, page 88, Figure 37 "Processor State Diagram", the following corrections should be made in the figure:

-- all references to "Trap" should be changed to "nrt"
   
-- add to the caption the words:
    ("nrt" = "non-reset trap")
     
--

"or SIR" should be added to the label on the center topmost arc in the diagram, so that it reads "nrt or SIR @ TL = MAXTL"

     
--

The references to "RED = 1" and "RED = 0" should be changed to
"PSTATE.red <-- 1" and "PSTATE.red <-- 0", respectively, for clarity.

     
--

Under the arc from "execute_state" to "RED_state", the label currently reads:

    "Trap or SIR @ TL < MAXTL, RED=1".
 

The words "Trap or" should be removed, so that it reads:

    "SIR @ TL < MAXTL, RED <-- 1"
     
A related change should be made on the first page of Chapter 7 (p.87) to the definition of "trap" in the paragraph beginning "Thus, an exception is...". The words:
  "...in response to the presence of an exception, interrupt, reset, or Tcc instruction"
should be changed to:
  "...in response to the presence of an exception, interrupt, reset, or Tcc instruction"
The same change should be made to the definition of "trap" in section 2.66 on p.13.

[75] In Chapter 6, p. 76, Table 13, the four rows with "B" in the leftmost cell should be more clearly labelled with Branch Always (BA) and Branch Never (BN) abbreviations, as follows:

  BA  
  BN  
  BA  
  BN  
Correspondingly, at the top of p.75:
  always or never taken, represented in the table by "B"
should be replaced by:
 

always or never taken, represented in the table by "BA" and "BN", respectively

[76] In Chapter 6, p.63, Figure 34, the Format(4) diagram for Tcc

(the one including "sw_trap_#", the third one from the bottom) is incorrect. That diagram should be deleted and replaced by a copy of the two Format-4 diagrams from the Tcc instruction page (p. 237).

[77] The contents of Chapter 6, section 6.3.11, p.82, should be replaced by the following:
If a conforming SPARC V9 implementation attempts to execute an instruction that is not specificallydefined in this specification, it behaves as follows:

o If the instruction encodes an implementation-specific extension to the instruction set, that extension is executed.
     
o If the instruction does not encode an extension to the instruction set, but would decode as a valid instruction if nonzero bits in reserved instruction field(s) were ignored (read as 0):
    -- the recommended behavior is to generate an illegal_instruction exception (or, in the FPop opcode space, an fp_exception_other exception with FSR.ftt = 3(unimplemented_FPop)
    -- altenatively, the implementation can ignore the nonzero reserved field bits and execute the instruction as if those bits had been zero.
     
o If the instruction does not encode an extension to the instruction set and would still not decode as a valid instruction if nonzero bits in reserved instruction field(s) were ignored, then the instruction is invalid and causes an exception. Specifically, attempting to execute an invalid instruction in the FPop opcode space causes an fp_exception_other trap (with FSR.ftt = unimplemented_FPop); attempting to execute any other invalid instruction causes an illegal_instruction trap.
     
See Appendix E, "Opcode Maps", for an enumeration of reserved opcodes.

 

Implementation Note:

  As described above, implementations are strongly encouraged, but not strictly required, to trap on nonzero values in reserved instruction fields.
     

Programming Note:

  For software portability, software (such as assemblers, static compilers, and dynamic compilers) that generates SPARC instructions must always generate zeroes in instruction fields marked "reserved" ("--").

[78] In Appendix A, p.131, numbered bullet point (2), third sentence:

Currently reads:
  If a conforming SPARC-V9 implementation encounters nonzero values in these fields, its behavior is undefined.
Should be corrected to read:
  If a conforming SPARC V9 implementation encounters nonzero values in these fields, its behavior is as defined in section 6.3.11 on page 82.

[79] In Appendix E, p.267:

The second paragraph currently reads:
  ...an attempt to execute a reserved opcode shall cause a trap, unless it is an implementation-specific extension to the instruction set.
Should be corrected to read:
  ...an attempt to execute a reserved opcode behaves as defined in section 6.3.11 on page 82.

[80] In Chapter 2, section 2.57 (definition of "reserved"), three corrections:

a) Where it currently reads:
    ...reserved instruction fields is undefined.
  Should be corrected to read:
    ...reserved instruction fields is as defined in in section 6.3.11 on page 82.
     
b) Where it currently reads:
    Reserved register fields should ...
  Should be corrected to read:
    ...assume that these fields will read...
     
c) Where it currently reads:
    ...assume that these field will read...
  Should be corrected to read:
    ...assume that these fields will read...

[81] In Appendix E, pp.267-8, two corrections:

a) p.267, Table 31, "BPr" column, currently reads:
    BPr
    See Table 37
  To reinforce that bit 28=0 for BPr, this should be corrected to read:
    BPr (bit 28=0)
      See Table 37
    -- (bit 28=1)
  Plus, a footnote must be added:
    Although SPARC V9 implementations should cause an illegal_instruction exception when bit 28=1, many early implementations ignored the value of this bit and executed the opcode as a BPr instruction even if bit 28=1.
  This footnote should be referenced in both Appendix E (p.268) and on the BPr instruction page (p.136).
       
b) p.268, Table 32, table cell for Tcc (op3=0x3A), currently reads:
    Tcc (bit 29=0)
      See Table 36
    -- (bit 29=1)

[82] The behavior of an attempt to reference a restricted ASI by a

PREFETCHA while in nonprivileged mode is not clear; the second paragraph of 6.3.1.3 (p.71) suggests that a privileged_action exception should occur and the first sentence in A.41 (p.2.3) suggests that an implementation should treat it as a NOP. Although such a reference is clearly inappropriate in nonprivileged software, a case can be made for either response by an implementation and both may have been implemented.
       
Therefore, it is implementation-dependent whether this condition causes a privileged_action exception or executes as a NOP.
       
In section A.41, p.204, new implementation dependency #103(6) should be added:
  IMPL.DEP. #103(6): Whether an attempt to reference a restricted ASI (< 0x80) by a PREFETCHA instruction while in nonprivileged mode (PSTATE.PRIV=0) causes a privileged_action exception or executes as a NOP is implementation-dependent.
In 6.3.1.3, second paragraph, append to the end of the second sentence:
  "(see impl.dep.#103(6))"
       
At the end of section A.41, page 207:
  The following entry should be added to the end of the "Exceptions" list:
    privileged_action (PREFETCHA with PSTATE.PRIV=0 and
      ASI<0x80 (impl.dep.#103(6))

In Appendix C, p.252:

  A reference to new implementation dependency #103(6) should be added to the entry for implementation dependency #103.

[83] Chapter 5, section 5.2.1.1, "PSTATE_current_little_endian (CLE)", on p.52 reads:

When PSTATE.CLE = 1, all data reads and writes using an implicit ASI are performed in little-endian byte order with an ASI of ASI_PRIMARY_LITTLE. When PSTATE.CLE = 0, all data reads and writes using an implicit ASI are performed in big-endian byte order with an ASI of ASI_PRIMARY. Instruction accesses are always big-endian.

This description assumes the processor is executing with TL = 0; to make it accurate for all conditions, it should be modifed to read:

When PSTATE.CLE = 1, all data accesses using an implicit ASI are performed in little-endian byte order. When PSTATE.CLE = 0, all data accesses using an implicit ASI are performed in big-endian byte order. Instruction accesses are always performed using big-endian byte order. Specific ASIs used are shown in Table __ on page 71.

[84] The first paragraph of Chapter 6, section 6.3.1.3, "Address Space Identifiers (ASIs)", p.71, should be replaced by:

Alternate-space Load, store, and load-store instructions specify an explicit ASI to use for their data access; when i = 0, the explicit ASI is provided in the instruction's imm_asi field and when i = 1, it is provided in the in ASI register. Non-alternate-space load, store, and load-store instructions use an implicit ASI value which depends on the current trap level (TL) and the value of PSTATE.CLE. Instruction fetches use an implicit ASI which depends only on the current trap level. The cases are enumerated in Table __.

Table __:   ASIs used for Data Access and Instruction Fetches

Access Type
----------
TL
---
PSTAE.CLE
---------
ASI Used
------------
Instruction = 0 any ASI_PRIMARY
Fetch > 0 any ASI_NUCLEUS*
       
----------      
       
Non-alternate- = 0 0 ASI_PRIMARY
space Load,   1 ASI_PRIMARY_LITTLE
Store, or > 0 0 ASI_NUCLEUS*
Load-Store   1 ASI_NUCLEUS_LITTLE**
       
----------      
       
Alternate- any any ASI explicitly
space     specified in the
Load, Store     instruction (subject
or Load-Store     to privilege level
      restrictions)

* on some early SPARC V9 implementations, ASI_PRIMARY may have been used for this case
** on some early SPARC V9 implementations, ASI_PRIMARY_LITTLE may have been used for this case

Also see section 8.3, Addressing and Alternate Address Spaces, on page 119.

[85] In the assembly-language sample at end of section 7.2.1.3, p.91, an instruction is missing that would shift the "TT" value left by 5 bits to line up with the correct field in TBA.

  Specifically, the current text:
    rdpr %tt, %g1
    rdpr %tba, %g2
    add %g1, %g2, %g2
       
  Should be replaced by:
    rdpr %tt, %g1
    rdpr %tba, %g2
    sllx %g1, 5, %g1
    add %g1, %g2, %g2

 

[86] On page 45, section 5.1.7.6, the first paragraph, replace the end of the last sentence:

"... the ftt field encodes the type of the floating-point exception until an STFSR or an FPop executes."

with corrected text:

"... the ftt field encodes the type of the floating-point exception until an STFSR, STXFSR, or FPop executes."

[87] On page 78, at the end of section 6.3.5.1, the line of example code:

movg %xcc, %g0,1, %i3
should read:
movg %xcc, 1, %i3

[88] On page 11, replace section 2.41 (a single paragraph) with this revised definiton:

  non-faulting load:
  A load operation that behaves identically to a normal load operation, except when supplied an invalid effective address by software. In that case, a regular load triggers an exception while a non-faulting load appears (possibly with the assistance of system software) to ignore the exception and loads its destination register with a value of zero.

[89] On page 40, section 5.1.4.2, the first Programming Note contains an error of omission, since poorly-aligned double- (or quad-) precision f.p. data _can_ be loaded directly into the upper half of the f.p. register file using LDDF(A)/LDQF(A) instructions.

The following should replace the erroneous Programming Note:

  Programming Note:
    The upper 16 double-precision (upper 8 quad-precision) floating-point registers cannot be directly loaded by 32-bit load instructions. Therefore, double- or quad-precision data that is only word-aligned in memory cannot be directly loaded    into the upper registers using LDF(A) instructions. The following guidelines are recommended:
       
  (1) Whenever possible, align floating-point data in memory on proper address boundaries. If access to a datum is required to be atomic, the datum _must_ be properly aligned.
       
  (2) When a double- or quad-precision datum is not properly aligned in memory, is still aligned on a 4-byte boundary, and access to the datum in memory is not required to be atomic, software should attempt to allocate a register for it in the lower half of the floating-point register file so that the datum can be loaded using multiple LDF(A) instructions.
       
  (3) If the only available registers for such a datum are located in the upper half of the floating-point register file and access to the datum in memory is not required to be atomic, the word-aligned datum can be loaded into them by one of two methods:
    (a) load the datum into an upper register by using multiple LDF(A) instructions to first load it into a double[quad]- precision register in the lower half of the floating-point register file, then copy that register to the desired destination register in the upper half, or
    (b) use a LDDF(A)[LDQF(A)] instruction to perform the load directly into the upper floating-point register, understanding that use of these instructions on poorly-aligned data can cause a trap (LDDF[LDQF]_mem_not_aligned) on some implementations which may significantly slow down program execution.

 

[90] On page 76, section 6.3.4.3, the second paragraph says:

"The JMPL instruction ... then causes a PC-relative delayed transfer of control..."

It should instead read:

"The JMPL instruction ... then causes a register-indirect delayed transfer of control..."

[91] On page 76, Table 13 at the top of the page:

The 8th row in the table, the one for Branch (B) that is Taken with Annul bit=1, has an erroneous entry in the "Delayed" column. A Taken unconditional branch with Annul bit=1 is non-delayed, therefore show say "No" in the Delayed column.

[92] On page 184, Table 26, last row (Lookaside barrier row):

The text under the "Description " column in the Lookaside row should be replaced by the following text:
(Deprecated) A store appearing prior to the MEMBAR must complete before any load following the MEMBAR referencing the same address can be initiated. MEMBAR #Lookaside is deprecated and is supported only for legacy code; it should not be used in new software. A slightly more restrictive MEMBAR operation (such as MEMBAR #StoreLoad) should be used, instead.

Implementation Note: Since #Lookaside is deprecated, implementations are not expected to perform address matching. Instead, they should provide #Lookaside functionality using a more restrictive MEMBAR operation, such as #StoreLoad. (in fact, no SPARC V9 processor has ever implemented address matching; all have implemented #Lookaside using a more restrictive operation such as #StoreLoad or #Sync)

[93] On page 285, last line on the page (#Lookaside line):

The following note should be added to #Lookaside:
Use of #Lookaside is deprecated and only supported for legacy software. New software should use a slightly more restrictive MEMBAR operation (such as #StoreLoad) instead.

[94] On page 94, section 7.3.3 Disrupting Traps, the last sentence of the second paragraph says:


"...when the condition's interrupt level is lower than that specified in PIL..."

It should instead read:

"...when the condition's interrupt level is less than or equal to that specified in PIL..."