SV-EC, Thanks very much to everyone for giving 890 the attention it's been craving for a long time. This is all good discussion. To help simplify processing of the issues, I split the Mantis item into two. 890 will continue to center on clocking blocks, and the new 1604 has been created to deal with programs. More responses forthcoming... Doug > -----Original Message----- > From: owner-sv-ec@server.eda.org > [mailto:owner-sv-ec@server.eda.org] On Behalf Of Jonathan Bromley > Sent: Thursday, September 21, 2006 8:12 AM > To: sv-ec@server.eda.org > Subject: [sv-ec] Review of Mantis 890 (clocking blocks) > > Since I was cast in the role of chief troublemaker on clocking > blocks at the last sv-ec meeting, I thought I'd try to live > up to that... > > > Background > ~~~~~~~~~~ > At the last SV-EC meeting (Monday Sept 11) there was an incomplete > discussion of Mantis 890. Mehdi very sensibly suggested that Doug > Warmke's proposal SV-890-3.pdf should be reviewed point by point. > This note attempts to do that. > > Since I can't easily edit the PDF document, I've copied relevant > fragments of its text here with what I hope is self-evident markup; > my observations and proposed amendments are indented and have the > marginal mark [JB]. Apologies in advance for any inconvenience. > > Many of my comments are "friendly amendments" - rewording, proposed > clarifications and so on. I've tried to capture the sense of last > week's meeting as well as various other emails that went before it. > > There is, I think, only one potentially controversial point, relating > to clause 15.12 where Doug proposed an addition to the text that I > find hard to accept. > > In a nutshell, the difficulty is that clocking blocks work well only > in one specific use case: as a bridge (I think Arturo Salz called it > a "trampoline") between the scheduling regimes in a program and in > a design (modules and interfaces). The rather complicated interaction > between Active, Reactive and NBA regions of the scheduler, together > with the sampling behaviour of clockings, makes this work reliably > and without races. In short, a clocking block has two "ends" - > a "signal end" that hooks into design code, and a "testbench end" > that should be manipulated only by program code. Any other > use model gives rise to many opportunities for races or unexpected > behaviour. > > The offending proposal in SV-890-3 is a workaround to make clockings > behave sensibly when the "testbench end" is manipulated by module > code instead of program code. It's been suggested that this matches > the sample() behaviour of covergroups, but I think that's a spurious > comparison; sampling a covergroup affects only the coverage data, > but updating a clocking block's sampled inputs could have extensive > knock-on throughout the rest of the testbench and I would need a > lot of convincing that this workaround assures freedom from races. > Furthermore, I suspect the proposal is completely broken in the > case of #0 input sampling; I have tried to discuss that issue in > more detail in the appropriate place below. > > Thanks for your consideration. > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > Comments from Jonathan Bromley <jonathan.bromley@doulos.com> > on document SV-890-3.pdf associated with Mantis item 890 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > > 15.2 Clocking Block Declaration > [snip] > > [JB] This change seems fine. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > > 15.10 Cycle delay > ... > What constitutes a cycle is determined by the default > clocking in effect > (see 15.11). If no default clocking has been specified for the current > module, interface, or program then the compiler shall issue an error. > > Example: > ## 5; // wait 5 cycles (clocking events) using the default clocking > ## (j + 1); // wait j+1 cycles (clocking events) using the > default clocking > > <insert> > If a ## cycle delay operator is executed at a simulation time > that does > not correspond to a default clocking event (perhaps due to > the use of a # > delay control or an asynchronous @ event control), the > processing of the > cycle delay is postponed until the time of the next default clocking > event. Thus a ##1 cycle delay shall always be guaranteed to > wait at least > one full clock cycle. > </insert> > > [JB] This formulation is mostly clear, but has some strange effects. > (Once again I'm not the only one who's unhappy here; existing > implementations don't fully match the described behaviour.) > It leads to behaviour that is completely at odds with the usual > behaviour of Verilog @ timing controls - if I say > "@(posedge clk)" > at a time that's halfway between two clock events, I expect > to wait for half a cycle rather than 1.5 cycles. And, > in particular, > it makes life very difficult if you want to do something on the > very first clock event. Surely if I write > > initial begin > ##1 sig <= expr; > > my intent was that 'sig' should be driven at the FIRST clock, not > the second? I realise that it may now be too late to > rescind this > decision. To rescue the situation, can we use ##0 to mean "wait > until the current-or-next clocking event"? If so, all is well > (despite the discontinuity with regular @). > > There's a further ambiguity here. If I use the clocking > block's name > as an event, using the @cb event control, do I get > *exactly* the same > behaviour as ##1? I guess so, but, especially in view > of the problems > I outline above, I think this should be made explicit. > > Finally, using the phrase "default clocking event" in > this context > is clearly wrong. If I say > ##1 cb.out <= ... > then the ##1 is a cycle of cb, which is not necessarily the same > as a cycle of the default clocking. > > So, my conclusions: If we wish to keep the current proposals of > SV-890-3 here, > (1) it is essential that we explicitly define the behaviour of > ##0, so that there's a way of reaching the next-or-current > clocking event; > (2) there should be a note clarifying the stark difference in > behaviour between ## and the regular @ event control, and > clearly stating the equivalence (if any) between @cb and ##1. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > > In 15.12, MODIFY the text as follows: > > [JB] I have a number of issues with this, which I'll take one piece > at a time... > > 15.12 Input sampling > All clocking block inputs (input or inout) are sampled at the > corresponding clocking event. If the input skew is not an explicit #0, > then the value sampled corresponds to the signal value at the > Postponed > region of the time step skew time-units prior to the clocking > event (see > Figure 15-1 in 15.3). > <strikeout> > If the input skew is an explicit #0, then the value sampled > corresponds to the signal value in the Observed region. > </strikeout> > If the input skew is an explicit #0, several additional considerations > shall govern input sampling. First, the value sampled > corresponds to the > signal value in the Observed region. > > [JB] OK so far. > > <insert> > Next, when the clocking event occurs, the sampled value shall be > updated and available for reading the instant the clocking event > takes place and before any other statements are executed. > > [JB] This new stipulation appears to be necessary to legitimize > the approach taken in some vendors' verification methodologies > that don't use program blocks for the test bench. It apparently > aims to sidestep the write/read race condition that pertains if > you have a clocking whose clocking event is on a design variable > and whose inputs are examined in design code. Can we be > confident > that this new stipulation is (a) appropriate, (b) general? It is > almost equivalent to creating a new scheduler region > (Pre-active?!). > If we accept this new behaviour, it is absurd to accept the > caveat that follows: > > <insert> > Finally, if the clocking event occurs due to activity on a > program object, there is a race condition between the update > of the clocking block input's value and the execution of > program code that reads that value. > </insert> > > [JB] The internal contradictions here are in my opinion insupportable. > In effect it says: > > Clocking event on a design object, clocking inputs read > in design code: > NO RACE because of special treatment of clocking inputs. > > Clocking event on a program object, clocking inputs read > in program code: > RACE because update of the clocking input happens in > the same scheduler region as reading of that input. > > There is a fundamental problem here. Clocking inputs are updated > as a result of occurrence of their clocking event; this is sure > to race with reading of the clocking input, *unless* the clocking > event is on a design variable but the clocking inputs are read in > program code. This is, as I understand it, precisely > the scenario > for which clockings were originally designed and in which they > can be expected to work reliably without races. I don't really > understand the need to shoe-horn them into other scenarios where > straightforward module code would do just as well. > > I also completely fail to understand how this approach can yield > meaningful behaviour when #0 input sampling is specified, because > it implies that the clocking block's sampled input values should > be updated BEFORE the Observed region where the sampling is > specified to take place! I discuss this in more detail below. > > I would prefer to see this new stipulation (clocking > inputs update > before anything else happens) completely removed, and in > its place > a warning added that clocking inputs can be read in a > race-free way > only if all the following conditions are met: > * the clocking event occurs in the design regions of the > scheduler > * the clocking input observes a design net or variable > * the clocking input is read only from code running in > the program > regions of the scheduler > > I also wish to see a note to the effect that input #0 sampling > has unusual behaviour. It samples its input signal *after* the > design regions have iterated, and therefore (in most cases) > *after* the clocking event has occurred. It seems to me that > this works sensibly only if the sampled "input #0" is read in > program code rather than in design code. Reading it from design > code will introduce an additional cycle's delay before the result > is visible. > > input #0 and output #0 appear to have been intended to provide > the useful effect of giving, to signals read or driven through a > clocking, exactly the same timing behaviour as you would see > from a program that reads and drives those signals without an > intervening clocking block. Insisting that input > samples be updated > instantaneously on the clocking event will break that > model, since > the sampled value will be updated before the Observe region; this > update will presumably obtain the value that was sampled in the > Observe region of the *previous* clocking event's timestep. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > > 15.14 Synchronous drives > > Clocking block outputs (output or inout) are used to drive values onto > their corresponding signals, but at a specified time. That is, the > corresponding signal changes value at the indicated clocking event as > modified by the output skew. > <insert> > For zero skew clocking block outputs with no cycle delay, synchronous > drives shall schedule new values in the NBA region of the current time > unit. This has the effect of causing the big loop in Figure > 9-1 to iterate > from the reactive/re-inactive regions back into the NBA region of the > current time unit. For clocking block outputs with non-zero > skew or non- > zero cycle delay, the corresponding signal shall be scheduled > to change > value in the NBA region of a future time unit. > </insert> > > Examples: > [snip] > Regardless of when the drive statement executes (due to event_count > delays), the driven value is assigned to the corresponding signal > only at the time specified by the output skew. > > [JB] In the last sentence, the parenthetical remark is entirely > bewildering and should be removed. In fact, given the various > other changes and clarifications proposed, I suspect the whole > sentence could be removed without loss. > > <insert> > It is possible for a drive statement to execute > asynchronously at a time > that does not correspond to its associated clocking event. Such drive > statements shall be processed as if they had executed at the > time of the > next clocking event. Any values read on the right hand side > of the drive > statement are read immediately, but the processing of the statement is > delayed until the time of the next clocking event. This has > implications > on synchronous drive resolution (See 15.14.2) and ## cycle delay > scheduling. > Note: The synchronous drive syntax does not allow > intra-assignment delays > like a regular procedural assignment does. > > [JB] This is good. However, with apologies for the pedantry, > can we please reword the final "Note" sentence as follows? > > Note: Unlike blocking and nonblocking procedural > assignment, the synchronous drive syntax does not > allow intra-assignment delays. > > 15.14.1 Drives and nonblocking assignments > <strikeout> > Synchronous signal drives are processed as nonblocking assignments. > </strikeout> > <insert> > Note: While the non-blocking assignment operator is used in the > synchronous drive syntax, these assignments are different than non- > blocking variable assignments. The intention of using this > operator is to > remind readers of certain similarities shared by synchronous > drives and > non-blocking assignments. One main similarity is that > variables and wires > connected to clocking block outputs and inouts are driven in the NBA > region. > </insert> > Another key NBA-like feature of inout clocking block > variables signals and > synchronous drives is that a drive does not change the clocking block > input. This is because reading the input always yields the > last sampled > value, and not the driven value. > > [JB] Excellent. > > <insert> > One difference between synchronous drives and classic NBA > assignments is > that transport delay is not performed by synchronous drives > (except in the > presence of the intra-assignment cycle delay operator). Another key > difference is drive value resolution, discussed in the next section. > </insert> > > [JB] It seems to me that synchronous drive *does* perform transport > delay, albeit in a rather unusual way: first there is a > transport > delay from the execution of the drive to its maturation, and then > there is a second transport delay associated with the clocking > output's skew. I suspect it would be better to remove entirely > the sentence about transport delay. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > > 15.14.2 Drive value resolution > ... > The driven value of nibble is 4'b0xx1, regardless of whether > nibble is a > reg or a wire. > <insert> > If a given clocking output is driven by more than one > assignment in the > same time unit, but the assignments are scheduled to mature > at different > future times due to the use of cycle delay, then no drive > value resolution > shall be performed. The drives shall be applied with classic > Verilog NBA > transport delay semantics in this case. > If a given clocking output is driven asynchronously at different time > units within the same clock cycle, then drive value > resolution is performed > as if all such assignments were made at the same time unit in > which the next > clocking event occurs. > </insert> > > [JB] I don't think this is as helpful as it could be. It describes > the behaviour from the point of view of the clocking drive, > whereas it is clearer and more general to describe it from the > point of view of the cycle in which the assignment(s) mature. > I'd like to suggest the following re-wording, which is somewhat > heavy going but seems to me to be more precise: > > <proposed LRM text> > Assignment to a clocking output using the syntax > clocking_name.output_name <= [##N] expr; > is known as a clocking drive. A clocking drive shall take > effect (mature) at a current or future clocking event, > as follows: > * If the intra-assignment cycle delay ##N is absent or N > is zero, the drive shall mature at the next occurrence of > the clocking event, or immediately if the clocking > event has occurred in the current timestep. > * If N is greater than zero, the drive shall mature at > the corresponding future clocking event. > In this way, all clocking drives shall mature in the timestep of > a clocking event of their clocking block, even if they executed > asynchronously to that clocking event. > At each clocking event of a clocking block, each clocking output > of that clocking block shall be treated as follows: > (a) _Scheduling of assignment to the clocking output_ > If one or more clocking drive to that output matures on the > current clocking event, a single nonblocking > assignment to that > output shall be scheduled for the current or future timestep > specified by its output skew. > If no clocking drive to that output matures on the current > clocking event, no such assignment to that output > shall be scheduled. > (b) _Value assigned to the clocking output_ > If exactly one clocking drive to that output > matures, the value > assigned as described in (a) above shall be the > value evaluated > by that clocking drive when it executed. However, > if two or more > clocking drives to that output mature, the value > assigned shall > be determined by resolving all those drives' values, > as if each of > those values had been driven on to the same net of > wire type by a > continuous assign statement of (strong0, strong1) > drive strength. > </proposed LRM text> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~ > > [JB] I'm happy with the remaining proposed modifications to clause 15. > However, the implicit NBAs described in 15.14.2 above > have an impact > on clause 16 (program) which should be mentioned also in > clause 15. > > Clocking outputs are updated by NBA (or, in the case of clocking > outputs that are nets, by continuous assign from an > implicit variable > that's updated by NBA). Consequently, as I understand it, > > ** it can in no circumstances be legal for any > program variable to be a clocking output ** > > because that would be equivalent to writing a program variable > by NBA, and we know that to be a bad idea. > > And, for the avoidance of any argument, let's note that > a program's > output port that is a variable is a program variable, > not a design > variable. You *can* get design variables visible > through ports of > a program, by passing them through ref ports, so this is not a > limitation. But we don't want the program to be able to read, > directly, one of its own variables that has been updated in the > NBA region by a clocking block - even if that variable happens > also to be one of its output ports that's driving a > design signal. > > -- > Jonathan Bromley, Consultant > > DOULOS - Developing Design Know-how > VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services > > Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, > Hampshire, BH24 1AW, UK > Tel: +44 (0)1425 471223 Email: > jonathan.bromley@doulos.com > Fax: +44 (0)1425 471573 Web: > http://www.doulos.com > > The contents of this message may contain personal views which > are not the views of Doulos Ltd., unless specifically stated. > >Received on Thu Sep 21 23:44:08 2006
This archive was generated by hypermail 2.1.8 : Thu Sep 21 2006 - 23:44:25 PDT