Re: [sv-bc] Expected behavior of macro concatenation of macro

From: Greg Jaxon <Greg.Jaxon@synopsys.com>
Date: Tue Jun 22 2010 - 23:15:22 PDT
Wilson Snyder wrote:
I would like the behavior to match the C
preprocessor, but did not mean to imply that what I want was
already LRM specified.
  
Everywhere you look, Verilog and SV resemble - but seldom accurately mimic - C.
In C, ## is described as "token gluing".  In SV, `` is described as a non-white token delimiter.
The LRM says outer levels are
always expanded first, so using CONCAT or not shouldn't matter.
  
Although outer-to-inner order of actual/formal substitutions is indicated, the role of ``
is only described in terms of separating tokens within the macro text itself, not as
a token in  the expansion of the macro.
The LRM is not specific about whether the (first) actual
argument (to `CONCAT) below should be expanded before it is
substituted or after.
    
I misspoke here - LRM does say that `CNT expands only after substitution.
I am not sure how well that statement describes current practice, though.
I know that I rejected its literal reading (as a editorial glitch) because
I knew actuals with macro invocations should have their expanded text
substituted next to the vacuum left by a ``.

Here's an example to disect further.  The VALUEb one seems
to fail on (I suspect) every simulator.

`define ZERO_0 "0"
`define CONCAT(a, b) a``b
`define NOTNOT(a) a
`define VALUEa(a) `CONCAT(`ZERO_,a)
`define VALUEb(a) `CONCAT(`ZERO_,`NOTNOT(a))

module t;
   initial begin
`define CNT 0
      // Works on at least one
      $write("EXP: '0' GOT '"); $write(`VALUEa(`CNT)); $write("'\n");
      // Fails on the one that DOES work
      $write("EXP: '0' GOT '"); $write(`VALUEb(`CNT)); $write("'\n");
   end
endmodule

Follwing the `VALUEb expansion, I would like the LRM to require:

    `VALUEb(`CNT)  // LRM: outer first
    `CONCAT(`ZERO_,`NOTNOT(`CNT))  // LRM: outer first
    `ZERO_```NOTNOT(`CNT)  // See below
    `ZERO_``CNT  // Same rule as above
  
The three possibilities I see are
Of these, the middle one seems most in tune with the LRM:
`` cannot delimit tokens indefinitely; it should vanish along with the formal argument(s) it conjoined.
Recall that the stated purpose for `` is "allowing identifiers to be constructed from arguments."

I see that my simulator counterparts agree with this reading:
    `VALUEb(`CNT)  // LRM: outer first
    `CONCAT(`ZERO_,`NOTNOT(`CNT))  // LRM: outer first
    `ZERO_```NOTNOT(`CNT)  // `` collapses without whitespace
    `ZERO_`NOTNOT(`CNT)  // NOTNOT expands and adds whitespace
    `ZERO_ `CNT  // Extra space now in the token
    `ZERO_ 0  // Extra space now in the token

[Does anyone know a workaround to get the behavior I would
like?]
  
Yes - parrot the LRM examples and write:
`define REPEAT(n, d) `REPEAT_``n(d)

The question is what does `` imply as to expanding any macro
on its LHS or RHS?  I think the LRM is silent on this, but
propse that `` occurs after complete expansions macros on
either side of it.  (Note the alternative, that `` occurs
"first" results in the extra space as shown above, which I
don't see how would be useful?)
  
But how does your proposal permit your example to "work"?
The macros on either side of `` are  `ZERO_ and `CNT , one of
which is undefined; its complete expansion is a failure.  Or it might be defined!
What you probably meant is exactly what one system I know intimately does:
expand the actual argument macro invocation at the point when it is used
in a formal substitution.
I would also propose the LRM to say that `` collapses
whitespace on both sides to match C99.  It now says "without
introducing white space" which is not sufficient if a space
was mis-introduced into a define value, possibly from an
earlier substitution.
  
I ran that idea through our regression suite a year or two ago and found that
a few users understand the words  "`` delimits" to mean
that `` works something like quotation marks.
These folks had defined macros using  ``formal``_suffix.
When you invent a module instance name this way, the abhorrent vacuum between the module name
and the macrotic instance name snaps shut and welds them all into one big happy identifier.
I may have been too greedy - some white spaces are more vacuous than others and I hadn't
tried to track which separated what from what else.  My glue was sticky!  But those tests (from
a large semiconductor manufacturer who shall remain unnamed) stuck more firmly than
my non-standard reading.

I began this effort believing, like you, that `` was the ## token-gluing operator.  It's not.

As you've demonstrated for us, you cannot `define a glue operator to package
and rename this feature - you must use it in the raw.
Is that the most desirable outcome?  I can't say.
The LRM said what it said after several long meetings and review periods.
It may yet change, but only after running a wicked gauntlet of user expectations.

Greg Jaxon



--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean. Received on Tue Jun 22 23:15:53 2010

This archive was generated by hypermail 2.1.8 : Tue Jun 22 2010 - 23:18:34 PDT