[sv-bc] FW: [sv-ec] Name resolution issues

From: Mark Hartoog <Mark.Hartoog_at_.....> Date: Fri Aug 03 2007 - 10:10:45 PDT · This archive was generated by hypermail 2.1.8 : Fri Aug 03 2007 - 10:13:46 PDT

-----Original Message-----
From: owner-sv-ec@eda.org [mailto:owner-sv-ec@eda.org] On Behalf Of Mark
Hartoog
Sent: Friday, August 03, 2007 10:02 AM
To: sv-ec@eda.org
Subject: [sv-ec] Name resolution issues

Problem statement:
First I am going to describe the name resolution problems I am
attempting to address. The name resolution issues arise because of
several features:

1) Wild card package imports.
2) Segmented $unit
3) Late binding caused by type parameters.
4) Late binding of identifiers in bind statements.
5) Forward references to fields in classes.

In Verilog-2005 all non-dotted identifiers, except for task and function
calls, can be resolved as they are parsed. For simplicity I am going to
leave task and function identifiers out of the discussion for the
moment. 

The basic question is should we continue to require all of these
non-dotted identifiers to be resolved at parse time, or will some of the
identifiers be resolved later. There are several later points where they
could be resolved. These include end of scope, end of module, end of
compilation unit, and at elaboration. 

In the case of bind statements, we have little choice. Non-dotted
identifiers in bind statements have to be resolved at elaboration time.
Section 22.10 says that the effect of a bind to is to create an instance
at the end of the target scope. Identifiers in the bind statement can
clearly bind to any symbol in that scope. Although the LRM does not say,
they presumably can also bind to symbols that have been imported into
that scope. What is less clear is whether they can bind to symbols in
$unit defined after the target module (same for nested modules) or
whether the bind statement can cause additional symbols to be wild card
imported into the scope. 

In addition to bind the language features or potential features that
cause problems are:

1) Randomize with blocks:

module m #(type TP = CB) ();
     int a;
     TP x;
      initial x.randomize with { a < 1 };

The identifier'a' is suppose to be resolved first in the scope of the
variable 'x,' but the type of 'x' is a type parameters, so the scope
cannot be examined until elaboration. The only way to fix this so it can
be resolved at parse time is to make a backwards incompatible change to
the syntax to force users to indicate which identifiers must resolved in
the scope of the variable.

2) Forward references to fields in classes:

int a;
class C;
    function int f();
        return a;
    endfunction
    int a;
endclass

Both C++ and Java require declaration before use for variables in
procedural contexts like Verilog, but they allow use before declaration
of class fields and methods. I think users expect this behavior in
classes. This can be implemented by delaying the resolution of
identifiers in classes until end of class. 

3) Forward typedefs as base class:

typedef class CB;
int a;
class C extends CB;
    function int f();
        return a;
    endfunction
endclass
class CB;
    int a;
endclass

This feature is in P1800-2005, so removing it would be a backwards
compatibility issue. On the other hand it really doesn't add anything
useful to the language, so it probably could be removed without causing
any serious problems. 

To implement this you have to delay resolution of identifiers in classes
to end of compilation unit, since worse case the base class could be a
forward typedef in $unit. 

4) Type parameters as base class.

module m #(type TP = CB) ();
     int a;
     class C extends CB;
         function int f();
             return a;
        endfunction
    endclass

It is unclear if this is allowed in P1800-2005. I know of several
implementation that have taken a liberal interpretation of
class_identifier to include type parameters that are assigned class
types in other contexts where the LRM indicates a class_identifer is
required. I have also gotten customer inquires as to why they cannot use
a type parameter for a base class. I do not know of any implementations
that actually allows this. In C++ and Java template classes this is
illegal.

If this is allowed in the language, then identifiers in class task and
functions cannot be resolved until elaboration. (Note: I am not
proposing that we add this now. I included this here, because customers
have asked why they cannot do this, and this would be completely ruled
out by requiring early name resolution.)

Proposals to address these Problems

Making the language so that all simple identifiers can be resolved at
parse time will simplify tools. On the other hand, if some identifiers
are not going to be resolved till elaboration, then I don't see why we
should be making incompatible changes in the language or making the
language less powerful to reduce the number of cases that identifiers
are resolved after parse time. 

The important question is can we resolve the identifiers later, say at
end of compilation unit or elaboration, and obtain results that are
consistent with the wild card package import rules and $unit scoping
rules. I believe this is possible. Here is an example I will use to
illustrate the algorithm:

package P;
     int a;
endpackage

class SomeClass;
endclass

module m #(type TP = SomeClass);
   import P::*;
   TP x;
   initial x.randomize with { a < b };
   int a;  // should be an error if 'a' was wild card imported by
randomize.
endmodule 

int b; // 'b' in randomize should never bind to this, because it is
after module

class Ok;
    int a;
    int b;
endclass

class Bad;
   int c, d;
endclass

module top;
   m #(Ok) m1();
   m #(Bad) m2();
endmodule

In instance top.m1, everything is fine. The identifiers 'a' and 'b'
resolve to fields of the class Ok. In instance top.m2, neither 'a' nor
'b' can be resolved in the class Bad. Here 'a' should cause the wild
card package import of P::a. This would then make the declaration of 'a'
later in the module an error. The identifier 'b' should have no
resolution and be an error. It should not resolve to $unit::b because
that declaration is after the module.

In this case these identifiers can only be processed at elaboration
time, because only then is the scope of the variable 'x' known. The
technique described below can resolve these correctly and produce the
correct error messages. This may not be the only way of doing this. It
is just one way of doing it.

To do this we need to identify a candidate resolution for each
identifier in the randomize with block at parse time. We record this
resolution as a candidate, but do not actually make the resolution. In
the example above, the candidate for 'a' would be the wild card package
import of P::a. We identify P::a as the candidate, but we do not
actually make the package import. When we parse the declaration of 'a'
later, this declaration is legal because P::a has not been imported. For
'b' there is no candidate resolution at parse time. We record a null
candidate resolution.

At elaboration time we try to resolve 'a' and 'b' in the scope of the
variable 'x'. In top.m1, they both resolve, and everything is fine. In
top.m2, neither 'a' nor 'b' resolves in the scope of 'x'. The only other
resolution that can be considered for 'a' or 'b' at elaboration time is
the candidate that was identified at parse time. In top.m2, there is a
candidate for 'a'. To resolve 'a' to this candidate we would have to
wild card import P::a, but this wild card import is no longer legal.
There is now a symbol 'a' in the scope of module 'm', so instead of
resolving 'a', we give an error message that the declaration of 'a' is
illegal because 'a' has been wild card imported. For 'b' there was no
parse time candidate, so we simply give an unresolved identifier error
message. 

A similar thing can be done in classes, even in classes where the base
class is a type parameter.

This does not address the issues with bind. Bind statements are not in
the target context at parse time, so you cannot have parse time
candidates for them. The basic issue is illustrated by this example:

package P;
   int a;
endpackage

module m;
   import P::*;
endmodule

module b (input int x, y);
endmodule

int b;

module top;
    m m1();
    bind m: m1 b b1(a, b);
endmodule

Here 'a' and 'b' need to be resolved at elaboration time in module 'm'.
If this instantiation were present at the end of the module 'm' at parse
time, 'a' would resolve to P::a and cause 'a' to be wild card imported
into 'm'. I think it should still do this at elaboration time. The
identifier 'b' is a little trickier. We cannot use parse time candidates
here. The choices seem to be

1) Let 'b' resolve to $unit::b.
2) Do not allow any bind identifiers to resolve in $unit. If you want it
to resolve to $unit, the user can put $unit::b in the bind statement.
(Note: this is a little strange. It is the $unit of the target, not the
$unit that contains the bind statement)
3) Require tools to have a segmented view of $unit so that identifiers
can be correctly resolved.

I think either 1 or 2 are good enough, but I can live with 3 too.

Summary

It is possible to resolve all simple identifiers at parse time as others
have proposed. To do this we need to:

1) Make backwards incompatible changes in the language.
2) Forever rule out useful features like forward references to class
members, a feature that most other languages with classes allow. 

I see no reason to do this. The Verilog language has always required
name resolution at elaboration time for task, function and hierarchical
names. The bind construct clearly requires name resolution at
elaboration for simple identifiers. I see no compelling reason for
making backwards incompatible changes to avoid a few more cases of
delayed resolution of simple identifiers.

--
This message has been scanned for viruses and dangerous content by
MailScanner, and is believed to be clean.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.