RE: [sv-bc] 1339: (RESEND)`define behavior on trimming leading and trailing spaces in macros

From: Alsop, Thomas R <thomas.r.alsop_at_.....>
Date: Mon Nov 19 2007 - 14:30:32 PST
Appreciate this feedback. Greg.  By responding however you open yourself
up to answering my questions in return, so I hope you don't mind if I
leach some learnings off you in the process of improving my smithing
nomenclature:) Let's see if I can respond to these issues and improve
this proposal.  See below. -Tom

 

>-----Original Message-----

>From: Greg Jaxon [mailto:Greg.Jaxon@synopsys.com]

>Sent: Monday, November 19, 2007 11:23 AM

>To: Alsop, Thomas R

>Cc: sv-bc@eda.org

>Subject: Re: [sv-bc] 1339: (RESEND)`define behavior on trimming leading
and

>trailing spaces in macros

> 

> 

> 

>Alsop, Thomas R wrote:

> 

>> Here is the additional wording.  I took some of it from the ANSI C

>> preprocessing document that Steven pointed us to. I am not a wizard
yet

>> on LRM word-smith'ing so any advice before we vote on this would be

>welcome.

>> 

>> Thanks, -Tom

> 

>You need more from the ANSI C standard - specifically the whole
step-by-

>step

>operational definition approach.  Here are some questions I have for
your

>definition:

> 

>   A) Is the backslash escape for newline applied before or after other

>      uses of backslash as escape (for example in quoted strings, or

>escaped

>      identifiers)?  If I want a quoted newline in the replacement,
what

>      should I write? (see below for some alternatives)

> 

[Alsop, Thomas R] I agree that the replacement should save newline
escapes. So the way I would answer this is that they be preserved during
macro replacement. For multi-line macros, the last escape character is
the continuation character.  All other escape characters are treated as
part of the replacement text.

 

 

>   B) Is backslash-newline whitespace?  I always assumed it was, but
you

>treat

>      it separately, why?

> 

[Alsop, Thomas R] Simply because it should be preserved, while the other
whi

That's a good question WRT to this definition.  My question back to you
would be whether replacement code should contain newlines?  If I had to
visually see the code after the replacement then I would argue that
newlines must be preserved to make it readable. I don't know of any
tools that I am currently using which require that I look at the
replaced code so perhaps I should be treating them the same. 

 

>   C) Can the backslash-newline ligature be the terminating whitespace
of

>      of an escaped identifier?  If so, will the identifier end with a

>backslash

>      or not?

> 

[Alsop, Thomas R] I see your point.  It has to be the terminating
character for escaped identifiers along with other white-space
characters. Backslash newline must be considered "white-space" in this
context. And no the identifier cannot have the backslash as part of the
identifier.

 

>   D) The first sentence defines "macro text" as being arbitrary stuff

>      on the same "line"; veterans who know the Unix convention of
escaped

>newlines

>      can factor this in as just more arbitrary bytes.  But your
additional

>      sentences describe the "macro replacement string", which I feel
is a

>misuse

>      of the well-defined term "string".  I think both the term "text"
and

>"string"

>      are misleading, and the LRM should instead define the

>"macro_replacement formula",

>      since it clearly contains free variables.  But ultimately the

>trimming

>      effort belies the original definition of this text as
"arbitrary".

> 

[Alsop, Thomas R] Yes, I was struggling to find the right nomenclature
for all the "arbitrary" stuff that we know of as the macro.  For me
there are three things found in a macro.  The string tokens which are
all just cut and paste stuff, the argument tokens which are "search and
replace" stuff, and the other stuff (sorry if I am sounding like Britney
Spears here:), which has special replacement operations.  Like \" gets
replaced with ". All of this combined stuff I just called the macro
string.  My bad. I like "macro text".  Macro_replacement formula doesn't
encapsulate or describe that stuff for me.  Is this another term I am
just not used to?

 

>> The macro text can be any arbitrary text specified on the same line
as

>> the text macro name. If more than one line is necessary to specify
the

>> text, the newline shall be preceded by a backslash ( \ ). The first

>> newline not preceded by a backslash shall end the macro text. The

>> newline preceded by a backslash shall be replaced in the expanded
macro

>> with a newline (but without the preceding backslash character).

> 

>Which raises question (A) what tokenization happens after this
replacement,

>and what backslash substitutions happen before it.  Which text below

>expands to the

>newline character?

> 

[Alsop, Thomas R] I hear that word "tokenization" a lot, but I am not
familiar with it.  What does that mean exactly?

 

>`define ascii_NL "\\

>"

> 

>or

> 

>`define ascii_NL "\\\

>"

>?

> 

[Alsop, Thomas R] First the above cases do not currently compile, I am
sure you knew this already. I believe I understand where you are taking
this question.  Interesting enough I had to add a white-space before the
escape continuation to get it to compile. I wasn't aware that this was a
requirement. According to my interpretation of the LRM, it should
compile as we do not require white-space before the continuation
character. 

 

In both of these examples the last escape continuation character
followed by the newline, expand into the newline without the
continuation character.  The first example should treat the preceding
escape character as nothing since there is nothing after it upon
replacement.  The second example replaces the two preceding escape
characters as one escape character in the replacement.

 

>In Unix conventions, a "line" is defined as arbitrary text delimited by

>unescaped newlines.

>I'd prefer to see that definition once very early in the lexical
convention

>section,

>and then simply not deal with it, except in notes or examples to
illustrate

>the concept.

> 

[Alsop, Thomas R] That would certainly eliminate some confusion.  I
found that as I was reading through your reply it took me a while to
understand your "newline" usage.  My interpretation was a newline as
literally defined in the LRM as the "\n" characters when in reality it
also refers to the unseen newline as the end of a line, which I believe
you are referring to as the "unescaped" newline.

 

>Similarly "whitespace" comprises space, the horizontal and vertical
tabs,

>and newline,

>maybe carriage return - and possibly others.  Isn't there a standard

>covering this?

> 

[Alsop, Thomas R] Not that I am aware of in the LRM.  Shalom should
answer this.

 

>As to whether the committee should back down from "arbitrary" text to

>"trimmed" text,

>I would personally recommend trimming /leading/ whitespace, but NOT

>/trailing/ whitespace.

>The first is done in the interest of free vertical alignment, to make

>   `define A 1

>equal

>   `define \A 1

>, and to prevent any macro from expanding to mere whitespace.

> 

[Alsop, Thomas R] You lost me on this.  What is the second example you
have?  How would you use the "\A" macro?

 

>The second is done to finesse this whole complication about escaped

>identifiers.

> 

>> Any white-space characters preceding or following the macro
replacement

>> string are not considered part of the replacement. Additionally, for

>> multi-line macros any trailing white-space between the last token on
a

>> line and the newline before a backslash is not considered part of the

>> replacement.

> 

>That "Additionally" clause is probably just a note on your original

>definition,

>not an actual addition to the rules.  I oppose the trimming of trailing

>whitespace.  However, I don't vote, so don't fret about it if you can
get

>consensus otherwise.

> 

[Alsop, Thomas R] No, not a note, just not well written.  What do you
think about this instead?

 

The macro text can be any arbitrary text specified on the same line as
the text macro name. If more than one line is necessary to specify the
text, the newline shall be preceded by a backslash ( \ ). The first
newline shall end the macro text. The newline preceded by a backslash
shall be replaced in the expanded macro with a newline (but without the
preceding backslash character).  Any white-space characters preceding or
following the macro replacement text are not considered part of the
replacement. For multi-line macros any leading white-space and any
trailing white-space preceding an escape continuation character will be
removed.

 

>Greg


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Received on Mon Nov 19 14:31:20 2007

This archive was generated by hypermail 2.1.8 : Mon Nov 19 2007 - 14:31:43 PST