February 11, 2009
Derek Parnell wrote:
> On Tue, 10 Feb 2009 14:01:29 -0800, Walter Bright wrote:
> 
>> Derek Parnell wrote:
> 
>>>  ... it creates a new problem; code
>>> duplication.
>> I don't think that duplicating a small run of code is a problem.
> 
> 
> My apology. The problem is more than run-time performance issues. The more
> pressing problem, IMHO, is the one of maintenance costs. Duplicated code is
> a significant cost burden. Not only may each duplication require updating,
> but there is extra effort in analyizing every duplicate to see if it *does*
> need updating. And every act of updating increases the opportunity for
> introducing bugs.

Let's take an example, like the enum for O_XXXX in std.c.linux.linux. Some of those values are common between platforms, and some are unique to particular platforms. So you might be tempted to write:

enum
{
    O_RDONLY = 0,
    O_WRONLY = 1,
    O_RDWR = 2,
    O_CREAT = 0100,
version(OSX) O_SYMLINK   = 0x200000,
}

instead of:

version (linux)
{
  enum
  {
    O_RDONLY = 0,
    O_WRONLY = 1,
    O_RDWR = 2,
    O_CREAT = 0100,
  }
}
else version (OSX)
{
  enum
  {
    O_RDONLY = 0,
    O_WRONLY = 1,
    O_RDWR = 2,
    O_CREAT = 0100,
    O_SYMLINK   = 0x200000,
  }
}
else
{
    static assert(0); // need platform support
}

The first version is definitely shorter. But is it easier to maintain? I argue that it is *harder* and *buggier* to maintain. Let's say I am using linux and I need to add O_APPEND to it. I just stuff it in like:

enum
{
    O_RDONLY = 0,
    O_WRONLY = 1,
    O_RDWR = 2,
    O_CREAT = 0100,
    O_APPEND = 02000,
version(OSX) O_SYMLINK   = 0x200000,
}

and it works great for linux. Now I port my code to OSX, and it mysteriously dies. After much frustration, I discover that O_APPEND for OSX is 8, not 02000. (Yes, this kind of thing has happened to me. I only found it by suspecting the problem, and then going through every fscking definition comparing it to the one in the system .h file. Yeech.)

Now let's try the other way. I add O_APPEND to the linux branch, and linux works great. Now I port to OSX, and the compiler dies with "O_APPEND is undefined". I know immediately exactly what is wrong, look up the .h file, and insert the right O_APPEND into the OSX version of the declaration.

Furthermore, when I build a FreeBSD version, the compiler bings at me for every declaration that needs some porting attention, instead of silently using the wrong values.

This would be even better if the OSX and linux declarations were split into separate "personality" modules. That way you can develop happily on OSX without fear of accidentally breaking linux support. You can defer dealing with the linux version until you actually on the linux machine and are in an efficient position to take care of it.


> Coders need languages that help them do their job, and one way to help is
> to reduce the need for duplicated code.

I know there's a risk by getting in the way of programmers wanting to do things a particular way. To do it, I have to be pretty convinced that there is a better way. The C preprocessor is like crack, everyone knows it's bad but they snort it anyway because it feels so good <g>. C++ was supposed to get a bunch of features that obviate the preprocessor, but have a look at the premier C++ library - Boost - which uses the preprocessor heavily. Boost also uses every last iota of what the preprocessor can do, because if your cpp implementation is not 100% most of Boost will not compile (I know this from experience, DMC++ is 100% now).


>>> Duplicating (nearly all of a) source file is NOT, repeat NOT, a
>>> satisfatory solution.
>> I'm not insensitive to this as I do it myself in maintaining Phobos. It is a problem, but not a huge one. I find that the meld utility (on linux) makes this chore a snap.
> 
> Because of D's limited support for text macros, I am using third party
> tools to get me out of this problem too. 

meld is particularly nice. Andrei showed it to me:

http://www.linux.com/feature/61372
February 11, 2009
Jason House wrote:
> Would you be willing to introduce an alternative to /+ +/ which would be treated differently by the D1 and D2 compilers? Here are some examples with no attempt at creativity:
> beginD1 endD1
> D1 D1 (works like string delimiters)
> /D2 D2/

That is an interesting idea. I never thought of that.
February 11, 2009
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
...
> Let's take an example, like the enum for O_XXXX in std.c.linux.linux. Some of those values are common between platforms, and some are unique to particular platforms. So you might be tempted to write:
[snip]
> The first version is definitely shorter. But is it easier to maintain? I argue that it is *harder* and *buggier* to maintain.

This is definitely true of header modules.  There's absolutely no way I could manage the Posix headers in Tango/druntime with the non-duplicating approach.

> This would be even better if the OSX and linux declarations were split into separate "personality" modules. That way you can develop happily on OSX without fear of accidentally breaking linux support. You can defer dealing with the linux version until you actually on the linux machine and are in an efficient position to take care of it.

With version blocks, working with one version shouldn't break another version anyway, unless I'm misunderstanding your point.


Sean
February 11, 2009
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Derek Parnell wrote:
> > Because of D's limited support for text macros, I am using third party tools to get me out of this problem too.
> meld is particularly nice. Andrei showed it to me: http://www.linux.com/feature/61372

I suggest UltraCompare, if you're a Windows user.


Sean
February 11, 2009
"Walter Bright" <newshound1@digitalmars.com> wrote in message news:gmt1s1$i7f$1@digitalmars.com...
> Nick Sabalausky wrote:
>> This strikes me as throwing away the baby with the bathwater. If your code starts degenerating towards a versioning rat's nest, then the solution is to take a moment and refactor it into a larger granularity, not to throw away features that are useful in moderation.
>
> True, but that never, ever happens. It's always "I can get my few additions to this rat's nest in and just get it working, and worry about proper abstraction later."
>

It seems to me that D's versioning, conditional compilation, and overall feature set are already different enough that people aren't necessarily going to be falling into the "c preprocessor style" rut. Plus, as someone else mentioned, there *are* still concerns about code duplication. So it really becomes a balancing act between clear abstraction and DRY. D's version() just makes this balancing act harder because it keeps pushing in the one direction.

I'll grant that if there's a bad habit that most coders are doing (such as messy versioning), then it's certainly worthwhile to create a design that prevents it. But the more I think about it, the more convinced I become that most of version()'s restrictions are just red herrings. I really think you're attacking the wrong thing (not that I have any idea where the ideal place to attack would be). After all, Denis and I have both demonstrated that D's version() is just as susceptible to mess as C's #if/#ifdef versioning. If people are going to make a version() mess, they're going to do it. Things like !, ||, && and expression-level versions are just drops in the pond, they would allow certain things to be cleaned up, but they're not going to break the dam any more than it already is, and they would even make a few things better.

It would be great to have a way to eliminate messy versioning, but things such as prohibiting !, ||, && and expression-versions are doing very little to accomplish that. Modeling the version() syntax after the conditional syntax, and giving it less-than-"BEGIN"->"{"-level of power is already accomplishing far more in that regard. The benefits of preventing typos in version identifiers would also dwarf any benefits that might be gained from prohibiting !, ||, && and expression-versions.


February 11, 2009
"Sean Kelly" <sean@invisibleduck.org> wrote in message news:gmt9pl$11mt$1@digitalmars.com...
> == Quote from Walter Bright (newshound1@digitalmars.com)'s article
>> Derek Parnell wrote:
>> > Because of D's limited support for text macros, I am using third party tools to get me out of this problem too.
>> meld is particularly nice. Andrei showed it to me: http://www.linux.com/feature/61372
>
> I suggest UltraCompare, if you're a Windows user.
>

I've been a big fan of Beyond Compare for a long time now. I'll have to take a look at UltraCompare.


February 11, 2009
Nick Sabalausky wrote:
> "Walter Bright" <newshound1@digitalmars.com> wrote in message news:gmt6l0$rff$1@digitalmars.com...
>> Denis Koroskin wrote:
>>> Does it look any better? No way!
>> Of course doing it that way doesn't look any better, because it still just replicates the C preprocessor style of doing it.
>>
> 
> Which just goes to show that the restrictions you've placed on D's version() (in order to eliminate rat's nest versioning) DON'T eliminate rat's nest versioning.

But they do make it more painful to write the rat's nest, which can be motivating to find a more appropriate solution.


>> A far better solution...
> 
> And we can come up with better solutions for C as well. Granted, the optimal D solution is going to be much better than the optimal C solution, but it won't be due to version()'s lack of !, ||, &&, etc...

When cookies and veggies are laid out on the buffet, I tend to reach for the cookies <g>.
February 11, 2009
Nick Sabalausky wrote:
> The point is, the current semantics for D's version() are *plenty* susceptible to most of same versioning mess as C's #if/#ifdef, and in some cases (such as ||), even worse. With either style, the solution is exactly the same as any other chunk of messy code: Clean it up! Not only is gimping the version-control mechanism the wrong solution, it doesn't even solve the problem anyway.

I'll argue that I've never seen anyone create such a mess in D, while I see it regularly in C. So something about D is discouraging developing those things.

I think the tipping point is that it's too easy in C to slip into writing such a mess without actually trying to, while in D you have to work harder to do it. Hard enough that one might as well do it better in the first place.
February 11, 2009
"Walter Bright" wrote
> Denis Koroskin wrote:
>> Does it look any better? No way!
>
> Of course doing it that way doesn't look any better, because it still just replicates the C preprocessor style of doing it.
>
> A far better solution is to create a series of modules:
>
> gcnetbsd.d
> gchurd.d
> gcsunos5.d
> ...
>
> and inside each one put the specifics for that particular system. The huge advantage of this is that if I want to create a BrightBSD operating system, I just have to write a:
>
> gcbrightbsd.d

All you have done is split the mess into separate files.  This does not solve the problem.

> rather than trying to carefully fold it into that conditional compilation mess without inadvertently breaking other platform support. (And I cannot even tell if I broke the SunOS5 platform support or not, because I don't have a SunOS5 platform to test it on.).

But you have, because inadvertently, you changed some code in the actual implementation to use the new identifiers you made in your special new file. Now you have to go back and rethink the sunos include because you broke it. Mess still exists.  (of course, I have no idea, but I gave you as much of an example/proof as you did ;)

>> The story is not about different functionality on different platforms but
>> rather about a common code which is 98% the same on all the platforms and
>> is different in *small* details.
>> For example, I'd like to make my library D1 and D2 compatible. Do you
>> suggest me to maintain 2 different libraries? This is ridiculous, and
>> that's why there is no Tango2 release yet - there is *no* point in
>> supporting such a large library as Tango (or DWT) for two language
>> versions without a sane versioning mechanism.
>
> There are two very different things going on here. One is accounting for differences in the *language*, the other is about generating different builds based on language independent different desired features and platform characteristics.

That I would agree, and it turns out your example suffers from the same issues, i.e. #if __STDC__

-Steve


February 11, 2009
On Tue, 10 Feb 2009 17:16:22 -0800, Walter Bright wrote:

> Let's take an example, like the enum for O_XXXX in std.c.linux.linux. Some of those values are common between platforms, and some are unique to particular platforms...

Yes, the duplicated code is a better approach here because the apparent common code is the key in this example. The lines that are the same are only coincidently common and and not common due to any inate factors of the operating system or data being declared.

> This would be even better if the OSX and linux declarations were split into separate "personality" modules.

Agreed.


The example I offer here is more the sort of thing that it seems you are also finding when maintaining Phobos ...

Current my code loks like this ...
//--------------- Using version() w/o duplicating code --------

//-------------------------------------------------------
int RunCommand(string pExeName, string pCommand, runCallBack
pCallBack=null)
//-------------------------------------------------------
{

    version(Windows)
    {
        if (std.path.getExt(pExeName).length == 0)
            pExeName ~= ".exe";
    }

    if (util.pathex.IsRelativePath(pExeName) == True)
    {
        string lExePath;
        lExePath = util.pathex.FindFileInPathList("PATH", pExeName);
        if (util.str.ends(lExePath, std.path.sep) == False)
            lExePath ~= std.path.sep;

        version(Windows) pExeName = util.pathex.CanonicalPath(lExePath ~
                                    pExeName, False, False);
        version(Posix)   pExeName = util.pathex.CanonicalPath(lExePath ~
                                    pExeName, False, True);
    }

    if (util.file2.FileExists(pExeName) == false)
    {
        throw new FileExException(std.string.format(
                 "Cannot find application '%s' to run", pExeName));
    }
    return RunCommand( util.str.enquote(pExeName) ~ " " ~ pCommand,
                           pCallBack);
}
//-------------------------------------------------------
int RunCommand(string pCommand, runCallBack pCallBack = null)
//-------------------------------------------------------
{
    int lRC;

    if (pCallBack !is null)
    {
      lRC = pCallBack(1, pCommand, 0);
      if (lRC != 0)
          return 0;
    }

    lRC = system(std.string.toStringz(pCommand));
    version(Posix) lRC = ((lRC & 0xFF00) >> 8);

    if (pCallBack !is null)
        pCallBack(2, pCommand, lRC);

    return lRC;
}


But if I split the opsys versioning out I get ...


//--------------- Using version() with duplicating code --------
alias int function(int, string, int) runCallBack;
version(Windows)
{
	//-------------------------------------------------------
	int RunCommand(string pExeName, string pCommand,
                       runCallBack pCallBack=null)
	//-------------------------------------------------------
	{

	    if (std.path.getExt(pExeName).length == 0)
	        pExeName ~= ".exe";

	    if (util.pathex.IsRelativePath(pExeName) == True)
	    {
	        string lExePath;
	        lExePath = util.pathex.FindFileInPathList("PATH", pExeName);
	        if (util.str.ends(lExePath, std.path.sep) == False)
	            lExePath ~= std.path.sep;

	        pExeName = util.pathex.CanonicalPath(lExePath ~ pExeName, False,
                                                     False);
	    }

	    if (util.file2.FileExists(pExeName) == false)
	    {
	        throw new FileExException(std.string.format(
                     "Cannot find application '%s' to run", pExeName));
	    }
	    return RunCommand( util.str.enquote(pExeName) ~ " " ~ pCommand,
                                pCallBack);
	}
	//-------------------------------------------------------
	int RunCommand(string pCommand, runCallBack pCallBack = null)
	//-------------------------------------------------------
	{
	    int lRC;

	    if (pCallBack !is null)
	    {
	      lRC = pCallBack(1, pCommand, 0);
	      if (lRC != 0)
	          return 0;
	    }

	    lRC = system(std.string.toStringz(pCommand));

	    if (pCallBack !is null)
	        pCallBack(2, pCommand, lRC);
	    return lRC;

	}

}

version(Posix)
{
	//-------------------------------------------------------
	int RunCommand(string pExeName, string pCommand,
                       runCallBack pCallBack=null)
	//-------------------------------------------------------
	{

	    if (util.pathex.IsRelativePath(pExeName) == True)
	    {
	        string lExePath;
	        lExePath = util.pathex.FindFileInPathList("PATH", pExeName);
	        if (util.str.ends(lExePath, std.path.sep) == False)
	            lExePath ~= std.path.sep;

	        pExeName = util.pathex.CanonicalPath(lExePath ~ pExeName, False,
                                                     True);
	    }

	    if (util.file2.FileExists(pExeName) == false)
	    {
	        throw new FileExException(std.string.format(
                       "Cannot find application '%s' to run", pExeName));
	    }
	    return RunCommand( util.str.enquote(pExeName) ~ " " ~ pCommand,
                               pCallBack);
	}
	//-------------------------------------------------------
	int RunCommand(string pCommand, runCallBack pCallBack = null)
	//-------------------------------------------------------
	{
	    int lRC;

	    if (pCallBack !is null)
	    {
	      lRC = pCallBack(1, pCommand, 0);
	      if (lRC != 0)
	          return 0;
	    }

	    lRC = system(std.string.toStringz(pCommand));
	    lRC = ((lRC & 0xFF00) >> 8);

	    if (pCallBack !is null)
	        pCallBack(2, pCommand, lRC);
	    return lRC;

	}

}



Now we have almost twice the code and whenever an enhancement to the RunCommand function is needed, I must examine and correctly update twice as many lines now, because *most*, but not quite all, the logic is identical between both operating systems. This is the sort of scenario that will cause bugs to be introduced.

The fine line that divides when to duplicate and when not to duplicate, is hard to see clearly. I tend to favour the less duplication approach, but only when it leads to lower maintenance costs.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell