December 23, 2011
On Thursday, December 22, 2011 12:59:15 Stewart Gordon wrote:
> On 22/12/2011 03:41, Jonathan M Davis wrote:
> <snip>
> 
> > http://jmdavis.github.com/d-programming-language.org/std_datetime.html
> 
> Have I got all this right?
> 
> - a flag goes from % up to another %, a {, a whitespace or punctuation
> character - flags with [...] portions are exceptions to this rule,
> extending to the closing ] - %{ or %} is a literal { or }
> - all literal characters that aren't classed as whitespace or punctuation
> must be enclosed in {...}
> - {%} is just a literal %
> - in %C2* and %Cn* flags, C must be either _ (space) or 0

Yes. I believe that that's correct.

> To look at the detail:
> 
> - In %nY, does "up to n" mean that it truncates years longer than that to that many characters?  Does it strip any leading 0s that result from this truncation?  (Under %3Y, does 2011 become 11 or 011?)

It becomes 11. There's no filler character. For instance, the example section gives

assert(SysTime(Date(8, 7, 4)).toCustomString!"%2Y"() == "8");

But I guess that I need to find a clearer way to state the flag's definition.

> - But the approach to formatting the year is nicely systematic.  (I've thought of possibly adding to my scheme a means of formatting dates to arbitrary length with sign, in order to support the ISO format for years that may be outside the 1BC-9999 range.

I tried to be extremely systematic about all of the flags. As a result, I believe that the system is very consistent with itself.

> - If you're going to have the ISO week number in the system, it seems to me you should also have the week-numbering year.  (I've thought about possibly adding these to my scheme.)

%isoweek and %C2isoweek

> - In %F, What does "as many digits as necessary" mean?  In particular, why in the example code does it give 12/100000 and not 3/25000 despite the latter being shorter?

Obviously, that needs to be clearer. The denominator is always a multiple of 10. It's what the mpeg-7 standard uses, which is why it's there.

> - Can only literal stuff be included within %BC[...] and %AD[...]?

That was the idea. I don't think that it needs to be any fancier than that. You could just use %cond or %func otherwise. That should probably clearer though.

> - %cond and %func - I'm made to wonder to what extent this would be used and to what extent it would just be simpler to use if / ?: / ~.  Anyway, I assume that if there's more than one, they will reference the template arguments in order.  Can %cond contain other %*[...] flags?  Indeed, can %cond's be nested to arbitrary depth?

In theory, %cond is supposed to be arbitrarily nestable, though I question that it would generally be a good idea to do so. And yes, as the text underneath that section of flags mentions, functions are associated with flags the same way that arguments to format or writefln would be.

> - Does %localeDate give the short or the long date format?  (This and %localeTime are handled by separate functions in my library - I didn't think there was any real use case for including such a thing within a longer formatted date/time string.)

I have no idea. It's strftime which has that functionality (%x and %X). The idea is that std.datetime would parse that to determine the correct format, but I don't know how well it will work, since I haven't written it yet. They're flags that are only there, because I was trying to make it possible to to use toCustomString to do everything that strftime can do. It does occur to me though that I don't have a way to do %c (which would be both the date and time in the preferred format). I guess that that would be %localeDateTime.

But these _are_ flags that might have to get the axe if I can't do what I think that I'm going to be able to do to figure out what strftime is doing and use the same format.

> On the whole, it seems a powerful system.  The format strings can get quite obfuscated, but at least the flags that are likely to be commonly used aren't too bad.

The idea at least is that the common use cases are fairly easy but that it's also powerful to do most any format without much difficulty. Some of the standard formats _do_ get a bit convoluted however.

> It seems strange to require explicit notations both for flags and for alphanumeric literals.  But I suppose it makes the literals easier to read than having to follow where all those % signs are.

It makes the parsing easier (particularly with multi-character flags), and makes it somewhat easier to read IMHO, since it does clearly separate out the portions that _aren't_ flags. I'm not sure that I would have thought of that on my own however. It seemed like a good idea from your scheme.

> One thing I noticed doesn't appear in your scheme is ordinal suffix.  OK, so arbitrary alignment fields and collapsible portions aren't to be seen either, but those are quite rare in format string schemes anyway.

I was debating that. A flag for it could certainly be added. I don't have a way to deal with it in a locale-specific manner though, unfortunately. It's also a bit nasty, since it has to be tied to a specific number. I think that it only really makes sense with the days, however. So, at that point, you either create alternate versions of some of the day flags that have a th suffix or something similar, or you create a separate day flag which only does the ordinal suffix (similar %yplus). I may add such a flag.

Regardless, some of the specific flags are definitely up for debate (e.g. several of them are only there because strftime has something similar), and if there are flags that are likely to be generally useful which I'm missing, they may be worth adding.

- Jonathan M Davis
December 23, 2011
On Thursday, December 22, 2011 11:20:11 Stewart Gordon wrote:
> On 22/12/2011 03:41, Jonathan M Davis wrote:
> <snip>
> 
> > Stewart Gordon has a library that takes a different approach ( http://pr.stewartsplace.org.uk/d/sutil/datetime_format.html ). It does away with % flags and uses maximul munch with each of the flags being name such that they don't overlap in a way that would make certain combinations of flags impossible.
> 
> If you mean such things as writing a datum twice consecutively in two different formats, it can be done using an empty literal.  For example, "Mmm''m" would today generate "Dec12".  Not that I can see any use for such a format, just showing that it can be done.

I mean that you have to be way more careful about how you name the flags. For instance, if you have

MMM

and

Month

you have issues with stuff like MMMonth. It can definitely work, but the more flags that you have, the more problematic it becomes. It's also easier to separate out consecutive flags when reading them if you have %.

> > It's an interesting approach, but it isn't as
> > flexible as it could be because of its use of maximul munch instead of %
> > flags.
> How do you mean?

It's harder to have modifiers for flags without the %. For instance, what I'm doing with filler characters would be much more difficult with your scheme. With % delineating the flags, it becomes easier to handle that sort of thing.

> Looks complicated compared to mine at first sight.  Maybe I just need to spend a bit of time looking at it in more detail.

I don't think that it's all that complicated ultimately, but it definitely _looks_ complicated to begin with. I _tried_ to do the documentation in a way that made it less daunting, but I'm not sure that I succeeded. Actually, it's a bit like std.datetime in general in that sense. It really isn't all that complicated to use, but there's a lot of it, so it _looks_ complicated and therefore is more daunting than it should be.

I probably shouldn't use the standard formats as the first examples. I was trying to show examples which corresponded with known formats so that you could see how they're done, but those formats are have to be very precise in how they're laid out, so they have a complicated set of flags. Doing simpler stuff like

"%M/%D/%Y" as the primary examples would probably be better.

- Jonathan M Davis
December 23, 2011
On 12/22/2011 7:13 PM, Jonathan M Davis wrote:
> Okay. Assuming that I'm going to try and make TimeZone opaque within SysTime,
> does that require a pointer rather than a reference? And I assume then that
> the time zone stuff would need to be in a separate module than SysTime. That
> being the case, how would SysTime be able to use the time zone without
> importing that module? Does the C++ solution of forward declaring it like
>
> class TimeZone;
>
> work in D?

It'll still put a reference to TimeZone in the ModuleInfo.

I suggest:

    void* tz;

The functions that don't need it, just ignore it. The functions that do need TimeZone, do:

    class TimeZone { void foo() { ... } }

    if (!tz)
	tz = initTimeZone();
    auto t = cast(TimeZone)tz;
    t.foo();  // call members of TimeZone

Put the functions that do need TimeZone in a separate module from the ones that don't.
December 23, 2011
On 12/22/2011 7:24 PM, Jonathan M Davis wrote:
> So, it harms usability IMHO to using something like UTCTime instead of
> SysTime, and just to save yourself the cost of the reference for the time
> zone?

No, it's not the cost of the reference. It's the cost of pulling in all the code to deal with that reference.
December 23, 2011
On Thursday, December 22, 2011 21:32:17 Walter Bright wrote:
> On 12/22/2011 7:24 PM, Jonathan M Davis wrote:
> > So, it harms usability IMHO to using something like UTCTime instead of
> > SysTime, and just to save yourself the cost of the reference for the
> > time
> > zone?
> 
> No, it's not the cost of the reference. It's the cost of pulling in all the code to deal with that reference.

Well, I still dispute that that's a big deal, but regardless, if the issue can be solved with PIMPL (as ugly as it may be do so), then at least the difference should be hidden from the programmer instead of affecting the API.

- Jonathan M Davis
December 23, 2011
On Thursday, December 22, 2011 21:30:46 Walter Bright wrote:
> On 12/22/2011 7:13 PM, Jonathan M Davis wrote:
> > Okay. Assuming that I'm going to try and make TimeZone opaque within SysTime, does that require a pointer rather than a reference? And I assume then that the time zone stuff would need to be in a separate module than SysTime. That being the case, how would SysTime be able to use the time zone without importing that module? Does the C++ solution of forward declaring it like
> > 
> > class TimeZone;
> > 
> > work in D?
> 
> It'll still put a reference to TimeZone in the ModuleInfo.

Will that still happen if the TimeZone is used in templated functions? SysTime has several functions that use TimeZone explicitly - e.g. the timezone property. It needs to be able to take and return a TimeZone. However, it _could_ be templatized with an empty template parameter list. Would that avoid pulling in the information on TimeZone if those functions aren't instantiated? Or would it still pull it in?

- Jonathan M Davis
December 23, 2011
On 12/22/2011 11:12 PM, Jonathan M Davis wrote:
> On Thursday, December 22, 2011 21:30:46 Walter Bright wrote:
>> On 12/22/2011 7:13 PM, Jonathan M Davis wrote:
>>> Okay. Assuming that I'm going to try and make TimeZone opaque within
>>> SysTime, does that require a pointer rather than a reference? And I
>>> assume then that the time zone stuff would need to be in a separate
>>> module than SysTime. That being the case, how would SysTime be able to
>>> use the time zone without importing that module? Does the C++ solution
>>> of forward declaring it like
>>>
>>> class TimeZone;
>>>
>>> work in D?
>>
>> It'll still put a reference to TimeZone in the ModuleInfo.
>
> Will that still happen if the TimeZone is used in templated functions? SysTime
> has several functions that use TimeZone explicitly - e.g. the timezone
> property. It needs to be able to take and return a TimeZone. However, it
> _could_ be templatized with an empty template parameter list. Would that avoid
> pulling in the information on TimeZone if those functions aren't instantiated?
> Or would it still pull it in?
>
> - Jonathan M Davis

Templates, after instantiation, are exactly like their non-templated equivalents. Before instantiation, they are not even semantically analyzed.
December 23, 2011
On Friday, December 23, 2011 01:41:43 Walter Bright wrote:
> Templates, after instantiation, are exactly like their non-templated equivalents. Before instantiation, they are not even semantically analyzed.

That's more or less what I figured, but then again, I never would have expected having a class would be such a big deal in the first place. With the judicious use of templates, it may be possible to get the PIMPL bit to work.

The problem that I see is at the moment that while a templated may not be instantiated, if it takes a particular type as a parameter (e.g. TimeZone), the type still needs to be imported. If it were just internal to the function, then a import statement could be put inside of the function, but I'm not sure how you could have a localized import like that for a parameter. Since essentially what is needed is for the module with the type to be imported when the programmer tries to instantiate the templated function but not be imported otherwise.

But doing that gets rather convoluted. I _think_ that it could be done if the function used a template constraint which used an eponymous template from another module which imported the module with TimeZone in it and checked the type. But that might still pull in the class, because the module with the eponymous template imported it. I really don't understand what exactly results in the class' info being pulled into the executable. It's also a bit ugly to template a parameter which can only be one type, but that's not the end of the world if it works.

So, I'm not sure that it's actually possible to get PIMPL to work here, since several functions take a TimeZone argument, and even templatizing them, I'm not sure how you avoid having to always import the module with TimeZone in it to have those functions compile properly when they're instantiated.

- Jonathan M Davis
December 23, 2011
Am 22.12.2011 22:57, schrieb Jacob Carlborg:
> On 2011-12-22 19:32, Walter Bright wrote:
>> On 12/22/2011 9:22 AM, Jacob Carlborg wrote:
>>> On 2011-12-22 16:56, Michel Fortin wrote:
>>>> The benefit of referencing classes within module info: you can
>>>> instantiate them using Object.factory, if they have a default
>>>> constructor. We pay a heavy price compared to what we get with this
>>>> very
>>>> limited runtime reflection.
>>>
>>> It's a really nice feature to have when implementing serialization.
>>
>> Sure, but we need to be aware of class overhead, and not use classes
>> unless necessary. I.e. a class shouldn't be used to merely create a
>> namespace. Classes also should not be used if it is not intended to be a
>> polymorphic type.
>
> Exactly. But I'm referring to deserializing classes, I don't care what
> they're used for.
>

IMHO, the user should know the type of the object he wants to deserialize, so it can be done only with compile-time reflection.
December 23, 2011
On 23/12/2011 03:55, Jonathan M Davis wrote:
<snip>
>> - If you're going to have the ISO week number in the system, it seems to me
>> you should also have the week-numbering year.  (I've thought about possibly
>> adding these to my scheme.)
>
> %isoweek and %C2isoweek

What are you saying?  That the documentation is wrong, and %isoweek emits a year, not a week?

>> - In %F, What does "as many digits as necessary" mean?  In particular, why
>> in the example code does it give 12/100000 and not 3/25000 despite the
>> latter being shorter?
>
> Obviously, that needs to be clearer. The denominator is always a multiple of
> 10. It's what the mpeg-7 standard uses, which is why it's there.
<snip>

What's that to do with it?  25000 _is_ a multiple of 10.  And 3/25000 contains 6 digits, compared to 12/100000's 8.  So the spec reads to the effect that %F should generate 3/25000 in that example.

Stewart.