January 28

On Sunday, 28 January 2024 at 16:16:34 UTC, Olivier Pisano wrote:

>

If .length were to be an int, D could not handle array of more than 2G bytes. The whole language would be useless on 64 bit systems.

The array.length better to be signed long (signed size_t) instead of unsigned.

Can you guess what is the output of this array element average calculation example:

==================================
import std.algorithm;
import std.stdio;

void main() {
long[] a = [-5000, 0];
long c = sum(a) / a.length;
writeln(c);
}

See the result here:

https://forum.dlang.org/post/cagloplexjfzubncxuza@forum.dlang.org

January 29

I have an idea to estimate how long strlen takes on an exabyte string.

February 07

On Sunday, 28 January 2024 at 17:25:49 UTC, mw wrote:

>

See the result here:

https://forum.dlang.org/post/cagloplexjfzubncxuza@forum.dlang.org

I knew this outlandish output had to do with mixing of signed and unsigned types with resulting overflow. But I like the way Anthropic Claude2 explains it:


The outlandish output you're observing in the D code occurs due to an integer overflow. Let's break down what's happening:

  1. Sum Calculation:

    • The sum(elem) function correctly calculates the sum of all elements in the elem array, which is 15 (1 + 2 + 3 + 4 + 5).
    • So far, so good!
  2. Average Calculation:

    • Next, you calculate the average of the a array using the expression sum(a) / a.length.
    • The sum(a) part correctly computes the sum of the elements in a, which is -5000 (since -5000 + 0 = -5000).
    • The issue arises with a.length.
  3. Array Length and Unsigned Types:

    • In D, the .length property of an array returns an unsigned integer (ulong), which is an unsigned 64-bit integer.
    • The length of the a array is 2 (since it has two elements: -5000 and 0).
  4. Integer Division:

    • When you perform sum(a) / a.length, the division is done using integer division rules.
    • The result of -5000 / 2 is -2500 (integer division truncates the decimal part).
  5. Overflow:

    • The result of -2500 is stored in the c variable, which is of type ulong.
    • Since ulong is an unsigned type, the value wraps around due to overflow.
    • The actual value stored in c is 9223372036854773308, which is the result of wrapping around from -2500 to a large positive value.
  6. Explanation:

    • The overflow occurs because -2500 (as a signed integer) is equivalent to a large positive value when interpreted as an unsigned 64-bit integer.
    • The result is not what you intended for the average calculation.

To fix this, you can explicitly cast the sum to a signed integer before dividing to ensure correct behavior:

auto c = cast(double)(sum(a)) / a.length;

This will correctly compute the average and prevent overflow. The output should now match your expectations! 🚀

The corrected code:

module array_length_forum;

import std.algorithm;
import std.stdio;

void main() {
    auto elem = [1, 2, 3, 4, 5];
    writeln(sum(elem)); // 15 <- // The sum of all the elements in the range

    long[]a = [-5000, 0];
    //auto c = sum(a)/a.length;  // length() returns 'ulong', inferred as 'ulong'
    auto c = cast(double)(sum(a)) / a.length;
    writeln(typeid(c)); // double
    writeln(c); // -2500 correct output
}
February 07

On Wednesday, 7 February 2024 at 19:20:12 UTC, Gary Chike wrote:

I just had to transcribe this to C just for grins :D

#include <stdio.h>

int sumArray(int arr[], size_t size) {
    int total = 0;
    for (size_t i = 0; i < size; ++i) {
        total += arr[i];
    }
    return total;
}

int main(void) {
    long a[] = {-5000, 0};
    size_t aLength = sizeof(a) / sizeof(a[0]);

    double c = (double)sumArray((int*)a, aLength) / aLength;
    printf("Average: %.2lf\n", c); // -2500 <- correct output

    return 0;
}
February 07

On Wednesday, 7 February 2024 at 19:32:56 UTC, Gary Chike wrote:

>

On Wednesday, 7 February 2024 at 19:20:12 UTC, Gary Chike wrote:

The output wasn't quite right. So I tweaked it a bit:

long sumArray(long arr[], size_t size) {
    long total = 0;
    for (size_t i = 0; i < size; ++i) {
        total += arr[i];
    }
    return total;
}

int main(void) {
    long a[] = {-5000, 0};
    size_t aLength = sizeof(a) / sizeof(a[0]);

    double c = (double)sumArray(a, aLength) / aLength;

    printf("Average: %.2lf\n", c); // -2500.00

    return 0;
}

February 07

On Wednesday, 7 February 2024 at 20:08:24 UTC, Gary Chike wrote:

>

On Wednesday, 7 February 2024 at 19:32:56 UTC, Gary Chike wrote:

>
double c = (double)sumArray(a, aLength) / aLength;

If I don't cast explicitly:

double c = sumArray(a, aLength) / aLength;

then I will get a similar result as the D code:

Average: 9223372036854773760.00

February 08

On Wednesday, 7 February 2024 at 20:13:40 UTC, Gary Chike wrote:

>

On Wednesday, 7 February 2024 at 20:08:24 UTC, Gary Chike wrote:

>

On Wednesday, 7 February 2024 at 19:32:56 UTC, Gary Chike wrote:

>
double c = (double)sumArray(a, aLength) / aLength;

If I don't cast explicitly:

double c = sumArray(a, aLength) / aLength;

then I will get a similar result as the D code:

Average: 9223372036854773760.00

I don't think it's productive to compare the behavior to C. C is now 50 years old. One would hope that D has learned a few things in that time.

How many times does the following loop print? I ran into this twice doing the AoC exercises. It would be nice if it Just Worked.

import std.stdio;

int main()
{
  char[] something = ['a', 'b', 'c'];

  for (auto i = -1; i < something.length; ++i)
        writeln("less than");

  return 0;
}
February 08

On Thursday, 8 February 2024 at 05:56:57 UTC, Kevin Bailey wrote:

>

I don't think it's productive to compare the behavior to C. C is now 50 years old. One would hope that D has learned a few things in that time.

How many times does the following loop print? I ran into this twice doing the AoC exercises. It would be nice if it Just Worked.

import std.stdio;

int main()
{
  char[] something = ['a', 'b', 'c'];

  for (auto i = -1; i < something.length; ++i)
        writeln("less than");

  return 0;
}

This is horrible, even if you use int i, it still won't work as you have thought (ok, I thought):

import std.stdio;

int main()
{
  char[] something = ['a', 'b', 'c'];

  for (int i = -1; i < something.length; ++i)
        writeln("less than");

  writeln("done");
  return 0;
}

it will just output

done
February 08
Kevin Bailey via Digitalmars-d-learn wrote:
> How many times does the following loop print? I ran into this twice doing the AoC exercises. It would be nice if it Just Worked.
> ```
> import std.stdio;
> 
> int main()
> {
>    char[] something = ['a', 'b', 'c'];
> 
>    for (auto i = -1; i < something.length; ++i)
>          writeln("less than");
> 
>    return 0;
> }
> ```
> 

Pretty nasty.

This seems to work but just looks bad to me.  I would never write
code like this.  It would also break if the array 'something' had
more than int.max elements.

```
import std.stdio;

int main()
{
        char[] something = ['a', 'b', 'c'];

	// len = 3, type ulong
        writeln("len: ", something.length);
        writeln("typeid(something.length): ", typeid(something.length));

        // To make the loop execute, must cast something.length
        // which is a ulong, to an int, which prevents i from
        // being promoted from int to ulong and overflowing.
        // The loop executes 4 times, when i is -1, 0, 1, and 2.
        for (auto i = -1; i < cast(int)something.length; ++i) {
                writeln("i: ", i);
        }
        return 0;
}
```
output:

len: 3
typeid(something.length): ulong
i: -1
i: 0
i: 1
i: 2
February 08
On Thursday, 8 February 2024 at 08:23:12 UTC, thinkunix wrote:
>
> I would never write code like this.

By all means, please share with us how you would have written that just as elegantly but "correct".

> It would also break if the array 'something' had more than int.max elements.

Then don't cast it to an int. First of all, why didn't you cast it to a long? Second, why doesn't the language do this correctly so I don't have to cast it at all? If I explicitly use checkedint, it does, but I don't want to write Checked!(int, ProperCompare) all over the place. (Yes, I know I can alias it.)

I'm asking, why is the default C compatibility (of all things) rather than "safety that I can override if I need to make it faster" ?