• Re: ???Why do arrays start at 0?"

    From gah4@21:1/5 to Juha Nieminen on Mon Aug 29 22:59:00 2022
    On Monday, August 29, 2022 at 4:16:00 AM UTC-7, Juha Nieminen wrote:


    I don't know if it's the *original* reason, but I would assume that at least in C one of the main reasons is the principle of maximum efficiency.

    In many processor architectures the concept of "array" exists, at least
    when it comes to values of the register sizes (ie. usually 1-byte,
    2-byte, 4-byte and 8-byte elements, the last one at least on 64-bit architectures). Prominently the concept of an indexable array exists
    in the x86 architecture. (I don't remember now if it also exists in
    the ARM architecture, but I would guess so.)

    BCPL, a predecessor of C, was written on a word addressable
    machine. Pointer arithmetic was integer arithmetic. It was with
    C that variables could be larger than the addressable unit,
    and that pointers (addresses) incremented and decremented
    in appropriate sized units.

    In any case, early C (and B and BCPL) programmers were likely
    also assembly programmers, and would have been used
    to 0 origin indexing.

    Generally when a processor architecture supports the concept of an "array", it does so by having instructions that take (at least) two registers as
    the input or the output parameter: A base address, and an offset. The
    memory location of the element is calculated by adding those two. (The
    number of bytes that an offset of 1 jumps depends on the instruction,
    and thus multi-byte elements are supported.)

    For many machines, you have to multiply by the element size,
    but yes some do it for you. C pointer arithmetic is defined in terms
    of the size of the pointed-to object, and array indexing is defined
    in terms of pointer arithmetic.

    Thus zero-indexing is extraordinarily natural in processor architectures:
    The "index" is actually an offset. It's a value you add to the base
    address in order to get to the location you want. Thus, the first element
    is at index/offset 0.

    Since that's the case, the most optimal way to handle low-level arrays is
    to have 0-based indexing in the programming language as well. That way
    you don't need to be subtracting 1 from the index every time an array is accessed (or you don't need an extraneous unused element at the beginning
    of the array, consuming memory for no reason).

    In the case of Fortran, starting with static arrays, the compiler can subtract before generating the address constant, such that the subtract is done
    at compile time. No run-time cost.

    For allocatable arrays, it can adjust the offset once, and use that.

    For automatic variables on the stack, the stack pointer offset is
    computed once.

    The way C pointers are defined, that would not be quite as
    easy to do. Or, C pointers are defined the way they are to make it easier.

    Fortran was originally Formula Translation, and mathematicians did,
    and still do, use 1 based indexing for matrices and vectors.

    Also, since C supports pointer arithmetic, many operations become simpler. Such as getting the index of an element when what you have is a pointer to
    it (and the pointer to the start of the array).

    Note that it gets more interesting for Fortran. When you pass an array
    as an argument to a subroutine, it loses the original origin.
    (Well, most of the time. Remembering when it does and doesn't is
    probably harder than figuring out 0 vs.1 origin.)

    One that C programmers like to do, and I suspect was faster on
    many early machines, is index through arrays with a pointer to an array, indexing it as it goes. That means no subscript calculation in the loop.
    Many early C processors didn't have multiply, or had slow multiply.

    However, one of the favorite optimizations for Fortran compilers
    is strength reduction, converting multiply inside a loop into addition
    as the loop increments. The result is pretty much the same.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)