• Order preserving Sort algorithm ?

    From Bob Armstrong@21:1/5 to All on Sat Sep 2 12:51:14 2023
    It's been too long since I checked comp.lang.forth . comp.lang.apl has too little activity to bother with & I didn't realize that is far from true here .

    Anybody got a good ` sort algorithm ?
    I've been working on getting all my accounting " business ready " in CoSy . It's only been in the last few years approaching the facilities I had in my legacy K ( Kx.com ) CoSy .
    See my SV-FIG presentation : https://www.cosy.com/CoSy/y23/SV-FIG_20230826_commentary.html .
    I am surprised to find I never implemented a variant of the sort algorithm I use , https://cosy.com/4thCoSy/Code/CoSy/sort.f , for floats .
    The fundamental algorithm , while impressively elegant has a more pervasive problem that it is not order preserving on identical elements . That means if sorting a table using the permutations associated with a sort , the equal elements in sorting of a
    later column can disrupt the order of previously sorted columns .
    APLs & K only return the sorting permutation , aka ` grade , rather than the sorted list itself . I've yet to figure out how to get the permutation ( to apply to all columns in a table ) w/o doing the sort itself .
    In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hugh Aguilar@21:1/5 to Bob Armstrong on Sat Sep 2 22:23:02 2023
    On Saturday, September 2, 2023 at 12:51:16 PM UTC-7, Bob Armstrong wrote:
    In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .

    I have a MergeSort in the novice package.
    AFAIK, it is stable, although I never gave any thought to whether it was or not.

    Obviously it works floats, strings, or whatever! Duh!
    Nobody with any knowledge of computer science implements complicated data-structures that are specific to one type of data --- what a waste of time! My LIST data-structure is general-purpose --- it can contain any kind of data. You pass the xt of a comparison function into the sort function.

    I also have HeapSort for arrays --- AFAIK, it is not stable.
    I also have an LLRB tree that orders the data as it is inserted.

    I rely heavily on linked lists in my own programming.
    I don't use arrays or trees unless there is some compelling reason to do so. Linked lists work well most of the time --- I got the idea of having a
    single data-structure that is generally used from Factor that has "sequences" that it uses for pretty much all data.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to bob@cosy.com on Sun Sep 3 11:21:11 2023
    In article <b0ac03f1-f64f-4fef-aa9a-f48badb438c6n@googlegroups.com>,
    Bob Armstrong <bob@cosy.com> wrote:
    <SNIP>
    In any case , if anyone has an order preserving sort which applies to
    floats as well as ints & strings , let us know .
    This is trivial. As long as you can discriminate between objects
    they are different. Now add some more test to the comparison.
    : <c OVER @ OVER @ < DSSWAP < AND ;

    DSSWAP is supposed to swap a double and a single.
    [Normal people use either ROT or -ROT, but that is too hard.]

    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Heinrich Hohl@21:1/5 to Bob Armstrong on Sun Sep 3 05:53:05 2023
    On Saturday, September 2, 2023 at 9:51:16 PM UTC+2, Bob Armstrong wrote:
    In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .

    An order-preserving sorting algorithm is also called a "stable" sorting algorithm.

    There are two good stable sorting algorithms:

    Insertion sort:
    An elementary sorting algorithms with sorting time of order O(n^2).
    Very simple algorithm. Extremely fast if the number of items to be sorted is small.
    Gets too slow if the number of items to be sorted exceeds a certain limit.

    Merge sort:
    An advanced sorting algorithm with sorting time of order O(n*log(n)).
    Top-down approaches (recursive) as well as bottom-up approaches (iterative)
    of merge sort are possible and work equally well. More complicated than insertion sort.
    Would be needlessly complicated if the number of data items is small.
    On the other hand, this algorithm can handle much larger data sets within reasonable time.

    I would suggest that you look at the algorithms and implement the sorting routines
    yourself. This is easier than adapting someone else's code.

    A good reference is:
    Hans Werner Lang: “Sequential and parallel sorting algorithms” https://hwlang.de/algorithmen/sortieren/algoen.htm

    Start with insertion sort and find out if the sorting time is still acceptable in the worst
    case scenario (large data set). Otherwise look at Merge sort. The recursive method of
    merge sort is slightly simpler to implement than the iterative one.

    Bubble sort and Gnome sort are other elementary sorting methods that are also stable.
    Both methods are flawed by design and should not be used in real applications.

    Henry

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hugh Aguilar@21:1/5 to Heinrich Hohl on Sun Sep 3 16:11:48 2023
    On Sunday, September 3, 2023 at 5:53:07 AM UTC-7, Heinrich Hohl wrote:
    Insertion sort:
    An elementary sorting algorithms with sorting time of order O(n^2).
    Very simple algorithm. Extremely fast if the number of items to be sorted is small.
    Gets too slow if the number of items to be sorted exceeds a certain limit.

    I have Insertion Sort for lists too. I used this in my <SWITCH construct primarily because it detects duplicate entries when the duplicate is first found so I can abort the compile with a helpful error message pointing at where the error occurred.
    Insertion Sort did prove to be quite slow for my processor simulator, but the slowness is at compile-time so it is just a minor nuisance.

    Merge sort:
    An advanced sorting algorithm with sorting time of order O(n*log(n)). Top-down approaches (recursive) as well as bottom-up approaches (iterative) of merge sort are possible and work equally well. More complicated than insertion sort.
    Would be needlessly complicated if the number of data items is small.
    On the other hand, this algorithm can handle much larger data sets within reasonable time.

    I would suggest that you look at the algorithms and implement the sorting routines
    yourself. This is easier than adapting someone else's code.

    Nobody has to adapt (rewrite) my code --- it is general-purpose so it can be put to use unmodified for whatever data you might have.

    Reading books on algorithms and studying an "advanced" sort algorithm
    such as Merge Sort is a waste of time --- this has been well known for decades. There is a lot of work involved in writing the code that I have already written.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Hugh Aguilar on Mon Sep 4 12:20:22 2023
    On 4/09/2023 9:11 am, Hugh Aguilar wrote:
    On Sunday, September 3, 2023 at 5:53:07 AM UTC-7, Heinrich Hohl wrote:
    Insertion sort:
    An elementary sorting algorithms with sorting time of order O(n^2).
    Very simple algorithm. Extremely fast if the number of items to be sorted is small.
    Gets too slow if the number of items to be sorted exceeds a certain limit.

    I have Insertion Sort for lists too. I used this in my <SWITCH construct primarily because it detects duplicate entries when the duplicate is first found so I can abort the compile with a helpful error message pointing at where the error occurred.
    Insertion Sort did prove to be quite slow for my processor simulator, but the slowness is at compile-time so it is just a minor nuisance.

    Merge sort:
    An advanced sorting algorithm with sorting time of order O(n*log(n)).
    Top-down approaches (recursive) as well as bottom-up approaches (iterative) >> of merge sort are possible and work equally well. More complicated than insertion sort.
    Would be needlessly complicated if the number of data items is small.
    On the other hand, this algorithm can handle much larger data sets within
    reasonable time.

    I would suggest that you look at the algorithms and implement the sorting routines
    yourself. This is easier than adapting someone else's code.

    Nobody has to adapt (rewrite) my code --- it is general-purpose so it can be put to use unmodified for whatever data you might have.

    Reading books on algorithms and studying an "advanced" sort algorithm
    such as Merge Sort is a waste of time --- this has been well known for decades.
    There is a lot of work involved in writing the code that I have already written.

    If thinking means anything, it is looking at what is useful to one and discarding
    what is not.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bob Armstrong@21:1/5 to Heinrich Hohl on Sun Sep 10 23:03:40 2023
    On Sunday, September 3, 2023 at 6:53:07 AM UTC-6, Heinrich Hohl wrote:
    On Saturday, September 2, 2023 at 9:51:16 PM UTC+2, Bob Armstrong wrote:
    In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .
    An order-preserving sorting algorithm is also called a "stable" sorting algorithm.

    Merge sort:
    An advanced sorting algorithm with sorting time of order O(n*log(n)). Top-down approaches (recursive) as well as bottom-up approaches (iterative) of merge sort are possible and work equally well. More complicated than insertion sort.
    Would be needlessly complicated if the number of data items is small.
    On the other hand, this algorithm can handle much larger data sets within reasonable time.

    CoSy is a ` user level system ( altho in open-to-the-chip Forth ) .
    An ` ordinary person just wants a sort which works . Thus ` merge is the choice .

    I would suggest that you look at the algorithms and implement the sorting routines
    yourself. This is easier than adapting someone else's code.

    Win32Forth must have a decent one that I could virtually cut&paste .
    Thinking is a last resort . A limited resource .

    A good reference is:
    Hans Werner Lang: “Sequential and parallel sorting algorithms” https://hwlang.de/algorithmen/sortieren/algoen.htm
    Thanks . That does look rather clear if I have to resort to rolling my own .

    Start with insertion sort and find out if the sorting time is still acceptable in the worst
    case scenario (large data set). Otherwise look at Merge sort. The recursive method of
    merge sort is slightly simpler to implement than the iterative one.
    A great deal of the power of CoSy comes from recursive operations on its lists-of-lists .
    I moved from the APL and legacy https://cosy.com/K/CoSy.htm world . If it can't handle Big Data it's not competitive .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bob Armstrong@21:1/5 to Hugh Aguilar on Sun Sep 10 22:20:13 2023
    On Saturday, September 2, 2023 at 11:23:04 PM UTC-6, Hugh Aguilar wrote:
    On Saturday, September 2, 2023 at 12:51:16 PM UTC-7, Bob Armstrong wrote:
    In any case , if anyone has an order preserving sort which applies to floats
    as well as ints & strings , let us know .
    I have a MergeSort in the novice package.
    AFAIK, it is stable, although I never gave any thought to whether it was or not.

    You pass the xt of a comparison function into the sort function.
    That's essentially what's done with Helmar's algorithm .

    I rely heavily on linked lists in my own programming.
    I don't use arrays or trees unless there is some compelling reason to do so. Linked lists work well most of the time --- I got the idea of having a single data-structure that is generally used from Factor that has "sequences"
    that it uses for pretty much all data.

    I've disliked linked lists for 60 years . I consider them too complex and difficult to transform .
    CoSy is a vocabulary evolved from Arthur Whitney's K of simple lists of lists . I don't think it is possible to match the speed for many operations with linked lists .

    I looked at Factor back in http://cosy.com/CoSy/CoSy/ReCoSy20101200.html . And gave some comments on a presentation on Factor to SV-FIG last year , https://www.youtube.com/watch?v=cpB0b7fIQBg . Frankly I think it regressed thru excessive complexification
    over time .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From L W@21:1/5 to Bob Armstrong on Sat Sep 23 16:27:25 2023
    On Saturday, September 2, 2023 at 2:51:16 PM UTC-5, Bob Armstrong wrote:
    It's been too long since I checked comp.lang.forth . comp.lang.apl has too little activity to bother with & I didn't realize that is far from true here .

    Anybody got a good ` sort algorithm ?
    I've been working on getting all my accounting " business ready " in CoSy . It's only been in the last few years approaching the facilities I had in my legacy K ( Kx.com ) CoSy .
    See my SV-FIG presentation : https://www.cosy.com/CoSy/y23/SV-FIG_20230826_commentary.html .
    I am surprised to find I never implemented a variant of the sort algorithm I use , https://cosy.com/4thCoSy/Code/CoSy/sort.f , for floats .
    The fundamental algorithm , while impressively elegant has a more pervasive problem that it is not order preserving on identical elements . That means if sorting a table using the permutations associated with a sort , the equal elements in sorting of a
    later column can disrupt the order of previously sorted columns .
    APLs & K only return the sorting permutation , aka ` grade , rather than the sorted list itself . I've yet to figure out how to get the permutation ( to apply to all columns in a table ) w/o doing the sort itself .
    In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .

    Seems to me the only time sorting meant anything was to prepare data to be saved to a GCR tape. Even a tape-to-tape merge/purge from ancient times is essentially a source to destination sort only having to actually sort the introduced data. So let's
    modernize ... Loading the table elements, given a data standardization exists, into a b+tree index solves the issues. 1) Never have to sort in the first place. 2) Add or remove data any time. 3) Drill down ability. 4) On an Intel system the space is
    trivial. 5) Order added is preserved on duplicates. 6) Why are you sorting in the first place? 7) Is there a real need to sort? Well, besides a human data presentation?

    This comment comes on following an understanding of temperature measurement and dealing with the associated instrumentation, say an 'S' type thermocouple. The linearization process is to keep performing a 7th order polynomial against the acquisition
    data or calc it once and have a lookup table based on said values. Well, CM would say "Calc it", as memory was rather expensive and limited 'in the day', not so much now, "as there is likely enough processor to do so". When you have 32 bit processors
    that can run on a 3V lithium coin cell for a decade+ while transmitting sub-GHz data every minute or so the entire while, the battery will die of old age before you deplete it, energy usage is now paramount. Maybe you need to factor your concept on
    needing to sort ...

    Sorting seems from days past a way to benchmark a system ... And only thus.

    Just put the data into the index in the first place. The index is the table. Forward, backward, subsets, content, selective resolution, duplicate check, expansion and contraction, you decide the data selection. Sorting ... LOL

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to logicweavers@gmail.com on Sun Sep 24 11:57:19 2023
    In article <8e107102-2533-4e85-a077-5ab8d50db695n@googlegroups.com>,
    L W <logicweavers@gmail.com> wrote:
    On Saturday, September 2, 2023 at 2:51:16 PM UTC-5, Bob Armstrong wrote:
    It's been too long since I checked comp.lang.forth . comp.lang.apl has
    too little activity to bother with & I didn't realize that is far from
    true here .

    Anybody got a good ` sort algorithm ?
    I've been working on getting all my accounting " business ready " in
    CoSy . It's only been in the last few years approaching the facilities I
    had in my legacy K ( Kx.com ) CoSy .
    See my SV-FIG presentation : >https://www.cosy.com/CoSy/y23/SV-FIG_20230826_commentary.html .
    I am surprised to find I never implemented a variant of the sort
    algorithm I use , https://cosy.com/4thCoSy/Code/CoSy/sort.f , for floats
    .
    The fundamental algorithm , while impressively elegant has a more
    pervasive problem that it is not order preserving on identical elements
    . That means if sorting a table using the permutations associated with a
    sort , the equal elements in sorting of a later column can disrupt the
    order of previously sorted columns .
    APLs & K only return the sorting permutation , aka ` grade , rather
    than the sorted list itself . I've yet to figure out how to get the >permutation ( to apply to all columns in a table ) w/o doing the sort
    itself .
    In any case , if anyone has an order preserving sort which applies to >floats as well as ints & strings , let us know .

    Seems to me the only time sorting meant anything was to prepare data to
    be saved to a GCR tape. Even a tape-to-tape merge/purge from ancient
    times is essentially a source to destination sort only having to
    actually sort the introduced data. So let's modernize ... Loading the
    table elements, given a data standardization exists, into a b+tree index >solves the issues. 1) Never have to sort in the first place. 2) Add or >remove data any time. 3) Drill down ability. 4) On an Intel system
    the space is trivial. 5) Order added is preserved on duplicates. 6)
    Why are you sorting in the first place? 7) Is there a real need to
    sort? Well, besides a human data presentation?

    This comment comes on following an understanding of temperature
    measurement and dealing with the associated instrumentation, say an 'S'
    type thermocouple. The linearization process is to keep performing a
    7th order polynomial against the acquisition data or calc it once and
    have a lookup table based on said values. Well, CM would say "Calc it",
    as memory was rather expensive and limited 'in the day', not so much
    now, "as there is likely enough processor to do so". When you have 32
    bit processors that can run on a 3V lithium coin cell for a decade+
    while transmitting sub-GHz data every minute or so the entire while, the >battery will die of old age before you deplete it, energy usage is now >paramount. Maybe you need to factor your concept on needing to sort ...

    Sorting seems from days past a way to benchmark a system ... And only thus.

    Just put the data into the index in the first place. The index is the
    table. Forward, backward, subsets, content, selective resolution,
    duplicate check, expansion and contraction, you decide the data
    selection. Sorting ... LOL

    You can't give hard and fast rules with respect to sorting in
    general.

    One of my jobs was to sort (actually merge unsorted data unto) to the
    most prestigious Dutch dictionary (de dikke van Dalen). 1]
    1. It makes no sense to have an index into this kind of jobs.
    A dictionary is a stream of characters, only during sorting it is
    considered a heap of records.
    2. The rules for comparison were so complicated that the program code
    for comparison run over several pages. They went to great length to
    prevent a situation that paragraphs sorted equal.
    So "stable sorting" ... LOL

    1] This was an overnight job on the PDP. My job was to create an VAX
    program in c exactly equivalent to the previous assembler program.
    It run in minutes, and was actually io-bound.

    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans Bezemer@21:1/5 to Bob Armstrong on Sun Sep 24 05:31:00 2023
    On Saturday, September 2, 2023 at 9:51:16 PM UTC+2, Bob Armstrong wrote:
    It's been too long since I checked comp.lang.forth . comp.lang.apl has too little activity to bother with & I didn't realize that is far from true here .

    Anybody got a good ` sort algorithm ?
    I got plenty. They come in two flavors, address based and index based. The address based ones work with any structure, as long as they exist as pointer arrays. The comparison uses a call back function:

    Algorithm Library
    Slow sort slowsort.4th
    Stooge sort stoosort.4th
    Cycle sort cyclsort.4th
    Pancake sort pancsort.4th
    Radix sort LSB radxsort.4th
    Bubble sort bublsort.4th
    Cocktail sort cocksort.4th
    Cocktail sort
    (improved) coc2sort.4th
    Simple sort simpsort.4th
    Simple sort
    (improved) ismpsort.4th
    Selection sort selcsort.4th
    Insertion sort instsort.4th
    Insertion sort
    (improved) ins2sort.4th
    Binary Insertion sort binssort.4th
    Oyelami sort (MDIS) oyelsort.4th
    Circle sort circsort.4th
    Circle sort (improved) cir2sort.4th
    Bitonic sort bitosort.4th
    Merge sort mergsort.4th
    Odd-even merge sort odevsort.4th
    Tim sort (simple) timsort.4th
    Shell sort shelsort.4th
    Shell sort (A033622) shelsort.4th
    Shell sort (A108870) shelsort.4th
    Comb sort com2sort.4th
    Heap sort hea2sort.4th
    Binary Quick sort binquick.4th
    Intro sort intrsort.4th
    Quick sort qsort.4th
    Quick sort
    (unsafe) qsort.4th

    The index based ones work with any structure as well, as long as they're random accessible (by using an index). Both the compare and the exchange words are callbacks:

    Algorithm Library
    Heap sort heapsort.4th
    Comb sort combsort.4th
    Selection sort sel2sort.4th
    Bubble sort bub2sort.4th
    Gnome sort gnomsort.4th
    Gnome sort (improved) gno2sort.4th

    They're written for 4tH, so some assembly may be required. You can find them here: https://sourceforge.net/p/forth-4th/code/HEAD/tree/trunk/4th.src/lib/

    Hans Bezemer

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)