• The Data Challenge Continues (With Big Prizes)

    From Farley Flud@21:1/5 to All on Sat Dec 2 11:15:29 2023
    The taxpayer-funded USDA FDC database is FUBAR.

    So far, with no help from any C.O.L.A. adherents, I have
    accurately laid bare the corruptions within the "food.csv"
    table.

    But there is yet more.

    Let's now take a look at the "food_nutrient.csv" file/table.
    This file also has problems.

    I won't say "corruptions" but there are definitely serious
    problems.

    Find those problems.

    The prize is $1,000,000,000,000 for the first person to
    accurately describe, in full detail, these problems.

    Am I joking?

    Nope, I am not. I am offering $1,000,000,000,000 because
    I know damned fucking well that no one will be able to
    do it.

    That's what Microslop/Apphole does to people. It makes
    them stupid.

    The employees at the USDA, who produced this POS database,
    all use Microslop/Apphole.

    Only the users of Gentoo or LFS will be able to find
    the solution.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Farley Flud on Sat Dec 2 09:10:42 2023
    On 12/2/2023 6:15 AM, Farley Flud wrote:

    The taxpayer-funded USDA FDC database is FUBAR.

    So far, with no help from any C.O.L.A. adherents, I have
    accurately laid bare the corruptions within the "food.csv"
    table.


    Liar. I'm the one that found the bogus LF record in food.csv. And the food.csv file is not corrupted.

    Feeb lie and Feeb fail.



    That's what Microslop/Apphole does to people. It makes
    them stupid.

    The most ignorant statements ever made on cola have ALL come from the
    mouths of Linux advocates and "advocates" (like yourself).



    The employees at the USDA, who produced this POS database,
    all use Microslop/Apphole.

    That's all you use at work, too. You've never been able to secure Linux-related employment.



    It's not nearly a POS database, but like many if not most large
    datasets, there are issues:


    food_nutrient:

    * 20 id's have no value for the nutrient amount

    select count(fdc_id) as cnt
    from food_nutrient
    where amount is null or amount = '';


    * 7 id's contain negative values for the nutrient amount

    select count(fdc_id) as cnt from food_nutrient where amount < 0;


    * There appears to be a small amt of duplicate data:
    - the same nutrient_id is listed 2-4 times for a given fdc_id

    select fdc_id, nutrient_id, count(nutrient_id) as cnt
    from food_nutrient
    group by fdc_id, nutrient_id
    having count(nutrient_id) > 1
    order by count(nutrient_id) desc, fdc_id, nutrient_id;


    * Not necessarily invalid data, but the set of nutrients isn't
    consistent for each food item. Some food items have 159 nutrients
    listed, some have 1.

    select fdc_id, count(fdc_id) as cnt
    from food_nutrient
    group by fdc_id
    order by count(fdc_id) desc, fdc_id;


    Dupe data would be the most worrisome.

    I quit looking.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Relf@21:1/5 to All on Sat Dec 2 06:56:50 2023
    Re: The USDA's FDC database.

    How many doughnuts/day should I be eating ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From vallor@21:1/5 to Usenet@Jeff-Relf.Me on Sat Dec 2 15:20:39 2023
    On Sat, 02 Dec 2023 06:56:50 -0800 (Seattle), "Relf"
    <Usenet@Jeff-Relf.Me> wrote in <Jeff-Relf.Me@Dec.2--6.56am.Seattle.2023>:

    Re: The USDA's FDC database.

    How many doughnuts/day should I be eating ?

    What does your doctor say?

    --
    -v

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tyrone@21:1/5 to All on Sat Dec 2 15:21:38 2023
    On Dec 2, 2023 at 6:15:29 AM EST, "Farley Flud" <ff@linux.rocks> wrote:

    To the surprise of no one here, Farley Fucktard Failed. Again.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From vallor@21:1/5 to Tyrone on Sat Dec 2 15:32:19 2023
    On Sat, 02 Dec 2023 15:21:38 +0000, Tyrone <none@none.none> wrote in <OkOdnToWm_4f1vb4nZ2dnZfqn_ednZ2d@supernews.com>:

    On Dec 2, 2023 at 6:15:29 AM EST, "Farley Flud" <ff@linux.rocks> wrote:

    To the surprise of no one here, Farley Fucktard Failed. Again.

    Didn't you hear? Linux Pooper once did some "GRADUATE-LEVEL, ACADEMIC, CUTTING-EDGE RESEARCH PROGRAMMING".

    Judging from the all-caps, I'd say he wrote his sooper secret
    project in Cobol.

    --
    -v

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to DFS on Sat Dec 2 16:14:17 2023
    On Sat, 2 Dec 2023 09:10:42 -0500, DFS wrote:


    It's not nearly a POS database, but like many if not most large
    datasets, there are issues:


    food_nutrient:

    * 20 id's have no value for the nutrient amount

    select count(fdc_id) as cnt
    from food_nutrient
    where amount is null or amount = '';


    FAIL! BIG FUCKING FAIL!

    You did not notice that the nutrient_id for each of
    those 20 records is 2066, which does not exist in the
    linked nutrient.csv table. It's a nutrient that doesn't
    exist!

    Thus, a query involving joins that seeks all nutrients
    is going to bomb and bomb BIG.

    Furthermore, there is a related issue. The API which is
    distributed with the database does not indicate the max field
    length for any field nor does it indicate the max number of
    decimal places for any obvious numeric fields.

    Consequently, how is anyone supposed to design an EFFICIENT
    database without knowing the maximum field length or the
    max number of decimal places?

    It's all FUBAR.

    And it's all due to the fucking USDA "food scientists"
    being schooled only in Microslop junk software.

    So here is another challenge.

    Write a computer program that will determine the max
    length for all fields.

    Using the incomparable GNU/Linux native tools I already
    have done so and that is how I was able to discover all
    of this inexcusable corruption.


    It's not nearly a POS database


    Like you, it's totally FUBAR.

    And as with you, most are too stupid to notice because
    this corruption has been present for several years already.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tyrone@21:1/5 to vallor on Sat Dec 2 16:16:51 2023
    On Dec 2, 2023 at 10:32:19 AM EST, "vallor" <vallor@cultnix.org> wrote:

    On Sat, 02 Dec 2023 15:21:38 +0000, Tyrone <none@none.none> wrote in <OkOdnToWm_4f1vb4nZ2dnZfqn_ednZ2d@supernews.com>:

    On Dec 2, 2023 at 6:15:29 AM EST, "Farley Flud" <ff@linux.rocks> wrote:

    To the surprise of no one here, Farley Fucktard Failed. Again.

    Didn't you hear? Linux Pooper once did some "GRADUATE-LEVEL, ACADEMIC, CUTTING-EDGE RESEARCH PROGRAMMING".

    Judging from the all-caps, I'd say he wrote his sooper secret
    project in Cobol.

    LOL, good one. On 80 column punch cards using IBM 029 and 129 keypunch machines.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tyrone@21:1/5 to Farley Flud on Sat Dec 2 16:45:51 2023
    On Dec 2, 2023 at 11:14:17 AM EST, "Farley Flud" <ff@linux.rocks> wrote:


    So here is another challenge.

    Write a computer program that will determine the max
    length for all fields.

    My god, you are truly clueless about SQL. So what happens next year when a column is 1 byte longer than your current CHAR definition? Are you going to repeat this nonsense?

    This is what VARCHAR is for. VARCHAR(n) fixes this. This means the column can be a MAXIMUM of n characters. Unlike CHAR(n) where even "Hello" takes up n bytes of space, VARCHAR only uses enough space to hold the string, UP TO THE MAX.

    The only thing FUBAR here is you. You could use VARCHAR(500) in your CREATE TABLE without wasting any space. That's what I did when I imported this allegedly "corrupted csv file" into MariaDB. I was getting Truncation errors
    on a few records.

    Using the incomparable GNU/Linux native tools I already
    have done so and that is how I was able to discover all
    of this inexcusable corruption.

    Variable length fields are not "data corruption". In fact, that is the entire point of csv files. Fixed field files exist, where each field has an explict length. Which of course wastes lots of space in the file.

    This is just another example of you not knowing what you are doing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to Tyrone on Sat Dec 2 22:32:44 2023
    On Sat, 02 Dec 2023 16:45:51 +0000, Tyrone wrote:


    This is what VARCHAR is for.


    Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!

    Isn't that cute. Tyrone, the supreme dummy, is trying to "teach"
    me about SQL.

    Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!

    He's only attempting to save face after his disastrous flop
    with his database import.

    Tyrone is now only a laughingstock.

    We're all laughing at your lame ass, Tyrone.

    Anyone who can create records out of thin air, like you
    did, deserves only perpetual ridicule.

    Ha, ha, ha, ha, ha, ha, ha, ha, ha, ha!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Feeb "Farley Flud" Russell on Sun Dec 3 22:40:04 2023
    On 12/2/2023 11:14 AM, Feeb "Farley Flud" Russell wrote:

    On Sat, 2 Dec 2023 09:10:42 -0500, DFS wrote:


    And it's all due to the fucking USDA "food scientists"
    being schooled only in Microslop junk software.


    You can put junk data into junk db designs in every dbms.

    The master FoodCentral db might still be in Access, but it can't be in
    one file. The Access filesize limit is still a measly 2GB, and food_nutrient.csv alone hits right up against that limit. So the USDA
    probably has the tables split among multiple back-end .accdb files.
    Which would work fine, but they would need a front-end Access file to do
    join queries across multiple back-end files.

    Govt agencies can't always afford dedicated DBAs, etc, so the db
    could've been designed by a food scientist, who used Access because it
    was easy to get started with and was already on their computer. Now
    it's morphed into an important, published database, but inertia means it remains in Access.

    If I were there I'd move the data into SQLite, maybe PostgreSQL or
    MariaDB. They could still use Access to query and report on it, but the
    data would be in a more robust dbms that wouldn't choke on tiny file sizes.

    I actually sent them notes about data quality issues a few years ago,
    when you first brought this db up on cola, but got no response.


    So here is another challenge.

    Write a computer program that will determine the max
    length for all fields.

    You mean the max size/length of the existing data?


    Using the incomparable GNU/Linux native tools I already
    have done so

    Code or it didn't happen.


    Here's my python to analyze any query and determine max data widths in
    each column in the resultset, then output the data in perfectly-sized
    columns onscreen.

    =====================================================================
    db.execute(sql)
    except (sqlite3.OperationalError) as sqliteerr:
    red = '\033[91m'
    reset = '\033[0m'
    msg = 'query failed to execute: ' + str(sqliteerr)
    print(red + msg + reset)
    return

    #get data
    rows = db.fetchall()
    if len(rows) == 0:
    print(red + 'no data for that query' + reset)
    return

    #get max data width of each column
    #note: for dbms != SQLite, the 3rd column of the
    cursor.description contains this data
    datawidths = [0]*len(rows[0])
    for row in rows:
    datawidths = [max(w,len(str(c))) for w,c in zip(datawidths,row)]
    #print(datawidths)

    #set column widths to wider of data or column label
    #also add column names to array from cursor description
    cw, columns = [],[]
    for i,colname in enumerate(db.description):
    cw.append(max(datawidths[i], len(colname[0])))
    columns.append(colname[0])
    #print(cw)

    #build print format string and separator line
    pf = ''
    for w in cw: pf += "%-"+"%ss " % (w,)
    sep = '=' * sum(cw)
    sep += '=' * (2 * len(cw))
    sep += '=' * 10
    #print(len(sep))

    #print column names and data
    #reprint header row every 45 lines
    print(sep)
    print('# ',pf % tuple(columns))
    print(sep)
    for i,row in enumerate(rows):
    if i > 45 and i % 46 == 0:
    print(sep)
    print('# ',pf % tuple(columns))
    print(sep)
    print("%-5s " % (i+1), pf % row)
    print(sep)
    if showquery == True:
    print('Query executed: ' + sql)
    print() =====================================================================

    A work of art.



    and that is how I was able to discover all
    of this inexcusable corruption.

    You should write a GPL tool to explode your inexcusable, corrupt skull.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to DFS on Mon Dec 4 09:31:32 2023
    On Sun, 3 Dec 2023 22:40:04 -0500, DFS wrote:


    Using the incomparable GNU/Linux native tools I already
    have done so

    Code or it didn't happen.


    The following is a Perl script to cut out unwanted fields
    and then find the max width of each remaining field.

    The field#'s in the argument list are the field to be retained.

    #! /usr/bin/perl -w
    # usage: cut-fields.pl infile.csv outfile.csv field0 field1 ...
    use warnings;
    use strict;
    use Text::CSV;

    my $line="";
    my @fields="";
    my $linecnt=0;
    my $i=0;
    my $numfields=$#ARGV-2;
    my @fieldsize;
    my @fieldnum;
    my $fieldlen=0;
    my $csv = Text::CSV_XS->new({ binary => 1 });

    for($i = 0; $i <= $numfields; $i++) {
    $fieldnum[$i]=$ARGV[$i+2];
    $fieldsize[$i]=0;
    }

    open (IPF, "<$ARGV[0]") or die "Cannot open input file: $!\n";
    open (OPF, ">$ARGV[1]") or die "Cannot create output file: $!\n";

    $line=<IPF>;
    while ($line=<IPF>) {
    chomp($line);
    $linecnt=$linecnt+1;
    $csv->parse($line);
    @fields = $csv->fields();
    $i=0;
    while ($i < $numfields) {
    print OPF "$fields[$fieldnum[$i]]^";
    $fieldlen = length $fields[$fieldnum[$i]];
    if($fieldlen>$fieldsize[$i]) { $fieldsize[$i]=$fieldlen; }
    $i=$i+1;
    }
    print OPF "$fields[$fieldnum[$i]]\n";
    $fieldlen = length $fields[$fieldnum[$i]];
    if($fieldlen>$fieldsize[$i]) { $fieldsize[$i]=$fieldlen; }
    }
    close IPF or die "Failed to close input: $!\n";
    close OPF or die "Failed to close output: $!\n";
    print "Total line count: $linecnt\n";

    for($i = 0; $i <= $numfields; $i++) {
    print "Max field length $i: $fieldsize[$i]\n";
    }


    Using this script immediately reports "wide chars" in
    certain line numbers. These "wide chars are the anomalous
    UTF-16 chars.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to DFS on Tue Dec 5 20:48:42 2023
    On Sun, 3 Dec 2023 22:40:04 -0500, DFS wrote:


    Here's my python to analyze any query and determine max data widths in
    each column in the resultset, then output the data in perfectly-sized
    columns onscreen.

    A work of art.


    Whoop dee doo! Let's do some cake decorating! I can make pretty
    patterns with frosting. How 'bout you?

    Ahahahahahahahahahahahahahahahahahaha!

    Only a girl scout or a faggot is concerned about arranging data
    in nice and neat columns.

    Which one is he?

    I'd say both.

    Ahahahahahahahahahahahahahahahahahaha!


    REAL MEN are very messy -- and very CREATIVE.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to All on Tue Dec 5 22:34:38 2023
    On Mon, 04 Dec 2023 09:31:32 +0000, Farley Flud wrote:

    I gots to fix my fantastic, inimitable Perl script.

    The default charset for Text::CSV_XS is ascii.

    So I's gots to changes it to UTF-8 thusly:

    open (IPF, '<:encoding(UTF-8)', $ARGV[0]) or die "Cannot open input file: $!\n";
    open (OPF, '>:encoding(UTF-8)', $ARGV[1]) or die "Cannot create output file: $!\n";

    Now it will correctly read those anomalous chars.

    The USDA, the fabled US government agency, managed
    to include 11 wide-chars in an otherwise ascii text
    file of 169656075 bytes.

    Amazing!

    But considering the quality of "professional" programmers
    in the pool, such as Tyrone, maybe not so.

    Ahahahahahahahahahahahahahahahahaha!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Farley Flud on Tue Dec 5 18:03:31 2023
    On 12/5/2023 3:48 PM, Farley Flud wrote:
    On Sun, 3 Dec 2023 22:40:04 -0500, DFS wrote:


    Only a girl scout or a faggot is concerned about arranging data
    in nice and neat columns.

    I can only imagine the high-pitched shrieking you would emit if the
    output of htop wasn't lined up.



    REAL MEN are very messy -- and very CREATIVE.

    Did you run away from this challenge

    <u1_QI.17336$NQ1.11608@fx48.iad>

    because you were too messy, or because you were too lame?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Joel on Wed Dec 6 09:15:49 2023
    On 12/5/2023 3:58 PM, Joel wrote:
    Farley Flud <ff@linux.rocks> wrote:


    As high level language coding goes, DFS's work there isn't bad at all,
    but it's certainly not on the level of modern app development,

    You talking about modern GUI apps?



    and
    your coding isn't either, you do some interesting mathematical work,
    but it's just personal hobby coding,

    You should get in on the cola hobby coding. Throw down with us, in
    whatever language.

    I never even looked at C or Python until a few years ago, and only
    because it was talked about here.

    Writing C is a tedious chore, so I don't do much of it. Because of the performance I feel good when it's done, but I generally dread it.


    you clearly overstate your real
    tech knowledge, making unsupported claims about achievements
    particularly in your line of work.

    uh oh... you're onto Feeb.



    As impressive as installing Gentoo
    is, it's still merely working with common tools, it's not achievement.
    You would lose very little by running a more ordinary distro.

    Feeb delusionally believes his (cr)apps run 30% faster because he
    compiles them on and for his machine.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Larry "Farley Flud" Piet on Wed Dec 6 09:22:53 2023
    On 12/5/2023 3:48 PM, Larry "Farley Flud" Piet wrote:

    Only a girl scout or a faggot is concerned about arranging data
    in nice and neat columns.

    I'm a little disappointed, Feeb. The last few days we had a much better rapport going than usual, discussing the USDA database. But you're back
    to the real you: demented and nasty.



    * My Python console masterpiece at work:

    https://imgur.com/a/KcHDctm

    The column widths aren't fixed - they're set based on the width of the
    label or the data from the query, whichever is widest.



    * My C console app, which uses the terminal window width and the
    resultset width to align the output.

    Linux:
    struct winsize w = {0};
    ioctl(STDOUT_FILENO, TIOCGWINSZ, &w);
    return w.ws_col;

    Windows:
    CONSOLE_SCREEN_BUFFER_INFO csbi;
    GetConsoleScreenBufferInfo(GetStdHandle( STD_OUTPUT_HANDLE ),&csbi);
    return csbi.dwSize.X;

    That section is fugly code!

    Narrow the window and you get fewer columns and more rows.
    Widen the terminal window and you get more columns and fewer rows.

    https://imgur.com/a/k2ole9F

    I'll sell you the code for $50,000.



    * My PyQt GUI app: the number of columns and rows changes depending on
    the width of the longest word in the resultset.

    https://imgur.com/a/SyHgloK



    It's perfection in all cases, of course.


    The PyQt was straightforward: ============================================================================
    #get longest word in results list
    longestword = 0
    for itm in reslist:
    if len(itm) > longestword:
    longestword = len(itm)

    #tablewidget width in pixels
    widgetwidth = frm.tblWords.geometry().width()

    #display space width is reduced by 30 pixels for the row numbers
    #and 30 pixels for the vertical scroll bar
    displaywidth = widgetwidth - 30 - 30

    #longest word in pixels (plus trailing space for readability)
    longestword = (longestword * 12) + 12

    #number of columns
    datacols = min(26,int(displaywidth/longestword))

    #nbr of words in search results
    listlen = len(reslist)

    #number of rows
    #math.ceil ensures the correct # of rows is shown
    datarows = math.ceil(listlen/datacols)

    #set width of each column
    colwidth = int(displaywidth/datacols) ============================================================================



    It's WELL beyond your ken, in C or python or PyQt.

    Stick with perl - it's dead like your plants, your eyes, your soul, and
    your career.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Joel on Wed Dec 6 19:04:11 2023
    On 12/6/2023 11:18 AM, Joel wrote:
    DFS <nospam@dfs.com> wrote:


    Writing C is a tedious chore, so I don't do much of it. Because of the
    performance I feel good when it's done, but I generally dread it.


    I don't find coding C challenging, but it is time consuming, and I
    just don't have any inspiration for something to develop. I don't
    think I've ever not found a program or app I needed, somewhere online.
    It just seems like if it could exist, it already does.


    Then you could develop your own copy of something that exists.

    I wrote a bunch of stuff in C, just to learn it:

    * import list of words, do various analyses and searches
    * anagram generator, taking input from the user
    * post the Linux kernel CREDITS file to a SQLite database table
    * calculate the entropy of text, based on this formula:
    https://www.shannonentropy.netmark.pl
    * sinner program to shut down that idiot Feeb
    * list .csv files in a directory. Choose one, it gets imported and
    analyzed and displayed. You can query the data, too.
    * split a file into N equal parts
    * write out one file per book of the bible from one big bible.txt
    * count words and letters
    * keep your hard drive active by writing a temp file every N
    seconds/minutes
    * list locale files, let you set your system locale
    * program to compute the mean, median, mode, variance, std deviation,
    skewness and kurtosis of a randomly generated set of integers
    * count words inside single quotes, dbl-quotes, parenthesis, etc.
    * solved various problems at https://projecteuler.net
    * Fibonacci generator
    * prime nbr finder
    * random nbr generator
    * create a pivot table from .csv input
    * word find/replace
    * word wrap
    * create arrays of random strings, add dupes, dedupe and combine the
    arrays
    * generic C program timing method
    * color-coded file lister


    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From candycanearter07@21:1/5 to DFS on Wed Dec 6 19:32:37 2023
    On 12/6/23 18:04, DFS wrote:
    On 12/6/2023 11:18 AM, Joel wrote:
    DFS <nospam@dfs.com> wrote:


    Writing C is a tedious chore, so I don't do much of it.  Because of the >>> performance I feel good when it's done, but I generally dread it.


    I don't find coding C challenging, but it is time consuming, and I
    just don't have any inspiration for something to develop.  I don't
    think I've ever not found a program or app I needed, somewhere online.
    It just seems like if it could exist, it already does.


    Then you could develop your own copy of something that exists.

    I wrote a bunch of stuff in C, just to learn it:

     * import list of words, do various analyses and searches
     * anagram generator, taking input from the user
     * post the Linux kernel CREDITS file to a SQLite database table
     * calculate the entropy of text, based on this formula:
        https://www.shannonentropy.netmark.pl
     * sinner program to shut down that idiot Feeb
     * list .csv files in a directory.  Choose one, it gets imported and
        analyzed and displayed.  You can query the data, too.
     * split a file into N equal parts
     * write out one file per book of the bible from one big bible.txt
     * count words and letters
     * keep your hard drive active by writing a temp file every N
        seconds/minutes
     * list locale files, let you set your system locale
     * program to compute the mean, median, mode, variance, std deviation,
        skewness and kurtosis of a randomly generated set of integers
     * count words inside single quotes, dbl-quotes, parenthesis, etc.
     * solved various problems at  https://projecteuler.net

    Thanks for the recourse! I'd also recommend looking at advent of code
    for neat challenges.



    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Being well-rounded is important.
    --
    user <candycane> is generated from /dev/urandom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to DFS on Thu Dec 7 03:35:16 2023
    On Wed, 6 Dec 2023 09:15:49 -0500, DFS wrote:

    I never even looked at C or Python until a few years ago, and only
    because it was talked about here.

    Writing C is a tedious chore, so I don't do much of it. Because of the performance I feel good when it's done, but I generally dread it.

    You should try R. I'm taking a course to expand my horizons but you can
    do the same sort of stuff in Python with numpy, pandas, and so forth
    without the strange syntax. For that matter a lot of it can be done with
    the GNU Scientific library (GSL) and/or ATLAS. Both are written in C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to Joel on Thu Dec 7 03:45:45 2023
    On Wed, 06 Dec 2023 11:18:42 -0500, Joel wrote:

    I don't find coding C challenging, but it is time consuming, and I just
    don't have any inspiration for something to develop. I don't think I've
    ever not found a program or app I needed, somewhere online.
    It just seems like if it could exist, it already does.

    At a personal level I like playing around with Arduinos and other edge
    devices. C/C++ is used in that world, although CircuitPython is a
    possibility.

    C gets better as you collect libraries or files you can link in. It's like
    a carpenter collecting tools over the years.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to DFS on Thu Dec 7 04:03:02 2023
    On Wed, 6 Dec 2023 19:04:11 -0500, DFS wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Fast to write, slow to execute. Don't get me wrong -- I like Python but I
    also realize its limitations.

    https://blog.enterprisedna.co/r-vs-python-the-real-differences/

    For many things R is even worse that Python. It probably doesn't make any difference for data analysts but while a user is waiting for a response
    speed is very important. About 500 ms is the threshold where they will
    start bitching about things being slow.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to All on Thu Dec 7 04:08:35 2023
    On Wed, 6 Dec 2023 19:32:37 -0600, candycanearter07 wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Being well-rounded is important.


    Precisely. In the physical world I've collected a lot of tools. I don't
    try to do everything with Vise-Grips and a flat head screwdriver.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From chrisv@21:1/5 to All on Thu Dec 7 07:12:43 2023
    Dumfsck wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Same for Basic, which I have often used. The QuickBASIC compiler is
    my all-time favorite M$ software.

    --
    "ALL non-idiots support the use of testing over compile-time warnings
    to determine if the code functions correctly. [chrisv is] of the few
    idiots who thinks otherwise." - DumFSck, lying shamelessly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to All on Thu Dec 7 09:44:37 2023
    On 12/6/2023 8:32 PM, candycanearter07 wrote:


    Being well-rounded is important.


    For that I rely on the junk food aisle at the Dollar Tree.

    Recently found Mega-sized Smarties! Nice.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to rbowman on Thu Dec 7 09:44:07 2023
    On 12/6/2023 11:03 PM, rbowman wrote:
    On Wed, 6 Dec 2023 19:04:11 -0500, DFS wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Fast to write, slow to execute.


    Yeah, I should've included, "and 1/10th the speed of execution."

    And that's not an exaggeration - I've seen that differential many times.

    pypy has closed the gap, though, and there are some situations where
    it's just as fast as C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to rbowman on Thu Dec 7 09:46:13 2023
    On 12/6/2023 11:08 PM, rbowman wrote:
    On Wed, 6 Dec 2023 19:32:37 -0600, candycanearter07 wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Being well-rounded is important.


    Precisely. In the physical world I've collected a lot of tools. I don't
    try to do everything with Vise-Grips and a flat head screwdriver.


    I recently disassembled an old wood bookcase from probably the 80's,
    held together with 36 shallow slot-head screws.

    Nightmare!

    Even though I had a drill, it took me about 45 minutes to disassemble
    because the flat head drill bit kept slipping out of the slot.

    I told the recipient to forget about the slotted screws, and buy
    Phillips-head.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to rbowman on Thu Dec 7 09:49:21 2023
    On 12/6/2023 10:35 PM, rbowman wrote:
    On Wed, 6 Dec 2023 09:15:49 -0500, DFS wrote:

    I never even looked at C or Python until a few years ago, and only
    because it was talked about here.

    Writing C is a tedious chore, so I don't do much of it. Because of the
    performance I feel good when it's done, but I generally dread it.

    You should try R. I'm taking a course to expand my horizons but you can
    do the same sort of stuff in Python with numpy, pandas, and so forth
    without the strange syntax. For that matter a lot of it can be done with
    the GNU Scientific library (GSL) and/or ATLAS. Both are written in C.


    I did do a few hours of R (4.2.2) Studio introduction last year.

    Wish it was available in 1990, for my graduate course in linear regression.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Larry "Lord Master" Piet on Thu Dec 7 10:41:18 2023
    On 12/7/2023 9:43 AM, Larry "Lord Master" Piet wrote:

    On Wednesday, December 6, 2023 at 10:35:21 PM UTC-5, rbowman wrote:

    For that matter a lot of it can be done with
    the GNU Scientific library (GSL) and/or ATLAS. Both are written in C.


    Ahahahahaha! That's so rich!

    ATLAS happens to be a "tuned" LAS, i.e. it is specially tuned for maximum performance on a given hardware system.

    But most of these C.O.L.A. chumps, including the DuFuS Supremus, don't believe that tuning makes any difference. Being chumps, they continually claim that tuning is a waste of time and effort.

    I never said 'any difference'. Just not enough to dedicate significant
    time and effort.

    And I don't believe for a second your "tuning" makes programs in general
    run 30% faster.


    "In a majority of cases where there is a consistent performance
    difference, it’s often less than 1%, and almost always less than 5%. A
    small handful of things can see more than 10% improvements, though they
    are often things that are both very CPU intensive and which lend
    themselves well to vectorization by the compiler, which is often not
    things most people would be building locally even on Gentoo (it’s stuff
    like video games or physics simulation suites).

    The gap has narrowed over the years, likely due to GCC’s default
    behavior getting smarter. In my original testing almost a decade ago,
    almost half my benchmarks showed a measurable performance difference of
    more than 2%. In my most recent tests, about six months ago and using equivalent benchmarks, I only saw performance differences higher than 1%
    in highly synthetic benchmarks designed specifically to accentuate
    differences resulting from compiler optimization behavior."

    https://www.reddit.com/r/Gentoo/comments/10rs5ql/is_a_source_distribution_faster_than_a_binary_one/



    People who know, however, know damn well that tuning can make a BIG difference in the performance of C code.

    If tuning your FOSS dreck makes you happy, go for it. I'm OCD like that
    in some cases, too.

    But you're smart enough to realize the performance difference you gain
    is far offset by the time required to tune and compile the (cr)apps.



    That's another mark of GNU/Linux superiority. GNU/Linux *can* be highly tuned for performance whereas with Microslop Winblows the user gets
    whatever generic garbage that is packaged for him.

    "Untuned" Linux running my C console code is about 13% faster than
    Windows. I don't need 14% faster.


    One of my GNU/Linux systems is close to being 10-years-old yet, because
    it is maximally tuned, can beat the fuck out current distros on current hardware.

    details required

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From candycanearter07@21:1/5 to DFS on Thu Dec 7 11:56:25 2023
    On 12/7/23 08:46, DFS wrote:
    On 12/6/2023 11:08 PM, rbowman wrote:
    On Wed, 6 Dec 2023 19:32:37 -0600, candycanearter07 wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Being well-rounded is important.


    Precisely. In the physical world I've collected a lot of tools. I don't
    try to do everything with Vise-Grips and a flat head screwdriver.


    I recently disassembled an old wood bookcase from probably the 80's,
    held together with 36 shallow slot-head screws.

    Nightmare!

    Even though I had a drill, it took me about 45 minutes to disassemble
    because the flat head drill bit kept slipping out of the slot.

    I told the recipient to forget about the slotted screws, and buy Phillips-head.

    Did they end up buying one?
    --
    user <candycane> is generated from /dev/urandom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From candycanearter07@21:1/5 to chrisv on Thu Dec 7 11:57:28 2023
    On 12/7/23 07:12, chrisv wrote:
    Dumfsck wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Same for Basic, which I have often used. The QuickBASIC compiler is
    my all-time favorite M$ software.


    Do you still use BASIC nowadays?
    --
    user <candycane> is generated from /dev/urandom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From candycanearter07@21:1/5 to rbowman on Thu Dec 7 11:58:11 2023
    On 12/6/23 21:45, rbowman wrote:
    On Wed, 06 Dec 2023 11:18:42 -0500, Joel wrote:

    I don't find coding C challenging, but it is time consuming, and I just
    don't have any inspiration for something to develop. I don't think I've
    ever not found a program or app I needed, somewhere online.
    It just seems like if it could exist, it already does.

    At a personal level I like playing around with Arduinos and other edge devices. C/C++ is used in that world, although CircuitPython is a possibility.

    C gets better as you collect libraries or files you can link in. It's like
    a carpenter collecting tools over the years.

    Plus, learning all the optimizations and data structures is useful.
    --
    user <candycane> is generated from /dev/urandom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to All on Thu Dec 7 13:48:35 2023
    On 12/7/2023 12:56 PM, candycanearter07 wrote:
    On 12/7/23 08:46, DFS wrote:
    On 12/6/2023 11:08 PM, rbowman wrote:
    On Wed, 6 Dec 2023 19:32:37 -0600, candycanearter07 wrote:

    In all cases where it can be used - which is most - Python is 1/3 to >>>>> 1/5th the amount of work and lines of code.

    Being well-rounded is important.


    Precisely. In the physical world I've collected a lot of tools. I don't
    try to do everything with Vise-Grips and a flat head screwdriver.


    I recently disassembled an old wood bookcase from probably the 80's,
    held together with 36 shallow slot-head screws.

    Nightmare!

    Even though I had a drill, it took me about 45 minutes to disassemble
    because the flat head drill bit kept slipping out of the slot.

    I told the recipient to forget about the slotted screws, and buy
    Phillips-head.

    Did they end up buying one?


    Engrish confused you?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From chrisv@21:1/5 to All on Thu Dec 7 15:39:23 2023
    candycanearter07 wrote:

    chrisv wrote:

    Dumfsck wrote:

    In all cases where it can be used - which is most - Python is 1/3 to
    1/5th the amount of work and lines of code.

    Same for Basic, which I have often used. The QuickBASIC compiler is
    my all-time favorite M$ software.

    Do you still use BASIC nowadays?

    It's been a few years. I try to use Python or Powershell, these days.
    Only for small things, as I don't use them enough to be as good as I
    was with Basic (or C, for that matter).

    --
    "i<=22? Look at the clueless wonder go! That bit of shit-code does
    two int comparisons on each iteration of the loop" - DumFSck,
    putting his ignorance on display

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to All on Fri Dec 8 04:53:02 2023
    On Thu, 7 Dec 2023 11:58:11 -0600, candycanearter07 wrote:

    Plus, learning all the optimizations and data structures is useful.

    Yeah, structures are nice... We had one programmer who never moved past parallel arrays. He used them to implement Dijkstra's shortest path
    algorithm. It works but it's a maintenance nightmare. The smae programmer
    also believed nothing could ever possibly go wrong.

    There are arguments for structures of arrays versus arrays of structures
    but loose arrays floating around with descriptive names like a, B, AND C
    leaves something to be desired.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to DFS on Fri Dec 8 05:24:48 2023
    On Thu, 7 Dec 2023 09:49:21 -0500, DFS wrote:

    On 12/6/2023 10:35 PM, rbowman wrote:
    On Wed, 6 Dec 2023 09:15:49 -0500, DFS wrote:

    I never even looked at C or Python until a few years ago, and only
    because it was talked about here.

    Writing C is a tedious chore, so I don't do much of it. Because of
    the performance I feel good when it's done, but I generally dread it.

    You should try R. I'm taking a course to expand my horizons but you
    can do the same sort of stuff in Python with numpy, pandas, and so
    forth without the strange syntax. For that matter a lot of it can be
    done with the GNU Scientific library (GSL) and/or ATLAS. Both are
    written in C.


    I did do a few hours of R (4.2.2) Studio introduction last year.

    Wish it was available in 1990, for my graduate course in linear
    regression.

    I've got RStudio installed but I tend to use the CLI for messing around.
    It is quirky and in the other thread about 0 based arrays:

    str(name)
    chr [1:4] "mandi" "amy" "nicole" "olivia"
    name[1]
    [1] "mandi"
    name[0]
    character(0)

    And, of course, R and RStudio works very nicely on Linux :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to DFS on Fri Dec 8 05:42:02 2023
    On Thu, 7 Dec 2023 09:46:13 -0500, DFS wrote:

    I recently disassembled an old wood bookcase from probably the 80's,
    held together with 36 shallow slot-head screws.

    Nightmare!

    Even though I had a drill, it took me about 45 minutes to disassemble
    because the flat head drill bit kept slipping out of the slot.

    I told the recipient to forget about the slotted screws, and buy Phillips-head.

    I'd skip the Phillips and go with Robertson or Torx. Like flat blades
    Phillips drivers run from #0000 to #5. My collection only goes up to $4
    that fits the air cleaner screws on a Sportster. Use the wrong size abd
    you'll strip out a Phillips head.

    Torx is the same. Harley, in a perverted moment, uses bot #25 and #27
    Torx. #27 isn't in all the sets and a #25 *almost* works in a #27 until it strips out.

    https://garrettwade.com/product/gunsmith-screwdriver-set

    Hollow ground helps a lot if the screw head wasn't mangled originally. I
    don't have as fancy as set but I do have a Chapman set

    https://www.amazon.com/Chapman-MFG-8900-Screwdriver-Equipment/dp/
    B0002IT4WU

    but I don't use it for taking apart bookcases. A real joy is an old house
    where the screws have been painted over with successive layers of oil
    based paint.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From rbowman@21:1/5 to Lord Master on Fri Dec 8 05:45:53 2023
    On Thu, 7 Dec 2023 09:11:10 -0800 (PST), Lord Master wrote:

    Don't ever play with the Lord Master.

    Yes, oh Loud Masturbator,

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Larry "Lord Master" Piet on Fri Dec 8 10:14:39 2023
    On 12/7/2023 12:11 PM, Larry "Lord Master" Piet wrote:

    On Thursday, December 7, 2023 at 10:41:21 AM UTC-5, DFS wrote:

    And I don't believe for a second your "tuning" makes programs in general
    run 30% faster.


    _You_ can't believe anything because you are qualified for nothing.

    Here is a benchmark using the Scimark4 program. (This is only a single threaded example.)

    The results using highly optimized compiler options on my Gentoo
    system and Xeon workstation:

    FFT Mflops: 2338.10 (N=1024)
    SOR Mflops: 2062.17 (100 x 100)
    MonteCarlo: Mflops: 943.78
    Sparse matmult Mflops: 2550.52 (N=1000, nz=5000)
    LU Mflops: 8320.08 (M=100, N=100)

    ************************************
    Composite Score: 3242.93
    ************************************

    Next, the software was built from generic, lowest-common-denominator compiler options
    just like the average Linux distro. The results:

    FFT Mflops: 1868.69 (N=1024)
    SOR Mflops: 1520.70 (100 x 100)
    MonteCarlo: Mflops: 762.05
    Sparse matmult Mflops: 2771.88 (N=1000, nz=5000)
    LU Mflops: 4883.40 (M=100, N=100)

    ************************************
    Composite Score: 2361.34
    ************************************

    Holey fucking moley! This is a 37% difference in performance.

    THIRTY SEVEN FUCKING PERCENT!!!

    THIRTY SEVEN FUCKING PERCENT!!!

    THIRTY SEVEN FUCKING PERCENT!!!

    Same machine. Same hardware. Same program. The only difference was in compile-time
    optimization.

    Furthermore, the second example still used the optimized glibc routines.
    If it were done with the common distro glibc the performance would have
    been even less.

    So, again, you are full of fucking shit.

    Everything you say, claim, report, or believe is full of fucking shit because _you_ are full of fucking shit.

    Don't ever play with the Lord Master.



    Hey dunce, read for comprehension: I said "programs in general".

    1) your one example, which you've posted to cola multiple times, is an
    exception. That's why it's the only one you've ever measured and
    posted.

    2) the only comparison that's meaningful is 'tuned and compiled' vs the
    standard binary that would be found on a Linux distro. Your test
    doesn't meet the terms of the comparison that people are interested
    in. And you purposely didn't reveal those optimizations or the
    "LCD" settings.

    3) There are hundreds and hundreds of Linux apps you could test. One
    proves nothing.


    Test SQLite:

    precompiled Linux binary: https://sqlite.org/2023/sqlite-tools-linux-x64-3440200.zip

    source code, including a configure script https://sqlite.org/2023/sqlite-autoconf-3440200.tar.gz

    how to compile: https://sqlite.org/howtocompile.html


    Here's a C script to create a test db and populate a table with 2M rows,
    then time a variety of db operations against that table:

    =========================================================================

    // install libsqlite3-dev
    // ie, apt install libsqlite3-dev

    // compilation: gcc -Wall source -o executable -lsqlite3

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <time.h>
    #include <unistd.h>
    #include <sqlite3.h>


    //from SQLite docs: used when retrieving data from SQLite
    int callback(void *na, int argc, char **argv, char **azColName)
    {
    na = 0;
    for (int i = 0; i < argc; i++) {
    printf("%s = %s\n", azColName[i], argv[i] ? argv[i] : "NULL");
    }
    printf("\n");
    return 0;
    }


    //program timing
    double elapsedtime(clock_t start) {
    return (clock() - (double)start)/CLOCKS_PER_SEC;
    }


    //get random number from within range
    int randNbr(int low, int high) {
    return (low + rand() / (RAND_MAX / (high - low + 1) + 1));
    }


    //test for leap year
    int isleapyear(int yr) {
    if(((yr%4==0) && (yr%100!=0)) || (yr%400==0))
    {return 0;}
    return -1;
    }

    //build random date in format YYYY-MM-DD
    //number of days in the month
    int mdays[12]={31,28,31,30,31,30,31,31,30,31,30,31};
    char *randDate() {

    //year
    int ryear = randNbr(2023,2023);

    //get random month
    int rmth = randNbr(1,12);

    //get random day from that month
    int rday = randNbr(1,mdays[rmth-1]);

    //leap years
    if((isleapyear(ryear)==0) && (rmth==1)) {
    rday = randNbr(1,29);
    }


    char *randDt = malloc(sizeof(char) * 11);
    sprintf(randDt,"%d-%02d-%02d",ryear, rmth, rday);
    randDt[11] = '\0';

    return randDt;
    }


    //build random string of random length
    char charset[62] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    short charsetLen = 62;

    char *randStr(int randLen) {
    char *randStr = malloc(sizeof(char) * randLen);
    for(int i=0; i<randLen; i++) {
    randStr[i] = charset[randNbr(0,charsetLen-1)];
    }
    randStr[randLen] = '\0';
    return randStr;
    }

    int main(void) {

    sqlite3 *db;
    sqlite3_stmt *stmt;
    char *testDB = "tuning.sqlite";
    char *error_msg = 0;
    int rc;
    char sqlI[1024] = {};
    char *posters[14] =
    {
    "DFS","Feeb","hh","Slimer","Joel","Freak","Relf","candycane",
    "RonB","Bloaty","shitv","Ahlstrom","bowman","Tyrone"
    };
    int min_age = 21, max_age = 76;
    int datarows = 2000000;

    //timing-related
    clock_t startmaster, start;

    //version
    printf("running SQLite %s\n", sqlite3_libversion());

    //delete existing db file
    if (access(testDB, F_OK) == 0) {
    int r = remove(testDB);
    if (r != 0) {
    printf("Unable to delete existing test database\n");
    return(1);
    }
    }


    //begin timing overall
    startmaster = clock();


    //create and open new db file
    rc = sqlite3_open(testDB, &db);
    printf("created database: %s\n",testDB);


    //begin timing create table and populate
    start = clock();

    //create table
    char *sql = "CREATE TABLE TEST(testid INTEGER PRIMARY KEY AUTOINCREMENT, poster TEXT NOT NULL, rnd TEXT DEFAULT 'na' NOT NULL, dt
    TEXT NOT NULL DEFAULT '0000-00-00', age INTEGER NOT NULL);";
    rc = sqlite3_exec(db, sql, 0, 0, &error_msg);
    rc = sqlite3_exec(db, "COMMIT;", 0, 0, &error_msg);
    printf("added test table\n");


    //rng
    srand(time(NULL));


    //populate table with test data
    //random data: poster name, 12-16 char string, 2023 date, age
    between 21 and 80
    printf("adding test data: ");
    rc = sqlite3_exec(db, "BEGIN TRANSACTION;", 0, 0, &error_msg);
    for (int i=1; i<=datarows; i++) {

    char* randposter = posters[rand() % 14];
    char* randstring = randStr(randNbr(12,16));
    char* randdt = randDate();
    int randage = rand() % ((max_age + 1) - min_age) + min_age;
    sprintf(sqlI,"INSERT INTO TEST (poster, rnd, dt, age) VALUES ('%s','%s','%s',%d);", randposter, randstring, randdt, randage);
    rc = sqlite3_exec(db, sqlI, 0, 0, &error_msg);

    if (i % (datarows/50) == 0) {
    printf("%dK ",i/1000);
    fflush(stdout);
    }
    }
    rc = sqlite3_exec(db, "COMMIT;", 0, 0, &error_msg);


    //count rows in test table
    sqlite3_prepare_v2(db, "SELECT COUNT(*) FROM TEST;", -1, &stmt, NULL);
    sqlite3_step(stmt);
    int rows = sqlite3_column_int(stmt, 0);

    //print time to create table and populate it
    printf("\n%'d rows added in %.1f seconds\n", rows, elapsedtime(start));

    //print top rows of data
    sql = "SELECT * FROM TEST LIMIT 5;";
    sqlite3_prepare_v2(db, sql, -1, &stmt, NULL);
    printf("First 5 rows\n");
    while (sqlite3_step(stmt) == SQLITE_ROW) {
    printf("%s %s %s %s %s\n", sqlite3_column_text(stmt, 0), sqlite3_column_text(stmt, 1), sqlite3_column_text(stmt, 2), sqlite3_column_text(stmt, 3), sqlite3_column_text(stmt, 4));
    }


    //create indexes
    start = clock();
    printf("creating indexes...\n");
    rc = sqlite3_exec(db, "CREATE INDEX IDX_POSTER ON TEST(POSTER);", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "CREATE INDEX IDX_DT ON TEST(DT);", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "CREATE INDEX IDX_AGE ON TEST(AGE);", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "CREATE UNIQUE INDEX UIDX_RND ON TEST(RND);", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "COMMIT;", 0, 0, &error_msg);
    printf("created 4 indexes in %.1f seconds\n", elapsedtime(start));


    //begin timing db operations
    start = clock();
    printf("executing db operations...\n");

    //subquery and row iteration
    printf(" - subquery and cursor iteration\n");
    sql = "SELECT * FROM TEST WHERE testid IN (SELECT testid FROM TEST WHERE LENGTH(rnd) = 14);";
    sqlite3_prepare_v2(db, sql, -1, &stmt, NULL);
    sqlite3_step(stmt);
    while (sqlite3_step(stmt) != SQLITE_DONE) {sqlite3_step(stmt);}

    //update subsets
    printf(" - update subset\n");
    rc = sqlite3_exec(db, "UPDATE TEST SET rnd = rnd || ' (16 CHARS)' WHERE LENGTH(rnd) = 16;", 0, 0, &error_msg);

    //alter table, add new column, update
    printf(" - add column and update\n");
    rc = sqlite3_exec(db, "ALTER TABLE TEST ADD COLUMN newage INTEGER;", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "UPDATE TEST SET newage = age * 3 WHERE age < 25;", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "UPDATE TEST SET newage = age * 2 WHERE newage IS NULL;", 0, 0, &error_msg);

    //delete subsets
    printf(" - delete subsets\n");
    rc = sqlite3_exec(db, "DELETE FROM TEST WHERE (age < 24 OR age = 76);", 0, 0, &error_msg);
    rc = sqlite3_exec(db, "DELETE FROM TEST WHERE poster = 'RonB' AND age > 50);", 0, 0, &error_msg);

    //final save
    rc = sqlite3_exec(db, "COMMIT;", 0, 0, &error_msg);


    //finish
    sqlite3_finalize(stmt);
    sqlite3_close(db);
    printf("Processing done in %.1f
    seconds.\n\n",elapsedtime(startmaster));
    return 0;
    }


    =========================================================================

    You have to compile and run my script (about 16sec on my system), and
    post to cola the results using your optimized SQLite, and then using the precompiled SQLite binary. You also have to post your optimizations/flags.

    Then, when the performance delta is proven to be very small, you have to
    kiss my ass.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to DFS on Fri Dec 8 15:50:40 2023
    On Fri, 8 Dec 2023 10:14:39 -0500, DFS wrote:


    Here's a C script to create a test db and populate a table with 2M rows,
    then time a variety of db operations against that table:


    Ahahahahaha! The stupid clown shows his stupidity yet again.

    If this were baseball, he'd be batting 0.000.

    SQLite doesn't count because its operations involve file I/O
    and file I/O is always a significant bottleneck.

    To measure optimization effects one needs operations in memory
    only where cache utilization becomes paramount.

    Now put away your ball and glove. You're being bumped
    down to the minor league -- as a bat boy.

    Ahahahahahahahahahaha!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Farley Flud on Fri Dec 8 10:59:23 2023
    On 12/8/2023 10:50 AM, "Farley Flud" wrote:


    To measure optimization effects one needs operations in memory
    only where cache utilization becomes paramount.


    I figured you'd puss out.

    The 1% extra performance vs stock binaries is yet another reason NOT to
    waste huge time with Gentoo.

    pwned

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From RabidPedagog@21:1/5 to DFS on Fri Dec 8 11:54:02 2023
    On 2023-12-08 10:59 a.m., DFS wrote:
    On 12/8/2023 10:50 AM, "Farley Flud" wrote:


    To measure optimization effects one needs operations in memory
    only where cache utilization becomes paramount.


    I figured you'd puss out.

    The 1% extra performance vs stock binaries is yet another reason NOT to
    waste huge time with Gentoo.

    pwned

    The time he saves from that 1% increase in the application's performance
    is instead wasted in compiling the software rather than using it.

    --
    TG: @RabidPedagog

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Farley Flud@21:1/5 to RabidPedagog on Fri Dec 8 17:50:54 2023
    On Fri, 8 Dec 2023 11:54:02 -0500, RabidPedagog wrote:


    The time he saves from that 1% increase in the application's performance
    is instead wasted in compiling the software rather than using it.


    Ahahahahaha! Tweedle Dee and Tweedle Dumb!

    The DuFuS Supremus and the Rabid Rat are two peas in a pod.
    They probably give each other lessons in stupidity.

    But what the fuck do you know about "using" software? Your
    Microslop software is using YOU. Microslop Defender, which
    you are helpless to stop, is constantly scanning all of your
    files and sending who knows what back to Redmond in an effort
    to keep THEIR, not YOUR, junk software pure and pristine.

    Furthermore, all of your files, which you also cannot stop,
    are being secreted away into the Microslop "cloud."

    Tweedle Dee and Tweedle Dumb to be sure.

    https://www.zurchers.com/cdn/shop/products/1111-157_d75faaf1-6d1a-45cd-b7b7-9aee3ca49a4c_550x505.jpeg?v=1487952041

    Who is the top and who is the bottom?

    Ahahahahahahahahahahahahahahahahahahahahaha!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)