• Re: Archive Any And All Text Usenet

    From immibis@21:1/5 to Ross Finlayson on Sun Mar 10 20:06:44 2024
    XPost: news.software.nntp

    On 9/03/24 19:01, Ross Finlayson wrote:
    [snip] I wonder how to define an
    "archive any and all text usenet", AAATU,
    filesystem convention, as a sort of "Library
    Filesystem Format", LFF.

    The idea is that each "message", "post", has an ID,
    then as far as that's good, that each group
    in the hierarchy has a name, and that, each
    message has a date.  Then, the idea is to
    make an LFF, that makes a folder for a group,
    for a date, each its messages.

    a.b.c/YYYY/MMDD/HHMM/

    A filesystem is not a good match for all possible problems. Have you
    considered an SQL database, which IS a good match for a large number of problems?

    There are very useful notions of "mbox" and
    "maildir", with the idea that LFF or "maillff",
    and mbox and maildir variously have a great
    affinity.

    These were a good idea when they were invented. SQL is a good idea now
    that it has also been invented. Most implementations do not suffer from
    the limitations you talk about below as well as other limitations you
    did not talk about below.

    Another system you might be interested in is BitTorrent. I believe
    Library Genesis (an illegal backup of all published books) uses this for resilience. They divided the entire library into some number of torrents
    and then told people to go and seed the torrents so that if the library
    goes away, it can be reconstructed. Yours wouldn't be illegal, of course.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Ross Finlayson on Sun Mar 10 19:23:19 2024
    XPost: news.software.nntp

    Ross Finlayson <ross.a.finlayson@gmail.com> wrote or quoted:
    The idea is that each "message", "post", has an ID,

    Special file systems for news storage, such as the
    Cyclical News Filesystem (CNFS), have been developed.

    But, as mentioned by immibis, SQL databases can be
    very efficient today when used by someone with an
    education in relational databases.

    For example, I have a filesystem here that sometimes
    starts to behave strangely or become slow once there
    are several 10,000 files in a single directory. Or,
    maybe it's just the user interface not the file system.
    But you should make some tests to see whether the fs
    can actually support your requirements.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From immibis@21:1/5 to David Chmelik on Wed Mar 13 03:57:24 2024
    XPost: news.software.nntp

    On 11/03/24 05:12, David Chmelik wrote:
    On Sat, 9 Mar 2024 10:01:52 -0800, Ross Finlayson wrote:

    Hello. I'd like to start with saying thanks to Usenet administrators
    and originators,
    Usenet has a lot of perceived value as a cultural artifact, and also a
    great experiment in free speech, association, and press.

    Here I'm mostly interested in text Usenet,
    not binaries, that text Usenet is a great artifact and experiment in
    speech, association,
    and press.

    When I saw this example that may have a lot of old Usenet, then it sort
    of aligned with an idea that started as an idea of vanity press, about
    an archive of a group.
    Now though, I wonder how to define an "archive any and all text usenet",
    AAATU,
    filesystem convention, as a sort of "Library Filesystem Format", LFF.
    [...]

    Sounds good; I'm interested in full archive of text newsgroups I use
    (1300+) but don't know free Usenet servers even go back to when I started (1996, though tried Internet in museum before Eternal September). I'm
    aware I could use commercial ones that may, but don't know which nor cost/ space. Is Google Groups the only going back to 1981? I hope other
    servers managed to save that before Google disconnected from peers or some might turn up back to 1979.

    Accessing some old binary ones would be nice also, but these days people
    use commercial servers for those, which probably didn't save even back to '90s... an archive of those (even though I'm uninterested in most rather
    than a few relating to history of science, some types of art/graphics & music) would presumably be too large except for data centres.

    Giganews on Reddit published the number: 20 gigabits per second. Of new
    data. This is approximately one new server full of hard drives every few
    days. If your servers are some of those dedicated to holding as many
    hard drives as possible, then one a week.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From CPMST@21:1/5 to All on Thu Mar 14 04:00:03 2024
    XPost: news.software.nntp

    It doesn't really have to be that way, in the case
    that basically Internet Messages here Usenet
    are "static assets" of a sort once arrived, if the
    so very many of them and with regards to their
    size, here that most text Usenet messages are
    on the order of linear in 4KiB header + body,
    while on the order of messages, each post.

    So one way to look at the facilities, of the system,
    is DB FS MQ WS, database filesystem message-queue
    <snip>

    A monster essay of google translate word salad.

    --
    we have to go back

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blue-Maned_Hawk@21:1/5 to All on Thu Mar 14 14:23:45 2024
    XPost: news.software.nntp

    I've seen much worse. This is parseäble.


    --
    Blue-Maned_Hawk│shortens to Hawk│/blu.mɛin.dʰak/│he/him/his/himself/Mr. blue-maned_hawk.srht.site
    Has anyone ever really been far even as decided to use even go so want to
    look more like?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)