• Re: Translation status robot (again?)

    From Thomas Lange@21:1/5 to All on Fri Jul 26 09:20:01 2024
    XPost: linux.debian.www

    Hi Camaleón,

    thanks for the info. I guess that www.d.o is using a total different
    code for generating the statistics. Not sure which data sources are
    used by those three pages if the sources are the same or different.

    I would like to see most translation statistic to be on l10n.d.o or
    i18n.d.o but not on www.d.o any more. Not now, but in the long run
    (0.5-2 years maybe). Therefore it would be nice to get the feedback
    which statistics are only available on www.d.o and which data is
    already available on other domains, so they may be removed from
    www.d.o.

    --
    regards Thomas
    (Debian web team)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Holger Wansing@21:1/5 to Thomas Lange on Sat Jul 27 00:00:01 2024
    XPost: linux.debian.www

    [ Be aware; long answer ! ]

    Hi Thomas,

    Thomas Lange <lange@cs.uni-koeln.de> wrote (Fri, 26 Jul 2024 09:17:16 +0200):
    Hi Camaleón,

    thanks for the info. I guess that www.d.o is using a total different
    code for generating the statistics. Not sure which data sources are
    used by those three pages if the sources are the same or different.

    I would like to see most translation statistic to be on l10n.d.o or
    i18n.d.o but not on www.d.o any more. Not now, but in the long run
    (0.5-2 years maybe). Therefore it would be nice to get the feedback
    which statistics are only available on www.d.o and which data is
    already available on other domains, so they may be removed from
    www.d.o.

    I think, what you are proposing above is a huge task.

    I'm personally lacking the perl skills for such changings, but I can share some knowledge (please ignore, if you are aware of all this):

    On i18n.debian.org, there is a cron job running every second day,
    that searchs for po files (translation files) in all the packages in
    the Debian archive (separately for unstable and testing).

    This is a massive job, and it has to deal with many errors in pacakges,
    where the filenames/formats/etc. are not conform with how it's expected by
    the cron job script.
    So, the output of this cron job is full of errors/warnings/whatever about errors, whose reason is not in the i18n server or the cron job, but in the packages.
    But apparently there are different assumptions how to deal with translation files etc...

    Anyway, the result of this cron job are two big compressed text files, testing.gz and unstable.gz in https://i18n.debian.org/material/data/

    They contain a dump of which translation for which language in which
    package exists, with its translation status (how many % of the file is translated).

    These files are the basis for some perl scripts, which are run during
    webwml cron job for building the website, including the l10n statistics
    we are talking about here (the unstable.gz is used here).
    The output of that webwml build run - the l10n statistics pages - are
    generally (looking at a page for a specific language like Spanish) pages
    with one line per package, for all packages in which a po file for Spanish
    is found (that's the "Packages with po-debconf support and for which translation is underway" section of the page).
    That line shows the current translation status of the file - in the
    package - in unstable!!!

    The package "multispeech" - which Camaleón complained about in his latest
    mail - does not have a es.po file in the package in unstable.
    So, there is no such line showing anything regarding this file on the
    page for Spanish.

    This is a special case, because multispeech is a very new package, which
    has just been added to Debian shortly, and therefore it does not have translations for many languages. And therefore it does not have a
    translation for Spanish currently.
    And so you don't have an entry in this "Packages with po-debconf support
    and for which translation is underway" section.

    BUT: there is an entry for multispeech in the section "Packages with po-debconf
    support and for which translation is to do".
    And the translator (Camaleón) has acted on this, he translated the file,
    a bugreport is filed, everything fine.
    As soon as the package maintainer includes this file into the package and uploads the package with a new version number into the Debian archive, multispeech
    should show up on the l10n statistics page (as "100% translated" ideally),
    and that's it.



    Please note, that the functionality I described above is the basic way of
    how it works, as the page has developed originally!

    Some time later, some data has been added to the page, and that's all the
    data from the l10n robots which acts on the pseudo-URLs, that translators
    sent to their debian-l10n-LANG mailinglists to coordinate translations
    inside the teams (like https://l10n.debian.org/coordination/spanish/es.by_translator.html).
    This is an additional data source, which was injected into Debian's
    l10n statistics pages at https://www.debian.org/international/l10n/po-debconf/es.

    This addition has added some more columns to the page:
    Status, Translator, Date, Bug

    But these information can only be shown, when a po file for the package
    (here: multispeech) exists in the package in unstable !!!
    Otherwise: no file -> no line with information shown. See above.




    Long story short:
    the repeating complaining about "Translation status page not updated (again)" and similar may look as an issue at the first glance, but as far as I can see, there is no problem currently.
    It's just that the system is working diffently compared to how people think
    it does.

    There is room for optimization probably, but the status quo is working as intended.



    If some changes are planned:
    maybe we should revert the changings, that the status data from robot (https://l10n.debian.org/coordination/spanish/es.by_translator.html)
    are integrated into the centralized l10n statistics page (https://www.debian.org/international/l10n/po-debconf/es).
    That statistics pages are originally not planned to deal with these data,
    and this is the reason for inconsistencies, which people are seing.

    Of course, this would force people to look at two pages instead of only
    one, to see the whole picture.
    But probably that would be an improvement nevertheless.
    That would divide the huge process of building this whole statistics beast
    into two smaller parts, allowing for easier maintenance and debugging in
    case of errors (or topics, that just seem to be an error, but are not).




    And finally:
    The page
    https://i18n.debian.org/nmu-radar/nmu_bypackage.html
    is currently pointless, since there is noone doing such NMU uploads for
    l10n updates (that was bubulle's part, who is no longer active in Debian).
    So people should not care about this page.
    Maybe we should remove it? Or place a big fat warning about its status?
    It's annoying, if different pages (with totally different approaches) are compared against, and the differences lead to such complains.



    So long

    Holger




    --
    Holger Wansing <hwansing@mailbox.org>
    PGP-Fingerprint: 496A C6E8 1442 4B34 8508 3529 59F1 87CA 156E B076

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helge Kreutzmann@21:1/5 to All on Sat Jul 27 15:20:01 2024
    XPost: linux.debian.www

    This is a MIME-formatted message. If you see this text it means that your E-mail software does not support MIME-formatted messages.

    Hello Holger and all,
    Am Fri, Jul 26, 2024 at 11:51:34PM +0200 schrieb Holger Wansing:
    And finally:
    The page
    https://i18n.debian.org/nmu-radar/nmu_bypackage.html
    is currently pointless, since there is noone doing such NMU uploads for
    l10n updates (that was bubulle's part, who is no longer active in Debian).
    So people should not care about this page.

    As stated earlier, I do care (a lot) about this page and I do i18n
    NMUs when we are closing towards a release. I already state this
    earlier on this list.

    I totally agree, that I'm doing much less in this regard than bubulle,
    yes, so additional people working on this are more than welcome to do
    so.

    Maybe we should remove it? Or place a big fat warning about its status?

    It would be great if this page could be kept, because it guides me
    (amongst others) in my NMU decisions.

    It's annoying, if different pages (with totally different approaches) are compared against, and the differences lead to such complains.

    IIRC this page was unknown to several contributors on this list until
    I recently mentioned it. So I doubt that this single page is the
    source of confusion.

    And thanks for keeping the infrastructure up and running, despite all
    these difficulties (strange tool chain, script written by others,
    broken po(t) files and difficult handling by other Debian
    contributors).

    Greetings

    Helge

    --
    Dr. Helge Kreutzmann debian@helgefjell.de
    Dipl.-Phys. http://www.helgefjell.de/debian.php
    64bit GNU powered gpg signed mail preferred
    Help keep free software "libre": http://www.ffii.de/

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEbZZfteMW0gNUynuwQbqlJmgq5nAFAmak8tcACgkQQbqlJmgq 5nCq1w//e3bqFoxZ8B+2XyqPq+DrIYDIVSrCovfyEBdDqf65OgHe4tIf2mq9w8Ko MbSxgSC0DjYp+C+WXcEP1t60fqx4aI87/UiWHPrzxbJYhNMbnE4VfBtuDy4BZlhf EII5fnR3vW6WSferWyFpvcSzUNIo1bEta4+42Mj8f1lMaOm3PDZAO2fhYO8BJQak j0C2wqzYThGuRAhVSamUkdPmmsK1u1TCDsMnAbuYHOAmXroZ2Eb4BvSANH96OpYK VSACgu8+zlpESdOnaFdFrQTkzKhXI0QaA0OF1nvYk+iVqDROhJ6FuuuYlGVFg9ln dYM8gqzVJ+uUeMwjK8BHEzMjy+pLeqmkitSus0cj+kkbCKrLS205PF9BX9mtb4Y9 uTddz2qJu6PGRmpwUXrdf+2dHtBKt/W/mnf44P9LUlcvdVLJXoE/CNBaK+QXwDQw ZZH46oaUnhC2c5/AxAbTt20VYTvmq0t0qA2KFAVH1aoPiUDIaocz6aGDhm+mTuHM zD1D40fQxa+sE9jQlzpHrbKLhcQpOs7UpH3nM90bJQOWSe96GG237jO58xwESOtj s/AgJh+XAa8fgtxnj7ieLDRKxl+F+R74lJNm5Lp
  • From Holger Wansing@21:1/5 to Helge Kreutzmann on Sun Jul 28 14:20:01 2024
    XPost: linux.debian.www

    Hi,

    Helge Kreutzmann <debian@helgefjell.de> wrote (Sat, 27 Jul 2024 13:15:03 +0000):
    Hello Holger and all,
    Am Fri, Jul 26, 2024 at 11:51:34PM +0200 schrieb Holger Wansing:
    And finally:
    The page
    https://i18n.debian.org/nmu-radar/nmu_bypackage.html
    is currently pointless, since there is noone doing such NMU uploads for l10n updates (that was bubulle's part, who is no longer active in Debian). So people should not care about this page.

    As stated earlier, I do care (a lot) about this page and I do i18n
    NMUs when we are closing towards a release. I already state this
    earlier on this list.

    Ok, I was not aware of you doing such NMUs, sorry!
    Thank you for that!

    I totally agree, that I'm doing much less in this regard than bubulle,
    yes, so additional people working on this are more than welcome to do
    so.

    I'm just in the process of applying for a DD (currently non-uploading DD),
    so once I get that done, there is probably potential for this :-)


    Holger



    --
    Holger Wansing <hwansing@mailbox.org>
    PGP-Fingerprint: 496A C6E8 1442 4B34 8508 3529 59F1 87CA 156E B076

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)