• Bug#1034091: RFP: whisper -- Robust Speech Recognition via Large-Scale

    From Petter Reinholdtsen@21:1/5 to All on Thu Feb 15 20:10:01 2024
    I just came across the article "Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges" Per E Kummervold, Javier de la
    Rosa, Freddy Wetjen, Rolv-Arild Braaten and Per Erik Solberg, <URL:https://arxiv.org/pdf/2402.01917.pdf>.

    I found this quote particularly interesting:

    Although the original PyTorch training code was not released by
    OpenAI, a collaborative effort with HuggingFace led to an alternative
    implementation in the Transformers library. This has also been
    adapted for Jax. The project participated in developing and
    open-sourcing training scripts for TPU-v4-pods, enabling dynamic
    changes to the training data during runtime (The National Library of
    Norway, 2024).

    The reference point to <URL: https://www.github.com/NbAiLab/nostram >.
    I have not investigated further. Perhaps the alternative implementation
    can be used to make a model from scratch and provide source for the
    files requested by the ftpmasters?

    Unrelated to this, there is an alternative implementation using the
    whisper models called whisper.cpp, available from
    <URL: https://github.com/ggerganov/whisper.cpp.git >. It might be
    easier to package than the openai whisper implementation.

    --
    Happy hacking
    Petter Reinholdtsen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)