Forum: >>> Magnum BBS <<<

Rethink about who we (Debian) are in rapid dev cycle of deep learning

From M. Zhou@21:1/5 to All on Thu Jan 13 01:20:01 2022

Keywords: GPU computing support, AI applications & ML-Policy.

Deep learning is a new area. From our past discussions, we have already noted that this area introduces many new questions to Debian. For example, the new AI applications may even challenge the definition of free software. In this article I shall share my latest reviews on related topics across multiple domains, reviews on some of my past forecasts, as well as some relevant development advices.

Note, the whole article only conveys my own personal opinion, and does not represent any official opinion of the Debian Project.

# Debian's GPU computing support -- how much should we do? ####################

The recent success of partly depend on the development of GPU, which can compute matrix multiplication hundreds of times faster than a CPU. Thus, GPU computation is very valuable. And intuitively, supporting GPU computation as much as we can from the Debian side is useful and valuable as well. Due to software license issues from some certain vendor, I've been seeking for the boundary for long time -- how much should we do to support a certain type of GPU computation? Now I finally figured out my own answer.

Debian is merely a _downstream_ in terms of providing GPU support for the end-users. As long as the upstream is willing to give us chance (legally) and is easy to cooperate, we can support that. Otherwise a dead-end will soon be reached, unsurprisingly.

I've had some discussions with several fellow developers on suggesting Debian to buy some GPUs to extend its infrastructures for better GPU support.
The plan to put forward those ideas to a larger audience inside Debian had
been indefinitely postponed because we know the requirement of non-free
driver (there is no free alternative) would be a big problem.

Although my initial thought is to make Debian useful in more areas like GPU computing, I finally realized that by accepting new non-free blobs as an organization, we are further loosing our core value written on our homepage -- "a complete free operating system".

My conclusion is: "Users with special demands can take care of themselves,
as we are unable to go far on our own." In terms of GPU computing, Debian
is providing a great system as a foundation for development and applications.

Of course, deep learning frameworks are regular software we are already familiar enough with. Their GPU support simply depends on whether the necessary drivers and libraries are maintained in Debian.

# AI Applications & ML-Policy #################################################

I predict that the ML-Policy [1] will work as a warning on potential issues instead of some practical guidance on packaging, because there are (and will be) long-existing issues hard to overcome which make our packages not really useful without external components. Throughout the whole ML-Policy, I think the most valuable warning is the definition of "ToxicCandy Model", which identifies software freedom trap for random developers interested in AI software.

Cool and useful stuff keeps emerging -- e.g., Facial Authentication for Linux
https://github.com/boltgolt/howdy
And it depends on some pre-trained models (licence: CC0-1.0):
https://github.com/davisking/dlib-models
People may still have some impression on the past discussions on ML-Policy. When we treat pre-trained models as something like a picture or a song,
they may enter our main archive. But when we try to exercise software freedom, things will go wrong. For example, we can study a painting/song and analyze it to learn something, but this does not work for pre-trained models. Without
the training data there is no much way to study/learn/reproduce the pre-trained models. As per definition in ML-Policy the mentioned model is ToxicCandy model.

Based on my interpretation, it means Debian might step aside from the world of AI applications to fully exercise software freedom. It's a pity but Debian's major role in the whole thing is a solid system.

Workarounds to address that pity are possible. For example, the past "Debian User Package Repository" idea. By distributing only package building scripts
to end-users so they can build corresponding packages locally. In this way
the license issues and software freedom issues are bypassed as the user has determined to accept the potential issues.

On the other hand, I'd advise people who want to package interesting AI applications carefully evaluate whether it is mature enough -- and never package a pure academic research project. This is largely due to our development cycle is much slower than the revolution cycle in the deep learning field. Something better may appear before it clear's our NEW queue...

As for AI applications that require considerable computing power (GPU), the answer rather distinct.

[1] https://salsa.debian.org/deeplearning-team/ml-policy/-/blob/master/ML-Policy.rst

# Concluding Remarks

We maintain and provide a free operating system, and we value software freedom. My contribution here is to provide my understanding on the boundary between what we can do and what we can't do with respect to a new interesting area. At least I learned a lot when thinking about this, and got a deeper understanding on "what Debian is".

Debian is wonderful because this is one of the only few places on the earth where people will shout when software freedom is potentially infringed.
Indeed, Debian must have its own uniqueness in the impression of every long term members of the project.

Thank you for the excellent system, fellow developers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Goirand@21:1/5 to M. Zhou on Thu Jan 13 12:20:02 2022

On 1/13/22 01:00, M. Zhou wrote:

Thank you for the excellent system, fellow developers.

Thanks to you for all of your work in this field.

Cheers,

Thomas Goirand (zigo)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Davide Prina@21:1/5 to M. Zhou on Thu Jan 13 22:50:01 2022

On 13/01/22 01:00, M. Zhou wrote:

Cool and useful stuff keeps emerging -- e.g., Facial Authentication for Linux
https://github.com/boltgolt/howdy

note that EU (European Union) Privacy is managed by the GDPR Regulation
and the ePrivacy Directive.
The Directive will be replaced by the ePrivacy Regulation that will have
more strict rules (probably this will be approved this year).

Note: a directive must be implemented in national law of each EU states
and each state can select how to "implement" it. A Regulation becomes
effective law for each EU states simultaneously, same rules for all. (In reality this is not true for only EU states, but also for all states
that are in the European Single Market that don't have contract some
special exception for the field ruled by the Regulation. For example
Norway, who is not an EU state, is subject to GDPR Regulation...
societies have been fined by Privacy Norway Board for violating GDPR).

I have read that the new ePrivacy Regulation will introduce new strict
rules, for example no one can use AI for doing a facial recognition
(only Police can do it and only on regulated cases), but also cannot be
used in more generic fashion, for example for identify people type that
are making a demonstration (for example identify if they are woman/man
or most woman/man, the religion that they have, the color of they skin,
the origin country/region, ...).
Note: in reality facial recognition in public spaces is illegal also today.

So facial recognition will be illegal for doing workers authentication
or for identify clients in your shop or...

Note also that actually some data use are illegal in EU, for example a
society has used public photos to training AI and that society has been
fined for that action, because that society don't have a user consent
for this data treatment.

If I don't mistake also other extra-UE states are introducing
law/privacy law that limit AI usage.

All of this to say that AI in Debian cannot only introduce license
problems, but also legal problems.

I think that if Debian give to users general AI product that can be used
to train models, than, I think, it is a user responsibility (it is the
user that select what data to use to training and the use of the
training data). But if Debian give users a package that use a trained
model for doing something than, I think, that there must be at least a disclaimer... so if there will be a package frdm (Facial Recognition
Display Manager) that let user authentication with only facial
recognition, probably who install/configure it will have to be
informed/accept that the use of this package in some states can violate
the law if not used only for personal use (or something similar).

I'm not a legal expert and neither a privacy expert.
But I will be interested to know what other people think about that and
if they are legal/privacy experts.

Ciao
Davide

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Free unofficial Italian translation@21:1/5 to All on Fri Jan 14 09:30:01 2022

Thanks Davide, for talking about this. This is not just a legal problem, but a de facto reality implemented in disregard of any right to freedom. In order to prevent artificial intelligence from being used against the privacy of third parties, it is
necessary to eliminate "the opportunity" (following the model of the fraud triangle, which includes cyber fraud), informing people about "cyber insecurity" and the undesirable effects of databases. It might seem like a trivial solution, but sometimes the
simplest tools are the best.
I personally thank the Debian teams and AI developers for their invaluable contribution.

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office"><head></head><body>
<div style="-webkit-text-size-adjust: auto; word-wrap: break-word !important;">Thanks Davide, for talking about this. This is not just a legal problem, but a de facto reality implemented in disregard of any right to freedom. In order to prevent
artificial intelligence from being used against the privacy of third parties, it is necessary to eliminate "the opportunity" (following the model of the fraud triangle, which includes cyber fraud), informing people about "cyber insecurity" and the
undesirable effects of databases. It might seem like a trivial solution, but sometimes the simplest tools are the best.</div><div style="-webkit-text-size-adjust: auto; word-wrap: break-word !important;"><br style="word-wrap: break-word !important;"></

<div style="-webkit-text-size-adjust: auto; word-wrap: break-word !important;">I personally thank the Debian teams and AI developers for their invaluable contribution.</div><br><br><blockquote class="iosymail"><blockquote></blockquote></blockquote>

</body></html>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From tomas@tuxteam.de@21:1/5 to Davide Prina on Fri Jan 14 10:20:02 2022

On Thu, Jan 13, 2022 at 10:07:05PM +0100, Davide Prina wrote:

On 13/01/22 01:00, M. Zhou wrote:

Cool and useful stuff keeps emerging -- e.g., Facial Authentication for Linux
https://github.com/boltgolt/howdy

note that EU (European Union) Privacy is managed by the GDPR Regulation and the ePrivacy Directive.
The Directive will be replaced by the ePrivacy Regulation that will have
more strict rules (probably this will be approved this year).

As far as I understand the GDPR won't restrict the tech itself, but only
its use. Which makes sense. Basically, no consent => no use, except in
very restricted scenarios (e.g. public security).

That said, to have a workable face recognition, you'll need a training
set (at least with current "solutions"), so you'll have to collect
consent from all those face "providers".

All the above said, I'm not a lawyer. Nor do I play one on TV :)

Cheers
--
t

-----BEGIN PGP SIGNATURE-----

iF0EABECAB0WIQRp53liolZD6iXhAoIFyCz1etHaRgUCYeERnQAKCRAFyCz1etHa Rm7yAJ9+4DIms+ZZczRxfAX42LTDLKM8hACeMMW3PiHyUrVY2TvvhNC4BXscIVA=
=0Dbn
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Free unofficial Italian translation@21:1/5 to All on Fri Jan 14 11:50:01 2022

The effectiveness of the privacy law depends on the context. In fact, in cases of public security or if crimes are in progress, the effectiveness of the privacy law is limited or in more serious cases not taken into consideration.But tools such as
artificial intelligence are also used to commit abuses of power (and not just by private individuals).Unfortunately, there is no efficient preventive "defensive" strategy (and in general, preventive "defensive" strategies are never efficient).Laws
against illegal forms of control exist, however Snowden is still in Russia (and Obama was a civil rights advocate). It is a paradox, but no written law can prevent injustice.

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office"><head></head><body>
<span style="-webkit-text-size-adjust: auto;">The effectiveness of the privacy law depends on the context. In fact, in cases of public security or if crimes are in progress, the effectiveness of the privacy law is limited or in more serious cases not
taken into consideration.</span><div style="-webkit-text-size-adjust: auto; word-wrap: break-word !important;">But tools such as artificial intelligence are also used to commit abuses of power (and not just by private individuals).</div><div style="-
webkit-text-size-adjust: auto; word-wrap: break-word !important;">Unfortunately, there is no efficient preventive "defensive" strategy (and in general, preventive "defensive" strategies are never efficient).</div><div style="-webkit-text-size-adjust:
auto; word-wrap: break-word !important;">Laws against illegal forms of control exist, however Snowden is still in Russia (and Obama was a civil rights advocate). It is a paradox, but no written law can prevent injustice.</div><blockquote class="iosymail">
<blockquote></blockquote></blockquote>
</body></html>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Free unofficial Italian translation@21:1/5 to All on Fri Jan 14 12:40:02 2022

I don't know the law in Switzerland, but in your case you need to take into account the Worker Rights and not just the privacy law. Furthermore, the nature of the goods and services produced by the company must also be considered.

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office"><head></head><body>
<span style="-webkit-text-size-adjust: auto;">I don't know the law in Switzerland, but in your case you need to take into account the Worker Rights and not just the privacy law. Furthermore, the nature of the goods and services produced by the company
must also be considered.</span><blockquote class="iosymail"><blockquote></blockquote></blockquote>
</body></html>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Goirand@21:1/5 to Davide Prina on Fri Jan 14 12:20:02 2022

On 1/13/22 22:07, Davide Prina wrote:

So facial recognition will be illegal for doing workers authentication
or for identify clients in your shop or...

Let's say we have facial recognition to enter a data center, is this
illegal as well? Will that be also illegal in Switzerland?

Cheers,

Thomas Goirand (zigo)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Gard Spreemann@21:1/5 to M. Zhou on Fri Jan 14 16:00:01 2022

Thank you for your work in this area, and wise thoughts, as always!

"M. Zhou" <lumin@debian.org> writes:

My conclusion is: "Users with special demands can take care of themselves,
as we are unable to go far on our own." In terms of GPU computing, Debian
is providing a great system as a foundation for development and applications.

[…]

Based on my interpretation, it means Debian might step aside from the world of
AI applications to fully exercise software freedom. It's a pity but Debian's major role in the whole thing is a solid system.

I understand how you reach these conclusions, both from the POV of
hardware driver non-freedom and from the POV of the toxic candy problem
of trained models. And while I agree with your conclusions, I do worry
about the prospect of the lines blurring.

It's not unreasonable to expect that AI models become standard
components of certain classes of software relatively soon. Nomatter our position on the matter, I suspect the matter will affect lots of
"non-special", "ordinary" software sooner rather than later. That is not
to say that that should change our position – it is just to say that I
think we should worry.

What do we do if/when an image compression scheme involving a deep
learning model becomes popular? What do we do if/when every new FOSS
game ships with an RL agent that takes 80 GPU-weeks of training to
reproduce (and upstream supports nvidia only)? When every new text
editor comes with an autocompleter based on some generative model that
upstream trained on an unclearly licensed scraping of a gazillion
webpages?

-- Gard

--=-=-Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQJGBAEBCgAwFiEEz8XvhRCFHnNVtV6AnRFYKv1UjPoFAmHhjYESHGdzcHJAbm9u ZW1wdHkub3JnAAoJEJ0RWCr9VIz6AZsP/ROfWoP2aFh6ll5crPIzHtGPH4PpyHHb Cu6i1GS40gvzWud0R2DoMoMWY18mRWVgn8OiK9/JhvBoFJgAMmlUvasbFLi9+oEL 1yPZjyPCoh6eya3c9fIz9eCccOhgxeDk+tYumQfeas1KxncirhM+cPMnkPO+Cvq2 c1C3COiVlFMUxsyJ/vuSli9uGPGWpcqwhzlCSsBdLtGSLS11keuHbRT/F7XJTTwm kAqBwTvqH8iQOejMa9QeTWnN5oxQXHVIb4laPSeWw0NBiUiWb3UaLwlM0lL0mf5U UvcUb7rJ1PTNOPEZ0CzH0FGuvbtNN7IXtZPdScGJzlvaVKz/8CvYto/aiPvjt1zW V37uuN3iD4fIHQfh3oKhux4d+9Pxz/v2ifym6lowon7uJ39XMJ/PI5mlcU6Vdaj0 ejJnrNplVouu+YVORdJ4zZCi4jbO3QqGnVUhZ/d5hEy/hBzZeXA//cILKmEsF+TC ArBrzcrT5KBTbSQyb+6CU7UPqOwfUX979oaf3LxxdYTJIGwPYZ99mUYdhzfkyec0 m7QzSdrz1jXATEeSKqHn6vAMJBzsyE/s0hoQQUUVP0asce1eEI1Ju+EhCKqCZClh /fGCh/a4E1w+JFacGR+qHmORNMva/HR2+bQnplLZYgQ2Prk90+dwly+9T7nWSozl
p4VyOFanI3kk
=/at5
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From M. Zhou@21:1/5 to Gard Spreemann on Fri Jan 14 17:00:02 2022

On Fri, 2022-01-14 at 15:35 +0100, Gard Spreemann wrote:

I understand how you reach these conclusions, both from the POV of
hardware driver non-freedom and from the POV of the toxic candy
problem
of trained models. And while I agree with your conclusions, I do
worry
about the prospect of the lines blurring.

Indeed. But I eventually figured out that "lazy evaluation" on this
problem is the most realistic solution for distribution developers.
I'm not worried about it. See the reason below.

It's not unreasonable to expect that AI models become standard
components of certain classes of software relatively soon. Nomatter

[...]

What do we do if/when an image compression scheme involving a deep
learning model becomes popular? What do we do if/when every new FOSS
game ships with an RL agent that takes 80 GPU-weeks of training to
reproduce (and upstream supports nvidia only)? When every new text
editor comes with an autocompleter based on some generative model
that
upstream trained on an unclearly licensed scraping of a gazillion
webpages?

Indeed. Deep Learning has been demonstrated effective in video
compression as well. However, research projects are not entering
Debian. Only those implementations for industrial standard enter
our archive. Only when standards like H.267 (imagined) really
introduces deep learning as a part of the core algorithm, should
we worry about the blurred borderline. However, even if that
happened eventually, upstreams such as videolan and ffmpeg will
have to think about GPL interpretation before we think about it.
There is already an historical example from ffmpeg where pre-trained convolution kernels (in header file) are excluded from the GPL
source code. And I bet even the ISO standard group has to
think about the potential license/legal issues before introducing
that.

An RL agent that takes 80 GPU-weeks is also highly likely to
require a powerful GPU for inference when we play such game.
I play lots of games and what kind of open source game has
reached that level of being so GPU-demanding? Before that
comes true for free software games, they will first appear on
commercial titles, ahead of free software games by decades.

Generative model for code completion is already a widely known
problem, such as Github's codepilot. They are fancy and useful
but before we really think about the blurred borderline, we
have already seen how controversy it was.

Let's step back a little bit. When what you said all comes true,
there will be some way for the end users to install them onto
the system.
A relevant example is vscode. It is a prevalent editor, being
fond by a large user group across all systems. vscode's being
absent from official repository is not stopping the upstream
from distributing their own .deb packages. I understand how
tricky it is to package in our archive. I believe the same
thing will happen for new fancy AI tools (e.g., the face
authentication for linux tool already has its own .deb package).

Let me quote a word from a fellow developer: "In Debian we should
stop from chasing rabbits." To me, "lazy evaluation" on these
problems is seemlingly the best strategy. Based on Debian's
role in this ecosystem, thinking about serious issues before
our upstream does destines to make negligible technical progress.

When we really have to execute those "lazy evaluation", we
are not unprepared since the community is already aware of
the precautions and warnings.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Davide Prina@21:1/5 to Thomas Goirand on Fri Jan 14 21:00:01 2022

On 14/01/22 12:02, Thomas Goirand wrote:

On 1/13/22 22:07, Davide Prina wrote:

So facial recognition will be illegal for doing workers authentication
or for identify clients in your shop or...

Let's say we have facial recognition to enter a data center, is this
illegal as well? Will that be also illegal in Switzerland?

Switzerland is an anomaly: it is the state that gain more advantage from
the European Single Market but it is not in the European Single Market
because each state of the UE have single "contract" with it. I know that
EU is trying to invalidate/stop single "contract" and make Switzerland
join the European Single Market (I don't know if they have already reach
an agreement).

If Switzerland will join the European Single Market then it cannot
participate to the formation of new EU laws but it will need to adopt
all the new EU laws that European Single Market require. For example
privacy laws.

Note: ePrivacy Regulation is not already approved and so it can be
changed before approval.

But, for the actual privacy law, the Privacy Italian Board has forbid
and fined a public administration that have start to use worker
fingerprint as a method of let them enter/exit the society.

If you know Italian can read the following (I have take a random article): http://www.lavorosi.it/rapporti-di-lavoro/riservatezza/garante-privacy-ordinanza-del-14012021-no-alluso-delle-impronte-digitali-dei-dipendenti-s/

Ciao
Davide

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From tomas@tuxteam.de@21:1/5 to Free unofficial Italian translation on Fri Jan 14 20:20:01 2022

On Fri, Jan 14, 2022 at 05:33:32PM +0000, Free unofficial Italian translation - FUIT wrote:

It may seem like a stupid question, but are there any open source programs based on artificial intelligence for the recognition and forensic analysis of the voice print?

Wikipedia [1] is your friend. From there: bob.bio.spear [2] (GPLv3),
ALIZE [3] (LGPL) (there may be others, of course).

Cheers
[1] https://en.wikipedia.org/wiki/Speaker_recognition
[2] https://pypi.org/project/bob.bio.spear/
[3] https://alize.univ-avignon.fr/mediawiki/index.php/Main_Page

--
tomás

-----BEGIN PGP SIGNATURE-----

iF0EABECAB0WIQRp53liolZD6iXhAoIFyCz1etHaRgUCYeHDDgAKCRAFyCz1etHa Rp/ZAJ49fZmwasEhjKWYjW0unY02LJRY5wCffH0fFbd3n+tcXdHpsVTb9sl3Ifs=
=Bojc
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Davide Prina@21:1/5 to tomas@tuxteam.de on Sat Jan 15 12:20:02 2022

On 14/01/22 07:01, tomas@tuxteam.de wrote:

On Thu, Jan 13, 2022 at 10:07:05PM +0100, Davide Prina wrote:

On 13/01/22 01:00, M. Zhou wrote:

Cool and useful stuff keeps emerging -- e.g., Facial Authentication for Linux
https://github.com/boltgolt/howdy

note that EU (European Union) Privacy is managed by the GDPR Regulation and >> the ePrivacy Directive.
The Directive will be replaced by the ePrivacy Regulation that will have
more strict rules (probably this will be approved this year).

As far as I understand the GDPR won't restrict the tech itself, but only
its use. Which makes sense. Basically, no consent => no use, except in
very restricted scenarios (e.g. public security).

That said, to have a workable face recognition, you'll need a training
set (at least with current "solutions"), so you'll have to collect
consent from all those face "providers".

I think that is not so simple. The reply can be very long and
articulated, I will try to be very concise and let you know some points
that I think can be very "interesting".

If you manage biometric data of EU citizen you must consider also:

* citizen can revoke the consent: so probably you must retire you model
and generate new one without the data revoked. But if you have saved
your model in a CVS/DVCS or similar... or you have distributed the
model... how can you do that?

* with the new ePrivacy legislation, in some cases, the consent have a
time of validity (I don't know if applicable also for this uses type)
and you need to have a renewed consent... or delete the data (there are
some exceptions, but I don't think they are applicable in this cases;
and in any case these exceptions can have longer time validity)

* if you store and use biometric data you have to inform the Privacy
State Board and also have the OK for the use you are declaring. the
consent has validity only if you have done previously this step.

* in theory, for the few thing I know about AI, a model is something
similar to an aggregation/anonymization... but for facial recognition a researcher have been able to extract original face from a model used to generate faces of not existing people. Other researchers have
demonstrate that using anonymized data, aggregated with public data,
they can identify some real people of the anonymized data. In these
cases the biometric data can be stored only in EU territory and the
servers where are stored must not be accessible by servers external the
EU territory (as my previous reply in reality there are other territory external EU if they are part of the...)

All the above said, I'm not a lawyer. Nor do I play one on TV :)

I'm not a law/privacy expert, so I can mistake something.

Ciao
Davide

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Rahmatullin@21:1/5 to Free unofficial Italian translation on Sat Jan 15 14:20:01 2022

On Fri, Jan 14, 2022 at 08:23:03PM +0000, Free unofficial Italian translation - FUIT wrote:

I know wikipedia. I was hoping there was a forensic court expert for the voiceprint among you.

This sounds like a wrong topic for debian-project@.

--
WBR, wRAR

-----BEGIN PGP SIGNATURE-----

iQJhBAABCgBLFiEEolIP6gqGcKZh3YxVM2L3AxpJkuEFAmHiuLEtFIAAAAAAFQAP cGthLWFkZHJlc3NAZ251cGcub3Jnd3JhckBkZWJpYW4ub3JnAAoJEDNi9wMaSZLh DfgP/2sCKs9OkRR7RaDINN6Cw1xB8D/EkDF2615yPxg2viH/F8T/197EmAoSTipk 274DbVTAoWAVNEXRwIXp8TxvlzmqMA225D6L1L6baA+k6kJ37M/5f6z1IVX75oWW +vmxxUlNGTB3+9QSaSf9FARLh64GJ6Twxby7Oe9roDibSdZ3DAr7fsawObDF4mtI S6C6LRjs2rzYnEXQXyjemTSu+VK18jJ7hE/KCQSx+9ag+TYOdHWp5w0z6f6vbYEl K26ZOQ3RYFGmVEkgM3XkIrE4qreWqtjWmna0/BGJQNHpIYMKwEyuuQxyGUAPKqPy psvX4AboMKoqzSXVnWVgXyEmQqkf/8GQqGQ7NH1NJ+Ih5geU/4zSRQyexNBjMWgO BwNCf8wbTRp/6KPsY7xLi+4tJ4BLb4IzLpaReFkSfUt5H9BJtBS/7mi2ZfS0dgje medJ8dDtuUhTKZ8KY33SdYlSkXURv33rPZssqbLWezkfY5u2uPEwI64tRSn4A/bf 2u7NIN80u9KKn3T9wMrKAw/JECc6H+GjXGnJ7x+qKESNHY/1GO7d+cZEs6cj4CIR NjzhNLwRi7R+PH0nzzK2ASp3d3wC1AaSgXJtPjQKZMZlH6+XLioN7fgPFtTIOCkB QXK2qZbVdvaEIqcspcPRwMh0LlNMx+Hlj9F2eFh/pWk8TT+S
=l4ml
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From tomas@tuxteam.de@21:1/5 to Davide Prina on Sat Jan 15 14:40:01 2022

On Sat, Jan 15, 2022 at 10:45:35AM +0100, Davide Prina wrote:

On 14/01/22 07:01, tomas@tuxteam.de wrote:

[...]

That said, to have a workable face recognition, [...] you'll have to collect
consent from all those face "providers".

I think that is not so simple. The reply can be very long and articulated, I will try to be very concise and let you know some points that I think can be very "interesting".

Basically, we do agree: perhaps "collect consent" was a bit sloppy and suggested an one-time action. That wasn't what I wanted to convey -- for
each image you use in your training set, you'd have to keep enough
metadata to document the person's consent (and to make revocation
possible). At each change, you'd have to re-train your model (or do
something equivalent).

Cheers
--
t

-----BEGIN PGP SIGNATURE-----

iF0EABECAB0WIQRp53liolZD6iXhAoIFyCz1etHaRgUCYeLLNQAKCRAFyCz1etHa RiGnAJ4yMJ+wMwFlQqzgAQ0bHPaBNsFQHACcDRk0NbO+2Rpdxm6+lOD81UtSuyE=
=EvQu
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Paul Wise@21:1/5 to M. Zhou on Sun Jan 16 03:30:01 2022

On Wed, 2022-01-12 at 19:00 -0500, M. Zhou wrote:

I've had some discussions with several fellow developers on suggesting Debian to buy some GPUs to extend its infrastructures for better GPU support.

Was there a plan for what to use these GPUs for?

Were they needed for driver/other package building/testing?

Were they to be used for libre model training?

Although my initial thought is to make Debian useful in more areas like GPU computing, I finally realized that by accepting new non-free blobs as an organization, we are further loosing our core value written on our homepage --
"a complete free operating system".

This isn't any different to most modern hardware devices, which either
have non-free blobs embedded in them or have non-free blobs uploaded to
them or both. Even worse, server hardware often requires proprietary
software running in userspace to manage parts of the server. The modern hardware industry does not produce hardware that allows Debian to avoid
dealing with these blobs in some way. GPUs aren't any different here
IMO. Things may change with RISC-V, OpenBMC and other efforts though.

I predict that the ML-Policy [1] will work as a warning on potential
issues instead of some practical guidance on packaging

Mostly agreed with this section.

Based on my interpretation, it means Debian might step aside from the
world of AI applications to fully exercise software freedom. It's a
pity but Debian's major role in the whole thing is a solid system.

I think we should simply follow our social contract and guidelines as
usual. Package useful things, but place them in contrib or non-free as appropriate depending on the situation. Advocate for the release of
libre training data, retraining from scratch, license changes etc.

PS: I note that we already have Toxic Candy models in Debian main.
For example the rnnoise model was trained from proprietary data
but is available in Debian source packages:

$ apt-file search -I dsc rnnoise

--
bye,
pabs

https://wiki.debian.org/PaulWise

-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEYQsotVz8/kXqG1Y7MRa6Xp/6aaMFAmHjeFcACgkQMRa6Xp/6 aaMfZg//SkSy2HwjXnmCPY9DpSWO7gwgdO0BrEQhwlov7A6I9PnagyASqybh+wfE 9mvAWI0FwC4FIXE1dfATllotOm6RvBbEsM7ym7F8PpLyfTv837vO4GnN/3qqmNsp +pI40jPDnTvF2GGfmJlfiz6zeVqrwYw+AF4uJlWNUg/WzIAKB6E0L+YIVQfzfnPp pzHAYIHUJhTc6aS4BvMnQiuf8BayaSWlL1TRBfyMNkz75nVvIfc7AjTIQP9YFgEy I/9FpVh+nzGmdAEJGOr72y/qHKzi1HEw1iVdPliJ8bl4+qWPJQA8CmG5JtHGMlVT ogstyWFQvOZMzwXO1Hk+0rBRU0pHi+gts8VCcbRzRUxcQLqz3Vf1Zk2QFBQHUZKj t7Ji9Cz+bqJqLcU/RstpN0X7k6lqg6uWDiKKw4ANjDpS53eYqLe0QZGQELpiCz4+ uokK61ahzq0Fgqn7Qk5/hck2K37Ko3JQGXK3HfvyutmywO7OnSo0zfxU5UzvsJXE eicxP0Y2GAyifnP5+KFcM/0ydD38Psk15O40uvFh5Hv6tuCupgXbzYFOtknhSD9H P0J126EQVprOXDy4dNhps0xi4g0dmQHoL8HxlNw7it+r2+MzmMMuYkGaeGBf81Jh eYVCNVNjYOG8kjTHSrL119QRT106/aslXQuKkKD5oosSlJWQg1M=
=p67O
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From M. Zhou@21:1/5 to Paul Wise on Sun Jan 16 03:40:01 2022

Hi Paul,

Thanks for the additional questions.

On Sun, 2022-01-16 at 09:43 +0800, Paul Wise wrote:

On Wed, 2022-01-12 at 19:00 -0500, M. Zhou wrote:

I've had some discussions with several fellow developers on
suggesting Debian
to buy some GPUs to extend its infrastructures for better GPU
support.

Was there a plan for what to use these GPUs for?

Not specific plan, but I can list some of its usage if we have one.

Assuming it's an nvidia GPU, we can use it for

1. building and testing cuda-related computational software,
such as tensorflow-cuda, pytorch-cuda, magma, etc.
(this demand is confirmed by we debian deep learning team)

2. building and testing some multimedia tools, such as ffmpeg
(when linked against nvidia's library, the resulting ffmpeg
binary is not redistributable).
(this demand is not confirmed with multimedia team)

3. building and testing GPU acceleration for software such as
blender.
(not confirmed with maintainer)

4. transcoding our videos (e.g., our debconf videos.)
(not confirmed with debconf team)

5. train neural networks?
(Such demand should be quite rare, given my view point
in the original post)

And the problem is that nvidia-driver is non-free. It is
inevitable for any upper layer application. The open source
driver nouveau cannot do any of the above.

Assuming it's an AMD GPU, we can use it for

1. building and testing ROCm (the AMD's opensource counterpart
to CUDA). It looks like the amdgpu driver in kernel
is enough to drive the ROCm without requiring non-free blob.
(I'm not sure whether firmware is still required)
(people in debian-ai@l.d.o is recently working on packaging)

2. some deep learning framework has added ROCm support,
such as pytorch. we can build and test it

3. build/test any software with OpenCL support, such as
opencv, etc. So we don't have to do everything with pocl.

4. 5. same to nvidia's 4 and 5.

Assuming it's an Intel GPU,

I simply don't know. Let's wait and see the news.
Intel is making effort on SYCL (an abstraction of OpenCL), which
is called DPC++ by the upstream. Intel has not yet merged SYCL
into LLVM upstream.

Were they needed for driver/other package building/testing?

Non-free driver is required for nvidia GPU. Unfortuately for
industry users (especially machine learning users) nvidia GPU
is the most widely-supported and mature option.

Kernel already has the driver for AMD GPU. I'm just not sure
whether firmware is required to run ROCm or OpenCL etc.

Were they to be used for libre model training?

As long as we finish the deep learning framework packaging
with specific hardware support, we can do so -- as long as
we have the corresponding "libre" data.

This isn't any different to most modern hardware devices, which
either
have non-free blobs embedded in them or have non-free blobs uploaded
to
them or both. Even worse, server hardware often requires proprietary
software running in userspace to manage parts of the server. The
modern
hardware industry does not produce hardware that allows Debian to
avoid
dealing with these blobs in some way. GPUs aren't any different here
IMO. Things may change with RISC-V, OpenBMC and other efforts though.

I still remember the microcode example from the last discussion,
and it's true. But the server proprietary software are inevitable
to make it fully functional, while GPU doesn't.
An infra server can be fully functional without a GPU -- GPU
not inevitable.

Based on my interpretation, it means Debian might step aside from
the
world of AI applications to fully exercise software freedom. It's a
pity but Debian's major role in the whole thing is a solid system.

I think we should simply follow our social contract and guidelines as
usual. Package useful things, but place them in contrib or non-free
as
appropriate depending on the situation. Advocate for the release of
libre training data, retraining from scratch, license changes etc.

Yes, recalling our initial motivation and principals is a very good
idea when facing complicated issues. I fully agree.

PS: I note that we already have Toxic Candy models in Debian main.
For example the rnnoise model was trained from proprietary data
but is available in Debian source packages:

$ apt-file search -I dsc rnnoise

Well... right. I've seen related bug reports. Thanks!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Thu Apr 25 22:17:10 2024
  from Wales, Uk via Telnet
- Keyop
  Thu Apr 25 21:14:50 2024
  from Huddersfield, West Yorkshire via SSH
- Cronus
  Thu Apr 25 18:32:15 2024
  from Provo, Ut via SSH
- Bob Worm
  Fri Apr 26 06:40:30 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	296
Nodes:	16 (2 / 14)
Uptime:	68:19:07
Calls:	6,655
Calls today:	1
Files:	12,200
Messages:	5,332,031
Posted today:	1

Rethink about who we (Debian) are in rapid dev cycle of deep learning

Who's Online

Recent Visitors

System Info