• Re: Mass svn-to-git migration - Progress report

    From Mattia Rizzolo@21:1/5 to Diederik de Haas on Wed Apr 12 17:30:01 2023
    On Thu, Apr 06, 2023 at 11:02:26PM +0200, Diederik de Haas wrote:
    The idea I had/have is to do the conversion and place them in a separate namespace/group, ready to be picked up by a prospective maintainer.

    I've now created a group for that on Salsa under which the converted repos can
    be stored: https://salsa.debian.org/groups/alioth-to-salsa-migration-team (*)

    That way a prospective maintainer can use that as *a* source, but can also use
    other sources like f.e. "gbp import-dscs" to create/rewrite a/the proper (git)
    history to their liking for the to be adopted package before it gets placed in
    the 'normal' Salsa structure.


    This is all fine and dandy, but one matter that doesn't seem to be
    mentioned in anything I've read is how/if you are going to match the SVN usernames with proper git-ready authorship information (i.e. email
    addresses).

    Did you prepare such matching list? Like, I have a 111 lines authors
    file that I used in the past for some conversions I did (coming from the debian-med team, fwiw), but I'm sure that if you are going to properly
    convert collab-maint you'll need quite a longer list.

    --
    regards,
    Mattia Rizzolo

    GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`.
    More about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'`
    Debian QA page: https://qa.debian.org/developer.php?login=mattia `-

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEi3hoeGwz5cZMTQpICBa54Yx2K60FAmQ2ykwACgkQCBa54Yx2 K62KbxAAmdQSD3eLxtsDQOtkQAHA+Osu9MhsaogekEpllOQM3hLMw26ebBxLl43T 0YCDxQhEIjPwkHi7zSk19KeNaE30S1WRbgT4siQpHXmDrLrGN5MNrfVl1B7RkEOp PFB4xSFVJDEeQpGJdHDGwjlTX0Hyye9ydQFiVEi6dOBGUFndxTOhEGNjZm91523C heNDxq2WOloP3niLTFsKqHlU0hMtmdmnX5yL7/LNac5YeLl2aA/MGXqZ5bAXPpcz ITfrU7WiMQzP4ndS6cQcjDSrxdGYVBeupNJ4T216931A9bIjC72FuVZKVhgamsTe GjoQKtck53TjWuek3X6AmNr4NBQrNJ9+kBt91k2/C3ZR37ZhTlgo24RvfTJLFcMo MSnCXcs+eqvPt4tCR/JUY+u/vtuX2VR3J9RRVoc192Ww5OFcVfbYPHMfffU8KxEm bJMUtwteKbCjyK8IwLKp/OUIXZKeoNm0qnt/OfRMqKIVrmpy8gykagrEWDfqW8tq W05q2t6izAabYeMk1gxmEk+e3TGKwYP14eo6Jv/4bcN1LJ8lH4d
  • From Diederik de Haas@21:1/5 to Mattia Rizzolo on Wed Apr 12 17:49:46 2023
    On Wednesday, 12 April 2023 17:12:14 CEST Mattia Rizzolo wrote:
    This is all fine and dandy, but one matter that doesn't seem to be
    mentioned in anything I've read is how/if you are going to match the SVN usernames with proper git-ready authorship information (i.e. email addresses).

    In my OP I explicitly mentioned the following:

    On Friday, 27 January 2023 19:56:37 CEST Diederik de Haas wrote:
    - If it needs to be done, isn't it way better to do a mass-migration of all the repos which haven't been converted yet? There may be a high similarity between the various SVN repos, but all the projects within one SVN repo likely share many things? Like svn-user to git-user mapping?

    And Jelmer responded with this:

    On Friday, 27 January 2023 20:14:43 CEST Jelmer Vernooij wrote:
    * lintian-brush has a mapping of team names to locations in salsa,
    here: https://salsa.debian.org/jelmer/lintian-brush/-/blob/master/lintian_brush/ salsa.py

    While I haven't made such a list yet, it surely was considered.
    As for the how, that'll likely be a manual task in which I will try to construct such a mapping file by (a lot of) searching.

    Or am I misunderstanding your question?

    Did you prepare such matching list? Like, I have a 111 lines authors
    file that I used in the past for some conversions I did (coming from the debian-med team, fwiw), but I'm sure that if you are going to properly convert collab-maint you'll need quite a longer list.

    One of the reasons for posting about this is the hope that someone else who may have experience with a SVN-to-Git migration would share their experiences, tools and other things that could benefit such a migration, like f.e. an authors list. So if you could share that, that would be most welcome :-)

    Cheers,
    Diederik
    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQT1sUPBYsyGmi4usy/XblvOeH7bbgUCZDbTGgAKCRDXblvOeH7b bhCyAQDNEW6ZwcFlZCwLrILXyA460+U3B9mIdfqh29kYtbZvpwEA06bjzLDnGLsC NLp9EDX1DuEmRU849LNO4C9G7NwPbQ4=
    =SSkc
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mattia Rizzolo@21:1/5 to Diederik de Haas on Wed Apr 12 19:10:01 2023
    On Wed, Apr 12, 2023 at 05:49:46PM +0200, Diederik de Haas wrote:
    - If it needs to be done, isn't it way better to do a mass-migration of all the repos which haven't been converted yet? There may be a high similarity between the various SVN repos, but all the projects within one SVN repo likely share many things? Like svn-user to git-user mapping?


    Guess I missed this one, sorry!

    On Friday, 27 January 2023 20:14:43 CEST Jelmer Vernooij wrote:
    * lintian-brush has a mapping of team names to locations in salsa,
    here: https://salsa.debian.org/jelmer/lintian-brush/-/blob/master/lintian_brush/ salsa.py

    I don't think team names are so relevant to my concern, though…?

    While I haven't made such a list yet, it surely was considered.
    As for the how, that'll likely be a manual task in which I will try to construct such a mapping file by (a lot of) searching.

    Indeed, filling all of them can be quite annoying.

    Did you prepare such matching list? Like, I have a 111 lines authors
    file that I used in the past for some conversions I did (coming from the debian-med team, fwiw), but I'm sure that if you are going to properly convert collab-maint you'll need quite a longer list.

    One of the reasons for posting about this is the hope that someone else who may have experience with a SVN-to-Git migration would share their experiences,
    tools and other things that could benefit such a migration, like f.e. an authors list. So if you could share that, that would be most welcome :-)

    What I used in the past is this script:


    | if [ -z "$SVN_URL" ]; then
    | export SVN_URL=svn://svn.debian.org/svn/debian-med/trunk/packages/ | fi
    | if [ $# != 1 ] ; then
    | echo "Usage: $0 <package to convert>"
    | exit
    | fi
    | export PKG=$1
    |
    | git svn clone \
    | ${SVN_URL}/${PKG} \
    | -T /trunk/${PKG} \
    | --tags tags \
    | --trunk trunk \
    | --authors-file=debian-med-authors \
    | --prefix=svn-import/ \
    | --no-metadata \
    | ${PKG} 2>&1 | tee >> svn2git_${PKG}.log



    I'll attach the authors-file that I used (which has some addition to the
    one from the debian-med repo, although probably I also don't have the
    latest version) in a separate private email, since it's likely better to
    not spread a collection of email addresses to a public mailing list.

    Another possibly nice authors-file could be the one that was used in the DPMT/PAPT migration. Stefano Rivera <stefanor@debian.org> took care of
    the PAPT one, so maybe email him to see if he has a compiled list of
    authors?



    So, overall, really nothing particularly complex.

    --
    regards,
    Mattia Rizzolo

    GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`.
    More about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'`
    Debian QA page: https://qa.debian.org/developer.php?login=mattia `-

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEi3hoeGwz5cZMTQpICBa54Yx2K60FAmQ25VcACgkQCBa54Yx2 K61i+Q/9EmjwTBleVkzl7RIuG9dbut3H6XBi3diWHb+/RBne0+sX5R1htj/UuPJ6 TL1lOAIPRvE+3m4LqVxeCf3u1hXNtTBCp4NBShSq4+WZjrCgAVNA8VZJiOaaGku1 whApK2DR35JWpw607aGZbegDZ6YzrsWTWFSp5y6DTzDTS6MrvzRbj6tkAUEF/M85 w+bH0OdA6mawdxh1On1GLwh0u+J57zd+jIF8THsnd5Esmfk1QQxtnPTq3W0JEQaB VuUQXamwWKSVrNKt36AcdQMer61W1lmzH9FHrxyn3QR18YrCqYNYllg1DCa9o75p Rf1Y/NDOuH7NkhMZ0gAx2hKxhRNj05/aKFXTVwYcUU7ay8ZCain0/YZq2Ii8+XtV lil2s6Y/b2doN518VoKryFVJpwD3KkLOGRjaYqywjra+X9Z6PjjMh0bcDStWpAN/ nLGEZiFnwWqI+8LtsUjkRMoGqBGwK7KMiA4Wm0O9KhvPITxwz0cxLLH5D3/Ka3LC DmYBs968Fl6DD6G1oBYHajrIi0/MicPJobSAtunEA3wQzEu5mRq
  • From Diederik de Haas@21:1/5 to All on Wed Apr 12 19:35:30 2023
    On Wednesday, 12 April 2023 18:45:31 CEST Jelmer Vernooij wrote:
    I think the git repositories should just be created directly under
    debian/. There's no reason not to, it makes it easier to find them
    (not everybody will know about the alioth-to-salsa-migration-team),
    and removes the need to create a separate group and to fork them to
    debian/ later. There's not much controversial there - it's just pure
    git to git.

    If debian/control points to the alioth-to-salsa-migration-team repo, it shouldn't be hard to find.

    On the subversion migrations, the hardest part is in dealing
    with the differences in habits between svn-buildpackage and git-buildpackage (or whatever you're expecting people to use). You'll need
    to convert svn-buildpackage settings to e.g. git-buildpackage somehow, and e.g. potentially weave in the upstream sources
    (lots of packages in svn ship just debian/ whereas in git it's common
    to include the upstream source). If you don't do that, then
    it's probably better to not do a conversion from svn at all, but e.g.
    do a straight import from the archive (e.g. based on the dgit
    repository).

    Then it's better to abort the migration attempt at all and do a straight import from the archive.

    I wanted to convert id3lib to git and then use that to *learn* packaging with git-buildpackage (and related stuff). But if I first have to learn that and also
    svn-buildpackage, which I really don't want, then it's kind of pointless for me. It would also increase the amount of work 10 or 100 fold?
    And I have no idea how I would _assume_ how other people would use it as I know virtually nothing about (git) packaging myself.

    The idea I had with the a-t-s-m-t repos is that I would not (have to) assume anything about how a prospective maintainer would want to do their packaging, but leave that choice entirely up to them. That's why I wanted to leave open the option of them being able to rewrite the git history to suit their needs and before the repo would be moved to f.e. the debian/ namespace.
    All they would get is the old SVN repo, but converted to git.

    Diederik
    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQT1sUPBYsyGmi4usy/XblvOeH7bbgUCZDbr4gAKCRDXblvOeH7b br1UAP9LUCwNQtXqra2KBta3mw+zWfLaGzBlmULgq+EgaeHK2gD+KdwQ7HY6uNjf w9DDXR59paUv9N/B3llOjdBlKhiYjgc=
    =t1qw
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jelmer =?utf-8?Q?Vernoo=C4=B3?=@21:1/5 to Diederik de Haas on Wed Apr 12 19:20:01 2023
    On Thu, Apr 06, 2023 at 11:02:26PM +0200, Diederik de Haas wrote:
    Hi,

    On Friday, 27 January 2023 20:14:43 CEST Jelmer Vernooij wrote:
    I've been looking at how to do a mass conversion. There's about 375 packages
    still listed as being on alioth (~100 in SVN, ~267 in Git, the rest in something else). https://janitor.debian.net/cupboard/result-codes/hosted-on-alioth?campaign=u
    nchanged&include_transient=off&include_historical=off

    That page now returns 747 packages?

    I think I already did some post-processing of that list. Some packages
    have been migrated to salsa, but have not had any uploads since - so
    the janitor can't find them yet.

    What still needs to happen is:

    * The mapping still needs to be tied together with the import script, to
    generate correct URLs to push to and set the Vcs-* headers
    appropriately

    I'm not sure what to do with packages whether the owning user or team
    is not on salsa. Add them to the "debian" group?

    * The import script supports just git right now, not svn. There's ~8
    repositories in a VCS other than SVN or Git, which we could just
    migrate manually.

    After sufficient procrastination I started working on this again ;-)

    The idea I had/have is to do the conversion and place them in a separate namespace/group, ready to be picked up by a prospective maintainer.

    I've now created a group for that on Salsa under which the converted repos can
    be stored: https://salsa.debian.org/groups/alioth-to-salsa-migration-team (*)

    That way a prospective maintainer can use that as *a* source, but can also use
    other sources like f.e. "gbp import-dscs" to create/rewrite a/the proper (git)
    history to their liking for the to be adopted package before it gets placed in
    the 'normal' Salsa structure.

    There's also a practical reason (for me) as f.e. the id3lib repo is stored under `collab-maint/deb-maint/id3lib` and having a group on Salsa allows me to
    create subgroups and subsubgroups under which to store the git repo(s).

    I'm open to suggestions how to structure the converted repos and possible (git) repos created to support this mass migration.
    I have attached the document I've written thus far wrt Subversion, but that really needs to be put under version control and possibly/likely split up (and
    linked from a README.md document?).

    *) I've also added/invited Jelmer to that group as Owner, possibly for practical reasons, certainly for the bus-factor reason.

    I think the git repositories should just be created directly under
    debian/. There's no reason not to, it makes it easier to find them
    (not everybody will know about the alioth-to-salsa-migration-team),
    and removes the need to create a separate group and to fork them to
    debian/ later. There's not much controversial there - it's just pure
    git to git.

    On the subversion migrations, the hardest part is in dealing
    with the differences in habits between svn-buildpackage and git-buildpackage (or whatever you're expecting people to use). You'll need
    to convert svn-buildpackage settings to e.g. git-buildpackage somehow, and e.g. potentially weave in the upstream sources
    (lots of packages in svn ship just debian/ whereas in git it's common
    to include the upstream source). If you don't do that, then
    it's probably better to not do a conversion from svn at all, but e.g.
    do a straight import from the archive (e.g. based on the dgit
    repository).

    Jelmer

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jelmer =?utf-8?Q?Vernoo=C4=B3?=@21:1/5 to Diederik de Haas on Wed Apr 12 23:00:02 2023
    On Wed, Apr 12, 2023 at 07:35:30PM +0200, Diederik de Haas wrote:
    On Wednesday, 12 April 2023 18:45:31 CEST Jelmer Vernooij wrote:
    I think the git repositories should just be created directly under
    debian/. There's no reason not to, it makes it easier to find them
    (not everybody will know about the alioth-to-salsa-migration-team),
    and removes the need to create a separate group and to fork them to
    debian/ later. There's not much controversial there - it's just pure
    git to git.

    If debian/control points to the alioth-to-salsa-migration-team repo, it shouldn't be hard to find.

    It seems simpler to just immediately push to debian/ where we would
    eventually expect the majority of repositories to end up anyway. It
    may be simple to fork, it's simpler to not even have to fork.

    (I don't think that holds for the SVN imports, where people probably
    want to do a lot of post-processing)

    If you're planning to update debian/control in a way that's visible to
    people you'll have to do NMUs. It seems bad form to do NMUs updating
    the Vcs-Git field to point to a repository that the original
    maintainer doesn't have access to.

    On the subversion migrations, the hardest part is in dealing
    with the differences in habits between svn-buildpackage and git-buildpackage
    (or whatever you're expecting people to use). You'll need
    to convert svn-buildpackage settings to e.g. git-buildpackage somehow, and e.g. potentially weave in the upstream sources
    (lots of packages in svn ship just debian/ whereas in git it's common
    to include the upstream source). If you don't do that, then
    it's probably better to not do a conversion from svn at all, but e.g.
    do a straight import from the archive (e.g. based on the dgit
    repository).

    Then it's better to abort the migration attempt at all and do a straight import from the archive.

    I wanted to convert id3lib to git and then use that to *learn* packaging with git-buildpackage (and related stuff). But if I first have to learn that and also
    svn-buildpackage, which I really don't want, then it's kind of pointless for me. It would also increase the amount of work 10 or 100 fold?
    And I have no idea how I would _assume_ how other people would use it as I know virtually nothing about (git) packaging myself.

    The idea I had with the a-t-s-m-t repos is that I would not (have to) assume anything about how a prospective maintainer would want to do their packaging, but leave that choice entirely up to them. That's why I wanted to leave open the option of them being able to rewrite the git history to suit their needs and before the repo would be moved to f.e. the debian/ namespace.
    All they would get is the old SVN repo, but converted to git.

    FWIW I think there's value in making conversions of the SVN
    repositories available as Git repositories, if they're not usable
    as-is; it at least removes some of the steps in migrating packaging
    from SVN to Git, and provides a start for migrations. But there's more post-processing / conversion necessary beyond that for packages to
    function like other Debian packages in Git.

    Cheers,

    Jelmer

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Diederik de Haas@21:1/5 to All on Tue May 2 19:54:21 2023
    On Tuesday, 2 May 2023 19:52:03 CEST Jelmer Vernooij wrote:
    On Tue, May 02, 2023 at 07:37:00PM +0200, Diederik de Haas wrote:
    Hi Jelmer,

    My email program ate the contents of the email including the attachment. This is the private email Mattia send us. Could you forward it to me?

    Here it is:

    Awesome, thanks :)

    Another possibly nice authors-file could be the one that was used in the DPMT/PAPT migration. Stefano Rivera <stefanor@debian.org> took care of
    the PAPT one, so maybe email him to see if he has a compiled list of authors?

    If you do, could you possibly share with me the final result?
    Thank you!

    Will do.

    Cheers,
    Diederik
    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQT1sUPBYsyGmi4usy/XblvOeH7bbgUCZFFOTQAKCRDXblvOeH7b btvzAP9mp17O65Pe7A2Z2G24bEpr2c7F32DsRVFrZ+HnKYhxgAEA1NJo/z+gET44 nKX3XVFY3bS2lcomGVlScgyetVsvCQI=
    =si/S
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)