• Questions about packaging the 'googleapis' project

    From Oliver Reiche@21:1/5 to All on Tue Jun 13 19:30:01 2023
    Dear mentors,

    Wookey and I are trying to come up with a sane concept to package the googleapis project [1]. During our initial investigation a few
    questions came up that we would like to discuss publicly.


    BACKGROUND:

    'googleapis' is a collection of protocol buffer [2] files, an
    interface description language for stable binary encoding of data, and
    gRPC service files. From those files, bindings for a variety of target languages (Python, Ruby, Java, C++ etc.) can be generated, using the
    'protoc' compiler with gRPC plugins. They _do_ offer a Makefile for
    generating those bindings, albeit quite out-dated apparently (due to
    ignoring protos from subfolder 'grafeas'). However, this Makefile only generates source files (and headers), which is fine for Python etc.
    but not particularly useful for Java/C++ etc. Furthermore, compiling
    these source files yourself can be quite tedious, because you need to
    know the dependency structure within the project, and this structure
    changes rather frequently. Example:

    Depending on 'google/longrunnig/operations.pb.cc' requires you to also
    compile and link
    - google/api/annotations.pb.cc
    - google/rpc/status.pb.cc
    for an older version of the project (143084a2624b6591ee1f9d23e7f5241856642f4d).

    Whereas on current master, you additionally need to compile and link google/api/client.pb.cc.

    Most users probably do not want to deal with such internal
    dependencies and just like to do:
    apt install libgoogleapis-dev
    pkg-config --libs googleapis_longrunning

    Therefore, Wookey's idea was to also compile the Java/C++ bindings and
    package the resulting libraries. Here is where things become
    difficult:
    - We do not have a build description for most of the bindings (some subfolders have Bazel BUILD files, but most do not)
    - We are talking about ~3,500 proto files. Building all of them
    results in extremely huge files.
    - jar: ~160MB
    - shared lib: ~3GB (with debug info) ~180MB (after stripping)
    - static lib: ~11GB (with debug info) ~600MB (after stripping)


    QUESTIONS:

    1. Due to the missing build description, is it ok if the maintainer
    provides a Makefile for building the C++ libraries in ./debian?

    2. With such large libraries, I guess it makes sense to split them up.
    I think a good approach to group proto files (for separation to
    different libraries) would be to look for their 'package' identifier
    (like a namespace, can be read from the file). Some packages belong to
    "sub packages" that might cause cyclic dependencies (e.g., grafeas.v1beta1.discovery). Therefore, I would suggest to use a
    heuristic to cut-off the package ID on first segment that matches
    '^v[1-9]+' (e.g., grafeas.v1beta1, resulting in libgoogleapis_grafeas_v1beta1.{a,so}). Doing this will result in
    'only' 413 different packages/libraries. What do you think about this
    approach?

    3. What granularity should we use for packaging? Should we provide
    these separated libraries via
    - a single debian package and a single dev package?
    - a debian package and dev package per library?
    - a debian package per library, but a single dev package for all headers?

    4. Such a Makefile (and control file) will be quite lengthy. My
    current solution is to use a Python script for analysing the proto
    files, grouping them according to their package id, building up a
    dependency graph, checking it for cycles, and finally generating the
    Makefile (and control file/pkg-config files etc.). With upcoming
    library releases this script could be extended and rerun.

    5. The Java bindings are considerably smaller. In my opinion, those
    could be provided in a single debian package, containing a single jar
    file. What do you think?

    6. As the googleapis repository is not versioned, it is hard to judge
    which protoc version is compatible with the current proto source base.
    I was talking to an ex-Googler and he told me I should look at the PiperOrigin-RevId (shown in some of the commits). That's their
    internal linear commit counter. According to him, we should look up
    the protobuf-compiler version that is currently packaged in that
    release. Then we should look for that ID in the googleapis commits and
    package the revision that fulfils the condition:
    PipeId(googleapis) <= PiperId(protoc)
    According to him, that's what has been tested at Google internally and
    is guaranteed to work. The same applies to the packaged
    protobuf-compiler-grpc, which is also a build dependency to
    googleapis. Do you think this is a valid approach?

    7. With no version given, what version should we use for this package?


    That's all for now. Any suggestions are very welcome. Many thanks!

    Oliver


    [1] https://github.com/googleapis/googleapis
    [2] https://protobuf.dev/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Wise@21:1/5 to Oliver Reiche on Wed Jun 14 09:40:02 2023
    On Tue, 2023-06-13 at 19:22 +0200, Oliver Reiche wrote:

    1. Due to the missing build description, is it ok if the maintainer
    provides a Makefile for building the C++ libraries in ./debian?
    ...
    4. Such a Makefile (and control file) will be quite lengthy.

    The upstream repo seems to use bazel as its build system, at least
    according to the README. Is that not usable here? The bazel tool
    appears to be packaged in bazel-bootstrap in Debian.

    I note that some of the BUILD.bazel files are themselves generated
    files, so you will also need to figure out how to regenerate them.

       $ git grep -l 'This file was automatically generated by' | wc -l
       396

    7. With no version given, what version should we use for this package?

    I noticed that upstream has a common-protos-1_3_1 tag, perhaps they
    could be convinced to add a version number?

    Or use the default version number uscan uses for unversioned repos.

    --
    bye,
    pabs

    https://wiki.debian.org/PaulWise

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEYQsotVz8/kXqG1Y7MRa6Xp/6aaMFAmSJbagACgkQMRa6Xp/6 aaP5Qg/+Mt0Pv0qVM9hMZrThqxvn5Ro+Yrt6lGRh5Tv5hML8/X3/BBRPWLf9CiFY nnFTwlwQYL29zsRv31pyXNUECVFugjrycfgoHi2Jjjg3EN4cDLWIBC4Odw8c8MM9 Xny7whcin7kCwypiTVMi6XSY8pBPq9P7UoGrjcUbqK/IybFGCiYTFLOC4Ub2P+3l 7O9OdGScV1sAkDyX4XCBuTRTeQGK6RlQ5WQ695vL8FpAN2mfb5tdD+KE9DUw2Dp/ r3XuX2okCvHflCGgRvgU1kadYPYWEMofViCOYAuiUV425VSWtIb6Y/4qG2ZNLHq5 8fltROs8HvERs6T/Xj8IMz12ePUVFbnH0Dm+saFsB6P7BAp9IFiVrsrvXjdi1hxy yMFW6VsWmSwukVsmpN5rHgrAj8qJpffkdBkdTL3VP+AxEH/b2rbklDu5TPot1vI0 /HiRIKS6B/S5ZouMZ1SdYc/WEp8ESg6BY5rSUZPkiI58EqpDPI8n54GRsOqbeMfY UYsa2b+4cVle9eYYxOOOw0aRI1ScVRSbyqCa9ri5ib7mBAg6zFJj8MHjJDJ9Ae61 /rLc5vQBkhoEM6TApBcWdqOZrcU5wM6lPAK1xT6UPMZwpwpKxEolpDHcTKzoLhGH 8tstX+Yco+lAGRUhTgsBRvcIknFqKH+zi5tA4DTnI17lxdcU1WA=
    =ydNg
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Given@21:1/5 to Paul Wise on Wed Jun 14 11:30:01 2023
    On Wed, 14 Jun 2023 at 09:35, Paul Wise <pabs@debian.org> wrote:
    [...]

    The upstream repo seems to use bazel as its build system, at least
    according to the README. Is that not usable here? The bazel tool
    appears to be packaged in bazel-bootstrap in Debian.


    bazel-bootstrap is very old, unfortunately --- it's bazel 4.2.3, while the current version is 6.something and has moved on a lot since then. bazel's really hard to package.

    The sad thing is that bazel, which is the least bad build system I've ever used, works in a way that's completely antithetical to how Debian wants to build things: it doesn't want to use host software for *anything* and will, e.g., download and recompile libraries you refer to at build time. So you
    end up with mostly static binaries containing embedded copies of any
    libraries you refer to. It even likes to download its own compiler rather
    than using the host one. This is all for perfectly good reasons, but those reasons don't work for Debian.

    It ought to be possible to write bazel files which *do* produce Debian-compliant binaries, but it'll be tough (you'd basically need to replicate the ability to install Debian packages into the bazel build tree
    in order to use them as dependencies), and of course any third party
    software won't be doing that anyway so the package maintainer would need to rewrite large chunks of the build system.

    <div dir="ltr"><div dir="ltr">On Wed, 14 Jun 2023 at 09:35, Paul Wise &lt;<a href="mailto:pabs@debian.org">pabs@debian.org</a>&gt; wrote:<br></div><div class="gmail_quote"><div>[...] <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.
    8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The upstream repo seems to use bazel as its build system, at least<br>
    according to the README. Is that not usable here? The bazel tool<br>
    appears to be packaged in bazel-bootstrap in Debian.<br></blockquote><div><br></div><div>bazel-bootstrap is very old, unfortunately --- it&#39;s bazel 4.2.3, while the current version is 6.something and has moved on a lot since then. bazel&#39;s really
    hard to package.<br></div><div><br></div><div>The sad thing is that bazel, which is the least bad build system I&#39;ve ever used, works in a way that&#39;s completely antithetical to how Debian wants to build things: it doesn&#39;t want to use host
    software for <i>anything</i> and will, e.g., download and recompile libraries you refer to at build time. So you end up with mostly static binaries containing embedded copies of any libraries you refer to. It even likes to download its own compiler
    rather than using the host one. This is all for perfectly good reasons, but those reasons don&#39;t work for Debian.</div><div><br></div><div>It ought to be possible to write bazel files which <i>do</i> produce Debian-compliant binaries, but it&#39;ll be
    tough (you&#39;d basically need to replicate the ability to install Debian packages into the bazel build tree in order to use them as dependencies), and of course any third party software won&#39;t be doing that anyway so the package maintainer would
    need to rewrite large chunks of the build system.</div><div><br></div></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Oliver Reiche@21:1/5 to dg@cowlark.com on Wed Jun 14 14:20:01 2023
    On Wed, Jun 14, 2023 at 11:27 AM David Given <dg@cowlark.com> wrote:

    The sad thing is that bazel, which is the least bad build system I've ever used, works in a way that's completely antithetical to how Debian wants to build things: it doesn't want to use host software for anything and will, e.g., download and recompile
    libraries you refer to at build time. So you end up with mostly static binaries containing embedded copies of any libraries you refer to. It even likes to download its own compiler rather than using the host one. This is all for perfectly good reasons,
    but those reasons don't work for Debian.


    Indeed, the bazel files are not an option, unfortunately. Also, the
    bazel build does not expose the dependency structure among the
    produced libraries. So we still need to analyse the proto files anyway
    to generate the correct dependencies in the control file (and while
    we're at it also pkg-config files). And if we do that already, we can
    as well extend this to generate the missing build description
    (Makefile), which is really not that hard. (TBH: I would rather use
    the Justbuild build system for compiling the proto files, but it's not
    packaged yet and it has a build dependency on googleapis...)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)