• Ftn I/Os documentation best practices

    From Don Y@21:1/5 to All on Sun Jun 26 12:35:46 2022
    I add a boilerplate to each function definition that
    declares constraints on inputs, expectations of outputs,
    performance issues, etc. I use this to add invariants
    to the code to detect/enforce these conditions.

    But, there is nothing that ensures that I've done
    this -- other than discipline.

    I'm looking at ways to create an IDL that will allow
    for more specific criteria to be included in the
    declaration that could also drive the IDL compiler
    to add suitable invariants as applicable.

    [This makes RPC much more effective but can also
    benefit traditional ftn invocations]

    Any pointers to similar schemes? I've been looking
    through CORBA et al. for hints but they seem to
    focus on bigger machines (where there is more tolerance
    over data types and more overhead expected).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Don Y on Mon Jun 27 09:36:21 2022
    On 26/06/2022 21:35, Don Y wrote:
    I add a boilerplate to each function definition that
    declares constraints on inputs, expectations of outputs,
    performance issues, etc.  I use this to add invariants
    to the code to detect/enforce these conditions.

    But, there is nothing that ensures that I've done
    this -- other than discipline.

    I'm looking at ways to create an IDL that will allow
    for more specific criteria to be included in the
    declaration that could also drive the IDL compiler
    to add suitable invariants as applicable.

    [This makes RPC much more effective but can also
    benefit traditional ftn invocations]

    Any pointers to similar schemes?  I've been looking
    through CORBA et al. for hints but they seem to
    focus on bigger machines (where there is more tolerance
    over data types and more overhead expected).

    What programming language are you using? If your answer is "C", it's wrong.

    If you are just putting these things in comments, then they will get out
    of sync with the code. The best you can do is writing something like a
    Python script that will read the C code and check for the pattern of
    comments.

    If you want something really useful, you need a programming language
    that will let you write the contracts in the language itself - then they
    can be checked and enforced. Ada, D, and Scala are examples. C++ has a Boost.Contracts library, and language support for contracts is due in
    C++23 (last I heard - but it might be delayed again).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Edwards@21:1/5 to David Brown on Mon Jun 27 15:18:07 2022
    On 2022-06-27, David Brown <david.brown@hesbynett.no> wrote:
    On 26/06/2022 21:35, Don Y wrote:
    I add a boilerplate to each function definition that
    declares constraints on inputs, expectations of outputs,
    performance issues, etc.

    What programming language are you using? If your answer is "C",
    it's wrong.

    If you are just putting these things in comments, then they will get out
    of sync with the code.

    I'd have to agree. I've worked with many projects and third-party
    libraries over the decades which had a big template of comments for
    every function which described the input/ouput parameters, return
    value, global variables used, and so on.

    Often these templates generated documents by using something like
    Doxygen.

    And on _every_single_one_ of those projects and libraries, the
    comments were wrong often enough that nobody who knew which way was up
    paid any attention to them. If you wanted to know what the parameters
    were for, what the function returned, and so on, you read the C code.

    A lot of the time, even the numbers and names of the parmeters
    described in the template didn't match the code.

    The auto-generated PDF documents and HTML web site looked nice, though.

    --
    Grant

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Grant Edwards on Mon Jun 27 19:52:44 2022
    On 27/06/2022 17:18, Grant Edwards wrote:
    On 2022-06-27, David Brown <david.brown@hesbynett.no> wrote:
    On 26/06/2022 21:35, Don Y wrote:
    I add a boilerplate to each function definition that
    declares constraints on inputs, expectations of outputs,
    performance issues, etc.

    What programming language are you using? If your answer is "C",
    it's wrong.

    If you are just putting these things in comments, then they will get out
    of sync with the code.

    I'd have to agree. I've worked with many projects and third-party
    libraries over the decades which had a big template of comments for
    every function which described the input/ouput parameters, return
    value, global variables used, and so on.

    Often these templates generated documents by using something like
    Doxygen.

    And on _every_single_one_ of those projects and libraries, the
    comments were wrong often enough that nobody who knew which way was up
    paid any attention to them. If you wanted to know what the parameters
    were for, what the function returned, and so on, you read the C code.

    A lot of the time, even the numbers and names of the parmeters
    described in the template didn't match the code.

    The auto-generated PDF documents and HTML web site looked nice, though.


    Accuracy of such in-code documentation varies, but there is generally no
    way to check it automatically. That's one of the reasons it is better
    to use constructs in the programming language, where possible, rather
    than documentation and comments. For preconditions, postconditions and invariants, you need a language that has support for contracts. For
    other languages, usually the best you can do is careful choice of names
    and types, along with assert statements.

    Still, Doxygen-like comments in code are usually better synchronised
    with the code than external documentation!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Grant Edwards on Mon Jun 27 14:34:32 2022
    On 6/27/2022 8:18 AM, Grant Edwards wrote:
    On 2022-06-27, David Brown <david.brown@hesbynett.no> wrote:
    On 26/06/2022 21:35, Don Y wrote:
    I add a boilerplate to each function definition that
    declares constraints on inputs, expectations of outputs,
    performance issues, etc.

    What programming language are you using? If your answer is "C",
    it's wrong.

    If you are just putting these things in comments, then they will get out
    of sync with the code.

    I'd have to agree. I've worked with many projects and third-party
    libraries over the decades which had a big template of comments for
    every function which described the input/ouput parameters, return
    value, global variables used, and so on.

    You perhaps missed the balance of my post:

    "I use this to add invariants to the code to detect/enforce
    these conditions."
    ...
    "I'm looking at ways to create an IDL that will allow for
    more specific criteria to be included in the declaration
    that could also drive the IDL compiler to add suitable
    invariants as applicable."

    I.e., a "specification language" FROM WHICH the IDL compiler can
    (I am currently using an enhanced form of OCL) create the code -- in
    whatever language binding is selected AT COMPILE TIME.

    So, if I say:
    month > 0
    AND
    month < 13
    as constraints *in* the function's "prototype", then
    the IDL compiler generates an invariant that throws a
    "range error" OR panics (depending on IDL compiler switch)
    AT RUN TIME if the function is invoked with the "month"
    parameter not compliant with those constraints.

    The OCL *documents* the calling constraints of the
    function (and its return values) in a language neutral
    manner. I.e., you could create an ASM binding for the
    IDL compiler's output and the programmer would be
    none the wiser.

    The advantage of driving the code generator this way is
    the "documentation" creates the code -- if you don't
    *document* (declare) a constraint, then it isn't enforced.

    It ensures the code and documentation agree and that
    every bit of documentation has a corresponding bit of
    code (but not necessarily the other way around)

    Often these templates generated documents by using something like
    Doxygen.

    And on _every_single_one_ of those projects and libraries, the
    comments were wrong often enough that nobody who knew which way was up
    paid any attention to them. If you wanted to know what the parameters
    were for, what the function returned, and so on, you read the C code.

    You *always* read the code. The OCL declarations *are* effectively
    code; the stub generated *will* reference "month" and not "moth"
    or "monday" (or whatever). But, they are formally expressed in a
    syntax defined by the "specification language" (~OCL in my case).

    Invoking the exemplar with a month of "13" could possibly work
    within the body of the function, as implemented -- perhaps treating
    this as year++ with month=1 -- but the invariant won't let the
    value *into* the function. Because the intent was *not* to invoke
    the function with a bogus month value.

    19A0 is not 2000!

    The whole point is to encourage the developer to codify (in OCL)
    the constraints on the code so that the IDL compiler can create
    the actual instruction sequence (in the language bound to that set
    of command line switches) to enforce those constraints.

    *But*, you are still reliant on discipline; if the developer
    doesn't declare those constraints, then the compiler can't create
    any code to do this and simply is resigned to creating the code
    to marshal arguments and pack the message for transport.

    One can casually inspect the IDL files to see if there is an
    abundance -- or a dearth -- of constraints without having to
    parse countless source files. The IDL files *generate* the
    "header" files so you can't skip that step.

    Additionally, it can generate the sever side stubs (in whichever
    language binding is appropriate *there*) to unpack and parse
    the message, convert the arguments to whatever format is "native"
    for the server (knowing that their values are "legitimized" by
    the client-side stub) and hand them off to the server-side
    function.

    [similarly handling the return message]

    A lot of the time, even the numbers and names of the parmeters
    described in the template didn't match the code.

    The auto-generated PDF documents and HTML web site looked nice, though.

    There's no point in generating "prose" from such a specification.
    What are you going to do, pretty-print the generated stubs? Or,
    the OCL-expressed constraints?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Pelc@21:1/5 to All on Tue Jun 28 08:30:35 2022
    On 27 Jun 2022 at 17:18:07 CEST, "Grant Edwards" <invalid@invalid.invalid> wrote:

    If you are just putting these things in comments, then they will get out
    of sync with the code.

    I'd have to agree. I've worked with many projects and third-party
    libraries over the decades which had a big template of comments for
    every function which described the input/ouput parameters, return
    value, global variables used, and so on.

    Often these templates generated documents by using something like
    Doxygen.

    For the last 20 years or so, virtually all our manuals have been created
    by our own "literate programming" system called DocGen. DocGen is
    optimised for Forth, but it would not be a big job to write a version for C.

    DocGen diverges from Doxygen and friends in a several ways. In
    particular it does not need template blocks. If your C code is so bad
    that another programmer cannot read the declaration, you need far
    more help than DocGen or Doxgen can give you. The main entry
    for a function follows the declaration

    float someFunc( int how, double x, double y )
    // *G The purpose of *\c{someFunc} is ...
    // ** ...
    {
    ...
    }

    The lines starting // *x are formal comments to be processed by
    DocGen. The *X parts are formatting commands, and the *\<name>{}
    parts are text macros.

    The ideas behind DocGen are that the code and the documentation
    are never separated, and that the DocGen portion is not much larger
    than the descriptive comments you should have in your code anyway.
    Keeping the code in sync with the documentation is a matter of
    company culture and management.

    Whenever we receive third party code to include in our products,
    we *always* DocGen it before release and we *always* find some
    bugs. Overall, I estimate that writing the documentation alongside
    the code costs about 10% extra, paid for by the reduction in bug level.

    Stephen
    --
    Stephen Pelc, stephen@vfxforth.com
    MicroProcessor Engineering, Ltd. - More Real, Less Time
    133 Hill Lane, Southampton SO15 5AF, England
    tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Don Y on Tue Jun 28 13:48:03 2022
    On 27/06/2022 23:34, Don Y wrote:
    On 6/27/2022 8:18 AM, Grant Edwards wrote:


    The auto-generated PDF documents and HTML web site looked nice, though.

    There's no point in generating "prose" from such a specification.
    What are you going to do, pretty-print the generated stubs?  Or,
    the OCL-expressed constraints?

    That is /exactly/ what you do with tools like Doxygen - it extracts
    /interface/ information (function prototypes, type declarations, etc.),
    strips it of implementation-specific details, merges the comments (which
    should hopefully be in sync with the code), and generates clear,
    readable, searchable, cross-referenced documentation.

    You use tools like that precisely so that people using your library or
    code do /not/ read the C code. You don't even have to read the header
    files.

    And if you are formalising your prototypes with some kind of interface description language to include preconditions, postconditions and
    invariants, then you want them included in the generated documentation.
    Ideally, that's what people will read, rather than the IDL source code
    or the generated C headers.


    The key point of separation of interfaces and implementations is that
    people using the code should /only/ use the documented interfaces, and
    not rely on anything involved in the implementation. So make the
    information about those interfaces clear and precise - such as good
    quality generated documentation - and make it accurate - such as by
    using an IDL.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Stephen Pelc on Tue Jun 28 05:49:41 2022
    On 6/28/2022 1:30 AM, Stephen Pelc wrote:
    On 27 Jun 2022 at 17:18:07 CEST, "Grant Edwards" <invalid@invalid.invalid> wrote:

    If you are just putting these things in comments, then they will get out >>> of sync with the code.

    I'd have to agree. I've worked with many projects and third-party
    libraries over the decades which had a big template of comments for
    every function which described the input/ouput parameters, return
    value, global variables used, and so on.

    Often these templates generated documents by using something like
    Doxygen.

    For the last 20 years or so, virtually all our manuals have been created
    by our own "literate programming" system called DocGen. DocGen is
    optimised for Forth, but it would not be a big job to write a version for C.

    DocGen diverges from Doxygen and friends in a several ways. In
    particular it does not need template blocks. If your C code is so bad
    that another programmer cannot read the declaration, you need far
    more help than DocGen or Doxgen can give you. The main entry
    for a function follows the declaration

    float someFunc( int how, double x, double y )
    // *G The purpose of *\c{someFunc} is ...
    // ** ...
    {
    ...
    }

    The lines starting // *x are formal comments to be processed by
    DocGen. The *X parts are formatting commands, and the *\<name>{}
    parts are text macros.

    The ideas behind DocGen are that the code and the documentation
    are never separated, and that the DocGen portion is not much larger
    than the descriptive comments you should have in your code anyway.
    Keeping the code in sync with the documentation is a matter of
    company culture and management.

    Whenever we receive third party code to include in our products,
    we *always* DocGen it before release and we *always* find some
    bugs. Overall, I estimate that writing the documentation alongside
    the code costs about 10% extra, paid for by the reduction in bug level.

    I do this by using a specific "paragraph tag" in FrameMaker documents
    (e.g., "Code") and then have a simple utility that extracts all thusly
    tagged paragraphs to create the "source file" -- which is then
    compiled <however>.

    [FM files are relatively easy to parse and the format has been
    consistent for many releases; I wouldn't think of this sort of
    approach with MSWord acting as "container"!]

    It adds an extra step to the process (because the source doesn't exist
    until extracted from the document).

    But, it is ill-suited to producing "manuals" as the presentation
    must be linear with the code; you can't tangle/weave to arrange
    the code in a different order than the documentation.

    OTOH, it is excellent for mixing multimedia with "code"; I can put
    an illustration between "if" and "then". Or, a sound snipet to
    indicate what a particular (audio) waveform -- expressed as an
    array of floats -- *sounds* like adjacent to those constants.
    This is particularly helpful with domain-specific constructs,
    mechanisms and phenomena with which a generic programmer might
    not have prior experience.

    I document the "rationale" and "strategy" behind the code, elsewhere.
    That can take the "30,000 ft view" of the code and usually needs
    infrequent maintenance. E.g., why was Q12.4 format chosen? Show
    me the error analysis behind that choice relative to other formats.

    Keeping modules short and supporting other non-text annotations
    makes it relatively easy for folks to understand the specifics of
    an implementation.

    But, all of these techniques (yours included) rely on discipline.
    There's nothing that mechanically verifies the code and comments
    agree. Even semi-automatic mechanisms rely on the developer
    having *created* them (e.g., #including an audio file that
    was generated by extracting those floats and converting them
    to audio). Too often, the "solution" is simply to remove
    comments rather than ensuring they are maintained.

    Sadly, my experience has been that folks aren't keen on keeping
    docs and code in sync and the more documentation, the less it
    tends to track the code. Especially for projects that "evolved"
    instead of being "designed". (each refactor requiring a substantial
    reframing of the commentary)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Pelc@21:1/5 to Don Y on Tue Jun 28 14:35:50 2022
    On 28 Jun 2022 at 14:49:41 CEST, "Don Y" <blockedofcourse@foo.invalid> wrote: >> The ideas behind DocGen are that the code and the documentation
    are never separated, and that the DocGen portion is not much larger
    than the descriptive comments you should have in your code anyway.
    Keeping the code in sync with the documentation is a matter of
    company culture and management.

    Sadly, my experience has been that folks aren't keen on keeping
    docs and code in sync and the more documentation, the less it
    tends to track the code. Especially for projects that "evolved"
    instead of being "designed". (each refactor requiring a substantial reframing of the commentary)

    As others have said it needs discipline. Discipline comes from
    management. As the boss, I have made it quite clear that use
    of DocGen is a requirement to work at the company. In turn
    it is my job to ensure that people know how to use the tool.

    Stephen

    --
    Stephen Pelc, stephen@vfxforth.com
    MicroProcessor Engineering, Ltd. - More Real, Less Time
    133 Hill Lane, Southampton SO15 5AF, England
    tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Stephen Pelc on Tue Jun 28 11:33:42 2022
    On 6/28/2022 7:35 AM, Stephen Pelc wrote:
    On 28 Jun 2022 at 14:49:41 CEST, "Don Y" <blockedofcourse@foo.invalid> wrote:
    The ideas behind DocGen are that the code and the documentation
    are never separated, and that the DocGen portion is not much larger
    than the descriptive comments you should have in your code anyway.
    Keeping the code in sync with the documentation is a matter of
    company culture and management.

    Sadly, my experience has been that folks aren't keen on keeping
    docs and code in sync and the more documentation, the less it
    tends to track the code. Especially for projects that "evolved"
    instead of being "designed". (each refactor requiring a substantial
    reframing of the commentary)

    As others have said it needs discipline. Discipline comes from
    management. As the boss, I have made it quite clear that use
    of DocGen is a requirement to work at the company. In turn
    it is my job to ensure that people know how to use the tool.

    You can "legislate" the use of a tool or adherence to a standard.
    But, these are subjective issues -- not like "derate all caps by
    40%" (which can be independently, mathematically verified). You
    rely on individual "employees" for their judgement as to the
    effectiveness of their documentation. Likewise, the efficacy
    of their test/validation efforts.

    EVERY employer and client I've ever worked with has had formal
    standards regarding code "style", documentation, testing, etc.
    "The Boss" in these cases have ranged from accountants, to
    mechanical engineers, to electrical engineers ("no longer
    practicing"), to economists. I.e., they can mandate but aren't
    qualified to evaluate the quality of the work performed.

    You can have peers review each others' work. But, I've not seen
    that improve the work of folks who just don't have the drive
    to "do better". (And I can't remember anyone EVER being fired
    for incompetence!)

    The true test of this is handing the design to another party
    (i.e., SELLING the design) and seeing how well the new owner
    can come up to speed on the product. If you have staff available
    "later" that can be consulted wrt their previous work on a
    design, then folks need not completely rely on print documentation.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Pelc@21:1/5 to Don Y on Wed Jun 29 12:36:57 2022
    On 28 Jun 2022 at 20:33:42 CEST, "Don Y" <blockedofcourse@foo.invalid> wrote:

    On 6/28/2022 7:35 AM, Stephen Pelc wrote:
    As others have said it needs discipline. Discipline comes from
    management. As the boss, I have made it quite clear that use
    of DocGen is a requirement to work at the company. In turn
    it is my job to ensure that people know how to use the tool.

    You can "legislate" the use of a tool or adherence to a standard.
    But, these are subjective issues -- not like "derate all caps by
    40%" (which can be independently, mathematically verified). You
    rely on individual "employees" for their judgement as to the
    effectiveness of their documentation. Likewise, the efficacy
    of their test/validation efforts.

    Followed by lots more pointless whining.

    Changing company culture is really hard, even for my own
    company. I'm an electronics engineer by training, and I have
    been writing software since 1967, and I still write production
    code.

    I may not have fired people directly for not being good enough,
    but I have certainly strongly encouraged them to get another job.

    Stephen

    --
    Stephen Pelc, stephen@vfxforth.com
    MicroProcessor Engineering, Ltd. - More Real, Less Time
    133 Hill Lane, Southampton SO15 5AF, England
    tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Stephen Pelc on Wed Jun 29 06:39:49 2022
    On 6/29/2022 5:36 AM, Stephen Pelc wrote:
    On 28 Jun 2022 at 20:33:42 CEST, "Don Y" <blockedofcourse@foo.invalid> wrote:

    On 6/28/2022 7:35 AM, Stephen Pelc wrote:
    As others have said it needs discipline. Discipline comes from
    management. As the boss, I have made it quite clear that use
    of DocGen is a requirement to work at the company. In turn
    it is my job to ensure that people know how to use the tool.

    You can "legislate" the use of a tool or adherence to a standard.
    But, these are subjective issues -- not like "derate all caps by
    40%" (which can be independently, mathematically verified). You
    rely on individual "employees" for their judgement as to the
    effectiveness of their documentation. Likewise, the efficacy
    of their test/validation efforts.

    Followed by lots more pointless whining.

    First-hand examples of how "discipline" doesn't work, in practice.

    If you've been "lucky", then "good for you". You're likely the
    Exception and not the Rule. You've led a blessed existence. So,
    likely aren't competent to comment on life with "less angelic"
    employees.

    Please let us know your progress on addressing world hunger...

    Changing company culture is really hard, even for my own
    company. I'm an electronics engineer by training, and I have
    been writing software since 1967, and I still write production
    code.

    Now, imagine your products hosted code written by other people.
    Outside of your organization. What sort of reach do you have
    into THEIR corporate culture? Do you act as PHYSICAL gatekeeper
    and prohibit "unblessed" code from being installed on your
    products? Do *you* take on the job of creating every application
    and hardware module that any of your users might conceivably want
    (because you trust your own efforts, exclusively)?

    How eager will you be for your customers to have their experiences
    with YOUR product tainted by those other "components"? Will they
    be sophisticated enough to know that the quality issues arise not
    from YOUR portion of the work but from the efforts of others?
    Will they be able to determine *which* others (so they can excise them)?

    [Imagine the resolver in your PC being unreliably written by X.
    Will the user recognize that the resolver's faults are the
    reason behind the poor performance of the browser? Or, flaws
    in the filesystem implementation the reason for application
    failures/data loss? Or...]

    I want mechanisms that make it easy for people to "do the right
    thing", despite their inclination to do otherwise. I'm not
    keen on waiting for them to "see the light". Nor do I have
    the ability to coerce them to do so.

    But, if I make The Right Path easier to follow than the "wrong"
    ones, they are more likely to follow it out of laziness/self-interest.

    I may not have fired people directly for not being good enough,

    Why not? Especially if it's YOUR company? Imagine the hesitance
    to doing so when The Boss is just an employee of some corporate
    entity -- not *his* name above the door.

    but I have certainly strongly encouraged them to get another job.

    So, you "reworked" any work they did for you up until the time
    of their departure? Or, did you just let it slide -- *into* your
    products?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)