• mmap vs. read

    From Steve Keller@21:1/5 to All on Fri Feb 8 12:40:19 2019
    XPost: comp.unix.programmer

    AFAIU, reading files using mmap(2) has some performance benefits
    compared to read(2). If a number of proecesses read the same file and
    each process mmap()s the file into its address space to read it, then
    only one copy of the file is in memory. OTOH, if the processes malloc
    some memory and use read() to fill it with file data, the memory is
    not shared, because (1) it will be aligned differently in these
    processes and (2) each process writes to the memory causing a private
    copy to be created.

    So I think one should prefer mmap() to access files, but how can
    errors be handled portably, then? On file I/O errors I get an error
    return code from read() (e.g. EIO), but with mmap() I typically get a
    SIGSEGV. How should I handle this?

    Steve

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From blt_uYh21j@xvjhmg9ueyj23p1690akks_m@21:1/5 to Casper H.S. Dik on Fri Feb 8 17:32:33 2019
    XPost: comp.unix.programmer

    On 08 Feb 2019 16:15:44 GMT
    Casper H.S. Dik <Casper.Dik@OrSPaMcle.COM> wrote:
    Richard Kettlewell <invalid@invalid.invalid> writes:

    Steve Keller <keller@no.invalid> writes:
    AFAIU, reading files using mmap(2) has some performance benefits
    compared to read(2). If a number of proecesses read the same file and
    each process mmap()s the file into its address space to read it, then
    only one copy of the file is in memory. OTOH, if the processes malloc
    some memory and use read() to fill it with file data, the memory is
    not shared, because (1) it will be aligned differently in these
    processes and (2) each process writes to the memory causing a private
    copy to be created.

    So I think one should prefer mmap() to access files,

    Profile first; historically at least mmap was not reliably faster than >>read/write. Fiddling with pages tables can be quite expensive.

    Yeah, though over time, memory closer to the CPU (cache, memory, page
    tables) has become much faster and CPU became faster more quickly.
    Storage, however, was lacking.

    Arn't the higher level I/O routines, eg fread() etc, supposed to be written
    to use the best access method on a given architecture?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Steve Keller on Fri Feb 8 19:09:35 2019
    XPost: comp.unix.programmer

    On 2019-02-08, Steve Keller <keller@no.invalid> wrote:
    AFAIU, reading files using mmap(2) has some performance benefits
    compared to read(2).

    This is not always the case. Basically the file has to be large enough
    for the overhead of allocating a new map.

    A program that repeatedly processes files by reading them into buffers
    from malloc can perform better, because malloc can efficiently re-use
    liberated memory without having to make system calls.

    A program that repeatedly processes small files using mmap is constantly
    making calls to mmap and munmap. These are expensive, and additionally
    so because they manipulate the address space.

    Basically the cost of the mmap operation has to be amortized somehow:
    the best situation is that very large files are processed, and
    infrequently so. Furthermore, random access is required.

    If a number of proecesses read the same file and
    each process mmap()s the file into its address space to read it, then
    only one copy of the file is in memory. OTOH, if the processes malloc
    some memory and use read() to fill it with file data, the memory is
    not shared, because (1) it will be aligned differently in these
    processes and (2) each process writes to the memory causing a private
    copy to be created.

    However, often we can process an arbitrarily large file with only a
    small buffer of a few kilobytes. Including doing random access, achieved
    by seeking around in the file.

    Ten processes passing over the same gigabyte file using 4 kilobyte
    buffers are allocating only 40 kilobytes in total.

    Ten processes mmapping the same gigabyte file means a gigabyte memory
    map exists. The madvise system call can help here.

    (To present a balanced view, we must observe that mmap doesn't have to
    map the entire file at once, either. Also, a mapping can be destroyed piece-wise, rather than all at once: munmap can be called on portions of
    a mapping that we know we are not going to touch.)

    So I think one should prefer mmap() to access files, but how can
    errors be handled portably, then? On file I/O errors I get an error
    return code from read() (e.g. EIO), but with mmap() I typically get a SIGSEGV. How should I handle this?

    In a utility program that can just bail on errors, you don't have to
    bother too much. Fetch the size of the file upfront (for instance
    stat(file, &stbuf) it and take stbuf.st_size). Then map just for that
    size. If the file happens to shrink, let the chips land where they may.

    In a robust application, you have to deal with the SIGBUS if you access
    the mapping beyond the end of the file.

    The signal handling for SIGBUS is about equally portable as mmap: you're writing a POSIX application.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Richard Kettlewell on Fri Feb 8 19:13:41 2019
    XPost: comp.unix.programmer

    On 2019-02-08, Richard Kettlewell <invalid@invalid.invalid> wrote:
    Steve Keller <keller@no.invalid> writes:
    AFAIU, reading files using mmap(2) has some performance benefits
    compared to read(2). If a number of proecesses read the same file and
    each process mmap()s the file into its address space to read it, then
    only one copy of the file is in memory. OTOH, if the processes malloc
    some memory and use read() to fill it with file data, the memory is
    not shared, because (1) it will be aligned differently in these
    processes and (2) each process writes to the memory causing a private
    copy to be created.

    So I think one should prefer mmap() to access files,

    Profile first; historically at least mmap was not reliably faster than read/write. Fiddling with pages tables can be quite expensive.

    I recently saw this on recent PC hardware, Ubuntu 18.

    There is a Debian patch for bsdiff which converts it from malloced
    buffers to use mmap. (The patch has a bug in the unmapping, which I
    fixed: it uses the compressed size of the source file to unmap it,
    rather than the original size.)

    I converted the bsdiff utility into a shared library, to use as a
    subroutine in a program which calls it millions of times for small-ish
    files.

    The original read() version was found to be faster than the mmap()
    version, so we dropped the patch instead of fixing its bug.

    I hypothesized the poorer performance to be caused by the repeated
    mapping and unmapping calls which manipulate the virtual address space
    and require trips to the kernel. Whereas the malloced buffers can be
    recycled without trips to the kernel or tweaking of the address space.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)