• Understanding the working of Shared memory using mmap()

    From Pavankumar S V@21:1/5 to All on Thu Mar 9 04:19:50 2023
    Hello,
    As per my understanding, I can explain the working of mmap() briefly like this: When a process(let's call it process1) calls mmap on a regular file, that file is first copied to the page cache. Then the region of page cache which contains the file is mapped to virtual address space of the process1(This memory region is called memory-
    mapped file).
    If another process(let's call it process2) calls mmap on the same file, then the same page cache that was mapped to process1 will get mapped to the virtual address space of process2.
    When the processes wants to access the file, they simply access this memory mapped file which is very faster. Also the data modified by process1 can be seen by process2.

    I have a query here. Please clarify it:
    When the process1 wants to write some data to the file, it will write to this memory mapped file. Then these dirty pages that are private to the process1 should be copied to the page cache. When will the kernel do this copying to page cache and how
    frequently? Is there a way for the process to control it?
    (My concern is : If there is a slight delay in performing this copying, it will delay the process2 which is mapped to the same file from reading the modified data.)
    The details of copying from Page Cache to the underlying file happens like this as per my understanding:
    Once the Page Cache is modified, the dirty pages are eventually flushed to the disk automatically based on some conditions using pdflush threads. And the processes can also explicitly do this flushing using the system call msync().
    So, I want to understand how these things happen when copying from memory mapped file to the Page cache.

    Thanks in Advance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Kettlewell@21:1/5 to Pavankumar S V on Fri Mar 10 16:30:02 2023
    Pavankumar S V <pavankumarsv96@gmail.com> writes:

    Hello,
    As per my understanding, I can explain the working of mmap() briefly like this:
    When a process(let's call it process1) calls mmap on a regular file,
    that file is first copied to the page cache. Then the region of page
    cache which contains the file is mapped to virtual address space of
    the process1(This memory region is called memory-mapped file).
    If another process(let's call it process2) calls mmap on the same
    file, then the same page cache that was mapped to process1 will get
    mapped to the virtual address space of process2.
    When the processes wants to access the file, they simply access this
    memory mapped file which is very faster. Also the data modified by
    process1 can be seen by process2.

    I have a query here. Please clarify it:
    When the process1 wants to write some data to the file, it will write
    to this memory mapped file. Then these dirty pages that are private to
    the process1 should be copied to the page cache. When will the kernel
    do this copying to page cache and how frequently? Is there a way for
    the process to control it?
    (My concern is : If there is a slight delay in performing this
    copying, it will delay the process2 which is mapped to the same file
    from reading the modified data.)
    The details of copying from Page Cache to the underlying file happens
    like this as per my understanding:
    Once the Page Cache is modified, the dirty pages are eventually
    flushed to the disk automatically based on some conditions using
    pdflush threads. And the processes can also explicitly do this
    flushing using the system call msync().
    So, I want to understand how these things happen when copying from
    memory mapped file to the Page cache.

    My understanding is that there is no such copy. The page in the page
    cache is added directly to the process’s virtual address space. The only copies are when flushing a dirty page to disk, or duplicating a page
    during copy-on-write with a private mapping.

    --
    https://www.greenend.org.uk/rjk/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rainer Weikusat@21:1/5 to Pavankumar S V on Mon Mar 13 17:35:08 2023
    Pavankumar S V <pavankumarsv96@gmail.com> writes:
    As per my understanding, I can explain the working of mmap() briefly
    like this: When a process(let's call it process1) calls mmap on a
    regular file, that file is first copied to the page cache. Then the
    region of page cache which contains the file is mapped to virtual
    address space of the process1(This memory region is called
    memory-mapped file). If another process(let's call it process2) calls
    mmap on the same file, then the same page cache that was mapped to
    process1 will get mapped to the virtual address space of process2.
    When the processes wants to access the file, they simply access this
    memory mapped file which is very faster. Also the data modified by
    process1 can be seen by process2.

    It *may* be faster. But address space manipulations, page faults
    occurring while populating some part of the virtual address space of a
    process and cache- and TLB-misses are all expensive operations, hence,
    it may well be not.

    I have a query here. Please clarify it:
    When the process1 wants to write some data to the file, it will write
    to this memory mapped file. Then these dirty pages that are private to
    the process1 should be copied to the page cache. When will the kernel
    do this copying to page cache and how frequently?

    Not at all. If the mapping is done as MAP_PRIVATE, the process will gets
    its own copy of each page as soon as it starts writing to it. For
    MAP_SHARED mappings, all processes mapping the same file plus the kernel
    page cache will write to the same page.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)