[PATCH v3 1/1] process_madvise.2: Add process_madvise man page

Michael Kerrisk (man-pages) mtk.manpages at gmail.com
Sat Feb 13 22:04:23 UTC 2021


Hello Suren,

On 2/2/21 11:12 PM, Suren Baghdasaryan wrote:
> Hi Michael,
> 
> On Tue, Feb 2, 2021 at 2:45 AM Michael Kerrisk (man-pages)
> <mtk.manpages at gmail.com> wrote:
>>
>> Hello Suren (and Minchan and Michal)
>>
>> Thank you for the revisions!
>>
>> I've applied this patch, and done a few light edits.
> 
> Thanks!
> 
>>
>> However, I have a questions about undocumented pieces in *madvise(2)*,
>> as well as one other question. See below.
>>
>> On 2/2/21 6:30 AM, Suren Baghdasaryan wrote:
>>> Initial version of process_madvise(2) manual page. Initial text was
>>> extracted from [1], amended after fix [2] and more details added using
>>> man pages of madvise(2) and process_vm_read(2) as examples. It also
>>> includes the changes to required permission proposed in [3].
>>>
>>> [1] https://lore.kernel.org/patchwork/patch/1297933/
>>> [2] https://lkml.org/lkml/2020/12/8/1282
>>> [3] https://patchwork.kernel.org/project/selinux/patch/20210111170622.2613577-1-surenb@google.com/#23888311
>>>
>>> Signed-off-by: Suren Baghdasaryan <surenb at google.com>
>>> Reviewed-by: Michal Hocko <mhocko at suse.com>
>>> ---
>>> changes in v2:
>>> - Changed description of MADV_COLD per Michal Hocko's suggestion
>>> - Applied fixes suggested by Michael Kerrisk
>>> changes in v3:
>>> - Added Michal's Reviewed-by
>>> - Applied additional fixes suggested by Michael Kerrisk
>>>
>>> NAME
>>>     process_madvise - give advice about use of memory to a process
>>>
>>> SYNOPSIS
>>>     #include <sys/uio.h>
>>>
>>>     ssize_t process_madvise(int pidfd,
>>>                            const struct iovec *iovec,
>>>                            unsigned long vlen,
>>>                            int advice,
>>>                            unsigned int flags);
>>>
>>> DESCRIPTION
>>>     The process_madvise() system call is used to give advice or directions
>>>     to the kernel about the address ranges of another process or the calling
>>>     process. It provides the advice to the address ranges described by iovec
>>>     and vlen. The goal of such advice is to improve system or application
>>>     performance.
>>>
>>>     The pidfd argument is a PID file descriptor (see pidfd_open(2)) that
>>>     specifies the process to which the advice is to be applied.
>>>
>>>     The pointer iovec points to an array of iovec structures, defined in
>>>     <sys/uio.h> as:
>>>
>>>     struct iovec {
>>>         void  *iov_base;    /* Starting address */
>>>         size_t iov_len;     /* Number of bytes to transfer */
>>>     };
>>>
>>>     The iovec structure describes address ranges beginning at iov_base address
>>>     and with the size of iov_len bytes.
>>>
>>>     The vlen represents the number of elements in the iovec structure.
>>>
>>>     The advice argument is one of the values listed below.
>>>
>>>   Linux-specific advice values
>>>     The following Linux-specific advice values have no counterparts in the
>>>     POSIX-specified posix_madvise(3), and may or may not have counterparts
>>>     in the madvise(2) interface available on other implementations.
>>>
>>>     MADV_COLD (since Linux 5.4.1)
>>
>> I just noticed these version numbers now, and thought: they can't be
>> right (because the system call appeared only in v5.11). So I removed
>> them. But, of course in another sense the version numbers are (nearly)
>> right, since these advice values were added for madvise(2) in Linux 5.4.
>> However, they are not documented in the madvise(2) manual page. Is it
>> correct to assume that MADV_COLD and MADV_PAGEOUT have exactly the same
>> meaning in madvise(2) (but just for the calling process, of course)?
> 
> Correct. They should be added in the madvise(2) man page as well IMHO.

So, I decided to move the description of MADV_COLD and MADV_PAGEOUT
to madvise(2) and refer to that page from the process_madvise(2)
page. This avoids repeating the same information in two places.

>>>         Deactive a given range of pages which will make them a more probable
>>
>> I changed: s/Deactive/Deactivate/
> 
> thanks!
> 
>>
>>>         reclaim target should there be a memory pressure. This is a
>>>         nondestructive operation. The advice might be ignored for some pages
>>>         in the range when it is not applicable.
>>>
>>>     MADV_PAGEOUT (since Linux 5.4.1)
>>>         Reclaim a given range of pages. This is done to free up memory occupied
>>>         by these pages. If a page is anonymous it will be swapped out. If a
>>>         page is file-backed and dirty it will be written back to the backing
>>>         storage. The advice might be ignored for some pages in the range when
>>>         it is not applicable.
>>
>> [...]
>>
>>>     The hint might be applied to a part of iovec if one of its elements points
>>>     to an invalid memory region in the remote process. No further elements will
>>>     be processed beyond that point.
>>
>> Is the above scenario the one that leads to the partial advice case described in
>> RETURN VALUE? If yes, perhaps I should add some words to make that clearer.
> 
> Correct. This describes the case when partial advice happens.

Thanks. I added a few words to clarify this.


>> You can see the light edits that I made in
>> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e3ce016472a1b3ec5dffdeb23c98b9fef618a97b
>> and following that I restructured DESCRIPTION a little in
>> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3aac0708a9acee5283e091461de6a8410bc921a6
> 
> The edits LGTM.

Thanks for checking them.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/



More information about the Linux-security-module-archive mailing list