Changelog in Linux kernel 5.10.220

bpf: In bpf_task_fd_query use fget_task [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:22 2020 -0600

    bpf: In bpf_task_fd_query use fget_task
    
    [ Upstream commit b48845af0152d790a54b8ab78cc2b7c07485fc98 ]
    
    Use the helper fget_task to simplify bpf_task_fd_query.
    
    As well as simplifying the code this removes one unnecessary increment of
    struct files_struct.  This unnecessary increment of files_struct.count can
    result in exec unnecessarily unsharing files_struct and breaking posix
    locks, and it can result in fget_light having to fallback to fget reducing
    performance.
    
    This simplification comes from the observation that none of the
    callers of get_files_struct actually need to call get_files_struct
    that was made when discussing[1] exec and posix file locks.
    
    [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-5-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-5-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dnotify: use fsnotify group lock helpers [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:21 2022 +0300

    dnotify: use fsnotify group lock helpers
    
    [ Upstream commit aabb45fdcb31f00f1e7cae2bce83e83474a87c03 ]
    
    Before commit 9542e6a643fc6 ("nfsd: Containerise filecache laundrette")
    nfsd would close open files in direct reclaim context.  There is no
    guarantee that others memory shrinkers don't do the same and no
    guarantee that future shrinkers won't do that.
    
    For example, if overlayfs implements inode cache of fscache would
    keep open files to cached objects, inode shrinkers could end up closing
    open files to underlying fs.
    
    Direct reclaim from dnotify mark allocation context may try to close
    open files that have dnotify marks of the same group and hit a deadlock
    on mark_mutex.
    
    Set the FSNOTIFY_GROUP_NOFS flag to prevent going into direct reclaim
    from allocations under dnotify group lock and use the safe group lock
    helpers.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-11-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Documentation: Add missing documentation for EXPORT_OP flags [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Aug 25 15:04:23 2023 -0400

    Documentation: Add missing documentation for EXPORT_OP flags
    
    [ Upstream commit b38a6023da6a12b561f0421c6a5a1f7624a1529c ]
    
    The commits that introduced these flags neglected to update the
    Documentation/filesystems/nfs/exporting.rst file.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exec: Don't open code get_close_on_exec [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Wed Dec 9 15:42:57 2020 -0600

    exec: Don't open code get_close_on_exec
    
    [ Upstream commit 878f12dbb8f514799d126544d59be4d2675caac3 ]
    
    Al Viro pointed out that using the phrase "close_on_exec(fd,
    rcu_dereference_raw(current->files->fdt))" instead of wrapping it in
    rcu_read_lock(), rcu_read_unlock() is a very questionable
    optimization[1].
    
    Once wrapped with rcu_read_lock()/rcu_read_unlock() that phrase
    becomes equivalent the helper function get_close_on_exec so
    simplify the code and make it more robust by simply using
    get_close_on_exec.
    
    [1] https://lkml.kernel.org/r/20201207222214.GA4115853@ZenIV.linux.org.uk
    Suggested-by: Al Viro <viro@ftp.linux.org.uk>
    Link: https://lkml.kernel.org/r/87k0tqr6zi.fsf_-_@x220.int.ebiederm.org
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exec: Move unshare_files to fix posix file locking during exec [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:18 2020 -0600

    exec: Move unshare_files to fix posix file locking during exec
    
    [ Upstream commit b6043501289ebf169ae19b810a882d517377302f ]
    
    Many moons ago the binfmts were doing some very questionable things
    with file descriptors and an unsharing of the file descriptor table
    was added to make things better[1][2].  The helper steal_lockss was
    added to avoid breaking the userspace programs[3][4][6].
    
    Unfortunately it turned out that steal_locks did not work for network
    file systems[5], so it was removed to see if anyone would
    complain[7][8].  It was thought at the time that NPTL would not be
    affected as the unshare_files happened after the other threads were
    killed[8].  Unfortunately because there was an unshare_files in
    binfmt_elf.c before the threads were killed this analysis was
    incorrect.
    
    This unshare_files in binfmt_elf.c resulted in the unshares_files
    happening whenever threads were present.  Which led to unshare_files
    being moved to the start of do_execve[9].
    
    Later the problems were rediscovered and the suggested approach was to
    readd steal_locks under a different name[10].  I happened to be
    reviewing patches and I noticed that this approach was a step
    backwards[11].
    
    I proposed simply moving unshare_files[12] and it was pointed
    out that moving unshare_files without auditing the code was
    also unsafe[13].
    
    There were then several attempts to solve this[14][15][16] and I even
    posted this set of changes[17].  Unfortunately because auditing all of
    execve is time consuming this change did not make it in at the time.
    
    Well now that I am cleaning up exec I have made the time to read
    through all of the binfmts and the only playing with file descriptors
    is either the security modules closing them in
    security_bprm_committing_creds or is in the generic code in fs/exec.c.
    None of it happens before begin_new_exec is called.
    
    So move unshare_files into begin_new_exec, after the point of no
    return.  If memory is very very very low and the application calling
    exec is sharing file descriptor tables between processes we might fail
    past the point of no return.  Which is unfortunate but no different
    than any of the other places where we allocate memory after the point
    of no return.
    
    This movement allows another process that shares the file table, or
    another thread of the same process and that closes files or changes
    their close on exec behavior and races with execve to cause some
    unexpected things to happen.  There is only one time of check to time
    of use race and it is just there so that execve fails instead of
    an interpreter failing when it tries to open the file it is supposed
    to be interpreting.   Failing later if userspace is being silly is
    not a problem.
    
    With this change it the following discription from the removal
    of steal_locks[8] finally becomes true.
    
        Apps using NPTL are not affected, since all other threads are killed before
        execve.
    
        Apps using LinuxThreads are only affected if they
    
          - have multiple threads during exec (LinuxThreads doesn't kill other
            threads, the app may do it with pthread_kill_other_threads_np())
          - rely on POSIX locks being inherited across exec
    
        Both conditions are documented, but not their interaction.
    
        Apps using clone() natively are affected if they
    
          - use clone(CLONE_FILES)
          - rely on POSIX locks being inherited across exec
    
    I have investigated some paths to make it possible to solve this
    without moving unshare_files but they all look more complicated[18].
    
    Reported-by: Daniel P. Berrangé <berrange@redhat.com>
    Reported-by: Jeff Layton <jlayton@redhat.com>
    History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
    [1] 02cda956de0b ("[PATCH] unshare_files"
    [2] 04e9bcb4d106 ("[PATCH] use new unshare_files helper")
    [3] 088f5d7244de ("[PATCH] add steal_locks helper")
    [4] 02c541ec8ffa ("[PATCH] use new steal_locks helper")
    [5] https://lkml.kernel.org/r/E1FLIlF-0007zR-00@dorka.pomaz.szeredi.hu
    [6] https://lkml.kernel.org/r/0060321191605.GB15997@sorel.sous-sol.org
    [7] https://lkml.kernel.org/r/E1FLwjC-0000kJ-00@dorka.pomaz.szeredi.hu
    [8] c89681ed7d0e ("[PATCH] remove steal_locks()")
    [9] fd8328be874f ("[PATCH] sanitize handling of shared descriptor tables in failing execve()")
    [10] https://lkml.kernel.org/r/20180317142520.30520-1-jlayton@kernel.org
    [11] https://lkml.kernel.org/r/87r2nwqk73.fsf@xmission.com
    [12] https://lkml.kernel.org/r/87bmfgvg8w.fsf@xmission.com
    [13] https://lkml.kernel.org/r/20180322111424.GE30522@ZenIV.linux.org.uk
    [14] https://lkml.kernel.org/r/20180827174722.3723-1-jlayton@kernel.org
    [15] https://lkml.kernel.org/r/20180830172423.21964-1-jlayton@kernel.org
    [16] https://lkml.kernel.org/r/20180914105310.6454-1-jlayton@kernel.org
    [17] https://lkml.kernel.org/r/87a7ohs5ow.fsf@xmission.com
    [18] https://lkml.kernel.org/r/87pn8c1uj6.fsf_-_@x220.int.ebiederm.org
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-1-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-1-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exec: Remove reset_files_struct [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:20 2020 -0600

    exec: Remove reset_files_struct
    
    [ Upstream commit 950db38ff2c01b7aabbd7ab4a50b7992750fa63d ]
    
    Now that exec no longer needs to restore the previous value of current->files
    on error there are no more callers of reset_files_struct so remove it.
    
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-3-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-3-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exec: Simplify unshare_files [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:19 2020 -0600

    exec: Simplify unshare_files
    
    [ Upstream commit 1f702603e7125a390b5cdf5ce00539781cfcc86a ]
    
    Now that exec no longer needs to return the unshared files to their
    previous value there is no reason to return displaced.
    
    Instead when unshare_fd creates a copy of the file table, call
    put_files_struct before returning from unshare_files.
    
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-2-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-2-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exit: Implement kthread_exit [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Mon Nov 22 10:27:36 2021 -0600

    exit: Implement kthread_exit
    
    [ Upstream commit bbda86e988d4c124e4cfa816291cbd583ae8bfb1 ]
    
    The way the per task_struct exit_code is used by kernel threads is not
    quite compatible how it is used by userspace applications.  The low
    byte of the userspace exit_code value encodes the exit signal.  While
    kthreads just use the value as an int holding ordinary kernel function
    exit status like -EPERM.
    
    Add kthread_exit to clearly separate the two kinds of uses.
    
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Stable-dep-of: ca3574bd653a ("exit: Rename module_put_and_exit to module_put_and_kthread_exit")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exit: Rename module_put_and_exit to module_put_and_kthread_exit [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Dec 3 11:00:19 2021 -0600

    exit: Rename module_put_and_exit to module_put_and_kthread_exit
    
    [ Upstream commit ca3574bd653aba234a4b31955f2778947403be16 ]
    
    Update module_put_and_exit to call kthread_exit instead of do_exit.
    
    Change the name to reflect this change in functionality.  All of the
    users of module_put_and_exit are causing the current kthread to exit
    so this change makes it clear what is happening.  There is no
    functional change.
    
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exportfs: Add a function to return the raw output from fh_to_dentry() [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Mon Nov 30 17:03:17 2020 -0500

    exportfs: Add a function to return the raw output from fh_to_dentry()
    
    [ Upstream commit d045465fc6cbfa4acfb5a7d817a7c1a57a078109 ]
    
    In order to allow nfsd to accept return values that are not
    acceptable to overlayfs and others, add a new function.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exportfs: use pr_debug for unreachable debug statements [+ + +]

Author: David Disseldorp <ddiss@suse.de>
Date:   Fri Oct 21 14:24:14 2022 +0200

    exportfs: use pr_debug for unreachable debug statements
    
    [ Upstream commit 427505ffeaa464f683faba945a88d3e3248f6979 ]
    
    expfs.c has a bunch of dprintk statements which are unusable due to:
     #define dprintk(fmt, args...) do{}while(0)
    Use pr_debug so that they can be enabled dynamically.
    Also make some minor changes to the debug statements to fix some
    incorrect types, and remove __func__ which can be handled by dynamic
    debug separately.
    
    Signed-off-by: David Disseldorp <ddiss@suse.de>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Add helpers to decide whether to report FID/DFID [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:38 2021 -0300

    fanotify: Add helpers to decide whether to report FID/DFID
    
    [ Upstream commit 4bd5a5c8e6e5cd964e9738e6ef87f6c2cb453edf ]
    
    Now that there is an event that reports FID records even for a zeroed
    file handle, wrap the logic that deides whether to issue the records
    into helper functions.  This shouldn't have any impact on the code, but
    simplifies further patches.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-24-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: add pidfd support to the fanotify API [+ + +]

Author: Matthew Bobrowski <repnop@google.com>
Date:   Sun Aug 8 15:26:25 2021 +1000

    fanotify: add pidfd support to the fanotify API
    
    [ Upstream commit af579beb666aefb17e9a335c12c788c92932baf1 ]
    
    Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which
    allows userspace applications to control whether a pidfd information
    record containing a pidfd is to be returned alongside the generic
    event metadata for each event.
    
    If FAN_REPORT_PIDFD is enabled for a notification group, an additional
    struct fanotify_event_info_pidfd object type will be supplied
    alongside the generic struct fanotify_event_metadata for a single
    event. This functionality is analogous to that of FAN_REPORT_FID in
    terms of how the event structure is supplied to a userspace
    application. Usage of FAN_REPORT_PIDFD with
    FAN_REPORT_FID/FAN_REPORT_DFID_NAME is permitted, and in this case a
    struct fanotify_event_info_pidfd object will likely follow any struct
    fanotify_event_info_fid object.
    
    Currently, the usage of the FAN_REPORT_TID flag is not permitted along
    with FAN_REPORT_PIDFD as the pidfd API currently only supports the
    creation of pidfds for thread-group leaders. Additionally, usage of
    the FAN_REPORT_PIDFD flag is limited to privileged processes only
    i.e. event listeners that are running with the CAP_SYS_ADMIN
    capability. Attempting to supply the FAN_REPORT_TID initialization
    flags with FAN_REPORT_PIDFD or creating a notification group without
    CAP_SYS_ADMIN will result with -EINVAL being returned to the caller.
    
    In the event of a pidfd creation error, there are two types of error
    values that can be reported back to the listener. There is
    FAN_NOPIDFD, which will be reported in cases where the process
    responsible for generating the event has terminated prior to the event
    listener being able to read the event. Then there is FAN_EPIDFD, which
    will be reported when a more generic pidfd creation error has occurred
    when fanotify calls pidfd_create().
    
    Link: https://lore.kernel.org/r/5f9e09cff7ed62bfaa51c1369e0f7ea5f16a91aa.1628398044.git.repnop@google.com
    Signed-off-by: Matthew Bobrowski <repnop@google.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Allow file handle encoding for unhashed events [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:29 2021 -0300

    fanotify: Allow file handle encoding for unhashed events
    
    [ Upstream commit 74fe4734897a2da2ae2a665a5e622cd490d36eaf ]
    
    Allow passing a NULL hash to fanotify_encode_fh and avoid calculating
    the hash if not needed.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-15-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Allow users to request FAN_FS_ERROR events [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:43 2021 -0300

    fanotify: Allow users to request FAN_FS_ERROR events
    
    [ Upstream commit 9709bd548f11a092d124698118013f66e1740f9b ]
    
    Wire up the FAN_FS_ERROR event in the fanotify_mark syscall, allowing
    user space to request the monitoring of FAN_FS_ERROR events.
    
    These events are limited to filesystem marks, so check it is the
    case in the syscall handler.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-29-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: cleanups for fanotify_mark() input validations [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Jun 29 17:42:09 2022 +0300

    fanotify: cleanups for fanotify_mark() input validations
    
    [ Upstream commit 8afd7215aa97f8868d033f6e1d01a276ab2d29c0 ]
    
    Create helper fanotify_may_update_existing_mark() for checking for
    conflicts between existing mark flags and fanotify_mark() flags.
    
    Use variable mark_cmd to make the checks for mark command bits
    cleaner.
    
    Link: https://lore.kernel.org/r/20220629144210.2983229-3-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: configurable limits via sysfs [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 13:29:20 2021 +0200

    fanotify: configurable limits via sysfs
    
    [ Upstream commit 5b8fea65d197f408bb00b251c70d842826d6b70b ]
    
    fanotify has some hardcoded limits. The only APIs to escape those limits
    are FAN_UNLIMITED_QUEUE and FAN_UNLIMITED_MARKS.
    
    Allow finer grained tuning of the system limits via sysfs tunables under
    /proc/sys/fs/fanotify, similar to tunables under /proc/sys/fs/inotify,
    with some minor differences.
    
    - max_queued_events - global system tunable for group queue size limit.
      Like the inotify tunable with the same name, it defaults to 16384 and
      applies on initialization of a new group.
    
    - max_user_marks - user ns tunable for marks limit per user.
      Like the inotify tunable named max_user_watches, on a machine with
      sufficient RAM and it defaults to 1048576 in init userns and can be
      further limited per containing user ns.
    
    - max_user_groups - user ns tunable for number of groups per user.
      Like the inotify tunable named max_user_instances, it defaults to 128
      in init userns and can be further limited per containing user ns.
    
    The slightly different tunable names used for fanotify are derived from
    the "group" and "mark" terminology used in the fanotify man pages and
    throughout the code.
    
    Considering the fact that the default value for max_user_instances was
    increased in kernel v5.10 from 8192 to 1048576, leaving the legacy
    fanotify limit of 8192 marks per group in addition to the max_user_marks
    limit makes little sense, so the per group marks limit has been removed.
    
    Note that when a group is initialized with FAN_UNLIMITED_MARKS, its own
    marks are not accounted in the per user marks account, so in effect the
    limit of max_user_marks is only for the collection of groups that are
    not initialized with FAN_UNLIMITED_MARKS.
    
    Link: https://lore.kernel.org/r/20210304112921.3996419-2-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: create helper fanotify_mark_user_flags() [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:23 2022 +0300

    fanotify: create helper fanotify_mark_user_flags()
    
    [ Upstream commit 4adce25ccfff215939ee465b8c0aa70526d5c352 ]
    
    To translate from fsnotify mark flags to user visible flags.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-13-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: do not allow setting dirent events in mask of non-dir [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Sat May 7 11:00:28 2022 +0300

    fanotify: do not allow setting dirent events in mask of non-dir
    
    [ Upstream commit ceaf69f8eadcafb323392be88e7a5248c415d423 ]
    
    Dirent events (create/delete/move) are only reported on watched
    directory inodes, but in fanotify as well as in legacy inotify, it was
    always allowed to set them on non-dir inode, which does not result in
    any meaningful outcome.
    
    Until kernel v5.17, dirent events in fanotify also differed from events
    "on child" (e.g. FAN_OPEN) in the information provided in the event.
    For example, FAN_OPEN could be set in the mask of a non-dir or the mask
    of its parent and event would report the fid of the child regardless of
    the marked object.
    By contrast, FAN_DELETE is not reported if the child is marked and the
    child fid was not reported in the events.
    
    Since kernel v5.17, with fanotify group flag FAN_REPORT_TARGET_FID, the
    fid of the child is reported with dirent events, like events "on child",
    which may create confusion for users expecting the same behavior as
    events "on child" when setting events in the mask on a child.
    
    The desired semantics of setting dirent events in the mask of a child
    are not clear, so for now, deny this action for a group initialized
    with flag FAN_REPORT_TARGET_FID and for the new event FAN_RENAME.
    We may relax this restriction in the future if we decide on the
    semantics and implement them.
    
    Fixes: d61fd650e9d2 ("fanotify: introduce group flag FAN_REPORT_TARGET_FID")
    Fixes: 8cc3b1ccd930 ("fanotify: wire up FAN_RENAME event")
    Link: https://lore.kernel.org/linux-fsdevel/20220505133057.zm5t6vumc4xdcnsg@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220507080028.219826-1-amir73il@gmail.com
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Emit generic error info for error event [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:42 2021 -0300

    fanotify: Emit generic error info for error event
    
    [ Upstream commit 130a3c742107acff985541c28360c8b40203559c ]
    
    The error info is a record sent to users on FAN_FS_ERROR events
    documenting the type of error.  It also carries an error count,
    documenting how many errors were observed since the last reporting.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-28-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: enable "evictable" inode marks [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:27 2022 +0300

    fanotify: enable "evictable" inode marks
    
    [ Upstream commit 5f9d3bd520261fd7a850818c71809fd580e0f30c ]
    
    Now that the direct reclaim path is handled we can enable evictable
    inode marks.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-17-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Encode empty file handle when no inode is provided [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:30 2021 -0300

    fanotify: Encode empty file handle when no inode is provided
    
    [ Upstream commit 272531ac619b374ab474e989eb387162fded553f ]
    
    Instead of failing, encode an invalid file handle in fanotify_encode_fh
    if no inode is provided.  This bogus file handle will be reported by
    FAN_FS_ERROR for non-inode errors.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-16-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: factor out helper fanotify_mark_update_flags() [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:24 2022 +0300

    fanotify: factor out helper fanotify_mark_update_flags()
    
    [ Upstream commit 8998d110835e3781ccd3f1ae061a590b4aaba911 ]
    
    Handle FAN_MARK_IGNORED_SURV_MODIFY flag change in a helper that
    is called after updating the mark mask.
    
    Replace the added and removed return values and help variables with
    bool recalc return values and help variable, which makes the code a
    bit easier to follow.
    
    Rename flags argument to fan_flags to emphasize the difference from
    mark->flags.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-14-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: fix incorrect fmode_t casts [+ + +]

Author: Vasily Averin <vasily.averin@linux.dev>
Date:   Sun May 22 15:08:02 2022 +0300

    fanotify: fix incorrect fmode_t casts
    
    [ Upstream commit dccd855771b37820b6d976a99729c88259549f85 ]
    
    Fixes sparce warnings:
    fs/notify/fanotify/fanotify_user.c:267:63: sparse:
     warning: restricted fmode_t degrades to integer
    fs/notify/fanotify/fanotify_user.c:1351:28: sparse:
     warning: restricted fmode_t degrades to integer
    
    FMODE_NONTIFY have bitwise fmode_t type and requires __force attribute
    for any casts.
    
    Signed-off-by: Vasily Averin <vvs@openvz.org>
    Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/9adfd6ac-1b89-791e-796b-49ada3293985@openvz.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: fix permission model of unprivileged group [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon May 24 16:53:21 2021 +0300

    fanotify: fix permission model of unprivileged group
    
    [ Upstream commit a8b98c808eab3ec8f1b5a64be967b0f4af4cae43 ]
    
    Reporting event->pid should depend on the privileges of the user that
    initialized the group, not the privileges of the user reading the
    events.
    
    Use an internal group flag FANOTIFY_UNPRIV to record the fact that the
    group was initialized by an unprivileged user.
    
    To be on the safe side, the premissions to setup filesystem and mount
    marks now require that both the user that initialized the group and
    the user setting up the mark have CAP_SYS_ADMIN.
    
    Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiA77_P5vtv7e83g0+9d7B5W9ZTE4GfQEYbWmfT1rA=VA@mail.gmail.com/
    Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
    Cc: <Stable@vger.kernel.org> # v5.12+
    Link: https://lore.kernel.org/r/20210524135321.2190062-1-amir73il@gmail.com
    Reviewed-by: Matthew Bobrowski <repnop@google.com>
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Fold event size calculation to its own function [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:20 2021 -0300

    fanotify: Fold event size calculation to its own function
    
    [ Upstream commit b9928e80dda84b349ba8de01780b9bef2fc36ffa ]
    
    Every time this function is invoked, it is immediately added to
    FAN_EVENT_METADATA_LEN, since there is no need to just calculate the
    length of info records. This minor clean up folds the rest of the
    calculation into the function, which now operates in terms of events,
    returning the size of the entire event, including metadata.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-6-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: implement "evictable" inode marks [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:25 2022 +0300

    fanotify: implement "evictable" inode marks
    
    [ Upstream commit 7d5e005d982527e4029b0139823d179986e34cdc ]
    
    When an inode mark is created with flag FAN_MARK_EVICTABLE, it will not
    pin the marked inode to inode cache, so when inode is evicted from cache
    due to memory pressure, the mark will be lost.
    
    When an inode mark with flag FAN_MARK_EVICATBLE is updated without using
    this flag, the marked inode is pinned to inode cache.
    
    When an inode mark is updated with flag FAN_MARK_EVICTABLE but an
    existing mark already has the inode pinned, the mark update fails with
    error EEXIST.
    
    Evictable inode marks can be used to setup inode marks with ignored mask
    to suppress events from uninteresting files or directories in a lazy
    manner, upon receiving the first event, without having to iterate all
    the uninteresting files or directories before hand.
    
    The evictbale inode mark feature allows performing this lazy marks setup
    without exhausting the system memory with pinned inodes.
    
    This change does not enable the feature yet.
    
    Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiRDpuS=2uA6+ZUM7yG9vVU-u212tkunBmSnP_u=mkv=Q@mail.gmail.com/
    Link: https://lore.kernel.org/r/20220422120327.3459282-15-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: introduce a generic info record copying helper [+ + +]

Author: Matthew Bobrowski <repnop@google.com>
Date:   Sun Aug 8 15:25:58 2021 +1000

    fanotify: introduce a generic info record copying helper
    
    [ Upstream commit 0aca67bb7f0d8c997dfef8ff0bfeb0afb361f0e6 ]
    
    The copy_info_records_to_user() helper allows for the separation of
    info record copying routines/conditionals from copy_event_to_user(),
    which reduces the overall clutter within this function. This becomes
    especially true as we start introducing additional info records in the
    future i.e. struct fanotify_event_info_pidfd. On success, this helper
    returns the total amount of bytes that have been copied into the user
    supplied buffer and on error, a negative value is returned to the
    caller.
    
    The newly defined macro FANOTIFY_INFO_MODES can be used to obtain info
    record types that have been enabled for a specific notification
    group. This macro becomes useful in the subsequent patch when the
    FAN_REPORT_PIDFD initialization flag is introduced.
    
    Link: https://lore.kernel.org/r/8872947dfe12ce8ae6e9a7f2d49ea29bc8006af0.1628398044.git.repnop@google.com
    Signed-off-by: Matthew Bobrowski <repnop@google.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: introduce FAN_MARK_IGNORE [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Jun 29 17:42:10 2022 +0300

    fanotify: introduce FAN_MARK_IGNORE
    
    [ Upstream commit e252f2ed1c8c6c3884ab5dd34e003ed21f1fe6e0 ]
    
    This flag is a new way to configure ignore mask which allows adding and
    removing the event flags FAN_ONDIR and FAN_EVENT_ON_CHILD in ignore mask.
    
    The legacy FAN_MARK_IGNORED_MASK flag would always ignore events on
    directories and would ignore events on children depending on whether
    the FAN_EVENT_ON_CHILD flag was set in the (non ignored) mask.
    
    FAN_MARK_IGNORE can be used to ignore events on children without setting
    FAN_EVENT_ON_CHILD in the mark's mask and will not ignore events on
    directories unconditionally, only when FAN_ONDIR is set in ignore mask.
    
    The new behavior is non-downgradable.  After calling fanotify_mark() with
    FAN_MARK_IGNORE once, calling fanotify_mark() with FAN_MARK_IGNORED_MASK
    on the same object will return EEXIST error.
    
    Setting the event flags with FAN_MARK_IGNORE on a non-dir inode mark
    has no meaning and will return ENOTDIR error.
    
    The meaning of FAN_MARK_IGNORED_SURV_MODIFY is preserved with the new
    FAN_MARK_IGNORE flag, but with a few semantic differences:
    
    1. FAN_MARK_IGNORED_SURV_MODIFY is required for filesystem and mount
       marks and on an inode mark on a directory. Omitting this flag
       will return EINVAL or EISDIR error.
    
    2. An ignore mask on a non-directory inode that survives modify could
       never be downgraded to an ignore mask that does not survive modify.
       With new FAN_MARK_IGNORE semantics we make that rule explicit -
       trying to update a surviving ignore mask without the flag
       FAN_MARK_IGNORED_SURV_MODIFY will return EEXIST error.
    
    The conveniene macro FAN_MARK_IGNORE_SURV is added for
    (FAN_MARK_IGNORE | FAN_MARK_IGNORED_SURV_MODIFY), because the
    common case should use short constant names.
    
    Link: https://lore.kernel.org/r/20220629144210.2983229-4-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: introduce group flag FAN_REPORT_TARGET_FID [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:29 2021 +0200

    fanotify: introduce group flag FAN_REPORT_TARGET_FID
    
    [ Upstream commit d61fd650e9d206a71fda789f02a1ced4b19944c4 ]
    
    FAN_REPORT_FID is ambiguous in that it reports the fid of the child for
    some events and the fid of the parent for create/delete/move events.
    
    The new FAN_REPORT_TARGET_FID flag is an implicit request to report
    the fid of the target object of the operation (a.k.a the child inode)
    also in create/delete/move events in addition to the fid of the parent
    and the name of the child.
    
    To reduce the test matrix for uninteresting use cases, the new
    FAN_REPORT_TARGET_FID flag requires both FAN_REPORT_NAME and
    FAN_REPORT_FID.  The convenience macro FAN_REPORT_DFID_NAME_TARGET
    combines FAN_REPORT_TARGET_FID with all the required flags.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-4-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: limit number of event merge attempts [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 12:48:26 2021 +0200

    fanotify: limit number of event merge attempts
    
    [ Upstream commit b8cd0ee8cda68a888a317991c1e918a8cba1a568 ]
    
    Event merges are expensive when event queue size is large, so limit the
    linear search to 128 merge tests.
    
    In combination with 128 size hash table, there is a potential to merge
    with up to 16K events in the hashed queue.
    
    Link: https://lore.kernel.org/r/20210304104826.3993892-6-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: minor cosmetic adjustments to fid labels [+ + +]

Author: Matthew Bobrowski <repnop@google.com>
Date:   Sun Aug 8 15:25:32 2021 +1000

    fanotify: minor cosmetic adjustments to fid labels
    
    [ Upstream commit d3424c9bac893bd06f38a20474cd622881d384ca ]
    
    With the idea to support additional info record types in the future
    i.e. fanotify_event_info_pidfd, it's a good idea to rename some of the
    labels assigned to some of the existing fid related functions,
    parameters, etc which more accurately represent the intent behind
    their usage.
    
    For example, copy_info_to_user() was defined with a generic function
    label, which arguably reads as being supportive of different info
    record types, however the parameter list for this function is
    explicitly tailored towards the creation and copying of the
    fanotify_event_info_fid records. This same point applies to the macro
    defined as FANOTIFY_INFO_HDR_LEN.
    
    With fanotify_event_info_len(), we change the parameter label so that
    the function implies that it can be extended to calculate the length
    for additional info record types.
    
    Link: https://lore.kernel.org/r/7c3ec33f3c718dac40764305d4d494d858f59c51.1628398044.git.repnop@google.com
    Signed-off-by: Matthew Bobrowski <repnop@google.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: mix event info and pid into merge key hash [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 12:48:24 2021 +0200

    fanotify: mix event info and pid into merge key hash
    
    [ Upstream commit 7e3e5c6943994943eb76cab2d3a1806bc10b9045 ]
    
    Improve the merge key hash by mixing more values relevant for merge.
    
    For example, all FAN_CREATE name events in the same dir used to have the
    same merge key based on the dir inode.  With this change the created
    file name is mixed into the merge key.
    
    The object id that was used as merge key is redundant to the event info
    so it is no longer mixed into the hash.
    
    Permission events are not hashed, so no need to hash their info.
    
    Link: https://lore.kernel.org/r/20210304104826.3993892-4-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Pre-allocate pool of error events [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:34 2021 -0300

    fanotify: Pre-allocate pool of error events
    
    [ Upstream commit 734a1a5eccc5f7473002b0669f788e135f1f64aa ]
    
    Pre-allocate slots for file system errors to have greater chances of
    succeeding, since error events can happen in GFP_NOFS context.  This
    patch introduces a group-wide mempool of error events, shared by all
    FAN_FS_ERROR marks in this group.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-20-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: prepare for setting event flags in ignore mask [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Jun 29 17:42:08 2022 +0300

    fanotify: prepare for setting event flags in ignore mask
    
    [ Upstream commit 31a371e419c885e0f137ce70395356ba8639dc52 ]
    
    Setting flags FAN_ONDIR FAN_EVENT_ON_CHILD in ignore mask has no effect.
    The FAN_EVENT_ON_CHILD flag in mask implicitly applies to ignore mask and
    ignore mask is always implicitly applied to events on directories.
    
    Define a mark flag that replaces this legacy behavior with logic of
    applying the ignore mask according to event flags in ignore mask.
    
    Implement the new logic to prepare for supporting an ignore mask that
    ignores events on children and ignore mask that does not ignore events
    on directories.
    
    To emphasize the change in terminology, also rename ignored_mask mark
    member to ignore_mask and use accessors to get only the effective
    ignored events or the ignored events and flags.
    
    This change in terminology finally aligns with the "ignore mask"
    language in man pages and in most of the comments.
    
    Link: https://lore.kernel.org/r/20220629144210.2983229-2-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: record either old name new name or both for FAN_RENAME [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:35 2021 +0200

    fanotify: record either old name new name or both for FAN_RENAME
    
    [ Upstream commit 2bfbcccde6e7a787feabad4645f628f963fe0663 ]
    
    We do not want to report the dirfid+name of a directory whose
    inode/sb are not watched, because watcher may not have permissions
    to see the directory content.
    
    Use an internal iter_info to indicate to fanotify_alloc_event()
    which marks of this group are watching FAN_RENAME, so it can decide
    if we need to record only the old parent+name, new parent+name or both.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-10-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    [JK: Modified code to pass around only mask of mark types matching
    generated event]
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: record old and new parent and name in FAN_RENAME event [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:34 2021 +0200

    fanotify: record old and new parent and name in FAN_RENAME event
    
    [ Upstream commit 3982534ba5ce45e890b2f5ef5e7372c1accd14c7 ]
    
    In the special case of FAN_RENAME event, we record both the old
    and new parent and name.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-9-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: reduce event objectid to 29-bit hash [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 12:48:23 2021 +0200

    fanotify: reduce event objectid to 29-bit hash
    
    [ Upstream commit 8988f11abb820bacfcc53d498370bfb30f792ec4 ]
    
    objectid is only used by fanotify backend and it is just an optimization
    for event merge before comparing all fields in event.
    
    Move the objectid member from common struct fsnotify_event into struct
    fanotify_event and reduce it to 29-bit hash to cram it together with the
    3-bit event type.
    
    Events of different types are never merged, so the combination of event
    type and hash form a 32-bit key for fast compare of events.
    
    This reduces the size of events by one pointer and paves the way for
    adding hashed queue support for fanotify.
    
    Link: https://lore.kernel.org/r/20210304104826.3993892-3-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: refine the validation checks on non-dir inode mask [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Jun 27 20:47:19 2022 +0300

    fanotify: refine the validation checks on non-dir inode mask
    
    [ Upstream commit 8698e3bab4dd7968666e84e111d0bfd17c040e77 ]
    
    Commit ceaf69f8eadc ("fanotify: do not allow setting dirent events in
    mask of non-dir") added restrictions about setting dirent events in the
    mask of a non-dir inode mark, which does not make any sense.
    
    For backward compatibility, these restictions were added only to new
    (v5.17+) APIs.
    
    It also does not make any sense to set the flags FAN_EVENT_ON_CHILD or
    FAN_ONDIR in the mask of a non-dir inode.  Add these flags to the
    dir-only restriction of the new APIs as well.
    
    Move the check of the dir-only flags for new APIs into the helper
    fanotify_events_supported(), which is only called for FAN_MARK_ADD,
    because there is no need to error on an attempt to remove the dir-only
    flags from non-dir inode.
    
    Fixes: ceaf69f8eadc ("fanotify: do not allow setting dirent events in mask of non-dir")
    Link: https://lore.kernel.org/linux-fsdevel/20220627113224.kr2725conevh53u4@quack3.lan/
    Link: https://lore.kernel.org/r/20220627174719.2838175-1-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Remove obsoleted fanotify_event_has_path() [+ + +]

Author: Gaosheng Cui <cuigaosheng1@huawei.com>
Date:   Mon Sep 26 10:30:18 2022 +0800

    fanotify: Remove obsoleted fanotify_event_has_path()
    
    [ Upstream commit 7a80bf902d2bc722b4477442ee772e8574603185 ]
    
    All uses of fanotify_event_has_path() have
    been removed since commit 9c61f3b560f5 ("fanotify: break up
    fanotify_alloc_event()"), now it is useless, so remove it.
    
    Link: https://lore.kernel.org/r/20220926023018.1505270-1-cuigaosheng1@huawei.com
    Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    [ cel: resolved merge conflict ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: remove variable set but not used [+ + +]

Author: Yang Li <yang.lee@linux.alibaba.com>
Date:   Thu Jan 20 13:57:22 2022 +0100

    fanotify: remove variable set but not used
    
    [ Upstream commit 217663f101a56ef77f82273818253fff082bf503 ]
    
    The code that uses the pointer info has been removed in 7326e382c21e
    ("fanotify: report old and/or new parent+name in FAN_RENAME event").
    and fanotify_event_info() doesn't change 'event', so the declaration and
    assignment of info can be removed.
    
    Eliminate the following clang warning:
    fs/notify/fanotify/fanotify_user.c:161:24: warning: variable ‘info’ set
    but not used
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Report fid info for file related file system errors [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:41 2021 -0300

    fanotify: Report fid info for file related file system errors
    
    [ Upstream commit 936d6a38be39177495af38497bf8da1c6128fa1b ]
    
    Plumb the pieces to add a FID report to error records.  Since all error
    event memory must be pre-allocated, we pre-allocate the maximum file
    handle size possible, such that it should always fit.
    
    For errors that don't expose a file handle, report it with an invalid
    FID. Internally we use zero-length FILEID_ROOT file handle for passing
    the information (which we report as zero-length FILEID_INVALID file
    handle to userspace) so we update the handle reporting code to deal with
    this case correctly.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-27-krisman@collabora.com
    Link: https://lore.kernel.org/r/20211025192746.66445-25-krisman@collabora.com
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    [Folded two patches into 2 to make series bisectable]
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: report old and/or new parent+name in FAN_RENAME event [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:36 2021 +0200

    fanotify: report old and/or new parent+name in FAN_RENAME event
    
    [ Upstream commit 7326e382c21e9c23c89c88369afdc90b82a14da8 ]
    
    In the special case of FAN_RENAME event, we report old or new or both
    old and new parent+name.
    
    A single info record will be reported if either the old or new dir
    is watched and two records will be reported if both old and new dir
    (or their filesystem) are watched.
    
    The old and new parent+name are reported using new info record types
    FAN_EVENT_INFO_TYPE_{OLD,NEW}_DFID_NAME, so if a single info record
    is reported, it is clear to the application, to which dir entry the
    fid+name info is referring to.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-11-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Require fid_mode for any non-fd event [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:31 2021 -0300

    fanotify: Require fid_mode for any non-fd event
    
    [ Upstream commit 4fe595cf1c80e7a5af4d00c4da29def64aff57a2 ]
    
    Like inode events, FAN_FS_ERROR will require fid mode.  Therefore,
    convert the verification during fanotify_mark(2) to require fid for any
    non-fd event.  This means fid_mode will not only be required for inode
    events, but for any event that doesn't provide a descriptor.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-17-krisman@collabora.com
    Suggested-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Reserve UAPI bits for FAN_FS_ERROR [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:33 2021 -0300

    fanotify: Reserve UAPI bits for FAN_FS_ERROR
    
    [ Upstream commit 8d11a4f43ef4679be0908026907a7613b33d7127 ]
    
    FAN_FS_ERROR allows reporting of event type FS_ERROR to userspace, which
    is a mechanism to report file system wide problems via fanotify.  This
    commit preallocate userspace visible bits to match the FS_ERROR event.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-19-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Split fsid check from other fid mode checks [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:21 2021 -0300

    fanotify: Split fsid check from other fid mode checks
    
    [ Upstream commit 8299212cbdb01a5867e230e961f82e5c02a6de34 ]
    
    FAN_FS_ERROR will require fsid, but not necessarily require the
    filesystem to expose a file handle.  Split those checks into different
    functions, so they can be used separately when setting up an event.
    
    While there, update a comment about tmpfs having 0 fsid, which is no
    longer true.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-7-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Support enqueueing of error events [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:35 2021 -0300

    fanotify: Support enqueueing of error events
    
    [ Upstream commit 83e9acbe13dc1b767f91b5c1350f7a65689b26f6 ]
    
    Once an error event is triggered, enqueue it in the notification group,
    similarly to what is done for other events.  FAN_FS_ERROR is not
    handled specially, since the memory is now handled by a preallocated
    mempool.
    
    For now, make the event unhashed.  A future patch implements merging of
    this kind of event.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-21-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: support limited functionality for unprivileged users [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 13:29:21 2021 +0200

    fanotify: support limited functionality for unprivileged users
    
    [ Upstream commit 7cea2a3c505e87a9d6afc78be4a7f7be636a73a7 ]
    
    Add limited support for unprivileged fanotify groups.
    An unprivileged users is not allowed to get an open file descriptor in
    the event nor the process pid of another process.  An unprivileged user
    cannot request permission events, cannot set mount/filesystem marks and
    cannot request unlimited queue/marks.
    
    This enables the limited functionality similar to inotify when watching a
    set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without
    requiring SYS_CAP_ADMIN privileges.
    
    The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged
    listener watching a set of directories (with FAN_EVENT_ON_CHILD) to monitor
    all changes inside those directories.
    
    This typically requires that the listener keeps a map of watched directory
    fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at()
    before starting to watch for changes.
    
    When getting an event, the reported fid of the parent should be resolved
    to dirfd and fstatsat(2) with dirfd and name should be used to query the
    state of the filesystem entry.
    
    Link: https://lore.kernel.org/r/20210304112921.3996419-3-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Support merging of error events [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:36 2021 -0300

    fanotify: Support merging of error events
    
    [ Upstream commit 8a6ae64132fd27a944faed7bc38484827609eb76 ]
    
    Error events (FAN_FS_ERROR) against the same file system can be merged
    by simply iterating the error count.  The hash is taken from the fsid,
    without considering the FH.  This means that only the first error object
    is reported.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-22-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Support null inode event in fanotify_dfid_inode [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:28 2021 -0300

    fanotify: Support null inode event in fanotify_dfid_inode
    
    [ Upstream commit 12f47bf0f0990933d95d021d13d31bda010648fd ]
    
    FAN_FS_ERROR doesn't support DFID, but this function is still called for
    every event.  The problem is that it is not capable of handling null
    inodes, which now can happen in case of superblock error events.  For
    this case, just returning dir will be enough.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-14-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: support secondary dir fh and name in fanotify_info [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:33 2021 +0200

    fanotify: support secondary dir fh and name in fanotify_info
    
    [ Upstream commit 3cf984e950c1c3f41d407ed31db33beb996be132 ]
    
    Allow storing a secondary dir fh and name tupple in fanotify_info.
    This will be used to store the new parent and name information in
    FAN_RENAME event.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-8-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: use fsnotify group lock helpers [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:26 2022 +0300

    fanotify: use fsnotify group lock helpers
    
    [ Upstream commit e79719a2ca5c61912c0493bc1367db52759cf6fd ]
    
    Direct reclaim from fanotify mark allocation context may try to evict
    inodes with evictable marks of the same group and hit this deadlock:
    
    [<0>] fsnotify_destroy_mark+0x1f/0x3a
    [<0>] fsnotify_destroy_marks+0x71/0xd9
    [<0>] __destroy_inode+0x24/0x7e
    [<0>] destroy_inode+0x2c/0x67
    [<0>] dispose_list+0x49/0x68
    [<0>] prune_icache_sb+0x5b/0x79
    [<0>] super_cache_scan+0x11c/0x16f
    [<0>] shrink_slab.constprop.0+0x23e/0x40f
    [<0>] shrink_node+0x218/0x3e7
    [<0>] do_try_to_free_pages+0x12a/0x2d2
    [<0>] try_to_free_pages+0x166/0x242
    [<0>] __alloc_pages_slowpath.constprop.0+0x30c/0x903
    [<0>] __alloc_pages+0xeb/0x1c7
    [<0>] cache_grow_begin+0x6f/0x31e
    [<0>] fallback_alloc+0xe0/0x12d
    [<0>] ____cache_alloc_node+0x15a/0x17e
    [<0>] kmem_cache_alloc_trace+0xa1/0x143
    [<0>] fanotify_add_mark+0xd5/0x2b2
    [<0>] do_fanotify_mark+0x566/0x5eb
    [<0>] __x64_sys_fanotify_mark+0x21/0x24
    [<0>] do_syscall_64+0x6d/0x80
    [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    Set the FSNOTIFY_GROUP_NOFS flag to prevent going into direct reclaim
    from allocations under fanotify group lock and use the safe group lock
    helpers.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-16-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: use helpers to parcel fanotify_info buffer [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:32 2021 +0200

    fanotify: use helpers to parcel fanotify_info buffer
    
    [ Upstream commit 1a9515ac9e55e68d733bab81bd408463ab1e25b1 ]
    
    fanotify_info buffer is parceled into variable sized records, so the
    records must be written in order: dir_fh, file_fh, name.
    
    Use helpers to assert that order and make fanotify_alloc_name_event()
    a bit more generic to allow empty dir_fh record and to allow expanding
    to more records (i.e. name2) soon.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-7-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: use macros to get the offset to fanotify_info buffer [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:31 2021 +0200

    fanotify: use macros to get the offset to fanotify_info buffer
    
    [ Upstream commit 2d9374f095136206a02eb0b6cd9ef94632c1e9f7 ]
    
    The fanotify_info buffer contains up to two file handles and a name.
    Use macros to simplify the code that access the different items within
    the buffer.
    
    Add assertions to verify that stored fh len and name len do not overflow
    the u8 stored value in fanotify_info header.
    
    Remove the unused fanotify_info_len() helper.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-6-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: WARN_ON against too large file handles [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:40 2021 -0300

    fanotify: WARN_ON against too large file handles
    
    [ Upstream commit 572c28f27a269f88e2d8d7b6b1507f114d637337 ]
    
    struct fanotify_error_event, at least, is preallocated and isn't able to
    to handle arbitrarily large file handles.  Future-proof the code by
    complaining loudly if a handle larger than MAX_HANDLE_SZ is ever found.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-26-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: wire up FAN_RENAME event [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:37 2021 +0200

    fanotify: wire up FAN_RENAME event
    
    [ Upstream commit 8cc3b1ccd930fe6971e1527f0c4f1bdc8cb56026 ]
    
    FAN_RENAME is the successor of FAN_MOVED_FROM and FAN_MOVED_TO
    and can be used to get the old and new parent+name information in
    a single event.
    
    FAN_MOVED_FROM and FAN_MOVED_TO are still supported for backward
    compatibility, but it makes little sense to use them together with
    FAN_RENAME in the same group.
    
    FAN_RENAME uses special info type records to report the old and
    new parent+name, so reporting only old and new parent id is less
    useful and was not implemented.
    Therefore, FAN_REANAME requires a group with flag FAN_REPORT_NAME.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-12-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify: Wrap object_fh inline space in a creator macro [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:37 2021 -0300

    fanotify: Wrap object_fh inline space in a creator macro
    
    [ Upstream commit 2c5069433a3adc01ff9c5673567961bb7f138074 ]
    
    fanotify_error_event would duplicate this sequence of declarations that
    already exist elsewhere with a slight different size.  Create a helper
    macro to avoid code duplication.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-23-krisman@collabora.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fanotify_user: use upper_32_bits() to verify mask [+ + +]

Author: Christian Brauner <brauner@kernel.org>
Date:   Thu Mar 25 09:37:43 2021 +0100

    fanotify_user: use upper_32_bits() to verify mask
    
    [ Upstream commit 22d483b99863202e3631ff66fa0f3c2302c0f96f ]
    
    I don't see an obvious reason why the upper 32 bit check needs to be
    open-coded this way. Switch to upper_32_bits() which is more idiomatic and
    should conceptually be the same check.
    
    Cc: Amir Goldstein <amir73il@gmail.com>
    Cc: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20210325083742.2334933-1-brauner@kernel.org
    Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Factor files_lookup_fd_locked out of fcheck_files [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:25 2020 -0600

    file: Factor files_lookup_fd_locked out of fcheck_files
    
    [ Upstream commit 120ce2b0cd52abe73e8b16c23461eb14df5a87d8 ]
    
    To make it easy to tell where files->file_lock protection is being
    used when looking up a file create files_lookup_fd_locked.  Only allow
    this function to be called with the file_lock held.
    
    Update the callers of fcheck and fcheck_files that are called with the
    files->file_lock held to call files_lookup_fd_locked instead.
    
    Hopefully this makes it easier to quickly understand what is going on.
    
    The need for better names became apparent in the last round of
    discussion of this set of changes[1].
    
    [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-8-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Implement task_lookup_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:28 2020 -0600

    file: Implement task_lookup_fd_rcu
    
    [ Upstream commit 3a879fb38082125cc0d8aa89b70c7f3a7cdf584b ]
    
    As a companion to lookup_fd_rcu implement task_lookup_fd_rcu for
    querying an arbitrary process about a specific file.
    
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200818103713.aw46m7vprsy4vlve@wittgenstein
    Link: https://lkml.kernel.org/r/20201120231441.29911-11-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Implement task_lookup_next_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:31 2020 -0600

    file: Implement task_lookup_next_fd_rcu
    
    [ Upstream commit e9a53aeb5e0a838f10fcea74235664e7ad5e6e1a ]
    
    As a companion to fget_task and task_lookup_fd_rcu implement
    task_lookup_next_fd_rcu that will return the struct file for the first
    file descriptor number that is equal or greater than the fd argument
    value, or NULL if there is no such struct file.
    
    This allows file descriptors of foreign processes to be iterated
    through safely, without needed to increment the count on files_struct.
    
    Some concern[1] has been expressed that this function takes the task_lock
    for each iteration and thus for each file descriptor.  This place
    where this function will be called in a commonly used code path is for
    listing /proc/<pid>/fd.  I did some small benchmarks and did not see
    any measurable performance differences.  For ordinary users ls is
    likely to stat each of the directory entries and tid_fd_mode called
    from tid_fd_revalidae has always taken the task lock for each file
    descriptor.  So this does not look like it will be a big change in
    practice.
    
    At some point is will probably be worth changing put_files_struct to
    free files_struct after an rcu grace period so that task_lock won't be
    needed at all.
    
    [1] https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com
    v1: https://lkml.kernel.org/r/20200817220425.9389-9-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-14-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: In f_dupfd read RLIMIT_NOFILE once. [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:36 2020 -0600

    file: In f_dupfd read RLIMIT_NOFILE once.
    
    Simplify the code, and remove the chance of races by reading
    RLIMIT_NOFILE only once in f_dupfd.
    
    Pass the read value of RLIMIT_NOFILE into alloc_fd which is the other
    location the rlimit was read in f_dupfd.  As f_dupfd is the only
    caller of alloc_fd this changing alloc_fd is trivially safe.
    
    Further this causes alloc_fd to take all of the same arguments as
    __alloc_fd except for the files_struct argument.
    
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-15-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-19-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Merge __alloc_fd into alloc_fd [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:37 2020 -0600

    file: Merge __alloc_fd into alloc_fd
    
    [ Upstream commit aa384d10f3d06d4b85597ff5df41551262220e16 ]
    
    The function __alloc_fd was added to support binder[1].  With binder
    fixed[2] there are no more users.
    
    As alloc_fd just calls __alloc_fd with "files=current->files",
    merge them together by transforming the files parameter into a
    local variable initialized to current->files.
    
    [1] dcfadfa4ec5a ("new helper: __alloc_fd()")
    [2] 44d8047f1d87 ("binder: use standard functions to allocate fds")
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-16-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-20-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Merge __fd_install into fd_install [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:35 2020 -0600

    file: Merge __fd_install into fd_install
    
    [ Upstream commit d74ba04d919ebe30bf47406819c18c6b50003d92 ]
    
    The function __fd_install was added to support binder[1].  With binder
    fixed[2] there are no more users.
    
    As fd_install just calls __fd_install with "files=current->files",
    merge them together by transforming the files parameter into a
    local variable initialized to current->files.
    
    [1] f869e8a7f753 ("expose a low-level variant of fd_install() for binder")
    [2] 44d8047f1d87 ("binder: use standard functions to allocate fds")
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-14-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-18-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Rename __close_fd to close_fd and remove the files parameter [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:38 2020 -0600

    file: Rename __close_fd to close_fd and remove the files parameter
    
    [ Upstream commit 8760c909f54a82aaa6e76da19afe798a0c77c3c3 ]
    
    The function __close_fd was added to support binder[1].  Now that
    binder has been fixed to no longer need __close_fd[2] all calls
    to __close_fd pass current->files.
    
    Therefore transform the files parameter into a local variable
    initialized to current->files, and rename __close_fd to close_fd to
    reflect this change, and keep it in sync with the similar changes to
    __alloc_fd, and __fd_install.
    
    This removes the need for callers to care about the extra care that
    needs to be take if anything except current->files is passed, by
    limiting the callers to only operation on current->files.
    
    [1] 483ce1d4b8c3 ("take descriptor-related part of close() to file.c")
    [2] 44d8047f1d87 ("binder: use standard functions to allocate fds")
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-17-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-21-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Rename __fcheck_files to files_lookup_fd_raw [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Dec 10 12:39:54 2020 -0600

    file: Rename __fcheck_files to files_lookup_fd_raw
    
    [ Upstream commit bebf684bf330915e6c96313ad7db89a5480fc9c2 ]
    
    The function fcheck despite it's comment is poorly named
    as it has no callers that only check it's return value.
    All of fcheck's callers use the returned file descriptor.
    The same is true for fcheck_files and __fcheck_files.
    
    A new less confusing name is needed.  In addition the names
    of these functions are confusing as they do not report
    the kind of locks that are needed to be held when these
    functions are called making error prone to use them.
    
    To remedy this I am making the base functio name lookup_fd
    and will and prefixes and sufficies to indicate the rest
    of the context.
    
    Name the function (previously called __fcheck_files) that proceeds
    from a struct files_struct, looks up the struct file of a file
    descriptor, and requires it's callers to verify all of the appropriate
    locks are held files_lookup_fd_raw.
    
    The need for better names became apparent in the last round of
    discussion of this set of changes[1].
    
    [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-7-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Rename fcheck lookup_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:27 2020 -0600

    file: Rename fcheck lookup_fd_rcu
    
    [ Upstream commit 460b4f812a9d473d4b39d87d37844f9fc30a9eb3 ]
    
    Also remove the confusing comment about checking if a fd exists.  I
    could not find one instance in the entire kernel that still matches
    the description or the reason for the name fcheck.
    
    The need for better names became apparent in the last round of
    discussion of this set of changes[1].
    
    [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Replace fcheck_files with files_lookup_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:26 2020 -0600

    file: Replace fcheck_files with files_lookup_fd_rcu
    
    [ Upstream commit f36c2943274199cb8aef32ac96531ffb7c4b43d0 ]
    
    This change renames fcheck_files to files_lookup_fd_rcu.  All of the
    remaining callers take the rcu_read_lock before calling this function
    so the _rcu suffix is appropriate.  This change also tightens up the
    debug check to verify that all callers hold the rcu_read_lock.
    
    All callers that used to call files_check with the files->file_lock
    held have now been changed to call files_lookup_fd_locked.
    
    This change of name has helped remind me of which locks and which
    guarantees are in place helping me to catch bugs later in the
    patchset.
    
    The need for better names became apparent in the last round of
    discussion of this set of changes[1].
    
    [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-9-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

file: Replace ksys_close with close_fd [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:39 2020 -0600

    file: Replace ksys_close with close_fd
    
    [ Upstream commit 1572bfdf21d4d50e51941498ffe0b56c2289f783 ]
    
    Now that ksys_close is exactly identical to close_fd replace
    the one caller of ksys_close with close_fd.
    
    [1] https://lkml.kernel.org/r/20200818112020.GA17080@infradead.org
    Suggested-by: Christoph Hellwig <hch@infradead.org>
    Link: https://lkml.kernel.org/r/20201120231441.29911-22-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

filelock: add a new locks_inode_context accessor function [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 16 09:02:30 2022 -0500

    filelock: add a new locks_inode_context accessor function
    
    [ Upstream commit 401a8b8fd5acd51582b15238d72a8d0edd580e9f ]
    
    There are a number of places in the kernel that are accessing the
    inode->i_flctx field without smp_load_acquire. This is required to
    ensure that the caller doesn't see a partially-initialized structure.
    
    Add a new accessor function for it to make this clear and convert all of
    the relevant accesses in locks.c to use it. Also, convert
    locks_free_lock_context to use the helper as well instead of just doing
    a "bare" assignment.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:25 2022 -0700

    fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict
    
    [ Upstream commit 2443da2259e97688f93d64d17ab69b15f466078a ]
    
    Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to
    lock_manager_operations to allow the lock manager to take appropriate
    action to resolve the lock conflict if possible.
    
    A new field, lm_mod_owner, is also added to lock_manager_operations.
    The lm_mod_owner is used by the fs/lock code to make sure the lock
    manager module such as nfsd, is not freed while lock conflict is being
    resolved.
    
    lm_lock_expirable checks and returns true to indicate that the lock
    conflict can be resolved else return false. This callback must be
    called with the flc_lock held so it can not block.
    
    lm_expire_lock is called to resolve the lock conflict if the returned
    value from lm_lock_expirable is true. This callback is called without
    the flc_lock held since it's allowed to block. Upon returning from
    this callback, the lock conflict should be resolved and the caller is
    expected to restart the conflict check from the beginnning of the list.
    
    Lock manager, such as NFSv4 courteous server, uses this callback to
    resolve conflict by destroying lock owner, or the NFSv4 courtesy client
    (client that has expired but allowed to maintains its states) that owns
    the lock.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/lock: add helper locks_owner_has_blockers to check for blockers [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:24 2022 -0700

    fs/lock: add helper locks_owner_has_blockers to check for blockers
    
    [ Upstream commit 591502c5cb325b1c6ec59ab161927d606b918aa0 ]
    
    Add helper locks_owner_has_blockers to check if there is any blockers
    for a given lockowner.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock. [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Sat Feb 12 10:12:52 2022 -0800

    fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock.
    
    [ Upstream commit 9d6647762b9c6b555bc83d97d7c93be6057a990f ]
    
    Update lock usage of lock_manager_operations' functions to reflect
    the changes in commit 6109c85037e5 ("locks: add a dedicated spinlock
    to protect i_flctx lists").
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/lockd: convert comma to semicolon [+ + +]

Author: Zheng Yongjun <zhengyongjun3@huawei.com>
Date:   Fri Dec 11 16:41:58 2020 +0800

    fs/lockd: convert comma to semicolon
    
    [ Upstream commit 3316fb80a0b4c1fef03a3eb1a7f0651e2133c429 ]
    
    Replace a comma between expression statements by a semicolon.
    
    Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/notify: constify path [+ + +]

Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Thu Aug 4 12:57:38 2022 -0400

    fs/notify: constify path
    
    [ Upstream commit d5bf88895f24686641c39420ee6df716dc1d95d8 ]
    
    Reviewed-by: Matthew Bobrowski <repnop@google.com>
    Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs: add file and path permissions helpers [+ + +]

Author: Christian Brauner <brauner@kernel.org>
Date:   Thu Jan 21 14:19:22 2021 +0100

    fs: add file and path permissions helpers
    
    [ Upstream commit 02f92b3868a1b34ab98464e76b0e4e060474ba10 ]
    
    Add two simple helpers to check permissions on a file and path
    respectively and convert over some callers. It simplifies quite a few
    codepaths and also reduces the churn in later patches quite a bit.
    Christoph also correctly points out that this makes codepaths (e.g.
    ioctls) way easier to follow that would otherwise have to do more
    complex argument passing than necessary.
    
    Link: https://lore.kernel.org/r/20210121131959.646623-4-christian.brauner@ubuntu.com
    Cc: David Howells <dhowells@redhat.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: linux-fsdevel@vger.kernel.org
    Suggested-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: James Morris <jamorris@linux.microsoft.com>
    Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs: inotify: Fix typo in inotify comment [+ + +]

Author: Oliver Ford <ojford@gmail.com>
Date:   Wed May 18 15:59:59 2022 +0100

    fs: inotify: Fix typo in inotify comment
    
    [ Upstream commit c05787b4c2f80a3bebcb9cdbf255d4fa5c1e24e1 ]
    
    Correct spelling in comment.
    
    Signed-off-by: Oliver Ford <ojford@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220518145959.41-1-ojford@gmail.com
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Add helper to detect overflow_event [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:23 2021 -0300

    fsnotify: Add helper to detect overflow_event
    
    [ Upstream commit 808967a0a4d2f4ce6a2005c5692fffbecaf018c1 ]
    
    Similarly to fanotify_is_perm_event and friends, provide a helper
    predicate to say whether a mask is of an overflow event.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-9-krisman@collabora.com
    Suggested-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Add wrapper around fsnotify_add_event [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:24 2021 -0300

    fsnotify: Add wrapper around fsnotify_add_event
    
    [ Upstream commit 1ad03c3a326a86e259389592117252c851873395 ]
    
    fsnotify_add_event is growing in number of parameters, which in most
    case are just passed a NULL pointer.  So, split out a new
    fsnotify_insert_event function to clean things up for users who don't
    need an insert hook.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-10-krisman@collabora.com
    Suggested-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: allow adding an inode mark without pinning inode [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:22 2022 +0300

    fsnotify: allow adding an inode mark without pinning inode
    
    [ Upstream commit c3638b5b13740fa31762d414bbce8b7a694e582a ]
    
    fsnotify_add_mark() and variants implicitly take a reference on inode
    when attaching a mark to an inode.
    
    Make that behavior opt-out with the mark flag FSNOTIFY_MARK_FLAG_NO_IREF.
    
    Instead of taking the inode reference when attaching connector to inode
    and dropping the inode reference when detaching connector from inode,
    take the inode reference on attach of the first mark that wants to hold
    an inode reference and drop the inode reference on detach of the last
    mark that wants to hold an inode reference.
    
    Backends can "upgrade" an existing mark to take an inode reference, but
    cannot "downgrade" a mark with inode reference to release the refernce.
    
    This leaves the choice to the backend whether or not to pin the inode
    when adding an inode mark.
    
    This is intended to be used when adding a mark with ignored mask that is
    used for optimization in cases where group can afford getting unneeded
    events and reinstate the mark with ignored mask when inode is accessed
    again after being evicted.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-12-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 12:48:22 2021 +0200

    fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
    
    [ Upstream commit 6f73171e192366ff7c98af9fb50615ef9615f8a7 ]
    
    Current code has an assumtion that fsnotify_notify_queue_is_empty() is
    called to verify that queue is not empty before trying to peek or remove
    an event from queue.
    
    Remove this assumption by moving the fsnotify_notify_queue_is_empty()
    into the functions, allow them to return NULL value and check return
    value by all callers.
    
    This is a prep patch for multi event queues.
    
    Link: https://lore.kernel.org/r/20210304104826.3993892-2-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: clarify contract for create event hooks [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Oct 25 16:27:18 2021 -0300

    fsnotify: clarify contract for create event hooks
    
    [ Upstream commit dabe729dddca550446e9cc118c96d1f91703345b ]
    
    Clarify argument names and contract for fsnotify_create() and
    fsnotify_mkdir() to reflect the anomaly of kernfs, which leaves dentries
    negavite after mkdir/create.
    
    Remove the WARN_ON(!inode) in audit code that were added by the Fixes
    commit under the wrong assumption that dentries cannot be negative after
    mkdir/create.
    
    Fixes: aa93bdc5500c ("fsnotify: use helpers to access data by data_type")
    Link: https://lore.kernel.org/linux-fsdevel/87mtp5yz0q.fsf@collabora.com/
    Link: https://lore.kernel.org/r/20211025192746.66445-4-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: clarify object type argument [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:27 2021 +0200

    fsnotify: clarify object type argument
    
    [ Upstream commit ad69cd9972e79aba103ba5365de0acd35770c265 ]
    
    In preparation for separating object type from iterator type, rename
    some 'type' arguments in functions to 'obj_type' and remove the unused
    interface to clear marks by object type mask.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-2-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: consistent behavior for parent not watching children [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed May 11 22:02:13 2022 +0300

    fsnotify: consistent behavior for parent not watching children
    
    [ Upstream commit e730558adffb88a52e562db089e969ee9510184a ]
    
    The logic for handling events on child in groups that have a mark on
    the parent inode, but without FS_EVENT_ON_CHILD flag in the mask is
    duplicated in several places and inconsistent.
    
    Move the logic into the preparation of mark type iterator, so that the
    parent mark type will be excluded from all mark type iterations in that
    case.
    
    This results in several subtle changes of behavior, hopefully all
    desired changes of behavior, for example:
    
    - Group A has a mount mark with FS_MODIFY in mask
    - Group A has a mark with ignore mask that does not survive FS_MODIFY
      and does not watch children on directory D.
    - Group B has a mark with FS_MODIFY in mask that does watch children
      on directory D.
    - FS_MODIFY event on file D/foo should not clear the ignore mask of
      group A, but before this change it does
    
    And if group A ignore mask was set to survive FS_MODIFY:
    - FS_MODIFY event on file D/foo should be reported to group A on account
      of the mount mark, but before this change it is wrongly ignored
    
    Fixes: 2f02fd3fa13e ("fanotify: fix ignore mask logic for events on child and on dir")
    Reported-by: Jan Kara <jack@suse.com>
    Link: https://lore.kernel.org/linux-fsdevel/20220314113337.j7slrb5srxukztje@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220511190213.831646-3-amir73il@gmail.com
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: count all objects with attached connectors [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Tue Aug 10 18:12:19 2021 +0300

    fsnotify: count all objects with attached connectors
    
    [ Upstream commit ec44610fe2b86daef70f3f53f47d2a2542d7094f ]
    
    Rename s_fsnotify_inode_refs to s_fsnotify_connectors and count all
    objects with attached connectors, not only inodes with attached
    connectors.
    
    This will be used to optimize fsnotify() calls on sb without any
    type of marks.
    
    Link: https://lore.kernel.org/r/20210810151220.285179-4-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Matthew Bobrowski <repnop@google.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: count s_fsnotify_inode_refs for attached connectors [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Tue Aug 10 18:12:18 2021 +0300

    fsnotify: count s_fsnotify_inode_refs for attached connectors
    
    [ Upstream commit 11fa333b58ba1518e7c69fafb6513a0117f8fe33 ]
    
    Instead of incrementing s_fsnotify_inode_refs when detaching connector
    from inode, increment it earlier when attaching connector to inode.
    Next patch is going to use s_fsnotify_inode_refs to count all objects
    with attached connectors.
    
    Link: https://lore.kernel.org/r/20210810151220.285179-3-amir73il@gmail.com
    Reviewed-by: Matthew Bobrowski <repnop@google.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: create helpers for group mark_mutex lock [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:17 2022 +0300

    fsnotify: create helpers for group mark_mutex lock
    
    [ Upstream commit 43b245a788e2d8f1bb742668a9bdace02fcb3e96 ]
    
    Create helpers to take and release the group mark_mutex lock.
    
    Define a flag FSNOTIFY_GROUP_NOFS in fsnotify_group that determines
    if the mark_mutex lock is fs reclaim safe or not.  If not safe, the
    lock helpers take the lock and disable direct fs reclaim.
    
    In that case we annotate the mutex with a different lockdep class to
    express to lockdep that an allocation of mark of an fs reclaim safe group
    may take the group lock of another "NOFS" group to evict inodes.
    
    For now, converted only the callers in common code and no backend
    defines the NOFS flag.  It is intended to be set by fanotify for
    evictable marks support.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-7-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Don't insert unmergeable events in hashtable [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:19 2021 -0300

    fsnotify: Don't insert unmergeable events in hashtable
    
    [ Upstream commit cc53b55f697fe5aa98bdbfdfe67c6401da242155 ]
    
    Some events, like the overflow event, are not mergeable, so they are not
    hashed.  But, when failing inside fsnotify_add_event for lack of space,
    fsnotify_add_event() still calls the insert hook, which adds the
    overflow event to the merge list.  Add a check to prevent any kind of
    unmergeable event to be inserted in the hashtable.
    
    Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge")
    Link: https://lore.kernel.org/r/20211025192746.66445-5-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Fix comment typo [+ + +]

Author: Xin Gao <gaoxin@cdjrlc.com>
Date:   Sat Jul 23 03:46:39 2022 +0800

    fsnotify: Fix comment typo
    
    [ Upstream commit feee1ce45a5666bbdb08c5bb2f5f394047b1915b ]
    
    The double `if' is duplicated in line 104, remove one.
    
    Signed-off-by: Xin Gao <gaoxin@cdjrlc.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220722194639.18545-1-gaoxin@cdjrlc.com
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: fix merge with parent's ignored mask [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Feb 23 17:14:37 2022 +0200

    fsnotify: fix merge with parent's ignored mask
    
    [ Upstream commit 4f0b903ded728c505850daf2914bfc08841f0ae6 ]
    
    fsnotify_parent() does not consider the parent's mark at all unless
    the parent inode shows interest in events on children and in the
    specific event.
    
    So unless parent added an event to both its mark mask and ignored mask,
    the event will not be ignored.
    
    Fix this by declaring the interest of an object in an event when the
    event is in either a mark mask or ignored mask.
    
    Link: https://lore.kernel.org/r/20220223151438.790268-2-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: fix sb_connectors leak [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Sep 9 14:56:34 2021 +0300

    fsnotify: fix sb_connectors leak
    
    [ Upstream commit 4396a73115fc8739083536162e2228c0c0c3ed1a ]
    
    Fix a leak in s_fsnotify_connectors counter in case of a race between
    concurrent add of new fsnotify mark to an object.
    
    The task that lost the race fails to drop the counter before freeing
    the unused connector.
    
    Following umount() hangs in fsnotify_sb_delete()/wait_var_event(),
    because s_fsnotify_connectors never drops to zero.
    
    Fixes: ec44610fe2b8 ("fsnotify: count all objects with attached connectors")
    Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
    Link: https://lore.kernel.org/linux-fsdevel/20210907063338.ycaw6wvhzrfsfdlp@xzhoux.usersys.redhat.com/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: generate FS_RENAME event with rich information [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:30 2021 +0200

    fsnotify: generate FS_RENAME event with rich information
    
    [ Upstream commit e54183fa7047c15819bc155f4c58501d9a9a3489 ]
    
    The dnotify FS_DN_RENAME event is used to request notification about
    a move within the same parent directory and was always coupled with
    the FS_MOVED_FROM event.
    
    Rename the FS_DN_RENAME event flag to FS_RENAME, decouple it from
    FS_MOVED_FROM and report it with the moved dentry instead of the moved
    inode, so it has the information about both old and new parent and name.
    
    Generate the FS_RENAME event regardless of same parent dir and apply
    the "same parent" rule in the generic fsnotify_handle_event() helper
    that is used to call backends with ->handle_inode_event() method
    (i.e. dnotify).  The ->handle_inode_event() method is not rich enough to
    report both old and new parent and name anyway.
    
    The enriched event is reported to fanotify over the ->handle_event()
    method with the old and new dir inode marks in marks array slots for
    ITER_TYPE_INODE and a new iter type slot ITER_TYPE_INODE2.
    
    The enriched event will be used for reporting old and new parent+name to
    fanotify groups with FAN_RENAME events.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-5-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: introduce mark type iterator [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed May 11 22:02:12 2022 +0300

    fsnotify: introduce mark type iterator
    
    [ Upstream commit 14362a2541797cf9df0e86fb12dcd7950baf566e ]
    
    fsnotify_foreach_iter_mark_type() is used to reduce boilerplate code
    of iterating all marks of a specific group interested in an event
    by consulting the iterator report_mask.
    
    Use an open coded version of that iterator in fsnotify_iter_next()
    that collects all marks of the current iteration group without
    consulting the iterator report_mask.
    
    At the moment, the two iterator variants are the same, but this
    decoupling will allow us to exclude some of the group's marks from
    reporting the event, for example for event on child and inode marks
    on parent did not request to watch events on children.
    
    Fixes: 2f02fd3fa13e ("fanotify: fix ignore mask logic for events on child and on dir")
    Reported-by: Jan Kara <jack@suse.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220511190213.831646-2-amir73il@gmail.com
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: make allow_dups a property of the group [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:16 2022 +0300

    fsnotify: make allow_dups a property of the group
    
    [ Upstream commit f3010343d9e119da35ee864b3a28993bb5c78ed7 ]
    
    Instead of passing the allow_dups argument to fsnotify_add_mark()
    as an argument, define the group flag FSNOTIFY_GROUP_DUPS to express
    the allow_dups behavior and set this behavior at group creation time
    for all calls of fsnotify_add_mark().
    
    Rename the allow_dups argument to generic add_flags argument for future
    use.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-6-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: optimize FS_MODIFY events with no ignored masks [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Feb 23 17:14:38 2022 +0200

    fsnotify: optimize FS_MODIFY events with no ignored masks
    
    [ Upstream commit 04e317ba72d07901b03399b3d1525e83424df5b3 ]
    
    fsnotify() treats FS_MODIFY events specially - it does not skip them
    even if the FS_MODIFY event does not apear in the object's fsnotify
    mask.  This is because send_to_group() checks if FS_MODIFY needs to
    clear ignored mask of marks.
    
    The common case is that an object does not have any mark with ignored
    mask and in particular, that it does not have a mark with ignored mask
    and without the FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY flag.
    
    Set FS_MODIFY in object's fsnotify mask during fsnotify_recalc_mask()
    if object has a mark with an ignored mask and without the
    FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY flag and remove the special
    treatment of FS_MODIFY in fsnotify(), so that FS_MODIFY events could
    be optimized in the common case.
    
    Call fsnotify_recalc_mask() from fanotify after adding or removing an
    ignored mask from a mark without FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY
    or when adding the FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY flag to a mark
    with ignored mask (the flag cannot be removed by fanotify uapi).
    
    Performance results for doing 10000000 write(2)s to tmpfs:
    
                                    vanilla         patched
    without notification mark       25.486+-1.054   24.965+-0.244
    with notification mark          30.111+-0.139   26.891+-1.355
    
    So we can see the overhead of notification subsystem has been
    drastically reduced.
    
    Link: https://lore.kernel.org/r/20220223151438.790268-3-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: optimize the case of no marks of any type [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Tue Aug 10 18:12:20 2021 +0300

    fsnotify: optimize the case of no marks of any type
    
    [ Upstream commit e43de7f0862b8598cd1ef440e3b4701cd107ea40 ]
    
    Add a simple check in the inline helpers to avoid calling fsnotify()
    and __fsnotify_parent() in case there are no marks of any type
    (inode/sb/mount) for an inode's sb, so there can be no objects
    of any type interested in the event.
    
    Link: https://lore.kernel.org/r/20210810151220.285179-5-amir73il@gmail.com
    Reviewed-by: Matthew Bobrowski <repnop@google.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: pass data_type to fsnotify_name() [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Oct 25 16:27:16 2021 -0300

    fsnotify: pass data_type to fsnotify_name()
    
    [ Upstream commit 9baf93d68bcc3d0a6042283b82603c076e25e4f5 ]
    
    Align the arguments of fsnotify_name() to those of fsnotify().
    
    Link: https://lore.kernel.org/r/20211025192746.66445-2-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    [ cel: adjust fsnotify_delete as well, a37d9a17f099 is already applied ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: pass dentry instead of inode data [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Oct 25 16:27:17 2021 -0300

    fsnotify: pass dentry instead of inode data
    
    [ Upstream commit fd5a3ff49a19aa69e2bc1e26e98037c2d778e61a ]
    
    Define a new data type to pass for event - FSNOTIFY_EVENT_DENTRY.
    Use it to pass the dentry instead of it's ->d_inode where available.
    
    This is needed in preparation to the refactor to retrieve the super
    block from the data field.  In some cases (i.e. mkdir in kernfs), the
    data inode comes from a negative dentry, such that no super block
    information would be available. By receiving the dentry itself, instead
    of the inode, fsnotify can derive the super block even on these cases.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-3-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    [Expand explanation in commit message]
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: pass flags argument to fsnotify_alloc_group() [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:15 2022 +0300

    fsnotify: pass flags argument to fsnotify_alloc_group()
    
    [ Upstream commit 867a448d587e7fa845bceaf4ee1c632448f2a9fa ]
    
    Add flags argument to fsnotify_alloc_group(), define and use the flag
    FSNOTIFY_GROUP_USER in inotify and fanotify instead of the helper
    fsnotify_alloc_user_group() to indicate user allocation.
    
    Although the flag FSNOTIFY_GROUP_USER is currently not used after group
    allocation, we store the flags argument in the group struct for future
    use of other group flags.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-5-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Pass group argument to free_event [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:27 2021 -0300

    fsnotify: Pass group argument to free_event
    
    [ Upstream commit 330ae77d2a5b0af32c0f29e139bf28ec8591de59 ]
    
    For group-wide mempool backed events, like FS_ERROR, the free_event
    callback will need to reference the group's mempool to free the memory.
    Wire that argument into the current callers.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-13-krisman@collabora.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Protect fsnotify_handle_inode_event from no-inode events [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:26 2021 -0300

    fsnotify: Protect fsnotify_handle_inode_event from no-inode events
    
    [ Upstream commit 24dca90590509a7a6cbe0650100c90c5b8a3468a ]
    
    FAN_FS_ERROR allows events without inodes - i.e. for file system-wide
    errors.  Even though fsnotify_handle_inode_event is not currently used
    by fanotify, this patch protects other backends from cases where neither
    inode or dir are provided.  Also document the constraints of the
    interface (inode and dir cannot be both NULL).
    
    Link: https://lore.kernel.org/r/20211025192746.66445-12-krisman@collabora.com
    Suggested-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: remove redundant parameter judgment [+ + +]

Author: Bang Li <libang.linuxer@gmail.com>
Date:   Fri Mar 11 23:12:40 2022 +0800

    fsnotify: remove redundant parameter judgment
    
    [ Upstream commit f92ca72b0263d601807bbd23ed25cbe6f4da89f4 ]
    
    iput() has already judged the incoming parameter, so there is no need to
    repeat the judgment here.
    
    Link: https://lore.kernel.org/r/20220311151240.62045-1-libang.linuxer@gmail.com
    Signed-off-by: Bang Li <libang.linuxer@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: remove unused declaration [+ + +]

Author: Gaosheng Cui <cuigaosheng1@huawei.com>
Date:   Fri Sep 9 11:38:28 2022 +0800

    fsnotify: remove unused declaration
    
    [ Upstream commit f847c74d6e89f10926db58649a05b99237258691 ]
    
    fsnotify_alloc_event_holder() and fsnotify_destroy_event_holder()
    has been removed since commit 7053aee26a35 ("fsnotify: do not share
    events between notification groups"), so remove it.
    
    Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
    Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: replace igrab() with ihold() on attach connector [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Tue Aug 10 18:12:17 2021 +0300

    fsnotify: replace igrab() with ihold() on attach connector
    
    [ Upstream commit 09ddbe69c9925b42cb9529f60678c25b241d8b18 ]
    
    We must have a reference on inode, so ihold is cheaper.
    
    Link: https://lore.kernel.org/r/20210810151220.285179-2-amir73il@gmail.com
    Reviewed-by: Matthew Bobrowski <repnop@google.com>
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Retrieve super block from the data field [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:25 2021 -0300

    fsnotify: Retrieve super block from the data field
    
    [ Upstream commit 29335033c574a15334015d8c4e36862cff3d3384 ]
    
    Some file system events (i.e. FS_ERROR) might not be associated with an
    inode or directory.  For these, we can retrieve the super block from the
    data field.  But, since the super_block is available in the data field
    on every event type, simplify the code to always retrieve it from there,
    through a new helper.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-11-krisman@collabora.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: separate mark iterator type from object type enum [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Mon Nov 29 22:15:28 2021 +0200

    fsnotify: separate mark iterator type from object type enum
    
    [ Upstream commit 1c9007d62bea6fd164285314f7553f73e5308863 ]
    
    They are two different types that use the same enum, so this confusing.
    
    Use the object type to indicate the type of object mark is attached to
    and the iter type to indicate the type of watch.
    
    A group can have two different watches of the same object type (parent
    and child watches) that match the same event.
    
    Link: https://lore.kernel.org/r/20211129201537.1932819-3-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: Support FS_ERROR event type [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:32 2021 -0300

    fsnotify: Support FS_ERROR event type
    
    [ Upstream commit 9daa811073fa19c08e8aad3b90f9235fed161acf ]
    
    Expose a new type of fsnotify event for filesystems to report errors for
    userspace monitoring tools.  fanotify will send this type of
    notification for FAN_FS_ERROR events.  This also introduce a helper for
    generating the new event.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-18-krisman@collabora.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fsnotify: use hash table for faster events merge [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Thu Mar 4 12:48:25 2021 +0200

    fsnotify: use hash table for faster events merge
    
    [ Upstream commit 94e00d28a680dff18805ca472b191364347d2234 ]
    
    In order to improve event merge performance, hash events in a 128 size
    hash table by the event merge key.
    
    The fanotify_event size grows by two pointers, but we just reduced its
    size by removing the objectid member, so overall its size is increased
    by one pointer.
    
    Permission events and overflow event are not merged so they are also
    not hashed.
    
    Link: https://lore.kernel.org/r/20210304104826.3993892-5-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

inotify, memcg: account inotify instances to kmemcg [+ + +]

Author: Shakeel Butt <shakeel.butt@linux.dev>
Date:   Sat Dec 19 20:46:08 2020 -0800

    inotify, memcg: account inotify instances to kmemcg
    
    [ Upstream commit ac7b79fd190b02e7151bc7d2b9da692f537657f3 ]
    
    Currently the fs sysctl inotify/max_user_instances is used to limit the
    number of inotify instances on the system. For systems running multiple
    workloads, the per-user namespace sysctl max_inotify_instances can be
    used to further partition inotify instances. However there is no easy
    way to set a sensible system level max limit on inotify instances and
    further partition it between the workloads. It is much easier to charge
    the underlying resource (i.e. memory) behind the inotify instances to
    the memcg of the workload and let their memory limits limit the number
    of inotify instances they can create.
    
    With inotify instances charged to memcg, the admin can simply set
    max_user_instances to INT_MAX and let the memcg limits of the jobs limit
    their inotify instances.
    
    Link: https://lore.kernel.org/r/20201220044608.1258123-1-shakeelb@google.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Shakeel Butt <shakeelb@google.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

inotify: Don't force FS_IN_IGNORED [+ + +]

Author: Gabriel Krisman Bertazi <krisman@collabora.com>
Date:   Mon Oct 25 16:27:22 2021 -0300

    inotify: Don't force FS_IN_IGNORED
    
    [ Upstream commit e0462f91d24756916fded4313d508e0fc52f39c9 ]
    
    According to Amir:
    
    "FS_IN_IGNORED is completely internal to inotify and there is no need
    to set it in i_fsnotify_mask at all, so if we remove the bit from the
    output of inotify_arg_to_mask() no functionality will change and we will
    be able to overload the event bit for FS_ERROR."
    
    This is done in preparation to overload FS_ERROR with the notification
    mechanism in fanotify.
    
    Link: https://lore.kernel.org/r/20211025192746.66445-8-krisman@collabora.com
    Suggested-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

inotify: Increase default inotify.max_user_watches limit to 1048576 [+ + +]

Author: Waiman Long <longman@redhat.com>
Date:   Sun Nov 8 22:59:31 2020 -0500

    inotify: Increase default inotify.max_user_watches limit to 1048576
    
    [ Upstream commit 92890123749bafc317bbfacbe0a62ce08d78efb7 ]
    
    The default value of inotify.max_user_watches sysctl parameter was set
    to 8192 since the introduction of the inotify feature in 2005 by
    commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
    small for many modern usage. As a result, users have to explicitly set
    it to a larger value to make it work.
    
    After some searching around the web, these are the
    inotify.max_user_watches values used by some projects:
     - vscode:  524288
     - dropbox support: 100000
     - users on stackexchange: 12228
     - lsyncd user: 2000000
     - code42 support: 1048576
     - monodevelop: 16384
     - tectonic: 524288
     - openshift origin: 65536
    
    Each watch point adds an inotify_inode_mark structure to an inode to
    be watched. It also pins the watched inode.
    
    Modeled after the epoll.max_user_watches behavior to adjust the default
    value according to the amount of addressable memory available, make
    inotify.max_user_watches behave in a similar way to make it use no more
    than 1% of addressable memory within the range [8192, 1048576].
    
    We estimate the amount of memory used by inotify mark to size of
    inotify_inode_mark plus two times the size of struct inode (we double
    the inode size to cover the additional filesystem private inode part).
    That means that a 64-bit system with 128GB or more memory will likely
    have the maximum value of 1048576 for inotify.max_user_watches. This
    default should be big enough for most use cases.
    
    Link: https://lore.kernel.org/r/20201109035931.4740-1-longman@redhat.com
    Reviewed-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

inotify: move control flags from mask to mark flags [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:13 2022 +0300

    inotify: move control flags from mask to mark flags
    
    [ Upstream commit 38035c04f5865c4ef9597d6beed6a7178f90f64a ]
    
    The inotify control flags in the mark mask (e.g. FS_IN_ONE_SHOT) are not
    relevant to object interest mask, so move them to the mark flags.
    
    This frees up some bits in the object interest mask.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-3-amir73il@gmail.com
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

inotify: use fsnotify group lock helpers [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:18 2022 +0300

    inotify: use fsnotify group lock helpers
    
    [ Upstream commit 642054b87058019be36033f73c3e48ffff1915aa ]
    
    inotify inode marks pin the inode so there is no need to set the
    FSNOTIFY_GROUP_NOFS flag.
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-8-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kallsyms: only build {,module_}kallsyms_on_each_symbol when required [+ + +]

Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Feb 2 13:13:27 2021 +0100

    kallsyms: only build {,module_}kallsyms_on_each_symbol when required
    
    [ Upstream commit 3e3552056ab42f883d7723eeb42fed712b66bacf ]
    
    kallsyms_on_each_symbol and module_kallsyms_on_each_symbol are only used
    by the livepatching code, so don't build them if livepatching is not
    enabled.
    
    Reviewed-by: Miroslav Benes <mbenes@suse.cz>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jessica Yu <jeyu@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kallsyms: refactor {,module_}kallsyms_on_each_symbol [+ + +]

Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Feb 2 13:13:26 2021 +0100

    kallsyms: refactor {,module_}kallsyms_on_each_symbol
    
    [ Upstream commit 013c1667cf78c1d847152f7116436d82dcab3db4 ]
    
    Require an explicit call to module_kallsyms_on_each_symbol to look
    for symbols in modules instead of the call from kallsyms_on_each_symbol,
    and acquire module_mutex inside of module_kallsyms_on_each_symbol instead
    of leaving that up to the caller.  Note that this slightly changes the
    behavior for the livepatch code in that the symbols from vmlinux are not
    iterated anymore if objname is set, but that actually is the desired
    behavior in this case.
    
    Reviewed-by: Petr Mladek <pmladek@suse.com>
    Acked-by: Miroslav Benes <mbenes@suse.cz>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jessica Yu <jeyu@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kcmp: In get_file_raw_ptr use task_lookup_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:30 2020 -0600

    kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
    
    [ Upstream commit ed77e80e14a3cd55c73848b9e8043020e717ce12 ]
    
    Modify get_file_raw_ptr to use task_lookup_fd_rcu.  The helper
    task_lookup_fd_rcu does the work of taking the task lock and verifying
    that task->files != NULL and then calls files_lookup_fd_rcu.  So let
    use the helper to make a simpler implementation of get_file_raw_ptr.
    
    Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
    Link: https://lkml.kernel.org/r/20201120231441.29911-13-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kcmp: In kcmp_epoll_target use fget_task [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:21 2020 -0600

    kcmp: In kcmp_epoll_target use fget_task
    
    [ Upstream commit f43c283a89a7dc531a47d4b1e001503cf3dc3234 ]
    
    Use the helper fget_task and simplify the code.
    
    As well as simplifying the code this removes one unnecessary increment of
    struct files_struct.  This unnecessary increment of files_struct.count can
    result in exec unnecessarily unsharing files_struct and breaking posix
    locks, and it can result in fget_light having to fallback to fget reducing
    performance.
    
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-4-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-4-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: Keep read and write fds with each nlm_file [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Aug 23 16:44:00 2021 -0400

    Keep read and write fds with each nlm_file
    
    [ Upstream commit 7f024fcd5c97dc70bb9121c80407cf3cf9be7159 ]
    
    We shouldn't really be using a read-only file descriptor to take a write
    lock.
    
    Most filesystems will put up with it.  But NFS, for example, won't.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kernel/pid.c: implement additional checks upon pidfd_create() parameters [+ + +]

Author: Matthew Bobrowski <repnop@google.com>
Date:   Sun Aug 8 15:25:05 2021 +1000

    kernel/pid.c: implement additional checks upon pidfd_create() parameters
    
    [ Upstream commit 490b9ba881e2c6337bb09b68010803ae98e59f4a ]
    
    By adding the pidfd_create() declaration to linux/pid.h, we
    effectively expose this function to the rest of the kernel. In order
    to avoid any unintended behavior, or set false expectations upon this
    function, ensure that constraints are forced upon each of the passed
    parameters. This includes the checking of whether the passed struct
    pid is a thread-group leader as pidfd creation is currently limited to
    such pid types.
    
    Link: https://lore.kernel.org/r/2e9b91c2d529d52a003b8b86c45f866153be9eb5.1628398044.git.repnop@google.com
    Signed-off-by: Matthew Bobrowski <repnop@google.com>
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kernel/pid.c: remove static qualifier from pidfd_create() [+ + +]

Author: Matthew Bobrowski <repnop@google.com>
Date:   Sun Aug 8 15:24:33 2021 +1000

    kernel/pid.c: remove static qualifier from pidfd_create()
    
    [ Upstream commit c576e0fcd6188d0edb50b0fb83f853433ef4819b ]
    
    With the idea of returning pidfds from the fanotify API, we need to
    expose a mechanism for creating pidfds. We drop the static qualifier
    from pidfd_create() and add its declaration to linux/pid.h so that the
    pidfd_create() helper can be called from other kernel subsystems
    i.e. fanotify.
    
    Link: https://lore.kernel.org/r/0c68653ec32f1b7143301f0231f7ed14062fd82b.1628398044.git.repnop@google.com
    Signed-off-by: Matthew Bobrowski <repnop@google.com>
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: Linux 5.10.220 [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Fri Jun 21 14:54:16 2024 +0200

    Linux 5.10.220
    
    Link: https://lore.kernel.org/r/20240618123407.280171066@linuxfoundation.org
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lockd: change the proc_handler for nsm_use_hostnames [+ + +]

Author: Jia He <hejianet@gmail.com>
Date:   Tue Aug 3 12:59:37 2021 +0200

    lockd: change the proc_handler for nsm_use_hostnames
    
    [ Upstream commit d02a3a2cb25d384005a6e3446a445013342024b7 ]
    
    nsm_use_hostnames is a module parameter and it will be exported to sysctl
    procfs. This is to let user sometimes change it from userspace. But the
    minimal unit for sysctl procfs read/write it sizeof(int).
    In big endian system, the converting from/to  bool to/from int will cause
    error for proc items.
    
    This patch use a new proc_handler proc_dobool to fix it.
    
    Signed-off-by: Jia He <hejianet@gmail.com>
    Reviewed-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
    [thuth: Fix typo in commit message]
    Signed-off-by: Thomas Huth <thuth@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Common NLM XDR helpers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:50:52 2021 -0400

    lockd: Common NLM XDR helpers
    
    [ Upstream commit a6a63ca5652ea05637ecfe349f9e895031529556 ]
    
    Add a .h file containing xdr_stream-based XDR helpers common to both
    NLMv3 and NLMv4.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Create a simplified .vs_dispatch method for NLM requests [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:50:46 2021 -0400

    lockd: Create a simplified .vs_dispatch method for NLM requests
    
    [ Upstream commit a9ad1a8090f58b2ed1774dd0f4c7cdb8210a3793 ]
    
    To enable xdr_stream-based encoding and decoding, create a bespoke
    RPC dispatch function for the lockd service.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: detect and reject lock arguments that overflow [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Aug 1 15:57:26 2022 -0400

    lockd: detect and reject lock arguments that overflow
    
    [ Upstream commit 6930bcbfb6ceda63e298c6af6d733ecdf6bd4cde ]
    
    lockd doesn't currently vet the start and length in nlm4 requests like
    it should, and can end up generating lock requests with arguments that
    overflow when passed to the filesystem.
    
    The NLM4 protocol uses unsigned 64-bit arguments for both start and
    length, whereas struct file_lock tracks the start and end as loff_t
    values. By the time we get around to calling nlm4svc_retrieve_args,
    we've lost the information that would allow us to determine if there was
    an overflow.
    
    Start tracking the actual start and len for NLM4 requests in the
    nlm_lock. In nlm4svc_retrieve_args, vet these values to ensure they
    won't cause an overflow, and return NLM4_FBIG if they do.
    
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=392
    Reported-by: Jan Kasiak <j.kasiak@gmail.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Cc: <stable@vger.kernel.org> # 5.14+
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: don't attempt blocking locks on nfs reexports [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Aug 20 17:02:05 2021 -0400

    lockd: don't attempt blocking locks on nfs reexports
    
    [ Upstream commit b840be2f00c0bc00d993f8f76e251052b83e4382 ]
    
    As in the v4 case, it doesn't work well to block waiting for a lock on
    an nfs filesystem.
    
    As in the v4 case, that means we're depending on the client to poll.
    It's probably incorrect to depend on that, but I *think* clients do poll
    in practice.  In any case, it's an improvement over hanging the lockd
    thread indefinitely as we currently are.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: drop inappropriate svc_get() from locked_get() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Sat Jun 3 07:14:14 2023 +1000

    lockd: drop inappropriate svc_get() from locked_get()
    
    [ Upstream commit 665e89ab7c5af1f2d260834c861a74b01a30f95f ]
    
    The below-mentioned patch was intended to simplify refcounting on the
    svc_serv used by locked.  The goal was to only ever have a single
    reference from the single thread.  To that end we dropped a call to
    lockd_start_svc() (except when creating thread) which would take a
    reference, and dropped the svc_put(serv) that would drop that reference.
    
    Unfortunately we didn't also remove the svc_get() from
    lockd_create_svc() in the case where the svc_serv already existed.
    So after the patch:
     - on the first call the svc_serv was allocated and the one reference
       was given to the thread, so there are no extra references
     - on subsequent calls svc_get() was called so there is now an extra
       reference.
    This is clearly not consistent.
    
    The inconsistency is also clear in the current code in lockd_get()
    takes *two* references, one on nlmsvc_serv and one by incrementing
    nlmsvc_users.   This clearly does not match lockd_put().
    
    So: drop that svc_get() from lockd_get() (which used to be in
    lockd_create_svc().
    
    Reported-by: Ido Schimmel <idosch@idosch.org>
    Closes: https://lore.kernel.org/linux-nfs/ZHsI%2FH16VX9kJQX1@shredder/T/#u
    Fixes: b73a2972041b ("lockd: move lockd_start_svc() call into lockd_create_svc()")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Tested-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: ensure we use the correct file descriptor when unlocking [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Nov 11 14:36:37 2022 -0500

    lockd: ensure we use the correct file descriptor when unlocking
    
    [ Upstream commit 69efce009f7df888e1fede3cb2913690eb829f52 ]
    
    Shared locks are set on O_RDONLY descriptors and exclusive locks are set
    on O_WRONLY ones. nlmsvc_unlock however calls vfs_lock_file twice, once
    for each descriptor, but it doesn't reset fl_file. Ensure that it does.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: fix failure to cleanup client locks [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Tue Jan 18 17:00:51 2022 -0500

    lockd: fix failure to cleanup client locks
    
    [ Upstream commit d19a7af73b5ecaac8168712d18be72b9db166768 ]
    
    In my testing, we're sometimes hitting the request->fl_flags & FL_EXISTS
    case in posix_lock_inode, presumably just by random luck since we're not
    actually initializing fl_flags here.
    
    This probably didn't matter before commit 7f024fcd5c97 ("Keep read and
    write fds with each nlm_file") since we wouldn't previously unlock
    unless we knew there were locks.
    
    But now it causes lockd to give up on removing more locks.
    
    We could just initialize fl_flags, but really it seems dubious to be
    calling vfs_lock_file with random values in some of the fields.
    
    Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [ cel: fixed checkpatch.pl nit ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: fix file selection in nlmsvc_cancel_blocked [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Nov 11 14:36:38 2022 -0500

    lockd: fix file selection in nlmsvc_cancel_blocked
    
    [ Upstream commit 9f27783b4dd235ef3c8dbf69fc6322777450323c ]
    
    We currently do a lock_to_openmode call based on the arguments from the
    NLM_UNLOCK call, but that will always set the fl_type of the lock to
    F_UNLCK, and the O_RDONLY descriptor is always chosen.
    
    Fix it to use the file_lock from the block instead.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: fix nlm_close_files [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Jul 11 14:30:14 2022 -0400

    lockd: fix nlm_close_files
    
    [ Upstream commit 1197eb5906a5464dbaea24cac296dfc38499cc00 ]
    
    This loop condition tries a bit too hard to be clever. Just test for
    the two indices we care about explicitly.
    
    Cc: J. Bruce Fields <bfields@fieldses.org>
    Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: fix server crash on reboot of client holding lock [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Tue Jan 18 17:00:16 2022 -0500

    lockd: fix server crash on reboot of client holding lock
    
    [ Upstream commit 6e7f90d163afa8fc2efd6ae318e7c20156a5621f ]
    
    I thought I was iterating over the array when actually the iteration is
    over the values contained in the array?
    
    Ugh, keep it simple.
    
    Symptoms were a null deference in vfs_lock_file() when an NFSv3 client
    that previously held a lock came back up and sent a notify.
    
    Reported-by: Jonathan Woithe <jwoithe@just42.net>
    Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: introduce lockd_put() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: introduce lockd_put()
    
    [ Upstream commit 865b674069e05e5779fcf8cf7a166d2acb7e930b ]
    
    There is some cleanup that is duplicated in lockd_down() and the failure
    path of lockd_up().
    Factor these out into a new lockd_put() and call it from both places.
    
    lockd_put() does *not* take the mutex - that must be held by the caller.
    It decrements nlmsvc_users and if that reaches zero, it cleans up.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: introduce nlmsvc_serv [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: introduce nlmsvc_serv
    
    [ Upstream commit 2840fe864c91a0fe822169b1fbfddbcac9aeac43 ]
    
    lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it
    wants the 'struct svc_serv', and when it doesn't want it exactly it can
    get to what it wants from the serv.
    
    This patch is a first step to removing nlmsvc_task and nlmsvc_rqst.  It
    introduces nlmsvc_serv to store the 'struct svc_serv*'.  This is set as
    soon as the serv is created, and cleared only when it is destroyed.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: move from strlcpy with unused retval to strscpy [+ + +]

Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date:   Thu Aug 18 23:01:16 2022 +0200

    lockd: move from strlcpy with unused retval to strscpy
    
    [ Upstream commit 97f8e62572555f8ad578d7b1739ba64d5d2cac0f ]
    
    Follow the advice of the below link and prefer 'strscpy' in this
    subsystem. Conversion is 1:1 because the return value is not used.
    Generated by a coccinelle script.
    
    Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: move lockd_start_svc() call into lockd_create_svc() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: move lockd_start_svc() call into lockd_create_svc()
    
    [ Upstream commit b73a2972041bee70eb0cbbb25fa77828c63c916b ]
    
    lockd_start_svc() only needs to be called once, just after the svc is
    created.  If the start fails, the svc is discarded too.
    
    It thus makes sense to call lockd_start_svc() from lockd_create_svc().
    This allows us to remove the test against nlmsvc_rqst at the start of
    lockd_start_svc() - it must always be NULL.
    
    lockd_up() only held an extra reference on the svc until a thread was
    created - then it dropped it.  The thread - and thus the extra reference
    - will remain until kthread_stop() is called.
    Now that the thread is created in lockd_create_svc(), the extra
    reference can be dropped there.  So the 'serv' variable is no longer
    needed in lockd_up().
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: move svc_exit_thread() into the thread [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: move svc_exit_thread() into the thread
    
    [ Upstream commit 6a4e2527a63620a820c4ebf3596b57176da26fb3 ]
    
    The normal place to call svc_exit_thread() is from the thread itself
    just before it exists.
    Do this for lockd.
    
    This means that nlmsvc_rqst is not used out side of lockd_start_svc(),
    so it can be made local to that function, and renamed to 'rqst'.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Remove stale comments [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:50:40 2021 -0400

    lockd: Remove stale comments
    
    [ Upstream commit 99cdf57b33e68df7afc876739c93a11f0b1ba807 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: rename lockd_create_svc() to lockd_get() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: rename lockd_create_svc() to lockd_get()
    
    [ Upstream commit ecd3ad68d2c6d3ae178a63a2d9a02c392904fd36 ]
    
    lockd_create_svc() already does an svc_get() if the service already
    exists, so it is more like a "get" than a "create".
    
    So:
     - Move the increment of nlmsvc_users into the function as well
     - rename to lockd_get().
    
    It is now the inverse of lockd_put().
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: set file_lock start and end when decoding nlm4 testargs [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Mar 14 06:20:58 2023 -0400

    lockd: set file_lock start and end when decoding nlm4 testargs
    
    [ Upstream commit 7ff84910c66c9144cc0de9d9deed9fb84c03aff0 ]
    
    Commit 6930bcbfb6ce dropped the setting of the file_lock range when
    decoding a nlm_lock off the wire. This causes the client side grant
    callback to miss matching blocks and reject the lock, only to rerequest
    it 30s later.
    
    Add a helper function to set the file_lock range from the start and end
    values that the protocol uses, and have the nlm_lock decoder call that to
    set up the file_lock args properly.
    
    Fixes: 6930bcbfb6ce ("lockd: detect and reject lock arguments that overflow")
    Reported-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Tested-by: Amir Goldstein <amir73il@gmail.com>
    Cc: stable@vger.kernel.org #6.0
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: set fl_owner when unlocking files [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Jul 11 14:30:13 2022 -0400

    lockd: set fl_owner when unlocking files
    
    [ Upstream commit aec158242b87a43d83322e99bc71ab4428e5ab79 ]
    
    Unlocking a POSIX lock on an inode with vfs_lock_file only works if
    the owner matches. Ensure we set it in the request.
    
    Cc: J. Bruce Fields <bfields@fieldses.org>
    Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: set missing fl_flags field when retrieving args [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Nov 11 14:36:36 2022 -0500

    lockd: set missing fl_flags field when retrieving args
    
    [ Upstream commit 75c7940d2a86d3f1b60a0a265478cb8fc887b970 ]
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: set other missing fields when unlocking files [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Sun Nov 6 14:02:39 2022 -0500

    lockd: set other missing fields when unlocking files
    
    [ Upstream commit 18ebd35b61b4693a0ddc270b6d4f18def232e770 ]
    
    vfs_lock_file() expects the struct file_lock to be fully initialised by
    the caller. Re-exported NFSv3 has been seen to Oops if the fl_file field
    is NULL.
    
    Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files")
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=216582
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: simplify management of network status notifiers [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: simplify management of network status notifiers
    
    [ Upstream commit 5a8a7ff57421b7de3ae72019938ffb5daaee36e7 ]
    
    Now that the network status notifiers use nlmsvc_serv rather then
    nlmsvc_rqst the management can be simplified.
    
    Notifier unregistration synchronises with any pending notifications so
    providing we unregister before nlm_serv is freed no further interlock
    is required.
    
    So we move the unregister call to just before the thread is killed
    (which destroys the service) and just before the service is destroyed in
    the failure-path of lockd_up().
    
    Then nlm_ntf_refcnt and nlm_ntf_wq can be removed.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: update nlm_lookup_file reexport comment [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Aug 20 17:02:02 2021 -0400

    lockd: update nlm_lookup_file reexport comment
    
    [ Upstream commit b661601a9fdf1af8516e1100de8bba84bd41cca4 ]
    
    Update comment to reflect that we *do* allow reexport, whether it's a
    good idea or not....
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 CANCEL arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:16 2021 -0400

    lockd: Update the NLMv1 CANCEL arguments decoder to use struct xdr_stream
    
    [ Upstream commit f4e08f3ac8c4945ea54a740e3afcf44b34e7cf44 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 FREE_ALL arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:46 2021 -0400

    lockd: Update the NLMv1 FREE_ALL arguments decoder to use struct xdr_stream
    
    [ Upstream commit 14e105256b9dcdf50a003e2e9a0da77e06770a4b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 LOCK arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:10 2021 -0400

    lockd: Update the NLMv1 LOCK arguments decoder to use struct xdr_stream
    
    [ Upstream commit c1adb8c672ca2b085c400695ef064547d77eda29 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 nlm_res arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:28 2021 -0400

    lockd: Update the NLMv1 nlm_res arguments decoder to use struct xdr_stream
    
    [ Upstream commit 16ddcabe6240c4fb01c97f6fce6c35ddf8626ad5 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:04 2021 -0400

    lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream
    
    [ Upstream commit e96735a6980574ecbdb24c760b8d294095e47074 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 SHARE arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:40 2021 -0400

    lockd: Update the NLMv1 SHARE arguments decoder to use struct xdr_stream
    
    [ Upstream commit 890939e1266b9adf3b0acd5e0385b39813cb8f11 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:10 2021 -0400

    lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream
    
    [ Upstream commit 529ca3a116e8978575fec061a71fa6865a344891 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 SM_NOTIFY arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:34 2021 -0400

    lockd: Update the NLMv1 SM_NOTIFY arguments decoder to use struct xdr_stream
    
    [ Upstream commit 137e05e2f735f696e117553f7fa5ef8fb09953e1 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 TEST arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:04 2021 -0400

    lockd: Update the NLMv1 TEST arguments decoder to use struct xdr_stream
    
    [ Upstream commit 2fd0c67aabcf0f8821450b00ee511faa0b7761bf ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:58 2021 -0400

    lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream
    
    [ Upstream commit adf98a4850b9ede9fc174c78a885845fb08499a5 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 UNLOCK arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:22 2021 -0400

    lockd: Update the NLMv1 UNLOCK arguments decoder to use struct xdr_stream
    
    [ Upstream commit c27045d302b022ed11d24a2653bceb6af56c6327 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 void argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:50:58 2021 -0400

    lockd: Update the NLMv1 void argument decoder to use struct xdr_stream
    
    [ Upstream commit cc1029b51273da5b342683e9ae14ab4eeaa15997 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv1 void results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:51:52 2021 -0400

    lockd: Update the NLMv1 void results encoder to use struct xdr_stream
    
    [ Upstream commit e26ec898b68b2ab64f379ba0fc0a615b2ad41f40 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:34 2021 -0400

    lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream
    
    [ Upstream commit 1e1f38dcf3c031715191e1fd26f70a0affca4dbd ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:53:04 2021 -0400

    lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream
    
    [ Upstream commit 3049e974a7c7cfa0c15fb807f4a3e75b2ab8517a ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:28 2021 -0400

    lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream
    
    [ Upstream commit 0e5977af4fdc277984fca7d8c2e0c880935775a0 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:46 2021 -0400

    lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream
    
    [ Upstream commit b4c24b5a41da63e5f3a9b6ea56cbe2a1efe49579 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:53:23 2021 -0400

    lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream
    
    [ Upstream commit 447c14d48968d0d4c2733c3f8052cb63aa1deb38 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:58 2021 -0400

    lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream
    
    [ Upstream commit 7cf96b6d0104b12aa30961901879e428884b1695 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:53:29 2021 -0400

    lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream
    
    [ Upstream commit 0ff5b50ab1f7f39862d0cdf6803978d31b27f25e ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:52 2021 -0400

    lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream
    
    [ Upstream commit bc3665fd718b325cfff3abd383b00d1a87e028dc ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:22 2021 -0400

    lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream
    
    [ Upstream commit 345b4159a075b15dc4ae70f1db90fa8abf85d2e7 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:53:17 2021 -0400

    lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream
    
    [ Upstream commit 1beef1473ccaa70a2d54f9e76fba5f534931ea23 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:40 2021 -0400

    lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream
    
    [ Upstream commit d76d8c25cea794f65615f3a2324052afa4b5f900 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:52:16 2021 -0400

    lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream
    
    [ Upstream commit 7956521aac58e434a05cf3c68c1b66c1312e5649 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: Update the NLMv4 void results encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jun 3 16:53:11 2021 -0400

    lockd: Update the NLMv4 void results encoder to use struct xdr_stream
    
    [ Upstream commit ec757e423b4fcd6e5ea4405d1e8243c040458d78 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: use locks_inode_context helper [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 16 09:19:43 2022 -0500

    lockd: use locks_inode_context helper
    
    [ Upstream commit 98b41ffe0afdfeaa1439a5d6bd2db4a94277e31b ]
    
    lockd currently doesn't access i_flctx safely. This requires a
    smp_load_acquire, as the pointer is set via cmpxchg (a release
    operation).
    
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Anna Schumaker <anna@kernel.org>
    Cc: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

lockd: use svc_set_num_threads() for thread start and stop [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    lockd: use svc_set_num_threads() for thread start and stop
    
    [ Upstream commit 6b044fbaab02292fedb17565dbb3f2528083b169 ]
    
    svc_set_num_threads() does everything that lockd_start_svc() does, except
    set sv_maxconn.  It also (when passed 0) finds the threads and
    stops them with kthread_stop().
    
    So move the setting for sv_maxconn, and use svc_set_num_thread()
    
    We now don't need nlmsvc_task.
    
    Now that we use svc_set_num_threads() it makes sense to set svo_module.
    This request that the thread exists with module_put_and_exit().
    Also fix the documentation for svo_module to make this explicit.
    
    svc_prepare_thread is now only used where it is defined, so it can be
    made static.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: address merge conflict with fd2468fa1301 ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

module: unexport find_module and module_mutex [+ + +]

Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Feb 2 13:13:24 2021 +0100

    module: unexport find_module and module_mutex
    
    [ Upstream commit 089049f6c9956c5cf1fc89fe10229c76e99f4bef ]
    
    find_module is not used by modular code any more, and random driver code
    has no business calling it to start with.
    
    Reviewed-by: Miroslav Benes <mbenes@suse.cz>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jessica Yu <jeyu@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

module: use RCU to synchronize find_module [+ + +]

Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Feb 2 13:13:25 2021 +0100

    module: use RCU to synchronize find_module
    
    [ Upstream commit a006050575745ca2be25118b90f1c37f454ac542 ]
    
    Allow for a RCU-sched critical section around find_module, following
    the lower level find_module_all helper, and switch the two callers
    outside of module.c to use such a RCU-sched critical section instead
    of module_mutex.
    
    Reviewed-by: Petr Mladek <pmladek@suse.com>
    Acked-by: Miroslav Benes <mbenes@suse.cz>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jessica Yu <jeyu@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

namei: introduce struct renamedata [+ + +]

Author: Christian Brauner <brauner@kernel.org>
Date:   Thu Jan 21 14:19:32 2021 +0100

    namei: introduce struct renamedata
    
    [ Upstream commit 9fe61450972d3900bffb1dc26a17ebb9cdd92db2 ]
    
    In order to handle idmapped mounts we will extend the vfs rename helper
    to take two new arguments in follow up patches. Since this operations
    already takes a bunch of arguments add a simple struct renamedata and
    make the current helper use it before we extend it.
    
    Link: https://lore.kernel.org/r/20210121131959.646623-14-christian.brauner@ubuntu.com
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: David Howells <dhowells@redhat.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: linux-fsdevel@vger.kernel.org
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: Add a private local dispatcher for NFSv4 callback operations [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jul 15 15:52:25 2021 -0400

    NFS: Add a private local dispatcher for NFSv4 callback operations
    
    [ Upstream commit 7d34c96217cf3c2d37ca0a56ca0bc3c3bef1e189 ]
    
    The client's NFSv4 callback service is the only remaining user of
    svc_generic_dispatch().
    
    Note that the NFSv4 callback service doesn't use the .pc_encode and
    .pc_decode callouts in any substantial way, so they are removed.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfs: block notification on fs with its own ->lock [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Dec 16 12:20:13 2021 -0500

    nfs: block notification on fs with its own ->lock
    
    [ Upstream commit 40595cdc93edf4110c0f0c0b06f8d82008f23929 ]
    
    NFSv4.1 supports an optional lock notification feature which notifies
    the client when a lock comes available.  (Normally NFSv4 clients just
    poll for locks if necessary.)  To make that work, we need to request a
    blocking lock from the filesystem.
    
    We turned that off for NFS in commit f657f8eef3ff ("nfs: don't atempt
    blocking locks on nfs reexports") [sic] because it actually blocks the
    nfsd thread while waiting for the lock.
    
    Thanks to Vasily Averin for pointing out that NFS isn't the only
    filesystem with that problem.
    
    Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
    does the right thing.  Simplest is just to assume that any filesystem
    that defines its own ->lock is not safe to request a blocking lock from.
    
    So, this patch mostly reverts commit f657f8eef3ff ("nfs: don't atempt
    blocking locks on nfs reexports") [sic] and commit b840be2f00c0 ("lockd:
    don't attempt blocking locks on nfs reexports"), and instead uses a
    check of ->lock (Vasily's suggestion) to decide whether to support
    blocking lock notifications on a given filesystem.  Also add a little
    documentation.
    
    Perhaps someday we could add back an export flag later to allow
    filesystems with "good" ->lock methods to support blocking lock
    notifications.
    
    Reported-by: Vasily Averin <vvs@virtuozzo.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [ cel: Description rewritten to address checkpatch nits ]
    [ cel: Fixed warning when SUNRPC debugging is disabled ]
    [ cel: Fixed NULL check ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Vasily Averin <vvs@virtuozzo.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfs: don't allow reexport reclaims [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Aug 20 17:02:06 2021 -0400

    nfs: don't allow reexport reclaims
    
    [ Upstream commit bb0a55bb7148a49e549ee992200860e7a040d3a5 ]
    
    In the reexport case, nfsd is currently passing along locks with the
    reclaim bit set.  The client sends a new lock request, which is granted
    if there's currently no conflict--even if it's possible a conflicting
    lock could have been briefly held in the interim.
    
    We don't currently have any way to safely grant reclaim, so for now
    let's just deny them all.
    
    I'm doing this by passing the reclaim bit to nfs and letting it fail the
    call, with the idea that eventually the client might be able to do
    something more forgiving here.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfs: don't atempt blocking locks on nfs reexports [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Aug 20 17:02:04 2021 -0400

    nfs: don't atempt blocking locks on nfs reexports
    
    [ Upstream commit f657f8eef3ff870552c9fd2839e0061046f44618 ]
    
    NFS implements blocking locks by blocking inside its lock method.  In
    the reexport case, this blocks the nfs server thread, which could lead
    to deadlocks since an nfs server thread might be required to unlock the
    conflicting lock.  It also causes a crash, since the nfs server thread
    assumes it can free the lock when its lm_notify lock callback is called.
    
    Ideal would be to make the nfs lock method return without blocking in
    this case, but for now it works just not to attempt blocking locks.  The
    difference is just that the original client will have to poll (as it
    does in the v4.0 case) instead of getting a callback when the lock's
    available.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: fix nfs_fetch_iversion() [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Wed Mar 24 15:32:21 2021 -0400

    NFS: fix nfs_fetch_iversion()
    
    [ Upstream commit b876d708316bf9b6b9678eb2beb289b93cfe6369 ]
    
    The change attribute is always set by all NFS client versions so get rid
    of the open-coded version.
    
    Fixes: 3cc55f4434b4 ("nfs: use change attribute for NFS re-exports")
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: Remove unused callback void decoder [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jul 15 15:52:31 2021 -0400

    NFS: Remove unused callback void decoder
    
    [ Upstream commit c35a810ce59524971c4a3b45faed4d0121e5a305 ]
    
    Clean up: The callback RPC dispatcher no longer invokes these call
    outs, although svc_process_common() relies on seeing a .pc_encode
    function.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: restore module put when manager exits. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Thu Jun 23 14:47:34 2022 +1000

    NFS: restore module put when manager exits.
    
    [ Upstream commit 080abad71e99d2becf38c978572982130b927a28 ]
    
    Commit f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module") removed
    calls to module_put_and_kthread_exit() from threads that acted as SUNRPC
    servers and had a related svc_serv_ops structure.  This was correct.
    
    It ALSO removed the module_put_and_kthread_exit() call from
    nfs4_run_state_manager() which is NOT a SUNRPC service.
    
    Consequently every time the NFSv4 state manager runs the module count
    increments and won't be decremented.  So the nfsv4 module cannot be
    unloaded.
    
    So restore the module_put_and_kthread_exit() call.
    
    Fixes: f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: switch the callback service back to non-pooled. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    NFS: switch the callback service back to non-pooled.
    
    [ Upstream commit 23a1a573c61ccb5e7829c1f5472d3e025293a031 ]
    
    Now that thread management is consistent there is no need for
    nfs-callback to use svc_create_pooled() as introduced in Commit
    df807fffaabd ("NFSv4.x/callback: Create the callback service through
    svc_create_pooled").  So switch back to svc_create().
    
    If service pools were configured, but the number of threads were left at
    '1', nfs callback may not work reliably when svc_create_pooled() is used.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfs: use change attribute for NFS re-exports [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Jan 29 14:26:29 2021 -0500

    nfs: use change attribute for NFS re-exports
    
    [ Upstream commit 3cc55f4434b421d37300aa9a167ace7d60b45ccf ]
    
    When exporting NFS, we may as well use the real change attribute
    returned by the original server instead of faking up a change attribute
    from the ctime.
    
    Note we can't do that by setting I_VERSION--that would also turn on the
    logic in iversion.h which treats the lower bit specially, and that
    doesn't make sense for NFS.
    
    So instead we define a new export operation for filesystems like NFS
    that want to manage the change attribute themselves.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: NFSD add vfs_fsync after async copy is done [+ + +]

Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Wed May 19 14:48:27 2021 -0400

    NFSD add vfs_fsync after async copy is done
    
    [ Upstream commit eac0b17a77fbd763d305a5eaa4fd1119e5a0fe0d ]
    
    Currently, the server does all copies as NFS_UNSTABLE. For synchronous
    copies linux client will append a COMMIT to the COPY compound but for
    async copies it does not (because COMMIT needs to be done after all
    bytes are copied and not as a reply to the COPY operation).
    
    However, in order to save the client doing a COMMIT as a separate
    rpc, the server can reply back with NFS_FILE_SYNC copy. This patch
    proposed to add vfs_fsync() call at the end of the async copy.
    
    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: NFSD enforce filehandle check for source file in COPY [+ + +]

Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Fri Aug 19 15:16:36 2022 -0400

    NFSD enforce filehandle check for source file in COPY
    
    [ Upstream commit 754035ff79a14886e68c0c9f6fa80adb21f12b53 ]
    
    If the passed in filehandle for the source file in the COPY operation
    is not a regular file, the server MUST return NFS4ERR_WRONG_TYPE.
    
    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd/nfs3: remove unused macro nfsd3_fhandleres [+ + +]

Author: Alex Shi <alexs@kernel.org>
Date:   Fri Nov 6 13:40:57 2020 +0800

    nfsd/nfs3: remove unused macro nfsd3_fhandleres
    
    [ Upstream commit 71fd721839a74d945c242299f6be29a246fc2131 ]
    
    The macro is unused, remove it to tame gcc warning:
    fs/nfsd/nfs3proc.c:702:0: warning: macro "nfsd3_fhandleres" is not used
    [-Wunused-macros]
    
    Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
    Cc: "J. Bruce Fields" <bfields@fieldses.org>
    Cc: Chuck Lever <chuck.lever@oracle.com>
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd4: add refcount for nfsd4_blocked_lock [+ + +]

Author: Vasily Averin <vasily.averin@linux.dev>
Date:   Fri Dec 17 09:49:39 2021 +0300

    nfsd4: add refcount for nfsd4_blocked_lock
    
    [ Upstream commit 47446d74f1707049067fee038507cdffda805631 ]
    
    nbl allocated in nfsd4_lock can be released by a several ways:
    directly in nfsd4_lock(), via nfs4_laundromat(), via another nfs
    command RELEASE_LOCKOWNER or via nfsd4_callback.
    This structure should be refcounted to be used and released correctly
    in all these cases.
    
    Refcount is initialized to 1 during allocation and is incremented
    when nbl is added into nbl_list/nbl_lru lists.
    
    Usually nbl is linked into both lists together, so only one refcount
    is used for both lists.
    
    However nfsd4_lock() should keep in mind that nbl can be present
    in one of lists only. This can happen if nbl was handled already
    by nfs4_laundromat/nfsd4_callback/etc.
    
    Refcount is decremented if vfs_lock_file() returns FILE_LOCK_DEFERRED,
    because nbl can be handled already by nfs4_laundromat/nfsd4_callback/etc.
    
    Refcount is not changed in find_blocked_lock() because of it reuses counter
    released after removing nbl from lists.
    
    Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd4: don't query change attribute in v2/v3 case [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Nov 30 17:46:17 2020 -0500

    nfsd4: don't query change attribute in v2/v3 case
    
    [ Upstream commit 942b20dc245590327ee0187c15c78174cd96dd52 ]
    
    inode_query_iversion() has side effects, and there's no point calling it
    when we're not even going to use it.
    
    We check whether we're currently processing a v4 request by checking
    fh_maxsize, which is arguably a little hacky; we could add a flag to
    svc_fh instead.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd4: Expose the callback address and state of each NFS4 client [+ + +]

Author: Dave Wysochanski <dwysocha@redhat.com>
Date:   Wed Jun 2 13:51:39 2021 -0400

    nfsd4: Expose the callback address and state of each NFS4 client
    
    [ Upstream commit 3518c8666f15cdd5d38878005dab1d589add1c19 ]
    
    In addition to the client's address, display the callback channel
    state and address in the 'info' file.
    
    Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd4: remove obselete comment [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Tue Oct 26 12:56:55 2021 -0400

    nfsd4: remove obselete comment
    
    [ Upstream commit 80479eb862102f9513e93fcf726c78cc0be2e3b2 ]
    
    Mandatory locking has been removed.  And the rest of this comment is
    redundant with the code.
    
    Reported-by: Jeff layton <jlayton@kernel.org>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd4: simplify process_lookup1 [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:37 2021 -0500

    nfsd4: simplify process_lookup1
    
    [ Upstream commit 33311873adb0d55c287b164117b5b4bb7b1bdc40 ]
    
    This STALE_CLIENTID check is redundant with the one in
    lookup_clientid().
    
    There's a difference in behavior is in case of memory allocation
    failure, which I think isn't a big deal.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: A semicolon is not needed after a switch statement. [+ + +]

Author: Tom Rix <trix@redhat.com>
Date:   Sun Nov 1 07:32:34 2020 -0800

    NFSD: A semicolon is not needed after a switch statement.
    
    [ Upstream commit 25fef48bdbe7cac5ba5577eab6a750e1caea43bc ]
    
    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a couple more nfsd_clid_expired call sites [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:06 2021 -0400

    NFSD: Add a couple more nfsd_clid_expired call sites
    
    [ Upstream commit 2958d2ee71021b6c44212ec6c2a39cc71d9cd4a9 ]
    
    Improve observation of NFSv4 lease expiry.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a helper that encodes NFSv3 directory offset cookies [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 13 16:53:17 2020 -0500

    NFSD: Add a helper that encodes NFSv3 directory offset cookies
    
    [ Upstream commit d52532002ffa217ad3fa4c3ba86c95203d21dd21 ]
    
    Refactor: Add helper function similar to nfs3svc_encode_cookie3().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a helper that encodes NFSv3 directory offset cookies [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 10 09:57:14 2020 -0500

    NFSD: Add a helper that encodes NFSv3 directory offset cookies
    
    [ Upstream commit a161e6c76aeba835e475a2f27dbbe5c37e565e94 ]
    
    Refactor: De-duplicate identical code that handles encoding of
    directory offset cookies across page boundaries.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a helper to decode channel_attrs4 [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 15:35:05 2020 -0500

    NFSD: Add a helper to decode channel_attrs4
    
    [ Upstream commit 3a3f1fbacb0960b628e5a9f07c78287312f7a99d ]
    
    De-duplicate some code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a helper to decode nfs_impl_id4 [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 15:21:55 2020 -0500

    NFSD: Add a helper to decode nfs_impl_id4
    
    [ Upstream commit 10ff84228197f47401833495ba19a50131323b4a ]
    
    Refactor for clarity.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a helper to decode state_protect4_a [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 2 15:19:12 2020 -0500

    NFSD: Add a helper to decode state_protect4_a
    
    [ Upstream commit 523ec6ed6fb80fd1537d748a06bffd060a8b3235 ]
    
    Refactor for clarity.
    
    Also, remove a stale comment. Commit ed94164398c9 ("nfsd: implement
    machine credential support for some operations") added support for
    SP4_MACH_CRED, so state_protect_a is no longer completely ignored.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a mechanism to wait for a DELEGRETURN [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 8 18:14:00 2022 -0400

    NFSD: Add a mechanism to wait for a DELEGRETURN
    
    [ Upstream commit c035362eb935fe9381d9d1cc453bc2a37460e24c ]
    
    Subsequent patches will use this mechanism to wake up an operation
    that is waiting for a client to return a delegation.
    
    The new tracepoint records whether the wait timed out or was
    properly awoken by the expected DELEGRETURN:
    
                nfsd-1155  [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out)
    
    Suggested-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations [+ + +]

Author: Jeff Layton <jeff.layton@primarydata.com>
Date:   Mon Nov 30 17:03:14 2020 -0500

    nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations
    
    [ Upstream commit daab110e47f8d7aa6da66923e3ac1a8dbd2b2a72 ]
    
    With NFSv3 nfsd will always attempt to send along WCC data to the
    client. This generally involves saving off the in-core inode information
    prior to doing the operation on the given filehandle, and then issuing a
    vfs_getattr to it after the op.
    
    Some filesystems (particularly clustered or networked ones) have an
    expensive ->getattr inode operation. Atomicity is also often difficult
    or impossible to guarantee on such filesystems. For those, we're best
    off not trying to provide WCC information to the client at all, and to
    simply allow it to poll for that information as needed with a GETATTR
    RPC.
    
    This patch adds a new flags field to struct export_operations, and
    defines a new EXPORT_OP_NOWCC flag that filesystems can use to indicate
    that nfsd should not attempt to provide WCC info in NFSv3 replies. It
    also adds a blurb about the new flags field and flag to the exporting
    documentation.
    
    The server will also now skip collecting this information for NFSv2 as
    well, since that info is never used there anyway.
    
    Note that this patch does not add this flag to any filesystem
    export_operations structures. This was originally developed to allow
    reexporting nfs via nfsd.
    
    Other filesystems may want to consider enabling this flag too. It's hard
    to tell however which ones have export operations to enable export via
    knfsd and which ones mostly rely on them for open-by-filehandle support,
    so I'm leaving that up to the individual maintainers to decide. I am
    cc'ing the relevant lists for those filesystems that I think may want to
    consider adding this though.
    
    Cc: HPDD-discuss@lists.01.org
    Cc: ceph-devel@vger.kernel.org
    Cc: cluster-devel@redhat.com
    Cc: fuse-devel@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a nfsd4_file_hash_remove() helper [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:34 2022 -0400

    NFSD: Add a nfsd4_file_hash_remove() helper
    
    [ Upstream commit 3341678f2fd6106055cead09e513fad6950a0d19 ]
    
    Refactor to relocate hash deletion operation to a helper function
    that is close to most other nfs4_file data structure operations.
    
    The "noinline" annotation will become useful in a moment when the
    hlist_del_rcu() is replaced with a more complex rhash remove
    operation. It also guarantees that hash remove operations can be
    traced with "-p function -l remove_nfs4_file_locked".
    
    This also simplifies the organization of forward declarations: the
    to-be-added rhashtable and its param structure will be defined
    /after/ put_nfs4_file().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a separate decoder for ssv_sp_parms [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 11:17:50 2020 -0500

    NFSD: Add a separate decoder for ssv_sp_parms
    
    [ Upstream commit 547bfeb4cd8d491aabbd656d5a6f410cb4249b4e ]
    
    Refactor for clarity.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a separate decoder to handle state_protect_ops [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 11:13:00 2020 -0500

    NFSD: Add a separate decoder to handle state_protect_ops
    
    [ Upstream commit 2548aa784d760567c2a77cbd8b7c55b211167c37 ]
    
    Refactor for clarity and de-duplication of code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Add a tracepoint for errors in nfsd4_clone_file_range() [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Sat Dec 18 20:38:00 2021 -0500

    nfsd: Add a tracepoint for errors in nfsd4_clone_file_range()
    
    [ Upstream commit a2f4c3fa4db94ba44d32a72201927cfd132a8e82 ]
    
    Since a clone error commit can cause the boot verifier to change,
    we should trace those errors.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    [ cel: Addressed a checkpatch.pl splat in fs/nfsd/vfs.h ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add a tracepoint to record directory entry encoding [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Mar 5 13:57:40 2021 -0500

    NFSD: Add a tracepoint to record directory entry encoding
    
    [ Upstream commit 6019ce0742ca55d3e45279a19b07d1542747a098 ]
    
    Enable watching the progress of directory encoding to capture the
    timing of any issues with reading or encoding a directory. The
    new tracepoint captures dirent encoding for all NFS versions.
    
    For example, here's what a few NFSv4 directory entries might look
    like:
    
    nfsd-989   [002]   468.596265: nfsd_dirent:          fh_hash=0x5d162594 ino=2 name=.
    nfsd-989   [002]   468.596267: nfsd_dirent:          fh_hash=0x5d162594 ino=1 name=..
    nfsd-989   [002]   468.596299: nfsd_dirent:          fh_hash=0x5d162594 ino=3827 name=zlib.c
    nfsd-989   [002]   468.596325: nfsd_dirent:          fh_hash=0x5d162594 ino=3811 name=xdiff
    nfsd-989   [002]   468.596351: nfsd_dirent:          fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h
    nfsd-989   [002]   468.596377: nfsd_dirent:          fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an nfsd4_encode_nfstime4() helper [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jun 12 10:13:39 2023 -0400

    NFSD: Add an nfsd4_encode_nfstime4() helper
    
    [ Upstream commit 262176798b18b12fd8ab84c94cfece0a6a652476 ]
    
    Clean up: de-duplicate some common code.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Acked-by: Tom Talpey <tom@talpey.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an nfsd4_read::rd_eof field [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:08:57 2022 -0400

    NFSD: Add an nfsd4_read::rd_eof field
    
    [ Upstream commit 24c7fb85498eda1d4c6b42cc4886328429814990 ]
    
    Refactor: Make the EOF result available in the entire NFSv4 READ
    path.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an nfsd_cb_lm_notify tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:08 2021 -0400

    NFSD: Add an nfsd_cb_lm_notify tracepoint
    
    [ Upstream commit 2cde7f8118f0fea29ad73ddcf28817f95adeffd5 ]
    
    When the server kicks off a CB_LM_NOTIFY callback, record its
    arguments so we can better observe asynchronous locking behavior.
    For example:
    
                nfsd-998   [002]  1471.705873: nfsd_cb_notify_lock:  addr=192.168.2.51:0 client 6092a47c:35a43fc1 fh_hash=0x8950b23a
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Cc: Jeff Layton <jlayton@redhat.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an nfsd_cb_offload tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:14 2021 -0400

    NFSD: Add an nfsd_cb_offload tracepoint
    
    [ Upstream commit 87512386e951ee28ba2e7ef32b843ac97621d371 ]
    
    Record the arguments of CB_OFFLOAD callbacks so we can better
    observe asynchronous copy-offload behavior. For example:
    
    nfsd-995   [008]  7721.934222: nfsd_cb_offload:
            addr=192.168.2.51:0 client 6092a47c:35a43fc1 fh_hash=0x8739113a
            count=116528 status=0
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Cc: Olga Kornievskaia <kolga@netapp.com>
    Cc: Dai Ngo <Dai.Ngo@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an nfsd_cb_probe tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:26 2021 -0400

    NFSD: Add an nfsd_cb_probe tracepoint
    
    [ Upstream commit 4ade892ae1c35527584decb7fa026553d53cd03f ]
    
    Record a tracepoint event when the server performs a callback
    probe. This event can be enabled as a group with other nfsd_cb
    tracepoints.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an nfsd_file_fsync tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 3 16:22:48 2022 -0400

    NFSD: Add an nfsd_file_fsync tracepoint
    
    [ Upstream commit d7064eaf688cfe454c50db9f59298463d80d403c ]
    
    Add a tracepoint to capture the number of filecache-triggered fsync
    calls and which files needed it. Also, record when an fsync triggers
    a write verifier reset.
    
    Examples:
    
    <...>-97    [007]   262.505611: nfsd_file_free:       inode=0xffff888171e08140 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d2400
    <...>-97    [007]   262.505612: nfsd_file_fsync:      inode=0xffff888171e08140 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d2400 ret=0
    <...>-97    [007]   262.505623: nfsd_file_free:       inode=0xffff888171e08dc0 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d1e00
    <...>-97    [007]   262.505624: nfsd_file_fsync:      inode=0xffff888171e08dc0 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d1e00 ret=0
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:46:51 2022 -0400

    NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
    
    [ Upstream commit 4d1ea8455716ca070e3cd85767e6f6a562a58b1b ]
    
    NFSv4 operations manage the lifetime of nfsd_file items they use by
    means of NFSv4 OPEN and CLOSE. Hence there's no need for them to be
    garbage collected.
    
    Introduce a mechanism to enable garbage collection for nfsd_file
    items used only by NFSv2/3 callers.
    
    Note that the change in nfsd_file_put() ensures that both CLOSE and
    DELEGRETURN will actually close out and free an nfsd_file on last
    reference of a non-garbage-collected file.
    
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=394
    Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an RPC authflavor tracepoint display helper [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:55:23 2021 -0400

    NFSD: Add an RPC authflavor tracepoint display helper
    
    [ Upstream commit 87b2394d60c32c158ebb96ace4abee883baf1239 ]
    
    To be used in subsequent patches.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an xdr_stream-based decoder for NFSv2/3 ACLs [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 10:38:46 2020 -0500

    NFSD: Add an xdr_stream-based decoder for NFSv2/3 ACLs
    
    [ Upstream commit 6bb844b4eb6e3b109a2fdaffb60e6da722dc4356 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 14:55:05 2020 -0500

    NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs
    
    [ Upstream commit 8edc0648880a151026fe625fa1b76772b5766f68 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add cb_lost tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:43 2021 -0400

    NFSD: Add cb_lost tracepoint
    
    [ Upstream commit 806d65b617d89be887fe68bfa051f78143669cd7 ]
    
    Provide more clarity about when the callback channel is in trouble.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add common helpers to decode void args and encode void results [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 5 14:48:29 2020 -0500

    NFSD: Add common helpers to decode void args and encode void results
    
    [ Upstream commit 788f7183fba86b46074c16e7d57ea09302badff4 ]
    
    Start off the conversion to xdr_stream by de-duplicating the functions
    that decode void arguments and encode void results.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add courteous server support for thread with only delegation [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:21 2022 -0700

    NFSD: add courteous server support for thread with only delegation
    
    [ Upstream commit 66af25799940b26efd41ea6e648f75c41a48a2c2 ]
    
    This patch provides courteous server support for delegation only.
    Only expired client with delegation but no conflict and no open
    or lock state is allowed to be in COURTESY state.
    
    Delegation conflict with COURTESY/EXPIRABLE client is resolved by
    setting it to EXPIRABLE, queue work for the laundromat and return
    delay to the caller. Conflict is resolved when the laudromat runs
    and expires the EXIRABLE client while the NFS client retries the
    OPEN request. Local thread request that gets conflict is doing the
    retry in _break_lease.
    
    Client in COURTESY or EXPIRABLE state is allowed to reconnect and
    continues to have access to its state. Access to the nfs4_client by
    the reconnecting thread and the laundromat is serialized via the
    client_lock.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add delegation reaper to react to low memory condition [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Nov 16 19:44:47 2022 -0800

    NFSD: add delegation reaper to react to low memory condition
    
    [ Upstream commit 44df6f439a1790a5f602e3842879efa88f346672 ]
    
    The delegation reaper is called by nfsd memory shrinker's on
    the 'count' callback. It scans the client list and sends the
    courtesy CB_RECALL_ANY to the clients that hold delegations.
    
    To avoid flooding the clients with CB_RECALL_ANY requests, the
    delegation reaper sends only one CB_RECALL_ANY request to each
    client per 5 seconds.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    [ cel: moved definition of RCA4_TYPE_MASK_RDATA_DLG ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add documenting comment for nfsd4_release_lockowner() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun May 22 12:34:38 2022 -0400

    NFSD: Add documenting comment for nfsd4_release_lockowner()
    
    [ Upstream commit 043862b09cc00273e35e6c3a6389957953a34207 ]
    
    And return explicit nfserr values that match what is documented in the
    new comment / API contract.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Add errno mapping for EREMOTEIO [+ + +]

Author: Jeff Layton <jeff.layton@primarydata.com>
Date:   Sat Dec 18 20:37:55 2021 -0500

    nfsd: Add errno mapping for EREMOTEIO
    
    [ Upstream commit a2694e51f60c5a18c7e43d1a9feaa46d7f153e65 ]
    
    The NFS client can occasionally return EREMOTEIO when signalling issues
    with the server.  ...map to NFSERR_IO.
    
    Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper for decoding locker4 [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:16:52 2020 -0500

    NFSD: Add helper for decoding locker4
    
    [ Upstream commit 8918cc0d2b72db9997390626010b182c4500d749 ]
    
    Refactor for clarity.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper to decode NFSv4 verifiers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:34:01 2020 -0500

    NFSD: Add helper to decode NFSv4 verifiers
    
    [ Upstream commit 796dd1c6b680959ac968b52aa507911b288b1749 ]
    
    This helper will be used to simplify decoders in subsequent
    patches.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper to decode OPEN's createhow4 argument [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:37:42 2020 -0500

    NFSD: Add helper to decode OPEN's createhow4 argument
    
    [ Upstream commit bf33bab3c4182cdd795983f14de5606e82fab377 ]
    
    Refactor for clarity.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper to decode OPEN's open_claim4 argument [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:45:04 2020 -0500

    NFSD: Add helper to decode OPEN's open_claim4 argument
    
    [ Upstream commit 1708e50b0145f393acbec9e319bdf0e33f765d25 ]
    
    Refactor for clarity.
    
    Note that op_fname is the only instance of an NFSv4 filename stored
    in a struct xdr_netobj. Convert it to a u32/char * pair so that the
    new nfsd4_decode_filename() helper can be used.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper to decode OPEN's openflag4 argument [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:41:21 2020 -0500

    NFSD: Add helper to decode OPEN's openflag4 argument
    
    [ Upstream commit e6ec04b27bfb4869c0e35fbcf24333d379f101d5 ]
    
    Refactor for clarity.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper to set up the pages where the dirlist is encoded [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 09:50:23 2020 -0500

    NFSD: Add helper to set up the pages where the dirlist is encoded
    
    [ Upstream commit 40116ebd0934cca7e46423bdb3397d3d27eb9fb9 ]
    
    De-duplicate some code that is used by both READDIR and READDIRPLUS
    to build the dirlist in the Reply. Because this code is not related
    to decoding READ arguments, it is moved to a more appropriate spot.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helper to set up the pages where the dirlist is encoded [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 13 17:03:49 2020 -0500

    NFSD: Add helper to set up the pages where the dirlist is encoded
    
    [ Upstream commit 788cd46ecf83ee2d561cb4e754e276dc8089b787 ]
    
    Add a helper similar to nfsd3_init_dirlist_pages().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add helpers to decode a clientid4 and an NFSv4 state owner [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:25:02 2020 -0500

    NFSD: Add helpers to decode a clientid4 and an NFSv4 state owner
    
    [ Upstream commit 144e82694092ff80b5e64749d6822cd8947587f2 ]
    
    These helpers will also be used to simplify decoders in subsequent
    patches.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd4_send_cb_offload() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:41:12 2022 -0400

    NFSD: Add nfsd4_send_cb_offload()
    
    [ Upstream commit e72f9bc006c08841c46d27747a4debc747a8fe13 ]
    
    Refactor for legibility.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd_clid_confirmed tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:55:48 2021 -0400

    NFSD: Add nfsd_clid_confirmed tracepoint
    
    [ Upstream commit 7e3b32ace6094aadfa2e1e54ca4c6bbfd07646af ]
    
    This replaces a dprintk call site in order to get greater visibility
    on when client IDs are confirmed or re-used. Simple example:
    
                nfsd-995   [000]   126.622975: nfsd_compound:        xid=0x3a34e2b1 opcnt=1
                nfsd-995   [000]   126.623005: nfsd_cb_args:         addr=192.168.2.51:45901 client 60958e3b:9213ef0e prog=1073741824 ident=1
                nfsd-995   [000]   126.623007: nfsd_compound_status: op=1/1 OP_SETCLIENTID status=0
                nfsd-996   [001]   126.623142: nfsd_compound:        xid=0x3b34e2b1 opcnt=1
      >>>>      nfsd-996   [001]   126.623146: nfsd_clid_confirmed:  client 60958e3b:9213ef0e
                nfsd-996   [001]   126.623148: nfsd_cb_probe:        addr=192.168.2.51:45901 client 60958e3b:9213ef0e state=UNKNOWN
                nfsd-996   [001]   126.623154: nfsd_compound_status: op=1/1 OP_SETCLIENTID_CONFIRM status=0
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd_clid_cred_mismatch tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:55:29 2021 -0400

    NFSD: Add nfsd_clid_cred_mismatch tracepoint
    
    [ Upstream commit 27787733ef44332fce749aa853f2749d141982b0 ]
    
    Record when a client tries to establish a lease record but uses an
    unexpected credential. This is often a sign of a configuration
    problem.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd_clid_destroyed tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:00 2021 -0400

    NFSD: Add nfsd_clid_destroyed tracepoint
    
    [ Upstream commit c41a9b7a906fb872f8b2b1a34d2a1d5ef7f94adb ]
    
    Record client-requested termination of client IDs.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd_clid_reclaim_complete tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:55:54 2021 -0400

    NFSD: Add nfsd_clid_reclaim_complete tracepoint
    
    [ Upstream commit cee8aa074281e5269d8404be2b6388bb29ea8efc ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd_clid_verf_mismatch tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:55:36 2021 -0400

    NFSD: Add nfsd_clid_verf_mismatch tracepoint
    
    [ Upstream commit 744ea54c869cebe41fbad5f53f8a8ca5d93a5c97 ]
    
    Record when a client presents a different boot verifier than the
    one we know about. Typically this is a sign the client has
    rebooted, but sometimes it signals a conflicting client ID, which
    the client's administrator will need to address.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add nfsd_file_lru_dispose_list() helper [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:18 2022 -0400

    NFSD: Add nfsd_file_lru_dispose_list() helper
    
    [ Upstream commit 0bac5a264d9a923f5b01f3521e1519a8d0358342 ]
    
    Refactor the invariant part of nfsd_file_lru_walk_list() into a
    separate helper function.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add posix ACLs to struct nfsd_attrs [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: add posix ACLs to struct nfsd_attrs
    
    [ Upstream commit c0cbe70742f4a70893cd6e5f6b10b6e89b6db95b ]
    
    pacl and dpacl pointers are added to struct nfsd_attrs, which requires
    that we have an nfsd_attrs_free() function to free them.
    Those nfsv4 functions that can set ACLs now set up these pointers
    based on the passed in NFSv4 ACL.
    
    nfsd_setattr() sets the acls as appropriate.
    
    Errors are handled as with security labels.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add security label to struct nfsd_attrs [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: add security label to struct nfsd_attrs
    
    [ Upstream commit d6a97d3f589a3a46a16183e03f3774daee251317 ]
    
    nfsd_setattr() now sets a security label if provided, and nfsv4 provides
    it in the 'open' and 'create' paths and the 'setattr' path.
    If setting the label failed (including because the kernel doesn't
    support labels), an error field in 'struct nfsd_attrs' is set, and the
    caller can respond.  The open/create callers clear
    FATTR4_WORD2_SECURITY_LABEL in the returned attr set in this case.
    The setattr caller returns the error.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add shrinker to reap courtesy clients on low memory condition [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Sep 14 08:54:26 2022 -0700

    NFSD: add shrinker to reap courtesy clients on low memory condition
    
    [ Upstream commit 7746b32f467b3813fb61faaab3258de35806a7ac ]
    
    Add courtesy_client_reaper to react to low memory condition triggered
    by the system memory shrinker.
    
    The delayed_work for the courtesy_client_reaper is scheduled on
    the shrinker's count callback using the laundry_wq.
    
    The shrinker's scan callback is not used for expiring the courtesy
    clients due to potential deadlocks.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    [ cel: adjusted to apply without e33c267ab70d ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: add some comments to nfsd_file_do_acquire [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Jan 5 07:15:12 2023 -0500

    nfsd: add some comments to nfsd_file_do_acquire
    
    [ Upstream commit b680cb9b737331aad271feebbedafb865504e234 ]
    
    David Howells mentioned that he found this bit of code confusing, so
    sprinkle in some comments to clarify.
    
    Reported-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add SPDX header for fs/nfsd/trace.c [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Aug 27 16:09:53 2020 -0400

    NFSD: Add SPDX header for fs/nfsd/trace.c
    
    [ Upstream commit f45a444cfe582b85af937a30d35d68d9a84399dd ]
    
    Clean up.
    
    The file was contributed in 2014 by Christoph Hellwig in commit
    31ef83dc0538 ("nfsd: add trace events").
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add support for lock conflict to courteous server [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:26 2022 -0700

    NFSD: add support for lock conflict to courteous server
    
    [ Upstream commit 27431affb0dbc259ac6ffe6071243a576c8f38f1 ]
    
    This patch allows expired client with lock state to be in COURTESY
    state. Lock conflict with COURTESY client is resolved by the fs/lock
    code using the lm_lock_expirable and lm_expire_lock callback in the
    struct lock_manager_operations.
    
    If conflict client is in COURTESY state, set it to EXPIRABLE and
    schedule the laundromat to run immediately to expire the client. The
    callback lm_expire_lock waits for the laundromat to flush its work
    queue before returning to caller.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add support for sending CB_RECALL_ANY [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Nov 16 19:44:46 2022 -0800

    NFSD: add support for sending CB_RECALL_ANY
    
    [ Upstream commit 3959066b697b5dfbb7141124ae9665337d4bc638 ]
    
    Add XDR encode and decode function for CB_RECALL_ANY.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: add support for share reservation conflict to courteous server [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:22 2022 -0700

    NFSD: add support for share reservation conflict to courteous server
    
    [ Upstream commit 3d69427151806656abf129342028f3f4e5e1fee0 ]
    
    This patch allows expired client with open state to be in COURTESY
    state. Share/access conflict with COURTESY client is resolved by
    setting COURTESY client to EXPIRABLE state, schedule laundromat
    to run and returning nfserr_jukebox to the request client.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Add support for the birth time attribute [+ + +]

Author: Ondrej Valousek <ondrej.valousek.xm@renesas.com>
Date:   Tue Jan 11 13:08:42 2022 +0100

    nfsd: Add support for the birth time attribute
    
    [ Upstream commit e377a3e698fb56cb63f6bddbebe7da76dc37e316 ]
    
    For filesystems that supports "btime" timestamp (i.e. most modern
    filesystems do) we share it via kernel nfsd. Btime support for NFS
    client has already been added by Trond recently.
    
    Suggested-by: Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Ondrej Valousek <ondrej.valousek.xm@renesas.com>
    [ cel: addressed some whitespace/checkpatch nits ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add tracepoints for EXCHANGEID edge cases [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:19 2021 -0400

    NFSD: Add tracepoints for EXCHANGEID edge cases
    
    [ Upstream commit e8f80c5545ec5794644b48537449e48b009d608d ]
    
    Some of the most common cases are traced. Enough infrastructure is
    now in place that more can be added later, as needed.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add tracepoints for SETCLIENTID edge cases [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:13 2021 -0400

    NFSD: Add tracepoints for SETCLIENTID edge cases
    
    [ Upstream commit 237f91c85acef206a33bc02f3c4e856128fd7994 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add tracepoints in nfsd4_decode/encode_compound() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 21 11:36:42 2020 -0500

    NFSD: Add tracepoints in nfsd4_decode/encode_compound()
    
    [ Upstream commit 08281341be8ebc97ee47999812bcf411942baa1e ]
    
    For troubleshooting purposes, record failures to decode NFSv4
    operation arguments and encode operation results.
    
    trace_nfsd_compound_decode_err() replaces the dprintk() call sites
    that are embedded in READ_* macros that are about to be removed.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add tracepoints in nfsd_dispatch() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 19 13:00:29 2020 -0400

    NFSD: Add tracepoints in nfsd_dispatch()
    
    [ Upstream commit 0dfdad1c1d1b77b9b085f4da390464dd0ac5647a ]
    
    For troubleshooting purposes, record GARBAGE_ARGS and CANT_ENCODE
    failures.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Add tracepoints to report NFSv4 callback completions [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 8 18:13:54 2022 -0400

    NFSD: Add tracepoints to report NFSv4 callback completions
    
    [ Upstream commit 1035d65446a018ca2dd179e29a2fcd6d29057781 ]
    
    Wireshark has always been lousy about dissecting NFSv4 callbacks,
    especially NFSv4.0 backchannel requests. Add tracepoints so we
    can surgically capture these events in the trace log.
    
    Tracepoints are time-stamped and ordered so that we can now observe
    the timing relationship between a CB_RECALL Reply and the client's
    DELEGRETURN Call. Example:
    
                nfsd-1153  [002]   211.986391: nfsd_cb_recall:       addr=192.168.1.67:45767 client 62ea82e4:fee7492a stateid 00000003:00000001
    
                nfsd-1153  [002]   212.095634: nfsd_compound:        xid=0x0000002c opcnt=2
                nfsd-1153  [002]   212.095647: nfsd_compound_status: op=1/2 OP_PUTFH status=0
                nfsd-1153  [002]   212.095658: nfsd_file_put:        hash=0xf72 inode=0xffff9291148c7410 ref=3 flags=HASHED|REFERENCED may=READ file=0xffff929103b3ea00
                nfsd-1153  [002]   212.095661: nfsd_compound_status: op=2/2 OP_DELEGRETURN status=0
       kworker/u25:8-148   [002]   212.096713: nfsd_cb_recall_done:  client 62ea82e4:fee7492a stateid 00000003:00000001 status=0
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Adjust cb_shutdown tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:49 2021 -0400

    NFSD: Adjust cb_shutdown tracepoint
    
    [ Upstream commit b200f0e35338b052976b6c5759e4f77a3013e6f6 ]
    
    Show when the upper layer requested a shutdown. RPC tracepoints can
    already show when rpc_shutdown_client() is called.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: allow disabling NFSv2 at compile time [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Oct 18 07:47:56 2022 -0400

    nfsd: allow disabling NFSv2 at compile time
    
    [ Upstream commit 2f3a4b2ac2f28b9be78ad21f401f31e263845214 ]
    
    rpc.nfsd stopped supporting NFSv2 a year ago. Take the next logical
    step toward deprecating it and allow NFSv2 support to be compiled out.
    
    Add a new CONFIG_NFSD_V2 option that can be turned off and rework the
    CONFIG_NFSD_V?_ACL option dependencies. Add a description that
    discourages enabling it.
    
    Also, change the description of CONFIG_NFSD to state that the always-on
    version is now 3 instead of 2.
    
    Finally, add an #ifdef around "case 2:" in __write_versions. When NFSv2
    is disabled at compile time, this should make the kernel ignore attempts
    to disable it at runtime, but still error out when trying to enable it.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Tom Talpey <tom@talpey.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: allow filesystems to opt out of subtree checking [+ + +]

Author: Jeff Layton <jeff.layton@primarydata.com>
Date:   Mon Nov 30 17:03:15 2020 -0500

    nfsd: allow filesystems to opt out of subtree checking
    
    [ Upstream commit ba5e8187c55555519ae0b63c0fb681391bc42af9 ]
    
    When we start allowing NFS to be reexported, then we have some problems
    when it comes to subtree checking. In principle, we could allow it, but
    it would mean encoding parent info in the filehandles and there may not
    be enough space for that in a NFSv3 filehandle.
    
    To enforce this at export upcall time, we add a new export_ops flag
    that declares the filesystem ineligible for subtree checking.
    
    Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: allow nfsd_file_get to sanely handle a NULL pointer [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Jan 6 10:33:47 2023 -0500

    nfsd: allow nfsd_file_get to sanely handle a NULL pointer
    
    [ Upstream commit 70f62231cdfd52357836733dd31db787e0412ab2 ]
    
    ...and remove some now-useless NULL pointer checks in its callers.
    
    Suggested-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: allow reaping files still under writeback [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Feb 15 06:53:54 2023 -0500

    nfsd: allow reaping files still under writeback
    
    [ Upstream commit dcb779fcd4ed5984ad15991d574943d12a8693d1 ]
    
    On most filesystems, there is no reason to delay reaping an nfsd_file
    just because its underlying inode is still under writeback. nfsd just
    relies on client activity or the local flusher threads to do writeback.
    
    The main exception is NFS, which flushes all of its dirty data on last
    close. Add a new EXPORT_OP_FLUSH_ON_CLOSE flag to allow filesystems to
    signal that they do this, and only skip closing files under writeback on
    such filesystems.
    
    Also, remove a redundant NULL file pointer check in
    nfsd_file_check_writeback, and clean up nfs's export op flag
    definitions.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: always drop directory lock in nfsd_unlink() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: always drop directory lock in nfsd_unlink()
    
    [ Upstream commit b677c0c63a135a916493c064906582e9f3ed4802 ]
    
    Some error paths in nfsd_unlink() allow it to exit without unlocking the
    directory.  This is not a problem in practice as the directory will be
    locked with an fh_put(), but it is untidy and potentially confusing.
    
    This allows us to remove all the fh_unlock() calls that are immediately
    after nfsd_unlink() calls.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 28 10:16:42 2022 -0400

    NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create()
    
    [ Upstream commit 14ee45b70dd0d9ae76fb066cd8c0652d657353f6 ]
    
    Clean up: The "out" label already invokes fh_drop_write().
    
    Note that fh_drop_write() is already careful not to invoke
    mnt_drop_write() if either it has already been done or there is
    nothing to drop. Therefore no change in behavior is expected.
    
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Mar 31 16:31:19 2023 -0400

    NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
    
    [ Upstream commit 804d8e0a6e54427268790472781e03bc243f4ee3 ]
    
    OPDESC() simply indexes into nfsd4_ops[] by the op's operation
    number, without range checking that value. It assumes callers are
    careful to avoid calling it with an out-of-bounds opnum value.
    
    nfsd4_decode_compound() is not so careful, and can invoke OPDESC()
    with opnum set to OP_ILLEGAL, which is 10044 -- well beyond the end
    of nfsd4_ops[].
    
    Reported-by: Jeff Layton <jlayton@kernel.org>
    Fixes: f4f9ef4a1b0a ("nfsd4: opdesc will be useful outside nfs4proc.c")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Avoid clashing function prototypes [+ + +]

Author: Kees Cook <kees@kernel.org>
Date:   Fri Dec 2 12:48:59 2022 -0800

    NFSD: Avoid clashing function prototypes
    
    [ Upstream commit e78e274eb22d966258a3845acc71d3c5b8ee2ea8 ]
    
    When built with Control Flow Integrity, function prototypes between
    caller and function declaration must match. These mismatches are visible
    at compile time with the new -Wcast-function-type-strict in Clang[1].
    
    There were 97 warnings produced by NFS. For example:
    
    fs/nfsd/nfs4xdr.c:2228:17: warning: cast from '__be32 (*)(struct nfsd4_compoundargs *, struct nfsd4_access *)' (aka 'unsigned int (*)(struct nfsd4_compoundargs *, struct nfsd4_access *)') to 'nfsd4_dec' (aka 'unsigned int (*)(struct nfsd4_compoundargs *, void *)') converts to incompatible function type [-Wcast-function-type-strict]
            [OP_ACCESS]             = (nfsd4_dec)nfsd4_decode_access,
                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    The enc/dec callbacks were defined as passing "void *" as the second
    argument, but were being implicitly cast to a new type. Replace the
    argument with union nfsd4_op_u, and perform explicit member selection
    in the function body. There are no resulting binary differences.
    
    Changes were made mechanically using the following Coccinelle script,
    with minor by-hand fixes for members that didn't already match their
    existing argument name:
    
    @find@
    identifier func;
    type T, opsT;
    identifier ops, N;
    @@
    
     opsT ops[] = {
            [N] = (T) func,
     };
    
    @already_void@
    identifier find.func;
    identifier name;
    @@
    
     func(...,
    -void
    +union nfsd4_op_u
     *name)
     {
            ...
     }
    
    @proto depends on !already_void@
    identifier find.func;
    type T;
    identifier name;
    position p;
    @@
    
     func@p(...,
            T name
     ) {
            ...
       }
    
    @script:python get_member@
    type_name << proto.T;
    member;
    @@
    
    coccinelle.member = cocci.make_ident(type_name.split("_", 1)[1].split(' ',1)[0])
    
    @convert@
    identifier find.func;
    type proto.T;
    identifier proto.name;
    position proto.p;
    identifier get_member.member;
    @@
    
     func@p(...,
    -       T name
    +       union nfsd4_op_u *u
     ) {
    +       T name = &u->member;
            ...
       }
    
    @cast@
    identifier find.func;
    type T, opsT;
    identifier ops, N;
    @@
    
     opsT ops[] = {
            [N] =
    -       (T)
            func,
     };
    
    Cc: Chuck Lever <chuck.lever@oracle.com>
    Cc: Jeff Layton <jlayton@kernel.org>
    Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
    Cc: linux-nfs@vger.kernel.org
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Avoid some useless tests [+ + +]

Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Thu Sep 1 07:27:11 2022 +0200

    nfsd: Avoid some useless tests
    
    [ Upstream commit d44899b8bb0b919f923186c616a84f0e70e04772 ]
    
    memdup_user() can't return NULL, so there is no point for checking for it.
    
    Simplify some tests accordingly.
    
    Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Batch release pages during splice read [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jun 28 17:24:27 2021 -0400

    NFSD: Batch release pages during splice read
    
    [ Upstream commit 496d83cf0f2fa70cfe256c2499e2d3523d3868f3 ]
    
    Large splice reads call put_page() repeatedly. put_page() is
    relatively expensive to call, so replace it with the new
    svc_rqst_replace_page() helper to help amortize that cost.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: call nfsd_last_thread() before final nfsd_put() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Fri Dec 15 11:56:31 2023 +1100

    nfsd: call nfsd_last_thread() before final nfsd_put()
    
    [ Upstream commit 2a501f55cd641eb4d3c16a2eab0d678693fac663 ]
    
    If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put()
    without calling nfsd_last_thread().  This leaves nn->nfsd_serv pointing
    to a structure that has been freed.
    
    So remove 'static' from nfsd_last_thread() and call it when the
    nfsd_serv is about to be destroyed.
    
    Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: call op_release, even when op_func returns an error [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Mar 27 06:21:37 2023 -0400

    nfsd: call op_release, even when op_func returns an error
    
    [ Upstream commit 15a8b55dbb1ba154d82627547c5761cac884d810 ]
    
    For ops with "trivial" replies, nfsd4_encode_operation will shortcut
    most of the encoding work and skip to just marshalling up the status.
    One of the things it skips is calling op_release. This could cause a
    memory leak in the layoutget codepath if there is an error at an
    inopportune time.
    
    Have the compound processing engine always call op_release, even when
    op_func sets an error in op->status. With this change, we also need
    nfsd4_block_get_device_info_scsi to set the gd_device pointer to NULL
    on error to avoid a double free.
    
    Reported-by: Zhi Li <yieli@redhat.com>
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2181403
    Fixes: 34b1744c91cc ("nfsd4: define ->op_release for compound ops")
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Cap rsize_bop result based on send buffer size [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 1 15:29:55 2022 -0400

    NFSD: Cap rsize_bop result based on send buffer size
    
    [ Upstream commit 76ce4dcec0dc08a032db916841ddc4e3998be317 ]
    
    Since before the git era, NFSD has conserved the number of pages
    held by each nfsd thread by combining the RPC receive and send
    buffers into a single array of pages. This works because there are
    no cases where an operation needs a large RPC Call message and a
    large RPC Reply at the same time.
    
    Once an RPC Call has been received, svc_process() updates
    svc_rqst::rq_res to describe the part of rq_pages that can be
    used for constructing the Reply. This means that the send buffer
    (rq_res) shrinks when the received RPC record containing the RPC
    Call is large.
    
    Add an NFSv4 helper that computes the size of the send buffer. It
    replaces svc_max_payload() in spots where svc_max_payload() returns
    a value that might be larger than the remaining send buffer space.
    Callers who need to know the transport's actual maximum payload size
    will continue to use svc_max_payload().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Capture every CB state transition [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:31 2021 -0400

    NFSD: Capture every CB state transition
    
    [ Upstream commit 8476c69a7fa0f1f9705ec0caa4e97c08b5045779 ]
    
    We were missing one.
    
    As a clean-up, add a helper that sets the new CB state and fires
    a tracepoint. The tracepoint fires only when the state changes, to
    help reduce trace log noise.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
    
    [ Upstream commit 927bfc5600cd6333c9ef9f090f19e66b7d4c8ee1 ]
    
    nfsd_create() usually returns with the directory still locked.
    nfsd_symlink() usually returns with it unlocked.  This is clumsy.
    
    Until recently nfsd_create() needed to keep the directory locked until
    ACLs and security label had been set.  These are now set inside
    nfsd_create() (in nfsd_setattr()) so this need is gone.
    
    So change nfsd_create() and nfsd_symlink() to always unlock, and remove
    any fh_unlock() calls that follow calls to these functions.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Change the way the expected length of a fattr4 is checked [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 13:09:13 2020 -0500

    NFSD: Change the way the expected length of a fattr4 is checked
    
    [ Upstream commit 081d53fe0b43c47c36d1832b759bf14edde9cdbb ]
    
    Because the fattr4 is now managed in an xdr_stream, all that is
    needed is to store the initial position of the stream before
    decoding the attribute list. Then the actual length of the list
    is computed using the final stream position, after decoding is
    complete.
    
    No behavior change is expected.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up _lm_ operation names [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Feb 16 11:26:06 2022 -0500

    NFSD: Clean up _lm_ operation names
    
    [ Upstream commit 35aff0678f99b0623bb72d50112de9e163a19559 ]
    
    The common practice is to name function instances the same as the
    method names, but with a uniquifying prefix. Commit aef9583b234a
    ("NFSD: Get reference of lockowner when coping file_lock") missed
    this -- the new function names should both have been of the form
    "nfsd4_lm_*".
    
    Before more lock manager operations are added in NFSD, rename these
    two functions for consistency.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up after updating NFSv2 ACL decoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 19 17:49:16 2020 -0400

    NFSD: Clean up after updating NFSv2 ACL decoders
    
    [ Upstream commit baadce65d6ee3032b921d9c043ba808bc69d6b13 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up after updating NFSv2 ACL encoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 15 14:31:42 2020 -0500

    NFSD: Clean up after updating NFSv2 ACL encoders
    
    [ Upstream commit 83d0b84572775a29f800de67a1b9b642a5376bc3 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up after updating NFSv3 ACL decoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 09:56:52 2020 -0400

    NFSD: Clean up after updating NFSv3 ACL decoders
    
    [ Upstream commit 9cee763ee654ce8622d673b8e32687d738e24ace ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up after updating NFSv3 ACL encoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 15 15:09:16 2020 -0500

    NFSD: Clean up after updating NFSv3 ACL encoders
    
    [ Upstream commit 1416f435303d81070c6bcf5a4a9b4ed0f7a9f013 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up find_or_add_file() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:41 2022 -0400

    NFSD: Clean up find_or_add_file()
    
    [ Upstream commit 9270fc514ba7d415636b23bcb937573a1ce54f6a ]
    
    Remove the call to find_file_locked() in insert_nfs4_file(). Tracing
    shows that over 99% of these calls return NULL. Thus it is not worth
    the expense of the extra bucket list traversal. insert_file() already
    deals correctly with the case where the item is already in the hash
    bucket.
    
    Since nfsd4_file_hash_insert() is now just a wrapper around
    insert_file(), move the meat of insert_file() into
    nfsd4_file_hash_insert() and get rid of it.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: clean up mounted_on_fileid handling [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Sep 8 12:31:07 2022 -0400

    nfsd: clean up mounted_on_fileid handling
    
    [ Upstream commit 6106d9119b6599fa23dc556b429d887b4c2d9f62 ]
    
    We only need the inode number for this, not a full rack of attributes.
    Rename this function make it take a pointer to a u64 instead of
    struct kstat, and change it to just request STATX_INO.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    [ cel: renamed get_mounted_on_ino() ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfs4_preprocess_stateid_op() call sites [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:46:57 2022 -0400

    NFSD: Clean up nfs4_preprocess_stateid_op() call sites
    
    [ Upstream commit eeff73f7c1c583f79a401284f46c619294859310 ]
    
    Remove the lame-duck dprintk()s around nfs4_preprocess_stateid_op()
    call sites.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfs4svc_encode_compoundres() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:23:19 2022 -0400

    NFSD: Clean up nfs4svc_encode_compoundres()
    
    [ Upstream commit 9993a66317fc9951322483a9edbfae95a640b210 ]
    
    In today's Linux NFS server implementation, the NFS dispatcher
    initializes each XDR result stream, and the NFSv4 .pc_func and
    .pc_encode methods all use xdr_stream-based encoding. This keeps
    rq_res.len automatically updated. There is no longer a need for
    the WARN_ON_ONCE() check in nfs4svc_encode_compoundres().
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfsd3_proc_create() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Mar 25 14:47:54 2022 -0400

    NFSD: Clean up nfsd3_proc_create()
    
    [ Upstream commit e61568599c9ad638fdaba150fee07d7065e31851 ]
    
    As near as I can tell, mode bit masking and setting S_IFREG is
    already done by do_nfsd_create() and vfs_create(). The NFSv4 path
    (do_open_lookup), for example, does not bother with this special
    processing.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfsd4_encode_readlink() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:09:23 2022 -0400

    NFSD: Clean up nfsd4_encode_readlink()
    
    [ Upstream commit 99b002a1fa00d90e66357315757e7277447ce973 ]
    
    Similar changes to nfsd4_encode_readv(), all bundled into a single
    patch.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfsd4_init_file() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:28 2022 -0400

    NFSD: Clean up nfsd4_init_file()
    
    [ Upstream commit 81a21fa3e7fdecb3c5b97014f0fc5a17d5806cae ]
    
    Name this function more consistently. I'm going to use nfsd4_file_
    and nfsd4_file_hash_ for these helpers.
    
    Change the @fh parameter to be const pointer for better type safety.
    
    Finally, move the hash insertion operation to the caller. This is
    typical for most other "init_object" type helpers, and it is where
    most of the other nfs4_file hash table operations are located.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Clean up nfsd_file_put() [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Thu Mar 31 09:54:02 2022 -0400

    nfsd: Clean up nfsd_file_put()
    
    [ Upstream commit 999397926ab3f78c7d1235cc4ca6e3c89d2769bf ]
    
    Make it a little less racy, by removing the refcount_read() test. Then
    remove the redundant 'is_hashed' variable.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfsd_open_verified() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Mar 27 16:46:47 2022 -0400

    NFSD: Clean up nfsd_open_verified()
    
    [ Upstream commit f4d84c52643ae1d63a8e73e2585464470e7944d1 ]
    
    Its only caller always passes S_IFREG as the @type parameter. As an
    additional clean-up, add a kerneldoc comment.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfsd_splice_actor() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Apr 7 16:48:24 2022 -0400

    NFSD: Clean up nfsd_splice_actor()
    
    [ Upstream commit 91e23b1c39820bfed642119ff6b6ef9f43cf09ce ]
    
    nfsd_splice_actor() checks that the page being spliced does not
    match the previous element in the svc_rqst::rq_pages array. We
    believe this is to prevent a double put_page() in cases where the
    READ payload is partially contained in the xdr_buf's head buffer.
    
    However, the NFSD READ proc functions no longer place any part of
    the READ payload in the head buffer, in order to properly support
    NFS/RDMA READ with Write chunks. Therefore, simplify the logic in
    nfsd_splice_actor() to remove this unnecessary check.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up nfsd_vfs_write() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Dec 28 14:19:41 2021 -0500

    NFSD: Clean up nfsd_vfs_write()
    
    [ Upstream commit 33388b3aefefd4d83764dab8038cb54068161a44 ]
    
    The RWF_SYNC and !RWF_SYNC arms are now exactly alike except that
    the RWF_SYNC arm resets the boot verifier twice in a row. Fix that
    redundancy and de-duplicate the code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up NFSDDBG_FACILITY macro [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Mar 5 14:22:32 2021 -0500

    NFSD: Clean up NFSDDBG_FACILITY macro
    
    [ Upstream commit 219a170502b3d597c52eeec088aee8fbf7b90da5 ]
    
    These are no longer needed because there are no dprintk() call sites
    in these files.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: clean up potential nfsd_file refcount leaks in COPY codepath [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Jan 17 14:38:31 2023 -0500

    nfsd: clean up potential nfsd_file refcount leaks in COPY codepath
    
    [ Upstream commit 6ba434cb1a8d403ea9aad1b667c3ea3ad8b3191f ]
    
    There are two different flavors of the nfsd4_copy struct. One is
    embedded in the compound and is used directly in synchronous copies. The
    other is dynamically allocated, refcounted and tracked in the client
    struture. For the embedded one, the cleanup just involves releasing any
    nfsd_files held on its behalf. For the async one, the cleanup is a bit
    more involved, and we need to dequeue it from lists, unhash it, etc.
    
    There is at least one potential refcount leak in this code now. If the
    kthread_create call fails, then both the src and dst nfsd_files in the
    original nfsd4_copy object are leaked.
    
    The cleanup in this codepath is also sort of weird. In the async copy
    case, we'll have up to four nfsd_file references (src and dst for both
    flavors of copy structure). They are both put at the end of
    nfsd4_do_async_copy, even though the ones held on behalf of the embedded
    one outlive that structure.
    
    Change it so that we always clean up the nfsd_file refs held by the
    embedded copy structure before nfsd4_copy returns. Rework
    cleanup_async_copy to handle both inter and intra copies. Eliminate
    nfsd4_cleanup_intra_ssc since it now becomes a no-op.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up splice actor [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jun 28 16:34:20 2021 -0400

    NFSD: Clean up splice actor
    
    [ Upstream commit c7e0b781b73c2e26e442ed71397cc2bc5945a732 ]
    
    A few useful observations:
    
     - The value in @size is never modified.
    
     - splice_desc.len is an unsigned int, and so is xdr_buf.page_len.
       An implicit cast to size_t is unnecessary.
    
     - The computation of .page_len is the same in all three arms
       of the "if" statement, so hoist it out to make it clear that
       the operation is an unconditional invariant.
    
    The resulting function is 18 bytes shorter on my system (-Os).
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up SPLICE_OK in nfsd4_encode_read() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:08:51 2022 -0400

    NFSD: Clean up SPLICE_OK in nfsd4_encode_read()
    
    [ Upstream commit c738b218a2e5a753a336b4b7fee6720b902c7ace ]
    
    Do the test_bit() once -- this reduces the number of locked-bus
    operations and makes the function a little easier to read.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up the nfsd_net::nfssvc_boot field [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Dec 29 14:43:16 2021 -0500

    NFSD: Clean up the nfsd_net::nfssvc_boot field
    
    [ Upstream commit 91d2e9b56cf5c80f9efc530d494968369a8a0e0d ]
    
    There are two boot-time fields in struct nfsd_net: one called
    boot_time and one called nfssvc_boot. The latter is used only to
    form write verifiers, but its documenting comment declares:
    
            /* Time of server startup */
    
    Since commit 27c438f53e79 ("nfsd: Support the server resetting the
    boot verifier"), this field can be reset at any time; it's no
    longer tied to server restart. So that comment is stale.
    
    Also, according to pahole, struct timespec64 is 16 bytes long on
    x86_64. The nfssvc_boot field is used only to form a write verifier,
    which is 8 bytes long.
    
    Let's clarify this situation by manufacturing an 8-byte verifier
    in nfs_reset_boot_verifier() and storing only that in struct
    nfsd_net.
    
    We're grabbing 128 bits of time, so compress all of those into a
    64-bit verifier instead of throwing out the high-order bits.
    In the future, the siphash_key can be re-used for other hashed
    objects per-nfsd_net.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up the show_nf_flags() macro [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Mar 27 16:43:03 2022 -0400

    NFSD: Clean up the show_nf_flags() macro
    
    [ Upstream commit bb283ca18d1e67c82d22a329c96c9d6036a74790 ]
    
    The flags are defined using C macros, so TRACE_DEFINE_ENUM is
    unnecessary.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up the show_nf_may macro [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Aug 19 12:56:40 2020 -0400

    NFSD: Clean up the show_nf_may macro
    
    [ Upstream commit b76278ae68848cea13b325d247aa5cf31c87edac ]
    
    Display all currently possible NFSD_MAY permission flags.
    
    Move and rename show_nf_may with a more generic name because the
    NFSD_MAY permission flags are used in other places besides the file
    cache.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up unused code after rhashtable conversion [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:36 2022 -0400

    NFSD: Clean up unused code after rhashtable conversion
    
    [ Upstream commit 0ec8e9d1539a7b8109a554028bbce441052f847e ]
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Clean up WRITE arg decoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:23:07 2022 -0400

    NFSD: Clean up WRITE arg decoders
    
    [ Upstream commit d4da5baa533215b14625458e645056baf646bb2e ]
    
    xdr_stream_subsegment() already returns a boolean value.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: close cached files prior to a REMOVE or RENAME that would replace target [+ + +]

Author: Jeff Layton <jeff.layton@primarydata.com>
Date:   Mon Nov 30 17:03:16 2020 -0500

    nfsd: close cached files prior to a REMOVE or RENAME that would replace target
    
    [ Upstream commit 7f84b488f9add1d5cca3e6197c95914c7bd3c1cf ]
    
    It's not uncommon for some workloads to do a bunch of I/O to a file and
    delete it just afterward. If knfsd has a cached open file however, then
    the file may still be open when the dentry is unlinked. If the
    underlying filesystem is nfs, then that could trigger it to do a
    sillyrename.
    
    On a REMOVE or RENAME scan the nfsd_file cache for open files that
    correspond to the inode, and proactively unhash and put their
    references. This should prevent any delete-on-last-close activity from
    occurring, solely due to knfsd's open file cache.
    
    This must be done synchronously though so we use the variants that call
    flush_delayed_fput. There are deadlock possibilities if you call
    flush_delayed_fput while holding locks, however. In the case of
    nfsd_rename, we don't even do the lookups of the dentries to be renamed
    until we've locked for rename.
    
    Once we've figured out what the target dentry is for a rename, check to
    see whether there are cached open files associated with it. If there
    are, then unwind all of the locking, close them all, and then reattempt
    the rename.
    
    None of this is really necessary for "typical" filesystems though. It's
    mostly of use for NFS, so declare a new export op flag and use that to
    determine whether to close the files beforehand.
    
    Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    [ cel: adjusted to apply to 5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Combine XDR error tracepoints [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 21 12:11:45 2021 -0400

    NFSD: Combine XDR error tracepoints
    
    [ Upstream commit 70e94d757b3e1f46486d573729d84c8955c81dce ]
    
    Clean up: The garbage_args and cant_encode tracepoints report the
    same information as each other, so combine them into a single
    tracepoint class to reduce code duplication and slightly reduce the
    size of trace.o.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: COMMIT operations must not return NFS?ERR_INVAL [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jan 24 15:50:31 2022 -0500

    NFSD: COMMIT operations must not return NFS?ERR_INVAL
    
    [ Upstream commit 3f965021c8bc38965ecb1924f570c4842b33d408 ]
    
    Since, well, forever, the Linux NFS server's nfsd_commit() function
    has returned nfserr_inval when the passed-in byte range arguments
    were non-sensical.
    
    However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
    are permitted to return only the following non-zero status codes:
    
          NFS3ERR_IO
          NFS3ERR_STALE
          NFS3ERR_BADHANDLE
          NFS3ERR_SERVERFAULT
    
    NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
    is not listed in the COMMIT row of Table 6 in RFC 8881.
    
    RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
    specify when it can or should be used.
    
    Instead of dropping or failing a COMMIT request in a byte range that
    is not supported, turn it into a valid request by treating one or
    both arguments as zero. Offset zero means start-of-file, count zero
    means until-end-of-file, so we only ever extend the commit range.
    NFS servers are always allowed to commit more and sooner than
    requested.
    
    The range check is no longer bounded by NFS_OFFSET_MAX, but rather
    by the value that is returned in the maxfilesize field of the NFSv3
    FSINFO procedure or the NFSv4 maxfilesize file attribute.
    
    Note that this change results in a new pynfs failure:
    
    CMT4     st_commit.testCommitOverflow                             : RUNNING
    CMT4     st_commit.testCommitOverflow                             : FAILURE
               COMMIT with offset + count overflow should return
               NFS4ERR_INVAL, instead got NFS4_OK
    
    IMO the test is not correct as written: RFC 8881 does not allow the
    COMMIT operation to return NFS4ERR_INVAL.
    
    Reported-by: Dan Aloni <dan.aloni@vastdata.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Constify @fh argument of knfsd_fh_hash() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:25 2021 -0400

    NFSD: Constify @fh argument of knfsd_fh_hash()
    
    [ Upstream commit 1736aec82a15cb5d4b3bbe0b2fbae0ede66b1a1a ]
    
    Enable knfsd_fh_hash() to be invoked in functions where the
    filehandle pointer is a const.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Convert filecache to rhltable [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 24 15:09:04 2022 -0500

    NFSD: Convert filecache to rhltable
    
    [ Upstream commit c4c649ab413ba6a785b25f0edbb12f617c87db2a ]
    
    While we were converting the nfs4_file hashtable to use the kernel's
    resizable hashtable data structure, Neil Brown observed that the
    list variant (rhltable) would be better for managing nfsd_file items
    as well. The nfsd_file hash table will contain multiple entries for
    the same inode -- these should be kept together on a list. And, it
    could be possible for exotic or malicious client behavior to cause
    the hash table to resize itself on every insertion.
    
    A nice simplification is that rhltable_lookup() can return a list
    that contains only nfsd_file items that match a given inode, which
    enables us to eliminate specialized hash table helper functions and
    use the default functions provided by the rhashtable implementation).
    
    Since we are now storing nfsd_file items for the same inode on a
    single list, that effectively reduces the number of hash entries
    that have to be tracked in the hash table. The mininum bucket count
    is therefore lowered.
    
    Light testing with fstests generic/531 show no regressions.
    
    Suggested-by: Neil Brown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Convert the filecache to use rhashtable [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:30 2022 -0400

    NFSD: Convert the filecache to use rhashtable
    
    [ Upstream commit ce502f81ba884c1fe45dc0ebddbcaaa4ec0fc5fb ]
    
    Enable the filecache hash table to start small, then grow with the
    workload. Smaller server deployments benefit because there should
    be lower memory utilization. Larger server deployments should see
    improved scaling with the number of open files.
    
    Suggested-by: Jeff Layton <jlayton@kernel.org>
    Suggested-by: Dave Chinner <david@fromorbit.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: copy the whole verifier in nfsd_copy_write_verifier [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Feb 14 10:07:59 2023 -0500

    NFSD: copy the whole verifier in nfsd_copy_write_verifier
    
    [ Upstream commit 90d2175572470ba7f55da8447c72ddd4942923c4 ]
    
    Currently, we're only memcpy'ing the first __be32. Ensure we copy into
    both words.
    
    Fixes: 91d2e9b56cf5 ("NFSD: Clean up the nfsd_net::nfssvc_boot field")
    Reported-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: COPY with length 0 should copy to end of file [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Mar 18 20:03:23 2021 -0400

    nfsd: COPY with length 0 should copy to end of file
    
    [ Upstream commit 792a5112aa90e59c048b601c6382fe3498d75db7 ]
    
    >From https://tools.ietf.org/html/rfc7862#page-65
    
            A count of 0 (zero) requests that all bytes from ca_src_offset
            through EOF be copied to the destination.
    
    Reported-by: <radchenkoy@gmail.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Count bytes instead of pages in the NFSv2 READDIR encoder [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 13 16:57:44 2020 -0500

    NFSD: Count bytes instead of pages in the NFSv2 READDIR encoder
    
    [ Upstream commit 8141d6a2bb6c655ff0c0b81ced80d9025f03e926 ]
    
    Clean up: Counting the bytes used by each returned directory entry
    seems less brittle to me than trying to measure consumed pages after
    the fact.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Count bytes instead of pages in the NFSv3 READDIR encoder [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 9 13:13:21 2020 -0500

    NFSD: Count bytes instead of pages in the NFSv3 READDIR encoder
    
    [ Upstream commit a1409e2de4f11034c8eb30775cc3e37039a4ef13 ]
    
    Clean up: Counting the bytes used by each returned directory entry
    seems less brittle to me than trying to measure consumed pages after
    the fact.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: cstate->session->se_client -> cstate->clp [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:45 2021 -0500

    nfsd: cstate->session->se_client -> cstate->clp
    
    [ Upstream commit ec59659b4972ec25851aa03b4b5baba6764a62e4 ]
    
    I'm not sure why we're writing this out the hard way in so many places.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: De-duplicate hash bucket indexing [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 30 19:19:57 2021 -0400

    NFSD: De-duplicate hash bucket indexing
    
    [ Upstream commit 378a6109dd142a678f629b740f558365150f60f9 ]
    
    Clean up: The details of finding the right hash bucket are exactly
    the same in both nfsd_cache_lookup() and nfsd_cache_update().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: De-duplicate net_generic(nf->nf_net, nfsd_net_id) [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Dec 28 14:26:03 2021 -0500

    NFSD: De-duplicate net_generic(nf->nf_net, nfsd_net_id)
    
    [ Upstream commit 2c445a0e72cb1fbfbdb7f9473c53556ee27c1d90 ]
    
    Since this pointer is used repeatedly, move it to a stack variable.
    
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: De-duplicate net_generic(SVC_NET(rqstp), nfsd_net_id) [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Dec 28 12:41:32 2021 -0500

    NFSD: De-duplicate net_generic(SVC_NET(rqstp), nfsd_net_id)
    
    [ Upstream commit fb7622c2dbd1aa41133a8c73e1137b833c074519 ]
    
    Since this pointer is used repeatedly, move it to a stack variable.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: De-duplicate nfsd4_decode_bitmap4() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Dec 13 10:20:45 2021 -0500

    NFSD: De-duplicate nfsd4_decode_bitmap4()
    
    [ Upstream commit cd2e999c7c394ae916d8be741418b3c6c1dddea8 ]
    
    Clean up. Trond points out that xdr_stream_decode_uint32_array()
    does the same thing as nfsd4_decode_bitmap4().
    
    Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Decode NFSv4 birth time attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Jul 10 14:46:04 2022 -0400

    NFSD: Decode NFSv4 birth time attribute
    
    [ Upstream commit 5b2f3e0777da2a5dd62824bbe2fdab1d12caaf8f ]
    
    NFSD has advertised support for the NFSv4 time_create attribute
    since commit e377a3e698fb ("nfsd: Add support for the birth time
    attribute").
    
    Igor Mammedov reports that Mac OS clients attempt to set the NFSv4
    birth time attribute via OPEN(CREATE) and SETATTR if the server
    indicates that it supports it, but since the above commit was
    merged, those attempts now fail.
    
    Table 5 in RFC 8881 lists the time_create attribute as one that can
    be both set and retrieved, but the above commit did not add server
    support for clients to provide a time_create attribute. IMO that's
    a bug in our implementation of the NFSv4 protocol, which this commit
    addresses.
    
    Whether NFSD silently ignores the new birth time or actually sets it
    is another matter. I haven't found another filesystem service in the
    Linux kernel that enables users or clients to modify a file's birth
    time attribute.
    
    This commit reflects my (perhaps incorrect) understanding of whether
    Linux users can set a file's birth time. NFSD will now recognize a
    time_create attribute but it ignores its value. It clears the
    time_create bit in the returned attribute bitmask to indicate that
    the value was not used.
    
    Reported-by: Igor Mammedov <imammedo@redhat.com>
    Fixes: e377a3e698fb ("nfsd: Add support for the birth time attribute")
    Tested-by: Igor Mammedov <imammedo@redhat.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: delay unmount source's export after inter-server copy completed. [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Fri May 21 15:09:37 2021 -0400

    NFSD: delay unmount source's export after inter-server copy completed.
    
    [ Upstream commit f4e44b393389c77958f7c58bf4415032b4cda15b ]
    
    Currently the source's export is mounted and unmounted on every
    inter-server copy operation. This patch is an enhancement to delay
    the unmount of the source export for a certain period of time to
    eliminate the mount and unmount overhead on subsequent copy operations.
    
    After a copy operation completes, a work entry is added to the
    delayed unmount list with an expiration time. This list is serviced
    by the laundromat thread to unmount the export of the expired entries.
    Each time the export is being used again, its expiration time is
    extended and the entry is re-inserted to the tail of the list.
    
    The unmount task and the mount operation of the copy request are
    synced to make sure the export is not unmounted while it's being
    used.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Demote a WARN to a pr_warn() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:23:45 2022 -0400

    NFSD: Demote a WARN to a pr_warn()
    
    [ Upstream commit ca3f9acb6d3faf78da2b63324f7c737dbddf7f69 ]
    
    The call trace doesn't add much value, but it sure is noisy.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Deprecate NFS_OFFSET_MAX [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jan 25 15:57:45 2022 -0500

    NFSD: Deprecate NFS_OFFSET_MAX
    
    [ Upstream commit c306d737691ef84305d4ed0d302c63db2932f0bb ]
    
    NFS_OFFSET_MAX was introduced way back in Linux v2.3.y before there
    was a kernel-wide OFFSET_MAX value. As a clean up, replace the last
    few uses of it with its generic equivalent, and get rid of it.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: destroy percpu stats counters after reply cache shutdown [+ + +]

Author: Julian Schroeder <jumaco@amazon.com>
Date:   Mon May 23 18:52:26 2022 +0000

    nfsd: destroy percpu stats counters after reply cache shutdown
    
    [ Upstream commit fd5e363eac77ef81542db77ddad0559fa0f9204e ]
    
    Upon nfsd shutdown any pending DRC cache is freed. DRC cache use is
    tracked via a percpu counter. In the current code the percpu counter
    is destroyed before. If any pending cache is still present,
    percpu_counter_add is called with a percpu counter==NULL. This causes
    a kernel crash.
    The solution is to destroy the percpu counter after the cache is freed.
    
    Fixes: e567b98ce9a4b (“nfsd: protect concurrent access to nfsd stats counters”)
    Signed-off-by: Julian Schroeder <jumaco@amazon.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: discard fh_locked flag and fh_lock/fh_unlock [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: discard fh_locked flag and fh_lock/fh_unlock
    
    [ Upstream commit dd8dd403d7b223cc77ee89d8d09caf045e90e648 ]
    
    As all inode locking is now fully balanced, fh_put() does not need to
    call fh_unlock().
    fh_lock() and fh_unlock() are no longer used, so discard them.
    These are the only real users of ->fh_locked, so discard that too.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't allow nfsd threads to be signalled. [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jun 3 10:35:02 2024 -0400

    nfsd: don't allow nfsd threads to be signalled.
    
    [ Upstream commit 3903902401451b1cd9d797a8c79769eb26ac7fe5 ]
    
    The original implementation of nfsd used signals to stop threads during
    shutdown.
    In Linux 2.3.46pre5 nfsd gained the ability to shutdown threads
    internally it if was asked to run "0" threads.  After this user-space
    transitioned to using "rpc.nfsd 0" to stop nfsd and sending signals to
    threads was no longer an important part of the API.
    
    In commit 3ebdbe5203a8 ("SUNRPC: discard svo_setup and rename
    svc_set_num_threads_sync()") (v5.17-rc1~75^2~41) we finally removed the
    use of signals for stopping threads, using kthread_stop() instead.
    
    This patch makes the "obvious" next step and removes the ability to
    signal nfsd threads - or any svc threads.  nfsd stops allowing signals
    and we don't check for their delivery any more.
    
    This will allow for some simplification in later patches.
    
    A change worth noting is in nfsd4_ssc_setup_dul().  There was previously
    a signal_pending() check which would only succeed when the thread was
    being shut down.  It should really have tested kthread_should_stop() as
    well.  Now it just does the latter, not the former.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't call locks_release_private() twice concurrently [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Wed Jan 31 11:17:40 2024 +1100

    nfsd: don't call locks_release_private() twice concurrently
    
    [ Upstream commit 05eda6e75773592760285e10ac86c56d683be17f ]
    
    It is possible for free_blocked_lock() to be called twice concurrently,
    once from nfsd4_lock() and once from nfsd4_release_lockowner() calling
    remove_blocked_locks().  This is why a kref was added.
    
    It is perfectly safe for locks_delete_block() and kref_put() to be
    called in parallel as they use locking or atomicity respectively as
    protection.  However locks_release_private() has no locking.  It is
    safe for it to be called twice sequentially, but not concurrently.
    
    This patch moves that call from free_blocked_lock() where it could race
    with itself, to free_nbl() where it cannot.  This will slightly delay
    the freeing of private info or release of the owner - but not by much.
    It is arguably more natural for this freeing to happen in free_nbl()
    where the structure itself is freed.
    
    This bug was found by code inspection - it has not been seen in practice.
    
    Fixes: 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't destroy global nfs4_file table in per-net shutdown [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Sat Feb 11 07:50:08 2023 -0500

    nfsd: don't destroy global nfs4_file table in per-net shutdown
    
    [ Upstream commit 4102db175b5d884d133270fdbd0e59111ce688fc ]
    
    The nfs4_file table is global, so shutting it down when a containerized
    nfsd is shut down is wrong and can lead to double-frees. Tear down the
    nfs4_file_rhltable in nfs4_state_shutdown instead of
    nfs4_state_shutdown_net.
    
    Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects")
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017
    Reported-by: JianHong Yin <jiyin@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't free files unconditionally in __nfsd_file_cache_purge [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Jan 20 14:52:14 2023 -0500

    nfsd: don't free files unconditionally in __nfsd_file_cache_purge
    
    [ Upstream commit 4bdbba54e9b1c769da8ded9abd209d765715e1d6 ]
    
    nfsd_file_cache_purge is called when the server is shutting down, in
    which case, tearing things down is generally fine, but it also gets
    called when the exports cache is flushed.
    
    Instead of walking the cache and freeing everything unconditionally,
    handle it the same as when we have a notification of conflicting access.
    
    Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache")
    Reported-by: Ruben Vestergaard <rubenv@drcmr.dk>
    Reported-by: Torkil Svensgaard <torkil@drcmr.dk>
    Reported-by: Shachar Kagan <skagan@nvidia.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Tested-by: Shachar Kagan <skagan@nvidia.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't fsync nfsd_files on last close [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Feb 7 12:02:46 2023 -0500

    nfsd: don't fsync nfsd_files on last close
    
    [ Upstream commit 4c475eee02375ade6e864f1db16976ba0d96a0a2 ]
    
    Most of the time, NFSv4 clients issue a COMMIT before the final CLOSE of
    an open stateid, so with NFSv4, the fsync in the nfsd_file_free path is
    usually a no-op and doesn't block.
    
    We have a customer running knfsd over very slow storage (XFS over Ceph
    RBD). They were using the "async" export option because performance was
    more important than data integrity for this application. That export
    option turns NFSv4 COMMIT calls into no-ops. Due to the fsync in this
    codepath however, their final CLOSE calls would still stall (since a
    CLOSE effectively became a COMMIT).
    
    I think this fsync is not strictly necessary. We only use that result to
    reset the write verifier. Instead of fsync'ing all of the data when we
    free an nfsd_file, we can just check for writeback errors when one is
    acquired and when it is freed.
    
    If the client never comes back, then it'll never see the error anyway
    and there is no point in resetting it. If an error occurs after the
    nfsd_file is removed from the cache but before the inode is evicted,
    then it will reset the write verifier on the next nfsd_file_acquire,
    (since there will be an unseen error).
    
    The only exception here is if something else opens and fsyncs the file
    during that window. Given that local applications work with this
    limitation today, I don't see that as an issue.
    
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2166658
    Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache")
    Reported-and-tested-by: Pierguido Lambri <plambri@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't hand out delegation on setuid files being opened for write [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Jan 27 07:09:33 2023 -0500

    nfsd: don't hand out delegation on setuid files being opened for write
    
    [ Upstream commit 826b67e6376c2a788e3a62c4860dcd79500a27d5 ]
    
    We had a bug report that xfstest generic/355 was failing on NFSv4.0.
    This test sets various combinations of setuid/setgid modes and tests
    whether DIO writes will cause them to be stripped.
    
    What I found was that the server did properly strip those bits, but
    the client didn't notice because it held a delegation that was not
    recalled. The recall didn't occur because the client itself was the
    one generating the activity and we avoid recalls in that case.
    
    Clearing setuid bits is an "implicit" activity. The client didn't
    specifically request that we do that, so we need the server to issue a
    CB_RECALL, or avoid the situation entirely by not issuing a delegation.
    
    The easiest fix here is to simply not give out a delegation if the file
    is being opened for write, and the mode has the setuid and/or setgid bit
    set. Note that there is a potential race between the mode and lease
    being set, so we test for this condition both before and after setting
    the lease.
    
    This patch fixes generic/355, generic/683 and generic/684 for me. (Note
    that 355 fails only on v4.0, and 683 and 684 require NFSv4.2 to run and
    fail).
    
    Reported-by: Boyang Xue <bxue@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't ignore high bits of copy count [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Mar 18 20:03:22 2021 -0400

    nfsd: don't ignore high bits of copy count
    
    [ Upstream commit e7a833e9cc6c3b58fe94f049d2b40943cba07086 ]
    
    Note size_t is 32-bit on a 32-bit architecture, but cp_count is defined
    by the protocol to be 64 bit, so we could be turning a large copy into a
    0-length copy here.
    
    Reported-by: <radchenkoy@gmail.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't kill nfsd_files because of lease break error [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Jan 5 07:15:11 2023 -0500

    nfsd: don't kill nfsd_files because of lease break error
    
    [ Upstream commit c6593366c0bf222be9c7561354dfb921c611745e ]
    
    An error from break_lease is non-fatal, so we needn't destroy the
    nfsd_file in that case. Just put the reference like we normally would
    and return the error.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't open-code clear_and_wake_up_bit [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Jan 5 07:15:09 2023 -0500

    nfsd: don't open-code clear_and_wake_up_bit
    
    [ Upstream commit b8bea9f6cdd7236c7c2238d022145e9b2f8aac22 ]
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't replace page in rq_pages if it's a continuation of last page [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Mar 17 13:13:08 2023 -0400

    nfsd: don't replace page in rq_pages if it's a continuation of last page
    
    [ Upstream commit 27c934dd8832dd40fd34776f916dc201e18b319b ]
    
    The splice read calls nfsd_splice_actor to put the pages containing file
    data into the svc_rqst->rq_pages array. It's possible however to get a
    splice result that only has a partial page at the end, if (e.g.) the
    filesystem hands back a short read that doesn't cover the whole page.
    
    nfsd_splice_actor will plop the partial page into its rq_pages array and
    return. Then later, when nfsd_splice_actor is called again, the
    remainder of the page may end up being filled out. At this point,
    nfsd_splice_actor will put the page into the array _again_ corrupting
    the reply. If this is done enough times, rq_next_page will overrun the
    array and corrupt the trailing fields -- the rq_respages and
    rq_next_page pointers themselves.
    
    If we've already added the page to the array in the last pass, don't add
    it to the array a second time when dealing with a splice continuation.
    This was originally handled properly in nfsd_splice_actor, but commit
    91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") removed the check
    for it.
    
    Fixes: 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()")
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Reported-by: Dario Lesca <d.lesca@solinos.it>
    Tested-by: David Critch <dcritch@redhat.com>
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't take fi_lock in nfsd_break_deleg_cb() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Feb 5 13:22:39 2024 +1100

    nfsd: don't take fi_lock in nfsd_break_deleg_cb()
    
    [ Upstream commit 5ea9a7c5fe4149f165f0e3b624fe08df02b6c301 ]
    
    A recent change to check_for_locks() changed it to take ->flc_lock while
    holding ->fi_lock.  This creates a lock inversion (reported by lockdep)
    because there is a case where ->fi_lock is taken while holding
    ->flc_lock.
    
    ->flc_lock is held across ->fl_lmops callbacks, and
    nfsd_break_deleg_cb() is one of those and does take ->fi_lock.  However
    it doesn't need to.
    
    Prior to v4.17-rc1~110^2~22 ("nfsd: create a separate lease for each
    delegation") nfsd_break_deleg_cb() would walk the ->fi_delegations list
    and so needed the lock.  Since then it doesn't walk the list and doesn't
    need the lock.
    
    Two actions are performed under the lock.  One is to call
    nfsd_break_one_deleg which calls nfsd4_run_cb().  These doesn't act on
    the nfs4_file at all, so don't need the lock.
    
    The other is to set ->fi_had_conflict which is in the nfs4_file.
    This field is only ever set here (except when initialised to false)
    so there is no possible problem will multiple threads racing when
    setting it.
    
    The field is tested twice in nfs4_set_delegation().  The first test does
    not hold a lock and is documented as an opportunistic optimisation, so
    it doesn't impose any need to hold ->fi_lock while setting
    ->fi_had_conflict.
    
    The second test in nfs4_set_delegation() *is* make under ->fi_lock, so
    removing the locking when ->fi_had_conflict is set could make a change.
    The change could only be interesting if ->fi_had_conflict tested as
    false even though nfsd_break_one_deleg() ran before ->fi_lock was
    unlocked.  i.e. while hash_delegation_locked() was running.
    As hash_delegation_lock() doesn't interact in any way with nfs4_run_cb()
    there can be no importance to this interaction.
    
    So this patch removes the locking from nfsd_break_one_deleg() and moves
    the final test on ->fi_had_conflict out of the locked region to make it
    clear that locking isn't important to the test.  It is still tested
    *after* vfs_setlease() has succeeded.  This might be significant and as
    vfs_setlease() takes ->flc_lock, and nfsd_break_one_deleg() is called
    under ->flc_lock this "after" is a true ordering provided by a spinlock.
    
    Fixes: edcf9725150e ("nfsd: fix RELEASE_LOCKOWNER")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: don't take/put an extra reference when putting a file [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Jan 18 12:31:37 2023 -0500

    nfsd: don't take/put an extra reference when putting a file
    
    [ Upstream commit b2ff1bd71db2a1b193a6dde0845adcd69cbcf75e ]
    
    The last thing that filp_close does is an fput, so don't bother taking
    and putting the extra reference.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: drop fh argument from alloc_init_deleg [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: drop fh argument from alloc_init_deleg
    
    [ Upstream commit bbf936edd543e7220f60f9cbd6933b916550396d ]
    
    Currently, we pass the fh of the opened file down through several
    functions so that alloc_init_deleg can pass it to delegation_blocked.
    The filehandle of the open file is available in the nfs4_file however,
    so there's no need to pass it in a separate argument.
    
    Drop the argument from alloc_init_deleg, nfs4_open_delegation and
    nfs4_set_delegation.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: drop fname and flen args from nfsd_create_locked() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Sep 6 10:42:19 2022 +1000

    NFSD: drop fname and flen args from nfsd_create_locked()
    
    [ Upstream commit 9558f9304ca1903090fa5d995a3269a8e82804b4 ]
    
    nfsd_create_locked() does not use the "fname" and "flen" arguments, so
    drop them from declaration and all callers.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: drop support for ancient filehandles [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 2 11:15:29 2021 +1000

    NFSD: drop support for ancient filehandles
    
    [ Upstream commit c645a883df34ee10b884ec921e850def54b7f461 ]
    
    Filehandles not in the "new" or "version 1" format have not been handed
    out for new mounts since Linux 2.4 which was released 20 years ago.
    I think it is safe to say that no such file handles are still in use,
    and that we can drop support for them.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: drop the nfsd_put helper [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Jan 3 08:36:52 2024 -0500

    nfsd: drop the nfsd_put helper
    
    [ Upstream commit 64e6304169f1e1f078e7f0798033f80a7fb0ea46 ]
    
    It's not safe to call nfsd_put once nfsd_last_thread has been called, as
    that function will zero out the nn->nfsd_serv pointer.
    
    Drop the nfsd_put helper altogether and open-code the svc_put in its
    callers instead. That allows us to not be reliant on the value of that
    pointer when handling an error.
    
    Fixes: 2a501f55cd64 ("nfsd: call nfsd_last_thread() before final nfsd_put()")
    Reported-by: Zhi Li <yieli@redhat.com>
    Cc: NeilBrown <neilb@suse.de>
    Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Drop TRACE_DEFINE_ENUM for NFSD4_CB_ macros [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:56:37 2021 -0400

    NFSD: Drop TRACE_DEFINE_ENUM for NFSD4_CB_<state> macros
    
    [ Upstream commit 167145cc64ce4b4b177e636829909a6b14004f9e ]
    
    TRACE_DEFINE_ENUM() is necessary for enum {} but not for C macros.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: eliminate the NFSD_FILE_BREAK_* flags [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Jul 29 17:01:07 2022 -0400

    nfsd: eliminate the NFSD_FILE_BREAK_* flags
    
    [ Upstream commit 23ba98de6dcec665e15c0ca19244379bb0d30932 ]
    
    We had a report from the spring Bake-a-thon of data corruption in some
    nfstest_interop tests. Looking at the traces showed the NFS server
    allowing a v3 WRITE to proceed while a read delegation was still
    outstanding.
    
    Currently, we only set NFSD_FILE_BREAK_* flags if
    NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc.
    NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for
    COMMIT ops, where we need a writeable filehandle but don't need to
    break read leases.
    
    It doesn't make any sense to consult that flag when allocating a file
    since the file may be used on subsequent calls where we do want to break
    the lease (and the usage of it here seems to be reverse from what it
    should be anyway).
    
    Also, after calling nfsd_open_break_lease, we don't want to clear the
    BREAK_* bits. A lease could end up being set on it later (more than
    once) and we need to be able to break those leases as well.
    
    This means that the NFSD_FILE_BREAK_* flags now just mirror
    NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just
    drop those flags and unconditionally call nfsd_open_break_lease every
    time.
    
    Reported-by: Olga Kornieskaia <kolga@netapp.com>
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360
    Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd)
    Cc: <stable@vger.kernel.org> # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro
    Cc: <stable@vger.kernel.org> # 5.4.x
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: enhance inter-server copy cleanup [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Sun Dec 18 16:55:53 2022 -0800

    NFSD: enhance inter-server copy cleanup
    
    [ Upstream commit df24ac7a2e3a9d0bc68f1756a880e50bfe4b4522 ]
    
    Currently nfsd4_setup_inter_ssc returns the vfsmount of the source
    server's export when the mount completes. After the copy is done
    nfsd4_cleanup_inter_ssc is called with the vfsmount of the source
    server and it searches nfsd_ssc_mount_list for a matching entry
    to do the clean up.
    
    The problems with this approach are (1) the need to search the
    nfsd_ssc_mount_list and (2) the code has to handle the case where
    the matching entry is not found which looks ugly.
    
    The enhancement is instead of nfsd4_setup_inter_ssc returning the
    vfsmount, it returns the nfsd4_ssc_umount_item which has the
    vfsmount embedded in it. When nfsd4_cleanup_inter_ssc is called
    it's passed with the nfsd4_ssc_umount_item directly to do the
    clean up so no searching is needed and there is no need to handle
    the 'not found' case.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    [ cel: adjusted whitespace and variable/function names ]
    Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Enhance the nfsd_cb_setup tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:02 2021 -0400

    NFSD: Enhance the nfsd_cb_setup tracepoint
    
    [ Upstream commit 9f57c6062bf3ce2c6ab9ba60040b34e8134ef259 ]
    
    Display the transport protocol and authentication flavor so admins
    can see what they might be getting wrong.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Ensure nf_inode is never dereferenced [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:27:09 2022 -0400

    NFSD: Ensure nf_inode is never dereferenced
    
    [ Upstream commit 427f5f83a3191cbf024c5aea6e5b601cdf88d895 ]
    
    The documenting comment for struct nf_file states:
    
    /*
     * A representation of a file that has been opened by knfsd. These are hashed
     * in the hashtable by inode pointer value. Note that this object doesn't
     * hold a reference to the inode by itself, so the nf_inode pointer should
     * never be dereferenced, only used for comparison.
     */
    
    Replace the two existing dereferences to make the comment always
    true.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: extra checks when freeing delegation stateids [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Sep 26 14:41:02 2022 -0400

    nfsd: extra checks when freeing delegation stateids
    
    [ Upstream commit 895ddf5ed4c54ea9e3533606d7a8b4e4f27f95ef ]
    
    We've had some reports of problems in the refcounting for delegation
    stateids that we've yet to track down. Add some extra checks to ensure
    that we've removed the object from various lists before freeing it.
    
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2127067
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Extract the svcxdr_init_encode() helper [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 27 15:53:42 2020 -0400

    NFSD: Extract the svcxdr_init_encode() helper
    
    [ Upstream commit bddfdbcddbe267519cd36aeb115fdf8620980111 ]
    
    NFSD initializes an encode xdr_stream only after the RPC layer has
    already inserted the RPC Reply header. Thus it behaves differently
    than xdr_init_encode does, which assumes the passed-in xdr_buf is
    entirely devoid of content.
    
    nfs4proc.c has this server-side stream initialization helper, but
    it is visible only to the NFSv4 code. Move this helper to a place
    that can be accessed by NFSv2 and NFSv3 server XDR functions.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: find_cpntf_state cleanup [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:42 2021 -0500

    nfsd: find_cpntf_state cleanup
    
    [ Upstream commit 47fdb22dacae78f37701d82a94c16a014186d34e ]
    
    I think this unusual use of struct compound_state could cause confusion.
    
    It's not that much more complicated just to open-code this stateid
    lookup.
    
    The only change in behavior should be a different error return in the
    case the copy is using a source stateid that is a revoked delegation,
    but I doubt that matters.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [ cel: squashed in fix reported by Coverity ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Finish converting the NFSv2 GETACL result encoder [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Oct 16 11:47:02 2022 -0400

    NFSD: Finish converting the NFSv2 GETACL result encoder
    
    [ Upstream commit ea5021e911d3479346a75ac9b7d9dcd751b0fb99 ]
    
    The xdr_stream conversion inadvertently left some code that set the
    page_len of the send buffer. The XDR stream encoders should handle
    this automatically now.
    
    This oversight adds garbage past the end of the Reply message.
    Clients typically ignore the garbage, but NFSD does not need to send
    it, as it leaks stale memory contents onto the wire.
    
    Fixes: f8cba47344f7 ("NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream")
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Finish converting the NFSv3 GETACL result encoder [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Oct 16 11:47:08 2022 -0400

    NFSD: Finish converting the NFSv3 GETACL result encoder
    
    [ Upstream commit 841fd0a3cb490eae5dfd262eccb8c8b11d57f8b8 ]
    
    For some reason, the NFSv2 GETACL result encoder was fully converted
    to use the new nfs_stream_encode_acl(), but the NFSv3 equivalent was
    not similarly converted.
    
    Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream")
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix a regression in nfsd_setattr() [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Thu Feb 15 20:24:50 2024 -0500

    nfsd: Fix a regression in nfsd_setattr()
    
    [ Upstream commit 6412e44c40aaf8f1d7320b2099c5bdd6cb9126ac ]
    
    Commit bb4d53d66e4b ("NFSD: use (un)lock_inode instead of
    fh_(un)lock for file operations") broke the NFSv3 pre/post op
    attributes behaviour when doing a SETATTR rpc call by stripping out
    the calls to fh_fill_pre_attrs() and fh_fill_post_attrs().
    
    Fixes: bb4d53d66e4b ("NFSD: use (un)lock_inode instead of fh_(un)lock for file operations")
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Message-ID: <20240216012451.22725-1-trondmy@kernel.org>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix a warning for nfsd_file_close_inode [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Thu Sep 30 15:44:42 2021 -0400

    nfsd: Fix a warning for nfsd_file_close_inode
    
    [ Upstream commit 19598141f40dff728dd50799e510805261f48850 ]
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix a write performance regression [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Thu Mar 31 09:54:01 2022 -0400

    nfsd: Fix a write performance regression
    
    [ Upstream commit 6b8a94332ee4f7d9a8ae0cbac7609f79c212f06c ]
    
    The call to filemap_flush() in nfsd_file_put() is there to ensure that
    we clear out any writes belonging to a NFSv3 client relatively quickly
    and avoid situations where the file can't be evicted by the garbage
    collector. It also ensures that we detect write errors quickly.
    
    The problem is this causes a regression in performance for some
    workloads.
    
    So try to improve matters by deferring writeback until we're ready to
    close the file, and need to detect errors so that we can force the
    client to resend.
    
    Tested-by: Jan Kara <jack@suse.cz>
    Fixes: b6669305d35a ("nfsd: Reduce the number of calls to nfsd_file_gc()")
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Link: https://lore.kernel.org/all/20220330103457.r4xrhy2d6nhtouzk@quack3.lan
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix comments about spinlock handling with delegations [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Sep 26 12:38:45 2022 -0400

    nfsd: fix comments about spinlock handling with delegations
    
    [ Upstream commit 25fbe1fca14142beae6c882f7906510363d42bff ]
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Feb 3 13:18:34 2023 -0500

    nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open
    
    [ Upstream commit dcd779dc46540e174a6ac8d52fbed23593407317 ]
    
    The nested if statements here make no sense, as you can never reach
    "else" branch in the nested statement. Fix the error handling for
    when there is a courtesy client that holds a conflicting deny mode.
    
    Fixes: 3d6942715180 ("NFSD: add support for share reservation conflict to courteous server")
    Reported-by: 張智諺 <cc85nod@gmail.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix crash on COPY_NOTIFY with special stateid [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Wed Jan 5 14:15:03 2022 -0500

    nfsd: fix crash on COPY_NOTIFY with special stateid
    
    [ Upstream commit 074b07d94e0bb6ddce5690a9b7e2373088e8b33a ]
    
    RTM says "If the special ONE stateid is passed to
    nfs4_preprocess_stateid_op(), it returns status=0 but does not set
    *cstid. nfsd4_copy_notify() depends on stid being set if status=0, and
    thus can crash if the client sends the right COPY_NOTIFY RPC."
    
    RFC 7862 says "The cna_src_stateid MUST refer to either open or locking
    states provided earlier by the server.  If it is invalid, then the
    operation MUST fail."
    
    The RFC doesn't specify an error, and the choice doesn't matter much as
    this is clearly illegal client behavior, but bad_stateid seems
    reasonable.
    
    Simplest is just to guarantee that nfs4_preprocess_stateid_op, called
    with non-NULL cstid, errors out if it can't return a stateid.
    
    Reported-by: rtm@csail.mit.edu
    Fixes: 624322f1adc5 ("NFSD add COPY_NOTIFY operation")
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
    Tested-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix creation time serialization order [+ + +]

Author: Tavian Barnes <tavianator@tavianator.com>
Date:   Fri Jun 23 17:09:06 2023 -0400

    nfsd: Fix creation time serialization order
    
    [ Upstream commit d7dbed457c2ef83709a2a2723a2d58de43623449 ]
    
    In nfsd4_encode_fattr(), TIME_CREATE was being written out after all
    other times.  However, they should be written out in an order that
    matches the bit flags in bmval1, which in this case are
    
        #define FATTR4_WORD1_TIME_ACCESS        (1UL << 15)
        #define FATTR4_WORD1_TIME_CREATE        (1UL << 18)
        #define FATTR4_WORD1_TIME_DELTA         (1UL << 19)
        #define FATTR4_WORD1_TIME_METADATA      (1UL << 20)
        #define FATTR4_WORD1_TIME_MODIFY        (1UL << 21)
    
    so TIME_CREATE should come second.
    
    I noticed this on a FreeBSD NFSv4.2 client, which supports creation
    times.  On this client, file times were weirdly permuted.  With this
    patch applied on the server, times looked normal on the client.
    
    Fixes: e377a3e698fb ("nfsd: Add support for the birth time attribute")
    Link: https://unix.stackexchange.com/q/749605/56202
    Signed-off-by: Tavian Barnes <tavianator@tavianator.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix double fget() bug in __write_ports_addfd() [+ + +]

Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Mon May 29 14:35:55 2023 +0300

    nfsd: fix double fget() bug in __write_ports_addfd()
    
    [ Upstream commit c034203b6a9dae6751ef4371c18cb77983e30c28 ]
    
    The bug here is that you cannot rely on getting the same socket
    from multiple calls to fget() because userspace can influence
    that.  This is a kind of double fetch bug.
    
    The fix is to delete the svc_alien_sock() function and instead do
    the checking inside the svc_addsock() function.
    
    Fixes: 3064639423c4 ("nfsd: check passed socket's net matches NFSd superblock's one")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: NeilBrown <neilb@suse.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix error return code in nfsd4_interssc_connect() [+ + +]

Author: Wei Yongjun <weiyongjun1@huawei.com>
Date:   Fri Jun 4 10:12:37 2021 +0000

    NFSD: Fix error return code in nfsd4_interssc_connect()
    
    [ Upstream commit 54185267e1fe476875e649bb18e1c4254c123305 ]
    
    'status' has been overwritten to 0 after nfsd4_ssc_setup_dul(), this
    cause 0 will be return in vfs_kern_mount() error case. Fix to return
    nfserr_nodev in this error.
    
    Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.")
    Reported-by: Hulk Robot <hulkci@huawei.com>
    Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix error return code in nfsd_file_cache_init() [+ + +]

Author: Huang Guobin <huangguobin4@huawei.com>
Date:   Wed Nov 25 03:39:33 2020 -0500

    nfsd: Fix error return code in nfsd_file_cache_init()
    
    [ Upstream commit 231307df246eb29f30092836524ebb1fcb8f5b25 ]
    
    Fix to return PTR_ERR() error code from the error handling case instead of
    0 in function nfsd_file_cache_init(), as done elsewhere in this function.
    
    Fixes: 65294c1f2c5e7("nfsd: add a new struct file caching facility to nfsd")
    Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix exposure in nfsd4_decode_bitmap() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 14 15:16:04 2021 -0500

    NFSD: Fix exposure in nfsd4_decode_bitmap()
    
    [ Upstream commit c0019b7db1d7ac62c711cda6b357a659d46428fe ]
    
    rtm@csail.mit.edu reports:
    > nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC
    > directs it to do so. This can cause nfsd4_decode_state_protect4_a()
    > to write client-supplied data beyond the end of
    > nfsd4_exchange_id.spo_must_allow[] when called by
    > nfsd4_decode_exchange_id().
    
    Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond
    @bmlen.
    
    Reported by: rtm@csail.mit.edu
    Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix fall-through warnings for Clang [+ + +]

Author: Gustavo A. R. Silva <gustavoars@kernel.org>
Date:   Fri Nov 20 12:26:40 2020 -0600

    nfsd: Fix fall-through warnings for Clang
    
    [ Upstream commit 76c50eb70d8e1133eaada0013845619c36345fbc ]
    
    In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
    warnings by explicitly adding a couple of break statements instead of
    just letting the code fall through to the next case.
    
    Link: https://github.com/KSPP/linux/issues/115
    Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix handling of cached open files in nfsd4_open codepath [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Jan 5 14:55:56 2023 -0500

    nfsd: fix handling of cached open files in nfsd4_open codepath
    
    [ Upstream commit 0b3a551fa58b4da941efeb209b3770868e2eddd7 ]
    
    Commit fb70bf124b05 ("NFSD: Instantiate a struct file when creating a
    regular NFSv4 file") added the ability to cache an open fd over a
    compound. There are a couple of problems with the way this currently
    works:
    
    It's racy, as a newly-created nfsd_file can end up with its PENDING bit
    cleared while the nf is hashed, and the nf_file pointer is still zeroed
    out. Other tasks can find it in this state and they expect to see a
    valid nf_file, and can oops if nf_file is NULL.
    
    Also, there is no guarantee that we'll end up creating a new nfsd_file
    if one is already in the hash. If an extant entry is in the hash with a
    valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with
    the value of op_file and the old nf_file will leak.
    
    Fix both issues by making a new nfsd_file_acquirei_opened variant that
    takes an optional file pointer. If one is present when this is called,
    we'll take a new reference to it instead of trying to open the file. If
    the nfsd_file already has a valid nf_file, we'll just ignore the
    optional file and pass the nfsd_file back as-is.
    
    Also rework the tracepoints a bit to allow for an "opened" variant and
    don't try to avoid counting acquisitions in the case where we already
    have a cached open file.
    
    Fixes: fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file")
    Cc: Trond Myklebust <trondmy@hammerspace.com>
    Reported-by: Stanislav Saner <ssaner@redhat.com>
    Reported-and-Tested-by: Ruben Vestergaard <rubenv@drcmr.dk>
    Reported-and-Tested-by: Torkil Svensgaard <torkil@drcmr.dk>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix handling of oversized NFSv4 COMPOUND requests [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 5 15:33:32 2022 -0400

    NFSD: Fix handling of oversized NFSv4 COMPOUND requests
    
    [ Upstream commit 7518a3dc5ea249d4112156ce71b8b184eb786151 ]
    
    If an NFS server returns NFS4ERR_RESOURCE on the first operation in
    an NFSv4 COMPOUND, there's no way for a client to know where the
    problem is and then simplify the compound to make forward progress.
    
    So instead, make NFSD process as many operations in an oversized
    COMPOUND as it can and then return NFS4ERR_RESOURCE on the first
    operation it did not process.
    
    pynfs NFSv4.0 COMP6 exercises this case, but checks only for the
    COMPOUND status code, not whether the server has processed any
    of the operations.
    
    pynfs NFSv4.1 SEQ6 and SEQ7 exercise the NFSv4.1 case, which detects
    too many operations per COMPOUND by checking against the limits
    negotiated when the session was created.
    
    Suggested-by: Bruce Fields <bfields@fieldses.org>
    Fixes: 0078117c6d91 ("nfsd: return RESOURCE not GARBAGE_ARGS on too many ops")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix ia_size underflow [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jan 31 13:01:53 2022 -0500

    NFSD: Fix ia_size underflow
    
    [ Upstream commit e6faac3f58c7c4176b66f63def17a34232a17b0e ]
    
    iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
    NFSv4 both define file size as an unsigned 64-bit type. Thus there
    is a range of valid file size values an NFS client can send that is
    already larger than Linux can handle.
    
    Currently decode_fattr4() dumps a full u64 value into ia_size. If
    that value happens to be larger than S64_MAX, then ia_size
    underflows. I'm about to fix up the NFSv3 behavior as well, so let's
    catch the underflow in the common code path: nfsd_setattr().
    
    Cc: stable@vger.kernel.org
    [ cel: context adjusted, 2f221d6f7b88 has not been applied ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix inconsistent indenting [+ + +]

Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Date:   Thu Dec 2 16:35:42 2021 +0800

    NFSD: Fix inconsistent indenting
    
    [ Upstream commit 1e37d0e5bda45881eea1bec4b812def72c7d4aea ]
    
    Eliminate the follow smatch warning:
    
    fs/nfsd/nfs4xdr.c:4766 nfsd4_encode_read_plus_hole() warn: inconsistent
    indenting.
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix kernel test robot warning in SSC code [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Thu Jun 3 20:02:26 2021 -0400

    nfsd: fix kernel test robot warning in SSC code
    
    [ Upstream commit f47dc2d3013c65631bf8903becc7d88dc9d9966e ]
    
    Fix by initializing pointer nfsd4_ssc_umount_item with NULL instead of 0.
    Replace return value of nfsd4_ssc_setup_dul with __be32 instead of int.
    
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: fix leaked reference count of nfsd4_ssc_umount_item [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon Jan 23 21:34:13 2023 -0800

    NFSD: fix leaked reference count of nfsd4_ssc_umount_item
    
    [ Upstream commit 34e8f9ec4c9ac235f917747b23a200a5e0ec857b ]
    
    The reference count of nfsd4_ssc_umount_item is not decremented
    on error conditions. This prevents the laundromat from unmounting
    the vfsmount of the source file.
    
    This patch decrements the reference count of nfsd4_ssc_umount_item
    on error.
    
    Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.")
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix licensing header in filecache.c [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 31 09:53:26 2022 -0400

    NFSD: Fix licensing header in filecache.c
    
    [ Upstream commit 3f054211b29c0fa06dfdcab402c795fd7e906be1 ]
    
    Add a missing SPDX header.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix net-namespace logic in __nfsd_file_cache_purge [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Oct 31 11:49:21 2022 -0400

    nfsd: fix net-namespace logic in __nfsd_file_cache_purge
    
    [ Upstream commit d3aefd2b29ff5ffdeb5c06a7d3191a027a18cdb8 ]
    
    If the namespace doesn't match the one in "net", then we'll continue,
    but that doesn't cause another rhashtable_walk_next call, so it will
    loop infinitely.
    
    Fixes: ce502f81ba88 ("NFSD: Convert the filecache to use rhashtable")
    Reported-by: Petr Vorel <pvorel@suse.cz>
    Link: https://lore.kernel.org/ltp/Y1%2FP8gDAcWC%2F+VR3@pevik/
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix nfsd_file_unhash_and_dispose [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Sep 30 16:56:02 2022 -0400

    nfsd: fix nfsd_file_unhash_and_dispose
    
    [ Upstream commit 8d0d254b15cc5b7d46d85fb7ab8ecede9575e672 ]
    
    nfsd_file_unhash_and_dispose() is called for two reasons:
    
    We're either shutting down and purging the filecache, or we've gotten a
    notification about a file delete, so we want to go ahead and unhash it
    so that it'll get cleaned up when we close.
    
    We're either walking the hashtable or doing a lookup in it and we
    don't take a reference in either case. What we want to do in both cases
    is to try and unhash the object and put it on the dispose list if that
    was successful. If it's no longer hashed, then we don't want to touch
    it, with the assumption being that something else is already cleaning
    up the sentinel reference.
    
    Instead of trying to selectively decrement the refcount in this
    function, just unhash it, and if that was successful, move it to the
    dispose list. Then, the disposal routine will just clean that up as
    usual.
    
    Also, just make this a void function, drop the WARN_ON_ONCE, and the
    comments about deadlocking since the nature of the purported deadlock
    is no longer clear.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jan 25 15:59:57 2022 -0500

    NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
    
    [ Upstream commit a648fdeb7c0e17177a2280344d015dba3fbe3314 ]
    
    iattr::ia_size is a loff_t, so these NFSv3 procedures must be
    careful to deal with incoming client size values that are larger
    than s64_max without corrupting the value.
    
    Silently capping the value results in storing a different value
    than the client passed in which is unexpected behavior, so remove
    the min_t() check in decode_sattr3().
    
    Note that RFC 1813 permits only the WRITE procedure to return
    NFS3ERR_FBIG. We believe that NFSv3 reference implementations
    also return NFS3ERR_FBIG when ia_size is too large.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix NULL dereference in nfs3svc_encode_getaclres [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jul 1 20:06:56 2021 -0400

    nfsd: fix NULL dereference in nfs3svc_encode_getaclres
    
    [ Upstream commit ab1016d39cc052064e32f25ad18ef8767a0ee3b8 ]
    
    In error cases the dentry may be NULL.
    
    Before 20798dfe249a, the encoder also checked dentry and
    d_really_is_positive(dentry), but that looks like overkill to me--zero
    status should be enough to guarantee a positive dentry.
    
    This isn't the first time we've seen an error-case NULL dereference
    hidden in the initialization of a local variable in an xdr encoder.  But
    I went back through the other recent rewrites and didn't spot any
    similar bugs.
    
    Reported-by: JianHong Yin <jiyin@redhat.com>
    Reviewed-by: Chuck Lever III <chuck.lever@oracle.com>
    Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder...")
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix null-ptr-deref in nfsd_fill_super() [+ + +]

Author: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Date:   Sat May 21 12:08:45 2022 +0800

    nfsd: Fix null-ptr-deref in nfsd_fill_super()
    
    [ Upstream commit 6f6f84aa215f7b6665ccbb937db50860f9ec2989 ]
    
    KASAN report null-ptr-deref as follows:
    
      BUG: KASAN: null-ptr-deref in nfsd_fill_super+0xc6/0xe0 [nfsd]
      Write of size 8 at addr 000000000000005d by task a.out/852
    
      CPU: 7 PID: 852 Comm: a.out Not tainted 5.18.0-rc7-dirty #66
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x34/0x44
       kasan_report+0xab/0x120
       ? nfsd_mkdir+0x71/0x1c0 [nfsd]
       ? nfsd_fill_super+0xc6/0xe0 [nfsd]
       nfsd_fill_super+0xc6/0xe0 [nfsd]
       ? nfsd_mkdir+0x1c0/0x1c0 [nfsd]
       get_tree_keyed+0x8e/0x100
       vfs_get_tree+0x41/0xf0
       __do_sys_fsconfig+0x590/0x670
       ? fscontext_read+0x180/0x180
       ? anon_inode_getfd+0x4f/0x70
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    This can be reproduce by concurrent operations:
            1. fsopen(nfsd)/fsconfig
            2. insmod/rmmod nfsd
    
    Since the nfsd file system is registered before than nfsd_net allocated,
    the caller may get the file_system_type and use the nfsd_net before it
    allocated, then null-ptr-deref occurred.
    
    So init_nfsd() should call register_filesystem() last.
    
    Fixes: bd5ae9288d64 ("nfsd: register pernet ops last, unregister first")
    Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: fix possible oops when nfsd/pool_stats is closed. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Sep 12 11:25:00 2023 +1000

    NFSD: fix possible oops when nfsd/pool_stats is closed.
    
    [ Upstream commit 88956eabfdea7d01d550535af120d4ef265b1d02 ]
    
    If /proc/fs/nfsd/pool_stats is open when the last nfsd thread exits, then
    when the file is closed a NULL pointer is dereferenced.
    This is because nfsd_pool_stats_release() assumes that the
    pointer to the svc_serv cannot become NULL while a reference is held.
    
    This used to be the case but a recent patch split nfsd_last_thread() out
    from nfsd_put(), and clearing the pointer is done in nfsd_last_thread().
    
    This is easily reproduced by running
       rpc.nfsd 8 ; ( rpc.nfsd 0;true) < /proc/fs/nfsd/pool_stats
    
    Fortunately nfsd_pool_stats_release() has easy access to the svc_serv
    pointer, and so can call svc_put() on it directly.
    
    Fixes: 9f28a971ee9f ("nfsd: separate nfsd_last_thread() from nfsd_put()")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix potential use-after-free in nfsd_file_put() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue May 31 19:49:01 2022 -0400

    NFSD: Fix potential use-after-free in nfsd_file_put()
    
    [ Upstream commit b6c71c66b0ad8f2b59d9bc08c7a5079b110bec01 ]
    
    nfsd_file_put_noref() can free @nf, so don't dereference @nf
    immediately upon return from nfsd_file_put_noref().
    
    Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
    Fixes: 999397926ab3 ("nfsd: Clean up nfsd_file_put()")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Apr 19 10:53:18 2023 -0700

    NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop
    
    [ Upstream commit 147abcacee33781e75588869e944ddb07528a897 ]
    
    The following request sequence to the same file causes the NFS client and
    server getting into an infinite loop with COMMIT and NFS4ERR_DELAY:
    
    OPEN
    REMOVE
    WRITE
    COMMIT
    
    Problem reported by recall11, recall12, recall14, recall20, recall22,
    recall40, recall42, recall48, recall50 of nfstest suite.
    
    This patch restores the handling of race condition in nfsd_file_do_acquire
    with unlink to that prior of the regression.
    
    Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache")
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: fix problems with cleanup on errors in nfsd4_copy [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Tue Jan 31 11:12:29 2023 -0800

    NFSD: fix problems with cleanup on errors in nfsd4_copy
    
    [ Upstream commit 81e722978ad21072470b73d8f6a50ad62c7d5b7d ]
    
    When nfsd4_copy fails to allocate memory for async_copy->cp_src, or
    nfs4_init_copy_state fails, it calls cleanup_async_copy to do the
    cleanup for the async_copy which causes page fault since async_copy
    is not yet initialized.
    
    This patche rearranges the order of initializing the fields in
    async_copy and adds checks in cleanup_async_copy to skip un-initialized
    fields.
    
    Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
    Fixes: 87689df69491 ("NFSD: Shrink size of struct nfsd4_copy")
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix READDIR buffer overflow [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Dec 16 11:12:11 2021 -0500

    NFSD: Fix READDIR buffer overflow
    
    [ Upstream commit 53b1119a6e5028b125f431a0116ba73510d82a72 ]
    
    If a client sends a READDIR count argument that is too small (say,
    zero), then the buffer size calculation in the new init_dirlist
    helper functions results in an underflow, allowing the XDR stream
    functions to write beyond the actual buffer.
    
    This calculation has always been suspect. NFSD has never sanity-
    checked the READDIR count argument, but the old entry encoders
    managed the problem correctly.
    
    With the commits below, entry encoding changed, exposing the
    underflow to the pointer arithmetic in xdr_reserve_space().
    
    Modern NFS clients attempt to retrieve as much data as possible
    for each READDIR request. Also, we have no unit tests that
    exercise the behavior of READDIR at the lower bound of @count
    values. Thus this case was missed during testing.
    
    Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
    Fixes: f5dcccd647da ("NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream")
    Fixes: 7f87fc2d34d4 ("NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix reads with a non-zero offset that don't end on a page boundary [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 23 14:14:32 2022 -0500

    NFSD: Fix reads with a non-zero offset that don't end on a page boundary
    
    [ Upstream commit ac8db824ead0de2e9111337c401409d010fba2f0 ]
    
    This was found when virtual machines with nfs-mounted qcow2 disks
    failed to boot properly.
    
    Reported-by: Anders Blomdell <anders.blomdell@control.lth.se>
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142132
    Fixes: bfbfb6182ad1 ("nfsd_splice_actor(): handle compound pages")
    [ cel: "‘for’ loop initial declarations are only allowed in C99 or C11 mode" ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: fix regression with setting ACLs. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 8 12:08:40 2022 +1000

    NFSD: fix regression with setting ACLs.
    
    [ Upstream commit 00801cd92d91e94aa04d687f9bb9a9104e7c3d46 ]
    
    A recent patch moved ACL setting into nfsd_setattr().
    Unfortunately it didn't work as nfsd_setattr() aborts early if
    iap->ia_valid is 0.
    
    Remove this test, and instead avoid calling notify_change() when
    ia_valid is 0.
    
    This means that nfsd_setattr() will now *always* lock the inode.
    Previously it didn't if only a ATTR_MODE change was requested on a
    symlink (see Commit 15b7a1b86d66 ("[PATCH] knfsd: fix setattr-on-symlink
    error return")). I don't think this change really matters.
    
    Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix RELEASE_LOCKOWNER [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 22 14:58:16 2024 +1100

    nfsd: fix RELEASE_LOCKOWNER
    
    [ Upstream commit edcf9725150e42beeca42d085149f4c88fa97afd ]
    
    The test on so_count in nfsd4_release_lockowner() is nonsense and
    harmful.  Revert to using check_for_locks(), changing that to not sleep.
    
    First: harmful.
    As is documented in the kdoc comment for nfsd4_release_lockowner(), the
    test on so_count can transiently return a false positive resulting in a
    return of NFS4ERR_LOCKS_HELD when in fact no locks are held.  This is
    clearly a protocol violation and with the Linux NFS client it can cause
    incorrect behaviour.
    
    If RELEASE_LOCKOWNER is sent while some other thread is still
    processing a LOCK request which failed because, at the time that request
    was received, the given owner held a conflicting lock, then the nfsd
    thread processing that LOCK request can hold a reference (conflock) to
    the lock owner that causes nfsd4_release_lockowner() to return an
    incorrect error.
    
    The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
    never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
    it knows that the error is impossible.  It assumes the lock owner was in
    fact released so it feels free to use the same lock owner identifier in
    some later locking request.
    
    When it does reuse a lock owner identifier for which a previous RELEASE
    failed, it will naturally use a lock_seqid of zero.  However the server,
    which didn't release the lock owner, will expect a larger lock_seqid and
    so will respond with NFS4ERR_BAD_SEQID.
    
    So clearly it is harmful to allow a false positive, which testing
    so_count allows.
    
    The test is nonsense because ... well... it doesn't mean anything.
    
    so_count is the sum of three different counts.
    1/ the set of states listed on so_stateids
    2/ the set of active vfs locks owned by any of those states
    3/ various transient counts such as for conflicting locks.
    
    When it is tested against '2' it is clear that one of these is the
    transient reference obtained by find_lockowner_str_locked().  It is not
    clear what the other one is expected to be.
    
    In practice, the count is often 2 because there is precisely one state
    on so_stateids.  If there were more, this would fail.
    
    In my testing I see two circumstances when RELEASE_LOCKOWNER is called.
    In one case, CLOSE is called before RELEASE_LOCKOWNER.  That results in
    all the lock states being removed, and so the lockowner being discarded
    (it is removed when there are no more references which usually happens
    when the lock state is discarded).  When nfsd4_release_lockowner() finds
    that the lock owner doesn't exist, it returns success.
    
    The other case shows an so_count of '2' and precisely one state listed
    in so_stateid.  It appears that the Linux client uses a separate lock
    owner for each file resulting in one lock state per lock owner, so this
    test on '2' is safe.  For another client it might not be safe.
    
    So this patch changes check_for_locks() to use the (newish)
    find_any_file_locked() so that it doesn't take a reference on the
    nfs4_file and so never calls nfsd_file_put(), and so never sleeps.  With
    this check is it safe to restore the use of check_for_locks() rather
    than testing so_count against the mysterious '2'.
    
    Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Cc: stable@vger.kernel.org # v6.2+
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix returned READDIR offset cookie [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 10 10:24:39 2020 -0500

    NFSD: Fix returned READDIR offset cookie
    
    [ Upstream commit 0a8f37fb34a96267c656f7254e69bb9a2fc89fe4 ]
    
    Code inspection shows that the server's NFSv3 READDIR implementation
    handles offset cookies slightly differently than the NFSv2 READDIR,
    NFSv3 READDIRPLUS, and NFSv4 READDIR implementations,
    and there doesn't seem to be any need for this difference.
    
    As a clean up, I copied the logic from nfsd3_proc_readdirplus().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix space and spelling mistake [+ + +]

Author: Zhang Jiaming <jiaming@nfschina.com>
Date:   Thu Jun 23 16:20:05 2022 +0800

    NFSD: Fix space and spelling mistake
    
    [ Upstream commit f532c9ff103897be0e2a787c0876683c3dc39ed3 ]
    
    Add a blank space after ','.
    Change 'succesful' to 'successful'.
    
    Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix sparse warning [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 13 16:44:20 2021 -0400

    NFSD: Fix sparse warning
    
    [ Upstream commit c2f1c4bd20621175c581f298b4943df0cffbd841 ]
    
    /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: warning: incorrect type in assignment (different base types)
    /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24:    expected restricted __be32 [usertype] status
    /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24:    got int
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix sparse warning in nfssvc.c [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Dec 18 12:28:23 2020 -0500

    NFSD: Fix sparse warning in nfssvc.c
    
    [ Upstream commit d6c9e4368cc6a61bf25c9c72437ced509c854563 ]
    
    fs/nfsd/nfssvc.c:36:6: warning: symbol 'inter_copy_offload_enable' was not declared. Should it be static?
    
    The parameter was added by commit ce0887ac96d3 ("NFSD add nfs4 inter
    ssc to nfsd4_copy"). Relocate it into the source file that uses it,
    and make it static. This approach is similar to the
    nfs4_disable_idmapping, cltrack_prog, and cltrack_legacy_disable
    module parameters.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix strncpy() fortify warning [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:03 2022 -0400

    NFSD: Fix strncpy() fortify warning
    
    [ Upstream commit 5304877936c0a67e1a01464d113bae4c81eacdb6 ]
    
    In function ‘strncpy’,
        inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3,
        inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11:
    /home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation]
       52 | #define __underlying_strncpy    __builtin_strncpy
          |                                 ^
    /home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’
       89 |         return __underlying_strncpy(p, q, size);
          |                ^~~~~~~~~~~~~~~~~~~~
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix the behavior of READ near OFFSET_MAX [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Feb 4 15:19:34 2022 -0500

    NFSD: Fix the behavior of READ near OFFSET_MAX
    
    [ Upstream commit 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 ]
    
    Dan Aloni reports:
    > Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
    > the RPC read layers") on the client, a read of 0xfff is aligned up
    > to server rsize of 0x1000.
    >
    > As a result, in a test where the server has a file of size
    > 0x7fffffffffffffff, and the client tries to read from the offset
    > 0x7ffffffffffff000, the read causes loff_t overflow in the server
    > and it returns an NFS code of EINVAL to the client. The client as
    > a result indefinitely retries the request.
    
    The Linux NFS client does not handle NFS?ERR_INVAL, even though all
    NFS specifications permit servers to return that status code for a
    READ.
    
    Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
    and return a short result. Set the EOF flag in the result to prevent
    the client from retrying the READ request. This behavior appears to
    be consistent with Solaris NFS servers.
    
    Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
    must be converted to loff_t internally before use -- an implicit
    type cast is not adequate for this purpose. Otherwise VFS checks
    against sb->s_maxbytes do not work properly.
    
    Reported-by: Dan Aloni <dan.aloni@vastdata.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix the filecache LRU shrinker [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:24 2022 -0400

    NFSD: Fix the filecache LRU shrinker
    
    [ Upstream commit edead3a55804739b2e4af0f35e9c7326264e7b22 ]
    
    Without LRU item rotation, the shrinker visits only a few items on
    the end of the LRU list, and those would always be long-term OPEN
    files for NFSv4 workloads. That makes the filecache shrinker
    completely ineffective.
    
    Adopt the same strategy as the inode LRU by using LRU_ROTATE.
    
    Suggested-by: Dave Chinner <david@fromorbit.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix typo "accesible" [+ + +]

Author: Ricardo Ribalda <ribalda@chromium.org>
Date:   Thu Mar 18 21:22:21 2021 +0100

    nfsd: Fix typo "accesible"
    
    [ Upstream commit 34a624931b8c12b435b5009edc5897e4630107bc ]
    
    Trivial fix.
    
    Cc: linux-nfs@vger.kernel.org
    Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Mon Nov 30 17:03:18 2020 -0500

    nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE
    
    [ Upstream commit 2e19d10c1438241de32467637a2a411971547991 ]
    
    If the underlying filesystem times out, then we want knfsd to return
    NFSERR_JUKEBOX/DELAY rather than NFSERR_STALE.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix up the filecache laundrette scheduling [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 2 14:44:50 2022 -0400

    nfsd: fix up the filecache laundrette scheduling
    
    [ Upstream commit 22ae4c114f77b55a4c5036e8f70409a0799a08f8 ]
    
    We don't really care whether there are hashed entries when it comes to
    scheduling the laundrette. They might all be non-gc entries, after all.
    We only want to schedule it if there are entries on the LRU.
    
    Switch to using list_lru_count, and move the check into
    nfsd_file_gc_worker. The other callsite in nfsd_file_put doesn't need to
    count entries, since it only schedules the laundrette after adding an
    entry to the LRU.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: fix use-after-free in nfsd4_ssc_setup_dul() [+ + +]

Author: Xingyuan Mo <hdthky0@gmail.com>
Date:   Thu Jan 12 00:24:53 2023 +0800

    NFSD: fix use-after-free in nfsd4_ssc_setup_dul()
    
    [ Upstream commit e6cf91b7b47ff82b624bdfe2fdcde32bb52e71dd ]
    
    If signal_pending() returns true, schedule_timeout() will not be executed,
    causing the waiting task to remain in the wait queue.
    Fixed by adding a call to finish_wait(), which ensures that the waiting
    task will always be removed from the wait queue.
    
    Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.")
    Signed-off-by: Xingyuan Mo <hdthky0@gmail.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Sat Nov 5 09:49:26 2022 -0400

    nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint
    
    [ Upstream commit bdd6b5624c62d0acd350d07564f1c82fe649235f ]
    
    When we fail to insert into the hashtable with a non-retryable error,
    we'll free the object and then goto out_status. If the tracepoint is
    enabled, it'll end up accessing the freed object when it tries to
    grab the fields out of it.
    
    Set nf to NULL after freeing it to avoid the issue.
    
    Fixes: 243a5263014a ("nfsd: rework hashtable handling in nfsd_do_file_acquire")
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <error27@gmail.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix using the correct variable for sizeof() [+ + +]

Author: Jakob Koschel <jakobkoschel@gmail.com>
Date:   Sat Mar 19 21:27:04 2022 +0100

    nfsd: fix using the correct variable for sizeof()
    
    [ Upstream commit 4fc5f5346592cdc91689455d83885b0af65d71b8 ]
    
    While the original code is valid, it is not the obvious choice for the
    sizeof() call and in preparation to limit the scope of the list iterator
    variable the sizeof should be changed to the size of the destination.
    
    Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix whitespace [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 21 16:41:32 2022 -0400

    NFSD: Fix whitespace
    
    [ Upstream commit 26320d7e317c37404c811603d50d811132aef78c ]
    
    Clean up: Pull case arms back one tab stop to conform every other
    switch statement in fs/nfsd/nfs4proc.c.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Fix zero-length NFSv3 WRITEs [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Dec 21 11:52:06 2021 -0500

    NFSD: Fix zero-length NFSv3 WRITEs
    
    [ Upstream commit 6a2f774424bfdcc2df3e17de0cefe74a4269cad5 ]
    
    The Linux NFS server currently responds to a zero-length NFSv3 WRITE
    request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE
    with NFS4_OK and count of zero.
    
    RFC 1813 says of the WRITE procedure's @count argument:
    
    count
             The number of bytes of data to be written. If count is
             0, the WRITE will succeed and return a count of 0,
             barring errors due to permissions checking.
    
    RFC 8881 has similar language for NFSv4, though NFSv4 removed the
    explicit @count argument because that value is already contained in
    the opaque payload array.
    
    The synthetic client pynfs's WRT4 and WRT15 tests do emit zero-
    length WRITEs to exercise this spec requirement. Commit fdec6114ee1f
    ("nfsd4: zero-length WRITE should succeed") addressed the same
    problem there with the same fix.
    
    But interestingly the Linux NFS client does not appear to emit zero-
    length WRITEs, instead squelching them. I'm not aware of a test that
    can generate such WRITEs for NFSv3, so I wrote a naive C program to
    generate a zero-length WRITE and test this fix.
    
    Fixes: 8154ef2776aa ("NFSD: Clean up legacy NFS WRITE argument XDR decoders")
    Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Flesh out a documenting comment for filecache.c [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 1 13:30:46 2022 -0400

    NFSD: Flesh out a documenting comment for filecache.c
    
    [ Upstream commit b3276c1f5b268ff56622e9e125b792b4c3dc03ac ]
    
    Record what we've learned recently about the NFSD filecache in a
    documenting comment so our future selves don't forget what all this
    is for.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: grant read delegations to clients holding writes [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Apr 16 14:00:18 2021 -0400

    nfsd: grant read delegations to clients holding writes
    
    [ Upstream commit aba2072f452346d56a462718bcde93d697383148 ]
    
    It's OK to grant a read delegation to a client that holds a write,
    as long as it's the only client holding the write.
    
    We originally tried to do this in commit 94415b06eb8a ("nfsd4: a
    client's own opens needn't prevent delegations"), which had to be
    reverted in commit 6ee65a773096 ("Revert "nfsd4: a client's own
    opens needn't prevent delegations"").
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: handle errors better in write_ports_addfd() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    NFSD: handle errors better in write_ports_addfd()
    
    [ Upstream commit 89b24336f03a8ba560e96b0c47a8434a7fa48e3c ]
    
    If write_ports_add() fails, we shouldn't destroy the serv, unless we had
    only just created it.  So if there are any permanent sockets already
    attached, leave the serv in place.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: hash nfs4_files by inode number [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Apr 16 14:00:15 2021 -0400

    nfsd: hash nfs4_files by inode number
    
    [ Upstream commit f9b60e2209213fdfcc504ba25a404977c5d08b77 ]
    
    The nfs4_file structure is per-filehandle, not per-inode, because the
    spec requires open and other state to be per filehandle.
    
    But it will turn out to be convenient for nfs4_files associated with the
    same inode to be hashed to the same bucket, so let's hash on the inode
    instead of the filehandle.
    
    Filehandle aliasing is rare, so that shouldn't have much performance
    impact.
    
    (If you have a ton of exported filesystems, though, and all of them have
    a root with inode number 2, could that get you an overlong hash chain?
    Perhaps this (and the v4 open file cache) should be hashed on the inode
    pointer instead.)
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 30 17:06:21 2021 -0400

    NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()
    
    [ Upstream commit dae9a6cab8009e526570e7477ce858dcdfeb256e ]
    
    Refactor.
    
    Now that the NFSv2 and NFSv3 XDR decoders have been converted to
    use xdr_streams, the WRITE decoder functions can use
    xdr_stream_subsegment() to extract the WRITE payload into its own
    xdr_buf, just as the NFSv4 WRITE XDR decoder currently does.
    
    That makes it possible to pass the first kvec, pages array + length,
    page_base, and total payload length via a single function parameter.
    
    The payload's page_base is not yet assigned or used, but will be in
    subsequent patches.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: helper for laundromat expiry calculations [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Tue Mar 2 10:46:23 2021 -0500

    nfsd: helper for laundromat expiry calculations
    
    [ Upstream commit 7f7e7a4006f74b031718055a0751c70c2e3d5e7e ]
    
    We do this same logic repeatedly, and it's easy to get the sense of the
    comparison wrong.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Hook up the filecache stat file [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:58 2022 -0400

    NFSD: Hook up the filecache stat file
    
    [ Upstream commit 2e6c6e4c4375bfd3defa5b1ff3604d9f33d1c936 ]
    
    There has always been the capability of exporting filecache metrics
    via /proc, but it was never hooked up. Let's surface these metrics
    to enable better observability of the filecache.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: ignore requests to disable unsupported versions [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Oct 18 07:47:54 2022 -0400

    nfsd: ignore requests to disable unsupported versions
    
    [ Upstream commit 8e823bafff2308753d430566256c83d8085952da ]
    
    The kernel currently errors out if you attempt to enable or disable a
    version that it doesn't recognize. Change it to ignore attempts to
    disable an unrecognized version. If we don't support it, then there is
    no harm in doing so.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Tom Talpey <tom@talpey.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: improve stateid access bitmask documentation [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Tue Dec 7 17:32:21 2021 -0500

    nfsd: improve stateid access bitmask documentation
    
    [ Upstream commit 3dcd1d8aab00c5d3a0a3725253c86440b1a0f5a7 ]
    
    The use of the bitmaps is confusing.  Add a cross-reference to make it
    easier to find the existing comment.  Add an updated reference with URL
    to make it quicker to look up.  And a bit more editorializing about the
    value of this.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Increase NFSD_MAX_OPS_PER_COMPOUND [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Sep 2 18:18:16 2022 -0400

    NFSD: Increase NFSD_MAX_OPS_PER_COMPOUND
    
    [ Upstream commit 80e591ce636f3ae6855a0ca26963da1fdd6d4508 ]
    
    When attempting an NFSv4 mount, a Solaris NFSv4 client builds a
    single large COMPOUND that chains a series of LOOKUPs to get to the
    pseudo filesystem root directory that is to be mounted. The Linux
    NFS server's current maximum of 16 operations per NFSv4 COMPOUND is
    not large enough to ensure that this works for paths that are more
    than a few components deep.
    
    Since NFSD_MAX_OPS_PER_COMPOUND is mostly a sanity check, and most
    NFSv4 COMPOUNDS are between 3 and 6 operations (thus they do not
    trigger any re-allocation of the operation array on the server),
    increasing this maximum should result in little to no impact.
    
    The ops array can get large now, so allocate it via vmalloc() to
    help ensure memory fragmentation won't cause an allocation failure.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=216383
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Initialize pointer ni with NULL and not plain integer 0 [+ + +]

Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Sat Sep 25 23:58:41 2021 +0100

    NFSD: Initialize pointer ni with NULL and not plain integer 0
    
    [ Upstream commit 8e70bf27fd20cc17e87150327a640e546bfbee64 ]
    
    Pointer ni is being initialized with plain integer zero. Fix
    this by initializing with NULL.
    
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Instantiate a struct file when creating a regular NFSv4 file [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Mar 30 10:30:54 2022 -0400

    NFSD: Instantiate a struct file when creating a regular NFSv4 file
    
    [ Upstream commit fb70bf124b051d4ded4ce57511dfec6d3ebf2b43 ]
    
    There have been reports of races that cause NFSv4 OPEN(CREATE) to
    return an error even though the requested file was created. NFSv4
    does not provide a status code for this case.
    
    To mitigate some of these problems, reorganize the NFSv4
    OPEN(CREATE) logic to allocate resources before the file is actually
    created, and open the new file while the parent directory is still
    locked.
    
    Two new APIs are added:
    
    + Add an API that works like nfsd_file_acquire() but does not open
    the underlying file. The OPEN(CREATE) path can use this API when it
    already has an open file.
    
    + Add an API that is kin to dentry_open(). NFSD needs to create a
    file and grab an open "struct file *" atomically. The
    alloc_empty_file() has to be done before the inode create. If it
    fails (for example, because the NFS server has exceeded its
    max_files limit), we avoid creating the file and can still return
    an error to the NFS client.
    
    BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: JianHong Yin <jiyin@redhat.com>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: introduce struct nfsd_attrs [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: introduce struct nfsd_attrs
    
    [ Upstream commit 7fe2a71dda349a1afa75781f0cc7975be9784d15 ]
    
    The attributes that nfsd might want to set on a file include 'struct
    iattr' as well as an ACL and security label.
    The latter two are passed around quite separately from the first, in
    part because they are only needed for NFSv4.  This leads to some
    clumsiness in the code, such as the attributes NOT being set in
    nfsd_create_setattr().
    
    We need to keep the directory locked until all attributes are set to
    ensure the file is never visibile without all its attributes.  This need
    combined with the inconsistent handling of attributes leads to more
    clumsiness.
    
    As a first step towards tidying this up, introduce 'struct nfsd_attrs'.
    This is passed (by reference) to vfs.c functions that work with
    attributes, and is assembled by the various nfs*proc functions which
    call them.  As yet only iattr is included, but future patches will
    expand this.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Invoke svc_encode_result_payload() in "read" NFSD encoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 5 10:24:19 2020 -0500

    NFSD: Invoke svc_encode_result_payload() in "read" NFSD encoders
    
    [ Upstream commit 76e5492b161f555c0fb69cad9eb39a7d8467f5fe ]
    
    Have the NFSD encoders annotate the boundaries of every
    direct-data-placement eligible result data payload. Then change
    svcrdma to use that annotation instead of the xdr->page_len
    when handling Write chunks.
    
    For NFSv4 on RDMA, that enables the ability to recognize multiple
    result payloads per compound. This is a pre-requisite for supporting
    multiple Write chunks per RPC transaction.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: keep track of the number of courtesy clients in the system [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Sep 14 08:54:25 2022 -0700

    NFSD: keep track of the number of courtesy clients in the system
    
    [ Upstream commit 3a4ea23d86a317c4b68b9a69d51f7e84e1e04357 ]
    
    Add counter nfs4_courtesy_client_count to nfsd_net to keep track
    of the number of courtesy clients in the system.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: keep track of the number of v4 clients in the system [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Fri Jul 15 16:54:52 2022 -0700

    NFSD: keep track of the number of v4 clients in the system
    
    [ Upstream commit 0926c39515aa065a296e97dfc8790026f1e53f86 ]
    
    Add counter nfs4_client_count to keep track of the total number
    of v4 clients, including courtesy clients, in the system.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Leave open files out of the filecache LRU [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:17 2022 -0400

    NFSD: Leave open files out of the filecache LRU
    
    [ Upstream commit 4a0e73e635e3f36b616ad5c943e3d23debe4632f ]
    
    There have been reports of problems when running fstests generic/531
    against Linux NFS servers with NFSv4. The NFS server that hosts the
    test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
    Analysis shows that:
    
    fs/nfsd/filecache.c
     482                 ret = list_lru_walk(&nfsd_file_lru,
     483                                 nfsd_file_lru_cb,
     484                                 &head, LONG_MAX);
    
    causes nfsd_file_gc() to walk the entire length of the filecache LRU
    list every time it is called (which is quite frequently). The walk
    holds a spinlock the entire time that prevents other nfsd threads
    from accessing the filecache.
    
    What's more, for NFSv4 workloads, none of the items that are visited
    during this walk may be evicted, since they are all files that are
    held OPEN by NFS clients.
    
    Address this by ensuring that open files are not kept on the LRU
    list.
    
    Reported-by: Frank van der Linden <fllinden@amazon.com>
    Reported-by: Wang Yugui <wangyugui@e16-tech.com>
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
    Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: limit the number of v4 clients to 1024 per 1GB of system memory [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Fri Jul 15 16:54:53 2022 -0700

    NFSD: limit the number of v4 clients to 1024 per 1GB of system memory
    
    [ Upstream commit 4271c2c0887562318a0afef97d32d8a71cbe0743 ]
    
    Currently there is no limit on how many v4 clients are supported
    by the system. This can be a problem in systems with small memory
    configuration to function properly when a very large number of
    clients exist that creates memory shortage conditions.
    
    This patch enforces a limit of 1024 NFSv4 clients, including courtesy
    clients, per 1GB of system memory.  When the number of the clients
    reaches the limit, requests that create new clients are returned
    with NFS4ERR_DELAY and the laundromat is kicked start to trim old
    clients. Due to the overhead of the upcall to remove the client
    record, the maximun number of clients the laundromat removes on
    each run is limited to 128. This is done to ensure the laundromat
    can still process the other tasks in a timely manner.
    
    Since there is now a limit of the number of clients, the 24-hr
    idle time limit of courtesy client is no longer needed and was
    removed.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Log client tracking type log message as info instead of warning [+ + +]

Author: Paul Menzel <pmenzel@molgen.mpg.de>
Date:   Fri Mar 12 22:03:00 2021 +0100

    nfsd: Log client tracking type log message as info instead of warning
    
    [ Upstream commit f988a7b71d1e66e63f79cd59c763875347943a7a ]
    
    `printk()`, by default, uses the log level warning, which leaves the
    user reading
    
        NFSD: Using UMH upcall client tracking operations.
    
    wondering what to do about it (`dmesg --level=warn`).
    
    Several client tracking methods are tried, and expected to fail. That’s
    why a message is printed only on success. It might be interesting for
    users to know the chosen method, so use info-level instead of
    debug-level.
    
    Cc: linux-nfs@vger.kernel.org
    Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: make a copy of struct iattr before calling notify_change [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed May 17 12:26:44 2023 -0400

    nfsd: make a copy of struct iattr before calling notify_change
    
    [ Upstream commit d53d70084d27f56bcdf5074328f2c9ec861be596 ]
    
    notify_change can modify the iattr structure. In particular it can
    end up setting ATTR_MODE when ATTR_KILL_SUID is already set, causing
    a BUG() if the same iattr is passed to notify_change more than once.
    
    Make a copy of the struct iattr before calling notify_change.
    
    Reported-by: Zhi Li <yieli@redhat.com>
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2207969
    Tested-by: Zhi Li <yieli@redhat.com>
    Fixes: 34b91dda7124 ("NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY")
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Make it possible to use svc_set_num_threads_sync [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    NFSD: Make it possible to use svc_set_num_threads_sync
    
    [ Upstream commit 3409e4f1e8f239f0ed81be0b068ecf4e73e2e826 ]
    
    nfsd cannot currently use svc_set_num_threads_sync.  It instead
    uses svc_set_num_threads which does *not* wait for threads to all
    exit, and has a separate mechanism (nfsd_shutdown_complete) to wait
    for completion.
    
    The reason that nfsd is unlike other services is that nfsd threads can
    exit separately from svc_set_num_threads being called - they die on
    receipt of SIGKILL.  Also, when the last thread exits, the service must
    be shut down (sockets closed).
    
    For this, the nfsd_mutex needs to be taken, and as that mutex needs to
    be held while svc_set_num_threads is called, the one cannot wait for
    the other.
    
    This patch changes the nfsd thread so that it can drop the ref on the
    service without blocking on nfsd_mutex, so that svc_set_num_threads_sync
    can be used:
     - if it can drop a non-last reference, it does that.  This does not
       trigger shutdown and does not require a mutex.  This will likely
       happen for all but the last thread signalled, and for all threads
       being shut down by nfsd_shutdown_threads()
     - if it can get the mutex without blocking (trylock), it does that
       and then drops the reference.  This will likely happen for the
       last thread killed by SIGKILL
     - Otherwise there might be an unrelated task holding the mutex,
       possibly in another network namespace, or nfsd_shutdown_threads()
       might be just about to get a reference on the service, after which
       we can drop ours safely.
       We cannot conveniently get wakeup notifications on these events,
       and we are unlikely to need to, so we sleep briefly and check again.
    
    With this we can discard nfsd_shutdown_complete and
    nfsd_complete_shutdown(), and switch to svc_set_num_threads_sync.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Make nfs4_put_copy() static [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:35 2022 -0400

    NFSD: Make nfs4_put_copy() static
    
    [ Upstream commit 8ea6e2c90bb0eb74a595a12e23a1dff9abbc760a ]
    
    Clean up: All call sites are in fs/nfsd/nfs4proc.c.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Make nfsd4_ops::opnum a u32 [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 22 12:49:52 2020 -0500

    NFSD: Make nfsd4_ops::opnum a u32
    
    [ Upstream commit 3a237b4af5b7b0e77588e120554077cab3341943 ]
    
    Avoid passing a "pointer to int" argument to xdr_stream_decode_u32.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 8 18:14:25 2022 -0400

    NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY
    
    [ Upstream commit 5f5f8b6d655fd947e899b1771c2f7cb581a06764 ]
    
    nfsd_unlink() can kick off a CB_RECALL (via
    vfs_unlink() -> leases_conflict()) if a delegation is present.
    Before returning NFS4ERR_DELAY, give the client holding that
    delegation a chance to return it and then retry the nfsd_unlink()
    again, once.
    
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354
    Tested-by: Igor Mammedov <imammedo@redhat.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 8 18:14:19 2022 -0400

    NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY
    
    [ Upstream commit 68c522afd0b1936b48a03a4c8b81261e7597c62d ]
    
    nfsd_rename() can kick off a CB_RECALL (via
    vfs_rename() -> leases_conflict()) if a delegation is present.
    Before returning NFS4ERR_DELAY, give the client holding that
    delegation a chance to return it and then retry the nfsd_rename()
    again, once.
    
    This version of the patch handles renaming an existing file,
    but does not deal with renaming onto an existing file. That
    case will still always trigger an NFS4ERR_DELAY.
    
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354
    Tested-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: make nfsd4_run_cb a bool return function [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Sep 26 14:41:01 2022 -0400

    nfsd: make nfsd4_run_cb a bool return function
    
    [ Upstream commit b95239ca4954a0d48b19c09ce7e8f31b453b4216 ]
    
    queue_work can return false and not queue anything, if the work is
    already queued. If that happens in the case of a CB_RECALL, we'll have
    taken an extra reference to the stid that will never be put. Ensure we
    throw a warning in that case.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 8 18:14:13 2022 -0400

    NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY
    
    [ Upstream commit 34b91dda7124fc3259e4b2ae53e0c933dedfec01 ]
    
    nfsd_setattr() can kick off a CB_RECALL (via
    notify_change() -> break_lease()) if a delegation is present. Before
    returning NFS4ERR_DELAY, give the client holding that delegation a
    chance to return it and then retry the nfsd_setattr() again, once.
    
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354
    Tested-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: make nfsd_stats.th_cnt atomic_t [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    nfsd: make nfsd_stats.th_cnt atomic_t
    
    [ Upstream commit 9b6c8c9bebccd5fb785c306b948c08874a88874d ]
    
    This allows us to move the updates for th_cnt out of the mutex.
    This is a step towards reducing mutex coverage in nfsd().
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: map EBADF [+ + +]

Author: Peng Tao <tao.peng@primarydata.com>
Date:   Sat Dec 18 20:37:54 2021 -0500

    nfsd: map EBADF
    
    [ Upstream commit b3d0db706c77d02055910fcfe2f6eb5155ff9d5e ]
    
    Now that we have open file cache, it is possible that another client
    deletes the file and DP will not know about it. Then IO to MDS would
    fail with BADSTATEID and knfsd would start state recovery, which
    should fail as well and then nfs read/write will fail with EBADF.
    And it triggers a WARN() in nfserrno().
    
    -----------[ cut here ]------------
    WARNING: CPU: 0 PID: 13529 at fs/nfsd/nfsproc.c:758 nfserrno+0x58/0x70 [nfsd]()
    nfsd: non-standard errno: -9
    modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_connt
    pata_acpi floppy
    CPU: 0 PID: 13529 Comm: nfsd Tainted: G        W       4.1.5-00307-g6e6579b #7
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
     0000000000000000 00000000464e6c9c ffff88079085fba8 ffffffff81789936
     0000000000000000 ffff88079085fc00 ffff88079085fbe8 ffffffff810a08ea
     ffff88079085fbe8 ffff88080f45c900 ffff88080f627d50 ffff880790c46a48
     all Trace:
     [<ffffffff81789936>] dump_stack+0x45/0x57
     [<ffffffff810a08ea>] warn_slowpath_common+0x8a/0xc0
     [<ffffffff810a0975>] warn_slowpath_fmt+0x55/0x70
     [<ffffffff81252908>] ? splice_direct_to_actor+0x148/0x230
     [<ffffffffa02fb8c0>] ? fsid_source+0x60/0x60 [nfsd]
     [<ffffffffa02f9918>] nfserrno+0x58/0x70 [nfsd]
     [<ffffffffa02fba57>] nfsd_finish_read+0x97/0xb0 [nfsd]
     [<ffffffffa02fc7a6>] nfsd_splice_read+0x76/0xa0 [nfsd]
     [<ffffffffa02fcca1>] nfsd_read+0xc1/0xd0 [nfsd]
     [<ffffffffa0233af2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc]
     [<ffffffffa03073da>] nfsd3_proc_read+0xba/0x150 [nfsd]
     [<ffffffffa02f7a03>] nfsd_dispatch+0xc3/0x210 [nfsd]
     [<ffffffffa0233af2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc]
     [<ffffffffa0232913>] svc_process_common+0x453/0x6f0 [sunrpc]
     [<ffffffffa0232cc3>] svc_process+0x113/0x1b0 [sunrpc]
     [<ffffffffa02f740f>] nfsd+0xff/0x170 [nfsd]
     [<ffffffffa02f7310>] ? nfsd_destroy+0x80/0x80 [nfsd]
     [<ffffffff810bf3a8>] kthread+0xd8/0xf0
     [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
     [<ffffffff817912a2>] ret_from_fork+0x42/0x70
     [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
    
    Signed-off-by: Peng Tao <tao.peng@primarydata.com>
    Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: minor nfsd4_change_attribute cleanup [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Nov 30 17:46:16 2020 -0500

    nfsd: minor nfsd4_change_attribute cleanup
    
    [ Upstream commit 4b03d99794eeed27650597a886247c6427ce1055 ]
    
    Minor cleanup, no change in behavior.
    
    Also pull out a common helper that'll be useful elsewhere.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Modernize nfsd4_release_lockowner() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun May 22 12:07:18 2022 -0400

    NFSD: Modernize nfsd4_release_lockowner()
    
    [ Upstream commit bd8fdb6e545f950f4654a9a10d7e819ad48146e5 ]
    
    Refactor: Use existing helpers that other lock operations use. This
    change removes several automatic variables, so re-organize the
    variable declarations for readability.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Move copy offload callback arguments into a separate structure [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:41:18 2022 -0400

    NFSD: Move copy offload callback arguments into a separate structure
    
    [ Upstream commit a11ada99ce93a79393dc6683d22f7915748c8f6b ]
    
    Refactor so that CB_OFFLOAD arguments can be passed without
    allocating a whole struct nfsd4_copy object. On my system (x86_64)
    this removes another 96 bytes from struct nfsd4_copy.
    
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:23 2022 -0700

    NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd
    
    [ Upstream commit d76cc46b37e123e8d245cc3490978dbda56f979d ]
    
    This patch moves create/destroy of laundry_wq from nfs4_state_start
    and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent
    the laundromat from being freed while a thread is processing a
    conflicting lock.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Move documenting comment for nfsd4_process_open2() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Mar 23 13:55:37 2022 -0400

    NFSD: Move documenting comment for nfsd4_process_open2()
    
    [ Upstream commit 7e2ce0cc15a509b859199235a2bad9cece00f67a ]
    
    Clean up nfsd4_open() by converting a large comment at the only
    call site for nfsd4_process_open2() to a kerneldoc comment in
    front of that function.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: move filehandle format declarations out of "uapi". [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 2 11:14:47 2021 +1000

    NFSD: move filehandle format declarations out of "uapi".
    
    [ Upstream commit ef5825e3cf0d0af657f5fb4dd86d750ed42fee0a ]
    
    A small part of the declaration concerning filehandle format are
    currently in the "uapi" include directory:
       include/uapi/linux/nfsd/nfsfh.h
    
    There is a lot more to the filehandle format, including "enum fid_type"
    and "enum nfsd_fsid" which are not exported via "uapi".
    
    This small part of the filehandle definition is of minimal use outside
    of the kernel, and I can find no evidence that an other code is using
    it. Certainly nfs-utils and wireshark (The most likely candidates) do not
    use these declarations.
    
    So move it out of "uapi" by copying the content from
      include/uapi/linux/nfsd/nfsfh.h
    into
      fs/nfsd/nfsfh.h
    
    A few unnecessary "#include" directives are not copied, and neither is
    the #define of fh_auth, which is annotated as being for userspace only.
    
    The copyright claims in the uapi file are identical to those in the nfsd
    file, so there is no need to copy those.
    
    The "__u32" style integer types are only needed in "uapi".  In
    kernel-only code we can use the more familiar "u32" style.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Move fill_pre_wcc() and fill_post_wcc() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Dec 24 14:36:49 2021 -0500

    NFSD: Move fill_pre_wcc() and fill_post_wcc()
    
    [ Upstream commit fcb5e3fa012351f3b96024c07bc44834c2478213 ]
    
    These functions are related to file handle processing and have
    nothing to do with XDR encoding or decoding. Also they are no longer
    NFSv3-specific. As a clean-up, move their definitions to a more
    appropriate location. WCC is also an NFSv3-specific term, so rename
    them as general-purpose helpers.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: move from strlcpy with unused retval to strscpy [+ + +]

Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date:   Thu Aug 18 23:01:14 2022 +0200

    NFSD: move from strlcpy with unused retval to strscpy
    
    [ Upstream commit 72f78ae00a8e5d7abe13abac8305a300f6afd74b ]
    
    Follow the advice of the below link and prefer 'strscpy' in this
    subsystem. Conversion is 1:1 because the return value is not used.
    Generated by a coccinelle script.
    
    Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: move fsnotify on client creation outside spinlock [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Tue May 25 14:53:44 2021 -0400

    nfsd: move fsnotify on client creation outside spinlock
    
    [ Upstream commit 934bd07fae7e55232845f909f78873ab8678ca74 ]
    
    This was causing a "sleeping function called from invalid context"
    warning.
    
    I don't think we need the set_and_test_bit() here; clients move from
    unconfirmed to confirmed only once, under the client_lock.
    
    The (conf == unconf) is a way to check whether we're in that confirming
    case, hopefully that's not too obscure.
    
    Fixes: 472d155a0631 "nfsd: report client confirmation status in "info" file"
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Move nfsd_file_trace_alloc() tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:49 2022 -0400

    NFSD: Move nfsd_file_trace_alloc() tracepoint
    
    [ Upstream commit b40a2839470cd62ed68c4a32d72a18ee8975b1ac ]
    
    Avoid recording the allocation of an nfsd_file item that is
    immediately released because a matching item was already
    inserted in the hash.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: move nfserrno() to vfs.c [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Oct 18 07:47:55 2022 -0400

    nfsd: move nfserrno() to vfs.c
    
    [ Upstream commit cb12fae1c34b1fa7eaae92c5aadc72d86d7fae19 ]
    
    nfserrno() is common to all nfs versions, but nfsproc.c is specifically
    for NFSv2. Move it to vfs.c, and the prototype to vfs.h.
    
    While we're in here, remove the #ifdef EDQUOT check in this function.
    It's apparently a holdover from the initial merge of the nfsd code in
    1997. No other place in the kernel checks that that symbol is defined
    before using it, so I think we can dispense with it here.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: move some commit_metadata()s outside the inode lock [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri May 14 18:21:37 2021 -0400

    nfsd: move some commit_metadata()s outside the inode lock
    
    [ Upstream commit eeeadbb9bd5652c47bb9b31aa9ad8b4f1b4aa8b3 ]
    
    The commit may be time-consuming and there's no need to hold the lock
    for it.
    
    More of these are possible, these were just some easy ones.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Move svc_serv_ops::svo_function into struct svc_serv [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Feb 16 12:16:27 2022 -0500

    NFSD: Move svc_serv_ops::svo_function into struct svc_serv
    
    [ Upstream commit 37902c6313090235c847af89c5515591261ee338 ]
    
    Hoist svo_function back into svc_serv and remove struct
    svc_serv_ops, since the struct is now devoid of fields.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: narrow nfsd_mutex protection in nfsd thread [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    NFSD: narrow nfsd_mutex protection in nfsd thread
    
    [ Upstream commit 9d3792aefdcda71d20c2b1ecc589c17ae71eb523 ]
    
    There is nothing happening in the start of nfsd() that requires
    protection by the mutex, so don't take it until shutting down the thread
    - which does still require protection - but only for nfsd_put().
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: address merge conflict with fd2468fa1301 ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Never call nfsd_file_gc() in foreground paths [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:30 2022 -0400

    NFSD: Never call nfsd_file_gc() in foreground paths
    
    [ Upstream commit 6df19411367a5fb4ef61854cbd1af269c077f917 ]
    
    The checks in nfsd_file_acquire() and nfsd_file_put() that directly
    invoke filecache garbage collection are intended to keep cache
    occupancy between a low- and high-watermark. The reason to limit the
    capacity of the filecache is to keep filecache lookups reasonably
    fast.
    
    However, invoking garbage collection at those points has some
    undesirable negative impacts. Files that are held open by NFSv4
    clients often push the occupancy of the filecache over these
    watermarks. At that point:
    
    - Every call to nfsd_file_acquire() and nfsd_file_put() results in
      an LRU walk. This has the same effect on lookup latency as long
      chains in the hash table.
    - Garbage collection will then run on every nfsd thread, causing a
      lot of unnecessary lock contention.
    - Limiting cache capacity pushes out files used only by NFSv3
      clients, which are the type of files the filecache is supposed to
      help.
    
    To address those negative impacts, remove the direct calls to the
    garbage collector. Subsequent patches will address maintaining
    lookup efficiency as cache capacity increases.
    
    Suggested-by: Wang Yugui <wangyugui@e16-tech.com>
    Suggested-by: Dave Chinner <david@fromorbit.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: nfsd_file_hash_remove can compute hashval [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:03 2022 -0400

    NFSD: nfsd_file_hash_remove can compute hashval
    
    [ Upstream commit cb7ec76e73ff6640241c8f1f2f35c81d4005a2d6 ]
    
    Remove an unnecessary use of nf_hashval.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: NFSD_FILE_KEY_INODE only needs to find GC'ed entries [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Jan 6 10:39:00 2023 -0500

    nfsd: NFSD_FILE_KEY_INODE only needs to find GC'ed entries
    
    [ Upstream commit 6c31e4c98853a4ba47355ea151b36a77c42b7734 ]
    
    Since v4 files are expected to be long-lived, there's little value in
    closing them out of the cache when there is conflicting access.
    
    Change the comparator to also match the gc value in the key. Change both
    of the current users of that key to set the gc value in the key to
    "true".
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: nfsd_file_put() can sleep [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed May 11 13:02:21 2022 -0400

    NFSD: nfsd_file_put() can sleep
    
    [ Upstream commit 08af54b3e5729bc1d56ad3190af811301bdc37a1 ]
    
    Now that there are no more callers of nfsd_file_put() that might
    hold a spin lock, ensure the lockdep infrastructure can catch
    newly introduced calls to nfsd_file_put() made while a spinlock
    is held.
    
    Link: https://lore.kernel.org/linux-nfs/ece7fd1d-5fb3-5155-54ba-347cfc19bd9a@oracle.com/T/#mf1855552570cf9a9c80d1e49d91438cd9085aada
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:50 2022 -0400

    NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
    
    [ Upstream commit 8755326399f471ec3b31e2ab8c5074c0d28a0fb5 ]
    
    Remove an unnecessary usage of nf_hashval.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: nfserrno(-ENOMEM) is nfserr_jukebox [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:09 2022 -0400

    NFSD: nfserrno(-ENOMEM) is nfserr_jukebox
    
    [ Upstream commit bb4d842722b84a2731257054b6405f2d866fc5f3 ]
    
    Suggested-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: NFSv4 CLOSE should release an nfsd_file immediately [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:27:02 2022 -0400

    NFSD: NFSv4 CLOSE should release an nfsd_file immediately
    
    [ Upstream commit 5e138c4a750dc140d881dab4a8804b094bbc08d2 ]
    
    The last close of a file should enable other accessors to open and
    use that file immediately. Leaving the file open in the filecache
    prevents other users from accessing that file until the filecache
    garbage-collects the file -- sometimes that takes several seconds.
    
    Reported-by: Wang Yugui <wangyugui@e16-tech.com>
    Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: No longer record nf_hashval in the trace log [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:37 2022 -0400

    NFSD: No longer record nf_hashval in the trace log
    
    [ Upstream commit 54f7df7094b329ca35d9f9808692bb16c48b13e9 ]
    
    I'm about to replace nfsd_file_hashtbl with an rhashtable. The
    individual hash values will no longer be visible or relevant, so
    remove them from the tracepoints.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: only call fh_unlock() once in nfsd_link() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: only call fh_unlock() once in nfsd_link()
    
    [ Upstream commit e18bcb33bc5b69bccc2b532075aa00bb49cc01c5 ]
    
    On non-error paths, nfsd_link() calls fh_unlock() twice.  This is safe
    because fh_unlock() records that the unlock has been done and doesn't
    repeat it.
    However it makes the code a little confusing and interferes with changes
    that are planned for directory locking.
    
    So rearrange the code to ensure fh_unlock() is called exactly once if
    fh_lock() was called.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: only call inode_query_iversion in the I_VERSION case [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Nov 30 17:46:14 2020 -0500

    nfsd: only call inode_query_iversion in the I_VERSION case
    
    [ Upstream commit 70b87f77294d16d3e567056ba4c9ee2b091a5b50 ]
    
    inode_query_iversion() can modify i_version.  Depending on the exported
    filesystem, that may not be safe.  For example, if you're re-exporting
    NFS, NFS stores the server's change attribute in i_version and does not
    expect it to be modified locally.  This has been observed causing
    unnecessary cache invalidations.
    
    The way a filesystem indicates that it's OK to call
    inode_query_iverson() is by setting SB_I_VERSION.
    
    So, move the I_VERSION check out of encode_change(), where it's used
    only in GETATTR responses, to nfsd4_change_attribute(), which is
    also called for pre- and post- operation attributes.
    
    (Note we could also pull the NFSEXP_V4ROOT case into
    nfsd4_change_attribute() as well.  That would actually be a no-op,
    since pre/post attrs are only used for metadata-modifying operations,
    and V4ROOT exports are read-only.  But we might make the change in
    the future just for simplicity.)
    
    Reported-by: Daire Byrne <daire@dneg.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: only fill out return pointer on success in nfsd4_lookup_stateid [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Sep 26 12:38:44 2022 -0400

    nfsd: only fill out return pointer on success in nfsd4_lookup_stateid
    
    [ Upstream commit 4d01416ab41540bb13ec4a39ac4e6c4aa5934bc9 ]
    
    In the case of a revoked delegation, we still fill out the pointer even
    when returning an error, which is bad form. Only overwrite the pointer
    on success.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Optimize DRC bucket pruning [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 20 15:25:21 2021 -0400

    NFSD: Optimize DRC bucket pruning
    
    [ Upstream commit 8847ecc9274a14114385d1cb4030326baa0766eb ]
    
    DRC bucket pruning is done by nfsd_cache_lookup(), which is part of
    every NFSv2 and NFSv3 dispatch (ie, it's done while the client is
    waiting).
    
    I added a trace_printk() in prune_bucket() to see just how long
    it takes to prune. Here are two ends of the spectrum:
    
     prune_bucket: Scanned 1 and freed 0 in 90 ns, 62 entries remaining
     prune_bucket: Scanned 2 and freed 1 in 716 ns, 63 entries remaining
    ...
     prune_bucket: Scanned 75 and freed 74 in 34149 ns, 1 entries remaining
    
    Pruning latency is noticeable on fast transports with fast storage.
    By noticeable, I mean that the latency measured here in the worst
    case is the same order of magnitude as the round trip time for
    cached server operations.
    
    We could do something like moving expired entries to an expired list
    and then free them later instead of freeing them right in
    prune_bucket(). But simply limiting the number of entries that can
    be pruned by a lookup is simple and retains more entries in the
    cache, making the DRC somewhat more effective.
    
    Comparison with a 70/30 fio 8KB 12 thread direct I/O test:
    
    Before:
    
      write: IOPS=61.6k, BW=481MiB/s (505MB/s)(14.1GiB/30001msec); 0 zone resets
    
    WRITE:
            1848726 ops (30%)
            avg bytes sent per op: 8340 avg bytes received per op: 136
            backlog wait: 0.635158  RTT: 0.128525   total execute time: 0.827242 (milliseconds)
    
    After:
    
      write: IOPS=63.0k, BW=492MiB/s (516MB/s)(14.4GiB/30001msec); 0 zone resets
    
    WRITE:
            1891144 ops (30%)
            avg bytes sent per op: 8340 avg bytes received per op: 136
            backlog wait: 0.616114  RTT: 0.126842   total execute time: 0.805348 (milliseconds)
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Optimize nfsd4_encode_fattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:08:45 2022 -0400

    NFSD: Optimize nfsd4_encode_fattr()
    
    [ Upstream commit ab04de60ae1cc64ae16b77feae795311b97720c7 ]
    
    write_bytes_to_xdr_buf() is a generic way to place a variable-length
    data item in an already-reserved spot in the encoding buffer.
    
    However, it is costly. In nfsd4_encode_fattr(), it is unnecessary
    because the data item is fixed in size and the buffer destination
    address is always word-aligned.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Optimize nfsd4_encode_operation() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:08:38 2022 -0400

    NFSD: Optimize nfsd4_encode_operation()
    
    [ Upstream commit 095a764b7afb06c9499b798c04eaa3cbf70ebe2d ]
    
    write_bytes_to_xdr_buf() is a generic way to place a variable-length
    data item in an already-reserved spot in the encoding buffer.
    However, it is costly, and here, it is unnecessary because the
    data item is fixed in size, the buffer destination address is
    always word-aligned, and the destination location is already in
    @p.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Optimize nfsd4_encode_readv() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:09:04 2022 -0400

    NFSD: Optimize nfsd4_encode_readv()
    
    [ Upstream commit 28d5bc468efe74b790e052f758ce083a5015c665 ]
    
    write_bytes_to_xdr_buf() is pretty expensive to use for inserting
    an XDR data item that is always 1 XDR_UNIT at an address that is
    always XDR word-aligned.
    
    Since both the readv and splice read paths encode EOF and maxcount
    values, move both to a common code path.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Pack struct nfsd4_compoundres [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:23:36 2022 -0400

    NFSD: Pack struct nfsd4_compoundres
    
    [ Upstream commit 9f553e61bd36c1048543ac2f6945103dd2f742be ]
    
    Remove a couple of 4-byte holes on platforms with 64-bit pointers.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: pass range end to vfs_fsync_range() instead of count [+ + +]

Author: Brian Foster <bfoster@redhat.com>
Date:   Wed Nov 16 10:28:36 2022 -0500

    NFSD: pass range end to vfs_fsync_range() instead of count
    
    [ Upstream commit 79a1d88a36f77374c77fd41a4386d8c2736b8704 ]
    
    _nfsd_copy_file_range() calls vfs_fsync_range() with an offset and
    count (bytes written), but the former wants the start and end bytes
    of the range to sync. Fix it up.
    
    Fixes: eac0b17a77fb ("NFSD add vfs_fsync after async copy is done")
    Signed-off-by: Brian Foster <bfoster@redhat.com>
    Tested-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Pass the target nfsd_file to nfsd_commit() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:46:38 2022 -0400

    NFSD: Pass the target nfsd_file to nfsd_commit()
    
    [ Upstream commit c252849082ff525af18b4f253b3c9ece94e951ed ]
    
    In a moment I'm going to introduce separate nfsd_file types, one of
    which is garbage-collected; the other, not. The garbage-collected
    variety is to be used by NFSv2 and v3, and the non-garbage-collected
    variety is to be used by NFSv4.
    
    nfsd_commit() is invoked by both NFSv3 and NFSv4 consumers. We want
    nfsd_commit() to find and use the correct variety of cached
    nfsd_file object for the NFS version that is in use.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Prevent a possible oops in the nfs_dirent() tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jun 25 11:12:49 2021 -0400

    NFSD: Prevent a possible oops in the nfs_dirent() tracepoint
    
    [ Upstream commit 7b08cf62b1239a4322427d677ea9363f0ab677c6 ]
    
    The double copy of the string is a mistake, plus __assign_str()
    uses strlen(), which is wrong to do on a string that isn't
    guaranteed to be NUL-terminated.
    
    Fixes: 6019ce0742ca ("NFSD: Add a tracepoint to record directory entry encoding")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Prevent truncation of an unlinked inode from blocking access to its directory [+ + +]

Author: Yu Hsiang Huang <nickhuang@synology.com>
Date:   Fri May 14 11:58:29 2021 +0800

    nfsd: Prevent truncation of an unlinked inode from blocking access to its directory
    
    [ Upstream commit e5d74a2d0ee67ae00edad43c3d7811016e4d2e21 ]
    
    Truncation of an unlinked inode may take a long time for I/O waiting, and
    it doesn't have to prevent access to the directory. Thus, let truncation
    occur outside the directory's mutex, just like do_unlinkat() does.
    
    Signed-off-by: Yu Hsiang Huang <nickhuang@synology.com>
    Signed-off-by: Bing Jing Chang <bingjingc@synology.com>
    Signed-off-by: Robbie Ko <robbieko@synology.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Propagate some error code returned by memdup_user() [+ + +]

Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Thu Sep 1 07:27:19 2022 +0200

    nfsd: Propagate some error code returned by memdup_user()
    
    [ Upstream commit 30a30fcc3fc1ad4c5d017c9fcb75dc8f59e7bdad ]
    
    Propagate the error code returned by memdup_user() instead of a hard coded
    -EFAULT.
    
    Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Protect against filesystem freezing [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 6 10:43:47 2023 -0500

    NFSD: Protect against filesystem freezing
    
    [ Upstream commit fd9a2e1d513823e840960cb3bc26d8b7749d4ac2 ]
    
    Flole observes this WARNING on occasion:
    
    [1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0
    
    Reported-by: <flole@flole.de>
    Suggested-by: Jan Kara <jack@suse.cz>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=217123
    Fixes: 73da852e3831 ("nfsd: use vfs_iter_read/write")
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Protect against send buffer overflow in NFSv2 READ [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 1 15:10:18 2022 -0400

    NFSD: Protect against send buffer overflow in NFSv2 READ
    
    [ Upstream commit 401bc1f90874280a80b93f23be33a0e7e2d1f912 ]
    
    Since before the git era, NFSD has conserved the number of pages
    held by each nfsd thread by combining the RPC receive and send
    buffers into a single array of pages. This works because there are
    no cases where an operation needs a large RPC Call message and a
    large RPC Reply at the same time.
    
    Once an RPC Call has been received, svc_process() updates
    svc_rqst::rq_res to describe the part of rq_pages that can be
    used for constructing the Reply. This means that the send buffer
    (rq_res) shrinks when the received RPC record containing the RPC
    Call is large.
    
    A client can force this shrinkage on TCP by sending a correctly-
    formed RPC Call header contained in an RPC record that is
    excessively large. The full maximum payload size cannot be
    constructed in that case.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Protect against send buffer overflow in NFSv2 READDIR [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 1 15:10:05 2022 -0400

    NFSD: Protect against send buffer overflow in NFSv2 READDIR
    
    [ Upstream commit 00b4492686e0497fdb924a9d4c8f6f99377e176c ]
    
    Restore the previous limit on the @count argument to prevent a
    buffer overflow attack.
    
    Fixes: 53b1119a6e50 ("NFSD: Fix READDIR buffer overflow")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Protect against send buffer overflow in NFSv3 READ [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 1 15:10:24 2022 -0400

    NFSD: Protect against send buffer overflow in NFSv3 READ
    
    [ Upstream commit fa6be9cc6e80ec79892ddf08a8c10cabab9baf38 ]
    
    Since before the git era, NFSD has conserved the number of pages
    held by each nfsd thread by combining the RPC receive and send
    buffers into a single array of pages. This works because there are
    no cases where an operation needs a large RPC Call message and a
    large RPC Reply at the same time.
    
    Once an RPC Call has been received, svc_process() updates
    svc_rqst::rq_res to describe the part of rq_pages that can be
    used for constructing the Reply. This means that the send buffer
    (rq_res) shrinks when the received RPC record containing the RPC
    Call is large.
    
    A client can force this shrinkage on TCP by sending a correctly-
    formed RPC Call header contained in an RPC record that is
    excessively large. The full maximum payload size cannot be
    constructed in that case.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Protect against send buffer overflow in NFSv3 READDIR [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 1 15:10:12 2022 -0400

    NFSD: Protect against send buffer overflow in NFSv3 READDIR
    
    [ Upstream commit 640f87c190e0d1b2a0fcb2ecf6d2cd53b1c41991 ]
    
    Since before the git era, NFSD has conserved the number of pages
    held by each nfsd thread by combining the RPC receive and send
    buffers into a single array of pages. This works because there are
    no cases where an operation needs a large RPC Call message and a
    large RPC Reply message at the same time.
    
    Once an RPC Call has been received, svc_process() updates
    svc_rqst::rq_res to describe the part of rq_pages that can be
    used for constructing the Reply. This means that the send buffer
    (rq_res) shrinks when the received RPC record containing the RPC
    Call is large.
    
    A client can force this shrinkage on TCP by sending a correctly-
    formed RPC Call header contained in an RPC record that is
    excessively large. The full maximum payload size cannot be
    constructed in that case.
    
    Thanks to Aleksi Illikainen and Kari Hulkko for uncovering this
    issue.
    
    Reported-by: Ben Ronallo <Benjamin.Ronallo@synopsys.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: protect concurrent access to nfsd stats counters [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Jan 6 09:52:35 2021 +0200

    nfsd: protect concurrent access to nfsd stats counters
    
    [ Upstream commit e567b98ce9a4b35b63c364d24828a9e5cd7a8179 ]
    
    nfsd stats counters can be updated by concurrent nfsd threads without any
    protection.
    
    Convert some nfsd_stats and nfsd_net struct members to use percpu counters.
    
    The longest_chain* members of struct nfsd_net remain unprotected.
    
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: put the export reference in nfsd4_verify_deleg_dentry [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Nov 8 11:23:11 2022 -0500

    nfsd: put the export reference in nfsd4_verify_deleg_dentry
    
    [ Upstream commit 50256e4793a5e5ab77703c82a47344ad2e774a59 ]
    
    nfsd_lookup_dentry returns an export reference in addition to the dentry
    ref. Ensure that we put it too.
    
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138866
    Fixes: 876c553cb410 ("NFSD: verify the opened dentry after setting a delegation")
    Reported-by: Yongcheng Yang <yoyang@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Record NFSv4 pre/post-op attributes as non-atomic [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Mon Nov 30 23:14:27 2020 -0500

    nfsd: Record NFSv4 pre/post-op attributes as non-atomic
    
    [ Upstream commit 716a8bc7f706eeef80ab42c99d9f210eda845c81 ]
    
    For the case of NFSv4, specify to the client that the pre/post-op
    attributes were not recorded atomically with the main operation.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Record number of flush calls [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:45 2022 -0400

    NFSD: Record number of flush calls
    
    [ Upstream commit df2aff524faceaf743b7c5ab0f4fb86cb511f782 ]
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Reduce amount of struct nfsd4_compoundargs that needs clearing [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:22:44 2022 -0400

    NFSD: Reduce amount of struct nfsd4_compoundargs that needs clearing
    
    [ Upstream commit 3fdc546462348b8a497c72bc894e0cde9f10fc40 ]
    
    Have SunRPC clear everything except for the iops array. Then have
    each NFSv4 XDR decoder clear it's own argument before decoding.
    
    Now individual operations may have a large argument struct while not
    penalizing the vast majority of operations with a small struct.
    
    And, clearing the argument structure occurs as the argument fields
    are initialized, enabling the CPU to do write combining on that
    memory. In some cases, clearing is not even necessary because all
    of the fields in the argument structure are initialized by the
    decoder.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: reduce locking in nfsd_lookup() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: reduce locking in nfsd_lookup()
    
    [ Upstream commit 19d008b46941b8c668402170522e0f7a9258409c ]
    
    nfsd_lookup() takes an exclusive lock on the parent inode, but no
    callers want the lock and it may not be needed at all if the
    result is in the dcache.
    
    Change nfsd_lookup_dentry() to not take the lock, and call
    lookup_one_len_locked() which takes lock only if needed.
    
    nfsd4_open() currently expects the lock to still be held, but that isn't
    necessary as nfsd_validate_delegated_dentry() provides required
    guarantees without the lock.
    
    NOTE: NFSv4 requires directory changeinfo for OPEN even when a create
      wasn't requested and no change happened.  Now that nfsd_lookup()
      doesn't use fh_lock(), we need to explicitly fill the attributes
      when no create happens.  A new fh_fill_both_attrs() is provided
      for that task.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Reduce svc_rqst::rq_pages churn during READDIR operations [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jan 15 09:28:44 2021 -0500

    NFSD: Reduce svc_rqst::rq_pages churn during READDIR operations
    
    [ Upstream commit 76ed0dd96eeb2771b21bf5dcbd88326ef89ee0ed ]
    
    During NFSv2 and NFSv3 READDIR/PLUS operations, NFSD advances
    rq_next_page to the full size of the client-requested buffer, then
    releases all those pages at the end of the request. The next request
    to use that nfsd thread has to refill the pages.
    
    NFSD does this even when the dirlist in the reply is small. With
    NFSv3 clients that send READDIR operations with large buffer sizes,
    that can be 256 put_page/alloc_page pairs per READDIR request, even
    though those pages often remain unused.
    
    We can save some work by not releasing dirlist buffer pages that
    were not used to form the READDIR Reply. I've left the NFSv2 code
    alone since there are never more than three pages involved in an
    NFSv2 READDIR Reply.
    
    Eventually we should nail down why these pages need to be released
    at all in order to avoid allocating and releasing pages
    unnecessarily.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor __nfsd_file_close_inode() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:57 2022 -0400

    NFSD: Refactor __nfsd_file_close_inode()
    
    [ Upstream commit a845511007a63467fee575353c706806c21218b1 ]
    
    The code that computes the hashval is the same in both callers.
    
    To prevent them from going stale, reframe the documenting comments
    to remove descriptions of the underlying hash table structure, which
    is about to be replaced.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor common code out of dirlist helpers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:22:56 2022 -0400

    NFSD: Refactor common code out of dirlist helpers
    
    [ Upstream commit 98124f5bd6c76699d514fbe491dd95265369cc99 ]
    
    The dust has settled a bit and it's become obvious what code is
    totally common between nfsd_init_dirlist_pages() and
    nfsd3_init_dirlist_pages(). Move that common code to SUNRPC.
    
    The new helper brackets the existing xdr_init_decode_pages() API.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor find_file() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:47 2022 -0400

    NFSD: Refactor find_file()
    
    [ Upstream commit 15424748001a9b5ea62b3e6ad45f0a8b27f01df9 ]
    
    find_file() is now the only caller of find_file_locked(), so just
    fold these two together.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2) [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:47 2022 -0400

    NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
    
    [ Upstream commit 24d796ea383b8a4c8234e06d1b14bbcd371192ea ]
    
    The @src parameter is sometimes a pointer to a struct nfsd_file and
    sometimes a pointer to struct file hiding in a phony struct
    nfsd_file. Refactor nfsd4_cleanup_inter_ssc() so the @src parameter
    is always an explicit struct file.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2) [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:53 2022 -0400

    NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
    
    [ Upstream commit 478ed7b10d875da2743d1a22822b9f8a82df8f12 ]
    
    Move the nfsd4_cleanup_*() call sites out of nfsd4_do_copy(). A
    subsequent patch will modify one of the new call sites to avoid
    the need to manufacture the phony struct nfsd_file.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd4_do_copy() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:59 2022 -0400

    NFSD: Refactor nfsd4_do_copy()
    
    [ Upstream commit 3b7bf5933cada732783554edf0dc61283551c6cf ]
    
    Refactor: Now that nfsd4_do_copy() no longer calls the cleanup
    helpers, plumb the use of struct file pointers all the way down to
    _nfsd_copy_file_range().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd_create_setattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 28 16:10:17 2022 -0400

    NFSD: Refactor nfsd_create_setattr()
    
    [ Upstream commit 5f46e950c395b9c14c282b53ba78c5fd46d6c256 ]
    
    I'd like to move do_nfsd_create() out of vfs.c. Therefore
    nfsd_create_setattr() needs to be made publicly visible.
    
    Note that both call sites in vfs.c commit both the new object and
    its parent directory, so just combine those common metadata commits
    into nfsd_create_setattr().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd_file_gc() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:25 2022 -0400

    NFSD: Refactor nfsd_file_gc()
    
    [ Upstream commit 3bc6d3470fe412f818f9bff6b71d1be3a76af8f3 ]
    
    Refactor nfsd_file_gc() to use the new list_lru helper.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd_file_lru_scan() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:31 2022 -0400

    NFSD: Refactor nfsd_file_lru_scan()
    
    [ Upstream commit 39f1d1ff8148902c5692ffb0e1c4479416ab44a7 ]
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor nfsd_setattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 8 18:14:07 2022 -0400

    NFSD: Refactor nfsd_setattr()
    
    [ Upstream commit c0aa1913db57219e91a0a8832363cbafb3a9cf8f ]
    
    Move code that will be retried (in a subsequent patch) into a helper
    function.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor NFSv3 CREATE [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 28 13:29:23 2022 -0400

    NFSD: Refactor NFSv3 CREATE
    
    [ Upstream commit df9606abddfb01090d5ece7dcc2441d848f690f0 ]
    
    The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to
    diverge such that it makes sense to split do_nfsd_create() into one
    version for NFSv3 and one for NFSv4.
    
    As a first step, copy do_nfsd_create() to nfs3proc.c and remove
    NFSv4-specific logic.
    
    One immediate legibility benefit is that the logic for handling
    NFSv3 createhow is now quite straightforward. NFSv4 createhow
    has some subtleties that IMO do not belong in generic code.
    
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Refactor NFSv4 OPEN(CREATE) [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 28 14:47:34 2022 -0400

    NFSD: Refactor NFSv4 OPEN(CREATE)
    
    [ Upstream commit 254454a5aa4a9f696d6bae080c08d5863e650f49 ]
    
    Copy do_nfsd_create() to nfs4proc.c and remove NFSv3-specific logic.
    
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: refactor set_client [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:41 2021 -0500

    nfsd: refactor set_client
    
    [ Upstream commit 7950b5316e40d99dcb85ab81a2d1dbb913d7c1c8 ]
    
    This'll be useful elsewhere.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Nov 16 19:44:45 2022 -0800

    NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
    
    [ Upstream commit a1049eb47f20b9eabf9afb218578fff16b4baca6 ]
    
    Refactoring courtesy_client_reaper to generic low memory
    shrinker so it can be used for other purposes.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: refactoring v4 specific code to a helper in nfs4state.c [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Fri Jul 15 16:54:51 2022 -0700

    NFSD: refactoring v4 specific code to a helper in nfs4state.c
    
    [ Upstream commit 6867137ebcf4155fe25f2ecf7c29b9fb90a76d1d ]
    
    This patch moves the v4 specific code from nfsd_init_net() to
    nfsd4_init_leases_net() helper in nfs4state.c
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Jan 11 12:17:09 2023 -0800

    NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time
    
    [ Upstream commit f385f7d244134246f984975ed34cd75f77de479f ]
    
    Currently the nfsd-client shrinker is registered and unregistered at
    the time the nfsd module is loaded and unloaded. The problem with this
    is the shrinker is being registered before all of the relevant fields
    in nfsd_net are initialized when nfsd is started. This can lead to an
    oops when memory is low and the shrinker is called while nfsd is not
    running.
    
    This patch moves the  register/unregister of nfsd-client shrinker from
    module load/unload time to nfsd startup/shutdown time.
    
    Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
    Reported-by: Mike Galbraith <efault@gmx.de>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    [ cel: adjusted to apply without e33c267ab70d ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Relocate nfsd4_decode_opaque() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 11:41:55 2020 -0500

    NFSD: Relocate nfsd4_decode_opaque()
    
    [ Upstream commit 5dcbfabb676b2b6d97767209cf707eb463ca232a ]
    
    Enable nfsd4_decode_opaque() to be used in more decoders, and
    replace the READ* macros in nfsd4_decode_opaque().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove "inline" directives on op_rsize_bop helpers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:23:25 2022 -0400

    NFSD: Remove "inline" directives on op_rsize_bop helpers
    
    [ Upstream commit 6604148cf961b57fc735e4204f8996536da9253c ]
    
    These helpers are always invoked indirectly, so the compiler can't
    inline these anyway. While we're updating the synopses of these
    helpers, defensively convert their parameters to const pointers.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove argument length checking in nfsd_dispatch() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 10:08:19 2020 -0400

    NFSD: Remove argument length checking in nfsd_dispatch()
    
    [ Upstream commit 5650682e16f41722f735b7beeb2dbc3411dfbeb6 ]
    
    Now that the argument decoders for NFSv2 and NFSv3 use the
    xdr_stream mechanism, the version-specific length checking logic in
    nfsd_dispatch() is no longer necessary.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove be32_to_cpu() from DRC hash function [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 30 19:10:03 2021 -0400

    NFSD: Remove be32_to_cpu() from DRC hash function
    
    [ Upstream commit 7578b2f628db27281d3165af0aa862311883a858 ]
    
    Commit 7142b98d9fd7 ("nfsd: Clean up drc cache in preparation for
    global spinlock elimination"), billed as a clean-up, added
    be32_to_cpu() to the DRC hash function without explanation. That
    commit removed two comments that state that byte-swapping in the
    hash function is unnecessary without explaining whether there was
    a need for that change.
    
    On some Intel CPUs, the swab32 instruction is known to cause a CPU
    pipeline stall. be32_to_cpu() does not add extra randomness, since
    the hash multiplication is done /before/ shifting to the high-order
    bits of the result.
    
    As a micro-optimization, remove the unnecessary transform from the
    DRC hash function.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove CONFIG_NFSD_V3 [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Feb 6 12:25:47 2022 -0500

    NFSD: Remove CONFIG_NFSD_V3
    
    [ Upstream commit 5f9a62ff7d2808c7b56c0ec90f3b7eae5872afe6 ]
    
    Eventually support for NFSv2 in the Linux NFS server is to be
    deprecated and then removed.
    
    However, NFSv2 is the "always supported" version that is available
    as soon as CONFIG_NFSD is set.  Before NFSv2 support can be removed,
    we need to choose a different "always supported" version.
    
    This patch removes CONFIG_NFSD_V3 so that NFSv3 is always supported,
    as NFSv2 is today. When NFSv2 support is removed, NFSv3 will become
    the only "always supported" NFS version.
    
    The defconfigs still need to be updated to remove CONFIG_NFSD_V3=y.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove do_nfsd_create() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Mar 28 15:36:58 2022 -0400

    NFSD: Remove do_nfsd_create()
    
    [ Upstream commit 1c388f27759c5d9271d4fca081f7ee138986eb7d ]
    
    Now that its two callers have their own version-specific instance of
    this function, do_nfsd_create() is no longer used.
    
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove dprintk call sites from tail of nfsd4_open() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Mar 30 14:28:51 2022 -0400

    NFSD: Remove dprintk call sites from tail of nfsd4_open()
    
    [ Upstream commit f67a16b147045815b6aaafeef8663e5faeb6d569 ]
    
    Clean up: These relics are not likely to benefit server
    administrators.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove extra "0x" in tracepoint format specifier [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Sep 4 15:06:26 2020 -0400

    NFSD: Remove extra "0x" in tracepoint format specifier
    
    [ Upstream commit 3a90e1dff16afdae6e1c918bfaff24f4d0f84869 ]
    
    Clean up: %p adds its own 0x already.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove kmalloc from nfsd4_do_async_copy() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:41:06 2022 -0400

    NFSD: Remove kmalloc from nfsd4_do_async_copy()
    
    [ Upstream commit ad1e46c9b07b13659635ee5405f83ad0df143116 ]
    
    Instead of manufacturing a phony struct nfsd_file, pass the
    struct file returned by nfs42_ssc_open() directly to
    nfsd4_do_copy().
    
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove lockdep assertion from unhash_and_release_locked() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:44 2022 -0400

    NFSD: Remove lockdep assertion from unhash_and_release_locked()
    
    [ Upstream commit f53cef15dddec7203df702cdc62e554190385450 ]
    
    IIUC, holding the hash bucket lock is needed only in
    nfsd_file_unhash, and there is already a lockdep assertion there.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove macros that are no longer used [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 11:12:18 2020 -0500

    NFSD: Remove macros that are no longer used
    
    [ Upstream commit 5cfc822f3e77b0477e6602d399116130317f537a ]
    
    Now that all the NFSv4 decoder functions have been converted to
    make direct calls to the xdr helpers, remove the unused C macros.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove nfsd4_prepare_cb_recall() declaration [+ + +]

Author: Gaosheng Cui <cuigaosheng1@huawei.com>
Date:   Fri Sep 9 14:59:10 2022 +0800

    nfsd: remove nfsd4_prepare_cb_recall() declaration
    
    [ Upstream commit 18224dc58d960c65446971930d0487fc72d00598 ]
    
    nfsd4_prepare_cb_recall() has been removed since
    commit 0162ac2b978e ("nfsd: introduce nfsd4_callback_ops"),
    so remove it.
    
    Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove nfsd_file::nf_hashval [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:10 2022 -0400

    NFSD: Remove nfsd_file::nf_hashval
    
    [ Upstream commit f0743c2b25c65debd4f599a7c861428cd9de5906 ]
    
    The value in this field can always be computed from nf_inode, thus
    it is no longer used.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove redundant assignment to pointer 'this' [+ + +]

Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Thu May 13 16:16:39 2021 +0100

    nfsd: remove redundant assignment to pointer 'this'
    
    [ Upstream commit e34c0ce9136a0fe96f0f547898d14c44f3c9f147 ]
    
    The pointer 'this' is being initialized with a value that is never read
    and it is being updated later with a new value. The initialization is
    redundant and can be removed.
    
    Addresses-Coverity: ("Unused value")
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove redundant assignment to variable host_err [+ + +]

Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Mon Oct 10 21:24:23 2022 +0100

    NFSD: Remove redundant assignment to variable host_err
    
    [ Upstream commit 69eed23baf877bbb1f14d7f4df54f89807c9ee2a ]
    
    Variable host_err is assigned a value that is never read, it is being
    re-assigned a value in every different execution path in the following
    switch statement. The assignment is redundant and can be removed.
    
    Cleans up clang-scan warning:
    warning: Value stored to 'host_err' is never read [deadcode.DeadStores]
    
    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove redundant assignment to variable len [+ + +]

Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Tue Jun 28 22:25:25 2022 +0100

    nfsd: remove redundant assignment to variable len
    
    [ Upstream commit 842e00ac3aa3b4a4f7f750c8ab54f8578fc875d3 ]
    
    Variable len is being assigned a value zero and this is never
    read, it is being re-assigned later. The assignment is redundant
    and can be removed.
    
    Cleans up clang scan-build warning:
    fs/nfsd/nfsctl.c:636:2: warning: Value stored to 'len' is never read
    
    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: remove redundant variable status [+ + +]

Author: Jinpeng Cui <cui.jinpeng2@zte.com.cn>
Date:   Wed Aug 31 14:20:02 2022 +0000

    NFSD: remove redundant variable status
    
    [ Upstream commit 4ab3442ca384a02abf8b1f2b3449a6c547851873 ]
    
    Return value directly from fh_verify() do_open_permission()
    exp_pseudoroot() instead of getting value from
    redundant variable status.
    
    Reported-by: Zeal Robot <zealci@zte.com.cn>
    Signed-off-by: Jinpeng Cui <cui.jinpeng2@zte.com.cn>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove svc_serv_ops::svo_module [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Feb 16 12:31:09 2022 -0500

    NFSD: Remove svc_serv_ops::svo_module
    
    [ Upstream commit f49169c97fceb21ad6a0aaf671c50b0f520f15a5 ]
    
    struct svc_serv_ops is about to be removed.
    
    Neil Brown says:
    > I suspect svo_module can go as well - I don't think the thread is
    > ever the thing that primarily keeps a module active.
    
    A random sample of kthread_create() callers shows sunrpc is the only
    one that manages module reference count in this way.
    
    Suggested-by: Neil Brown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove the nfsd_cb_work and nfsd_cb_done tracepoints [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:32 2021 -0400

    NFSD: Remove the nfsd_cb_work and nfsd_cb_done tracepoints
    
    [ Upstream commit 1d2bf65983a137121c165a7e69b2885572954915 ]
    
    Clean up: These are noise in properly working systems. If you really
    need to observe the operation of the callback mechanism, use the
    sunrpc:rpc\* tracepoints along with the workqueue tracepoints.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove the pages_flushed statistic from filecache [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 2 14:44:47 2022 -0400

    nfsd: remove the pages_flushed statistic from filecache
    
    [ Upstream commit 1f696e230ea5198e393368b319eb55651828d687 ]
    
    We're counting mapping->nrpages, but not all of those are necessarily
    dirty. We don't really have a simple way to count just the dirty pages,
    so just remove this stat since it's not accurate.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove trace_nfsd_clid_inuse_err [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:55:42 2021 -0400

    NFSD: Remove trace_nfsd_clid_inuse_err
    
    [ Upstream commit 0bfaacac57e64aa342f865b8ddcab06ca59a6f83 ]
    
    This tracepoint has been replaced by nfsd_clid_cred_mismatch and
    nfsd_clid_verf_mismatch, and can simply be removed.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove unused function [+ + +]

Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Date:   Thu Apr 15 16:38:24 2021 +0800

    nfsd: remove unused function
    
    [ Upstream commit 363f8dd5eecd6c67fe9840ef6065440f0ee7df3a ]
    
    Fix the following clang warning:
    
    fs/nfsd/nfs4state.c:6276:1: warning: unused function 'end_offset'
    [-Wunused-function].
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove unused nfsd4_compoundargs::cachetype field [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:23:30 2022 -0400

    NFSD: Remove unused nfsd4_compoundargs::cachetype field
    
    [ Upstream commit 77e378cf2a595d8e39cddf28a31efe6afd9394a0 ]
    
    This field was added by commit 1091006c5eb1 ("nfsd: turn on reply
    cache for NFSv4") but was never put to use.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove unused NFSv2 directory entry encoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 15 14:30:13 2020 -0500

    NFSD: Remove unused NFSv2 directory entry encoders
    
    [ Upstream commit 8a2cf9f5709cc20a1114a7d22655928314fc86f8 ]
    
    Clean up.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Remove unused NFSv3 directory entry encoders [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 13 11:27:13 2020 -0500

    NFSD: Remove unused NFSv3 directory entry encoders
    
    [ Upstream commit 1411934627f9fe31a36ac8c43179ce9b63edce5c ]
    
    Clean up.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove unused set_client argument [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:43 2021 -0500

    nfsd: remove unused set_client argument
    
    [ Upstream commit f71475ba8c2a77fff8051903cf4b7d826c3d1693 ]
    
    Every caller is setting this argument to false, so we don't need it.
    
    Also cut this comment a bit and remove an unnecessary warning.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: remove unused stats counters [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Jan 6 09:52:34 2021 +0200

    nfsd: remove unused stats counters
    
    [ Upstream commit 1b76d1df1a3683b6b23cd1c813d13c5e6a9d35e5 ]
    
    Commit 501cb1849f86 ("nfsd: rip out the raparms cache") removed the
    code that updates read-ahead cache stats counters,
    commit 8bbfa9f3889b ("knfsd: remove the nfsd thread busy histogram")
    removed code that updates the thread busy stats counters back in 2009
    and code that updated filehandle cache stats was removed back in 2002.
    
    Remove the unused stats counters from nfsd_stats struct and print
    hardcoded zeros in /proc/net/rpc/nfsd.
    
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: remove vanity comments [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Wed Jul 28 08:56:09 2021 +1000

    NFSD: remove vanity comments
    
    [ Upstream commit ea49dc79002c416a9003f3204bc14f846a0dbcae ]
    
    Including one's name in copyright claims is appropriate.  Including it
    in random comments is just vanity.  After 2 decades, it is time for
    these to be gone.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: removed unused argument in nfsd_startup_generic() [+ + +]

Author: Vasily Averin <vasily.averin@linux.dev>
Date:   Thu Apr 15 15:00:58 2021 +0300

    nfsd: removed unused argument in nfsd_startup_generic()
    
    [ Upstream commit 70c5307564035c160078401f541c397d77b95415 ]
    
    Since commit 501cb1849f86 ("nfsd: rip out the raparms cache")
    nrservs is not used in nfsd_startup_generic()
    
    Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Rename boot verifier functions [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Dec 30 10:22:05 2021 -0500

    NFSD: Rename boot verifier functions
    
    [ Upstream commit 3988a57885eeac05ef89f0ab4d7e47b52fbcf630 ]
    
    Clean up: These functions handle what the specs call a write
    verifier, which in the Linux NFS server implementation is now
    divorced from the server's boot instance
    
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: rename lookup_clientid->set_client [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:40 2021 -0500

    nfsd: rename lookup_clientid->set_client
    
    [ Upstream commit 460d27091ae2c23e7ac959a61cd481c58832db58 ]
    
    I think this is a better name, and I'm going to reuse elsewhere the code
    that does the lookup itself.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Rename the fields in copy_stateid_t [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 22 13:10:35 2022 -0400

    NFSD: Rename the fields in copy_stateid_t
    
    [ Upstream commit 781fde1a2ba2391f31142f46f964cf1148ca1791 ]
    
    Code maintenance: The name of the copy_stateid_t::sc_count field
    collides with the sc_count field in struct nfs4_stid, making the
    latter difficult to grep for when auditing stateid reference
    counting.
    
    No behavior change expected.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Reorder the fields in struct nfsd4_op [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:28 2022 -0400

    NFSD: Reorder the fields in struct nfsd4_op
    
    [ Upstream commit d314309425ad5dc1b6facdb2d456580fb5fa5e3a ]
    
    Pack the fields to reduce the size of struct nfsd4_op, which is used
    an array in struct nfsd4_compoundargs.
    
    sizeof(struct nfsd4_op):
    Before: /* size: 672, cachelines: 11, members: 5 */
    After:  /* size: 640, cachelines: 10, members: 5 */
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: reorganize filecache.c [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 2 14:44:48 2022 -0400

    nfsd: reorganize filecache.c
    
    [ Upstream commit 8214118589881b2d390284410c5ff275e7a5e03c ]
    
    In a coming patch, we're going to rework how the filecache refcounting
    works. Move some code around in the function to reduce the churn in the
    later patches, and rename some of the functions with (hopefully) clearer
    names: nfsd_file_flush becomes nfsd_file_fsync, and
    nfsd_file_unhash_and_dispose is renamed to nfsd_file_unhash_and_queue.
    
    Also, the nfsd_file_put_final tracepoint is renamed to nfsd_file_free,
    to better match the name of the function from which it's called.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace boolean fields in struct nfsd4_copy [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:41 2022 -0400

    NFSD: Replace boolean fields in struct nfsd4_copy
    
    [ Upstream commit 1913cdf56cb5bfbc8170873728d13598cbecda23 ]
    
    Clean up: saves 8 bytes, and we can replace check_and_set_stop_copy()
    with an atomic bitop.
    
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: replace delayed_work with work_struct for nfsd_client_shrinker [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Wed Jan 11 16:06:51 2023 -0800

    NFSD: replace delayed_work with work_struct for nfsd_client_shrinker
    
    [ Upstream commit 7c24fa225081f31bc6da6a355c1ba801889ab29a ]
    
    Since nfsd4_state_shrinker_count always calls mod_delayed_work with
    0 delay, we can replace delayed_work with work_struct to save some
    space and overhead.
    
    Also add the call to cancel_work after unregister the shrinker
    in nfs4_state_shutdown_net.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_access() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:12:27 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_access()
    
    [ Upstream commit d169a6a9e5fd7f9e4b74e5e5d2e5a4fd0f84ef05 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_backchannel_ctl() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:14:35 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_backchannel_ctl()
    
    [ Upstream commit 0f81d96098f8eb707afe2f8d5c3fe0f9316ef5ce ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_bind_conn_to_session() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:16:23 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_bind_conn_to_session()
    
    [ Upstream commit 571e0451c4de0a545960ffaea16d969931afc563 ]
    
    A dedicated sessionid4 decoder is introduced that will be used by
    other operation decoders in subsequent patches.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_cb_sec() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:09:34 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_cb_sec()
    
    [ Upstream commit 1a99440807bfc66597aaa2e0f0213c319b023e34 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_clone() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:46:46 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_clone()
    
    [ Upstream commit 3dfd0b0e15671e2b4047ccb9222432f0b2d930be ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_close() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:18:23 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_close()
    
    [ Upstream commit d3d2f38154571e70d5806b5c5264bf61c101ea15 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_commit() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:19:51 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_commit()
    
    [ Upstream commit cbd9abb3706e96563b36af67595707a7054ab693 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_compound() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 11:07:06 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_compound()
    
    [ Upstream commit d9b74bdac6f24afc3101b6a5b6f59842610c9c94 ]
    
    And clean-up: Now that we have removed the DECODE_TAIL macro from
    nfsd4_decode_compound(), we observe that there's no benefit for
    nfsd4_decode_compound() to return nfs_ok or nfserr_bad_xdr only to
    have its sole caller convert those values to one or zero,
    respectively. Have nfsd4_decode_compound() return 1/0 instead.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_copy() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:49:37 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_copy()
    
    [ Upstream commit e8febea7190bcbd1e608093acb67f2a5009556aa ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_copy_notify() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 21 14:19:24 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_copy_notify()
    
    [ Upstream commit f9a953fb369bbd2135ccead3393ec1ef66544471 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_create() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:24:10 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_create()
    
    [ Upstream commit 000dfa18b3df9c62df5f768f9187cf1a94ded71d ]
    
    A dedicated decoder for component4 is introduced here, which will be
    used by other operation decoders in subsequent patches.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_create_session() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:52:44 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_create_session()
    
    [ Upstream commit 81243e3fe37ed547fc4ed8aab1cec2865540bb18 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_delegreturn() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 21 14:11:58 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_delegreturn()
    
    [ Upstream commit 95e6482cedfc0785b85db49b72a05323bbf41750 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_destroy_clientid() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:15:09 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_destroy_clientid()
    
    [ Upstream commit c95f2ec3490586cbb33badc8f4c82d6aa4955078 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_destroy_session() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 13:50:55 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_destroy_session()
    
    [ Upstream commit 94e254af1f873b4b551db4c4549294f2c4d385ef ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_fallocate() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:44:05 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_fallocate()
    
    [ Upstream commit 6aef27aaeae7611f98af08205acc79f5a8f3aa59 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_fattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 12:56:05 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_fattr()
    
    [ Upstream commit d1c263a031e876ac3ca5223c728e4d98ed50b3c0 ]
    
    Let's be more careful to avoid overrunning the memory that backs
    the bitmap array. This requires updating the synopsis of
    nfsd4_decode_fattr().
    
    Bruce points out that a server needs to be careful to return nfs_ok
    when a client presents bitmap bits the server doesn't support. This
    includes bits in bitmap words the server might not yet support.
    
    The current READ* based implementation is good about that, but that
    requirement hasn't been documented.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_free_stateid() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 1 13:38:27 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_free_stateid()
    
    [ Upstream commit aec387d5909304810d899f7d90ae57df33f3a75c ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_getattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 14:40:20 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_getattr()
    
    [ Upstream commit f759eff260f1f0b0f56531517762f27ee3233506 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_getdeviceinfo() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 15:03:50 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_getdeviceinfo()
    
    [ Upstream commit 044959715f370b24870c95df3940add8710c5a29 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_layoutcommit() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:40:07 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_layoutcommit()
    
    [ Upstream commit 5185980d8a23001c2317c290129ab7ab20067e20 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_layoutget() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 15:06:04 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_layoutget()
    
    [ Upstream commit c8e88e3aa73889421461f878cd569ef84f231ceb ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_layoutreturn() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:42:25 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_layoutreturn()
    
    [ Upstream commit 645fcad371420913c30e9aca80fc0a38f3acf432 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_link() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:01:24 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_link()
    
    [ Upstream commit 5c505d128691c70991b766dd6a3faf49fa59ecfb ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_listxattrs() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 11:04:02 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_listxattrs()
    
    [ Upstream commit 2212036cadf4da3c4b0e4bd2a9a8c3d78617ab4f ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_lock() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:29:27 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_lock()
    
    [ Upstream commit 7c59deed5cd2e1cfc6cbecf06f4584ac53755f53 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_lockt() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:31:44 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_lockt()
    
    [ Upstream commit 0a146f04aa0fa7a57aaed3913d1c2732b3853f31 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_locku() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 13:33:28 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_locku()
    
    [ Upstream commit ca9cf9fc27f8f722e9eb2763173ba01f6ac3dad1 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_lookup() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:02:40 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_lookup()
    
    [ Upstream commit 3d5877e8e03f60d7cc804d7b230ff9c00c9c07bd ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_nl4_server() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 18:05:06 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_nl4_server()
    
    [ Upstream commit f49e4b4d58cc835d8bd0cc9663f7b9c5497e0e7e ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_offload_status() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 21 14:21:25 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_offload_status()
    
    [ Upstream commit 2846bb0525a73e00b3566fda535ea6a5879e2971 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_open() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Nov 1 12:04:06 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_open()
    
    [ Upstream commit 61e5e0b3ec713d1365008c8af3fe5fdd262e2a60 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_open_confirm() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:18:57 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_open_confirm()
    
    [ Upstream commit 06bee693a1f1cb774b91000f05a6e183c257d8e9 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_open_downgrade() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:21:01 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_open_downgrade()
    
    [ Upstream commit dca71651f097ea608945d7a66bf62761a630de9a ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_putfh() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:23:02 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_putfh()
    
    [ Upstream commit a73bed98413b1d9eb4466f776a56d2fde8b3b2c9 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_read() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:28:24 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_read()
    
    [ Upstream commit 3909c3bc604688503e31ddceb429dc156c4720c1 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_readdir() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:30:59 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_readdir()
    
    [ Upstream commit 0dfaf2a371436860ace6af889e6cd8410ee63164 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_reclaim_complete() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 15:02:11 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_reclaim_complete()
    
    [ Upstream commit 0d6467844d437e07db1e76d96176b1a55401018c ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_release_lockowner() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 13:42:25 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_release_lockowner()
    
    [ Upstream commit a4a80c15ca4dd998ab5cbe87bd856c626a318a80 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_remove() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:04:36 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_remove()
    
    [ Upstream commit b7f5fbf219aecda98e32de305551e445f9438899 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_rename() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:05:58 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_rename()
    
    [ Upstream commit ba881a0a5342b3aaf83958901ebe3fe752eaab46 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_renew() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:08:50 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_renew()
    
    [ Upstream commit d12f90458dc8c11734ba44ec88f109bf8de86ff0 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_secinfo() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:09:42 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_secinfo()
    
    [ Upstream commit d0abdae5191a916d767164f6fc6c0e2e814a20a7 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_secinfo_no_name() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:33:12 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_secinfo_no_name()
    
    [ Upstream commit 53d70873e37c09a582167ed73d1858e3a2af0157 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_seek() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:54:47 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_seek()
    
    [ Upstream commit 9d32b412fe0a6186cc57789d218e8f8299454ae2 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_sequence() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:55:19 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_sequence()
    
    [ Upstream commit cf907b11326d9360877d6c6ea8f75e1b29f39f2f ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_setattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 21 14:14:59 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_setattr()
    
    [ Upstream commit 44592fe9479d8d4b88594365ab825f7b07afdf7c ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_setclientid() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:35:02 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_setclientid()
    
    [ Upstream commit 92fa6c08c251d52d0d7b46066ecf87b96a0c4b8f ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_setclientid_confirm() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 15:12:33 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_setclientid_confirm()
    
    [ Upstream commit d1ca55149d67e5896f89a30053f5d83c002ac10e ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_setxattr() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:59:57 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_setxattr()
    
    [ Upstream commit 403366a7e8e2930002157525cd44add7fa01bca9 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_share_access() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:54:48 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_share_access()
    
    [ Upstream commit 9aa62f5199749b274454b6d7d914c9b2a5e77031 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_share_deny() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 16 17:56:17 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_share_deny()
    
    [ Upstream commit b07bebd9eb9842e2d0dea87efeb92884556e55b0 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_test_stateid() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:57:44 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_test_stateid()
    
    [ Upstream commit b7a0c8f6e741bf9dee0d24e69d3ce51fa4ccce78 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_verify() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:40:32 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_verify()
    
    [ Upstream commit 67cd453eeda86be90f83a0f4798f33832cf2d98c ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_write() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 14:44:28 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_write()
    
    [ Upstream commit 244e2befcba80f42c65293b6c56282bb78f9f417 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros in nfsd4_decode_xattr_name() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 4 10:56:52 2020 -0500

    NFSD: Replace READ* macros in nfsd4_decode_xattr_name()
    
    [ Upstream commit 830c71502ae0ae1677ac6c08ffbcf85a6e7b2937 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 acl attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 13:02:54 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 acl attribute
    
    [ Upstream commit c941a96823cf52e742606b486b81ab346bf111c9 ]
    
    Refactor for clarity and to move infrequently-used code out of line.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 mode attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 13:54:26 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 mode attribute
    
    [ Upstream commit 1c8f0ad7dd35fd12307904036c7c839f77b6e3f9 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 owner attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 13:56:42 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 owner attribute
    
    [ Upstream commit 9853a5ac9be381917e9be0b4133cd4ac5a7ad875 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 owner_group attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 13:58:18 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 owner_group attribute
    
    [ Upstream commit 393c31dd27f83adb06b07a1b5f0a5b8966a0f01e ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 security label attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 14:05:51 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 security label attribute
    
    [ Upstream commit dabe91828f92cd493e9e75efbc10f9878d2a73fe ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 size attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 13:47:16 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 size attribute
    
    [ Upstream commit 2ac1b9b2afbbacf597dbec722b23b6be62e4e41e ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 time_set attributes [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 14:01:08 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 time_set attributes
    
    [ Upstream commit 1c3eff7ea4a98c642134ee493001ae13b79ff38c ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace READ* macros that decode the fattr4 umask attribute [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 19 14:07:43 2020 -0500

    NFSD: Replace READ* macros that decode the fattr4 umask attribute
    
    [ Upstream commit 66f0476c704c86d44aa9da19d4753df66f2dbc96 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace the "init once" mechanism [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:16 2022 -0400

    NFSD: Replace the "init once" mechanism
    
    [ Upstream commit c7b824c3d06c85e054caf86e227255112c5e3c38 ]
    
    In a moment, the nfsd_file_hashtbl global will be replaced with an
    rhashtable. Replace the one or two spots that need to check if the
    hash table is available. We can easily reuse the SHUTDOWN flag for
    this purpose.
    
    Document that this mechanism relies on callers to hold the
    nfsd_mutex to prevent init, shutdown, and purging to run
    concurrently.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace the internals of the READ_BUF() macro [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 3 11:54:23 2020 -0500

    NFSD: Replace the internals of the READ_BUF() macro
    
    [ Upstream commit c1346a1216ab5cb04a265380ac9035d91b16b6d5 ]
    
    Convert the READ_BUF macro in nfs4xdr.c from open code to instead
    use the new xdr_stream-style decoders already in use by the encode
    side (and by the in-kernel NFS client implementation). Once this
    conversion is done, each individual NFSv4 argument decoder can be
    independently cleaned up to replace these macros with C code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Replace the nfsd_deleg_break tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:20 2021 -0400

    NFSD: Replace the nfsd_deleg_break tracepoint
    
    [ Upstream commit 17d76ddf76e4972411402743eea7243d9a46f4f9 ]
    
    Renamed so it can be enabled as a set with the other nfsd_cb_
    tracepoints. And, consistent with those tracepoints, report the
    address of the client, the client ID the server has given it, and
    the state ID being recalled.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Report average age of filecache items [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:12 2022 -0400

    NFSD: Report average age of filecache items
    
    [ Upstream commit 904940e94a887701db24401e3ed6928a1d4e329f ]
    
    This is a measure of how long items stay in the filecache, to help
    assess how efficient the cache is.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: report client confirmation status in "info" file [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Sat Mar 20 09:38:04 2021 +1100

    nfsd: report client confirmation status in "info" file
    
    [ Upstream commit 472d155a0631bd1a09b5c0c275a254e65605d683 ]
    
    mountd can now monitor clients appearing and disappearing in
    /proc/fs/nfsd/clients, and will log these events, in liu of the logging
    of mount/unmount events for NFSv3.
    
    Currently it cannot distinguish between unconfirmed clients (which might
    be transient and totally uninteresting) and confirmed clients.
    
    So add a "status: " line which reports either "confirmed" or
    "unconfirmed", and use fsnotify to report that the info file
    has been modified.
    
    This requires a bit of infrastructure to keep the dentry for the "info"
    file.  There is no need to take a counted reference as the dentry must
    remain around until the client is removed.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Report count of calls to nfsd_file_acquire() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:23:59 2022 -0400

    NFSD: Report count of calls to nfsd_file_acquire()
    
    [ Upstream commit 29d4bdbbb910f33d6058d2c51278f00f656df325 ]
    
    Count the number of successful acquisitions that did not create a
    file (ie, acquisitions that do not result in a compulsory cache
    miss). This count can be compared directly with the reported hit
    count to compute a hit ratio.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Report count of freed filecache items [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:05 2022 -0400

    NFSD: Report count of freed filecache items
    
    [ Upstream commit d63293272abb51c02457f1017dfd61c3270d9ae3 ]
    
    Surface the count of freed nfsd_file items.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Report filecache LRU size [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:23:52 2022 -0400

    NFSD: Report filecache LRU size
    
    [ Upstream commit 0fd244c115f0321fc5e34ad2291f2a572508e3f7 ]
    
    Surface the NFSD filecache's LRU list length to help field
    troubleshooters monitor filecache issues.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: report per-export stats [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Jan 6 09:52:36 2021 +0200

    nfsd: report per-export stats
    
    [ Upstream commit 20ad856e47323e208ae8d6a9ecfe5bf0be6f505e ]
    
    Collect some nfsd stats per export in addition to the global stats.
    
    A new nfsdfs export_stats file is created.  It uses the same ops as the
    exports file to iterate the export entries and we use the file's name to
    determine the reported info per export.  For example:
    
     $ cat /proc/fs/nfsd/export_stats
     # Version 1.1
     # Path Client Start-time
     #      Stats
     /test  localhost       92
            fh_stale: 0
            io_read: 9
            io_write: 1
    
    Every export entry reports the start time when stats collection
    started, so stats collecting scripts can know if stats where reset
    between samples.
    
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Report the number of items evicted by the LRU walk [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:38 2022 -0400

    NFSD: Report the number of items evicted by the LRU walk
    
    [ Upstream commit 94660cc19c75083af046b0f8362e3d3bc2eba21d ]
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: reshuffle some code [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Apr 16 14:00:17 2021 -0400

    nfsd: reshuffle some code
    
    [ Upstream commit ebd9d2c2f5a7ebaaed2d7bb4dee148755f46033d ]
    
    No change in behavior, I'm just moving some code around to avoid forward
    references in a following patch.
    
    (To do someday: figure out how to split up nfs4state.c.  It's big and
    disorganized.)
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Restore NFSv4 decoding's SAVEMEM functionality [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Dec 18 12:28:58 2020 -0500

    NFSD: Restore NFSv4 decoding's SAVEMEM functionality
    
    [ Upstream commit 7b723008f9c95624c848fad661c01b06e47b20da ]
    
    While converting the NFSv4 decoder to use xdr_stream-based XDR
    processing, I removed the old SAVEMEM() macro. This macro wrapped
    a bit of logic that avoided a memory allocation by recognizing when
    the decoded item resides in a linear section of the Receive buffer.
    In that case, it returned a pointer into that buffer instead of
    allocating a bounce buffer.
    
    The bounce buffer is necessary only when xdr_inline_decode() has
    placed the decoded item in the xdr_stream's scratch buffer, which
    disappears the next time xdr_inline_decode() is called with that
    xdr_stream. That happens only if the data item crosses a page
    boundary in the receive buffer, an exceedingly rare occurrence.
    
    Allocating a bounce buffer every time results in a minor performance
    regression that was introduced by the recent NFSv4 decoder overhaul.
    Let's restore the previous behavior. On average, it saves about 1.5
    kmalloc() calls per COMPOUND.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Retry once in nfsd_open on an -EOPENSTALE return [+ + +]

Author: Jeff Layton <jeff.layton@primarydata.com>
Date:   Sat Dec 18 20:37:56 2021 -0500

    nfsd: Retry once in nfsd_open on an -EOPENSTALE return
    
    [ Upstream commit 12bcbd40fd931472c7fc9cf3bfe66799ece93ed8 ]
    
    If we get back -EOPENSTALE from an NFSv4 open, then we either got some
    unhandled error or the inode we got back was not the same as the one
    associated with the dentry.
    
    We really have no recourse in that situation other than to retry the
    open, and if it fails to just return nfserr_stale back to the client.
    
    Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: return error if nfs4_setacl fails [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Nov 7 06:58:41 2022 -0500

    nfsd: return error if nfs4_setacl fails
    
    [ Upstream commit 01d53a88c08951f88f2a42f1f1e6568928e0590e ]
    
    With the addition of POSIX ACLs to struct nfsd_attrs, we no longer
    return an error if setting the ACL fails. Ensure we return the na_aclerr
    error on SETATTR if there is one.
    
    Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs")
    Cc: Neil Brown <neilb@suse.de>
    Reported-by: Yongcheng Yang <yoyang@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately" [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:46:44 2022 -0400

    NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately"
    
    [ Upstream commit dcf3f80965ca787c70def402cdf1553c93c75529 ]
    
    This reverts commit 5e138c4a750dc140d881dab4a8804b094bbc08d2.
    
    That commit attempted to make files available to other users as soon
    as all NFSv4 clients were done with them, rather than waiting until
    the filecache LRU had garbage collected them.
    
    It gets the reference counting wrong, for one thing.
    
    But it also misses that DELEGRETURN should release a file in the
    same fashion. In fact, any nfsd_file_put() on an file held open
    by an NFSv4 client needs potentially to release the file
    immediately...
    
    Clear the way for implementing that idea.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: rework hashtable handling in nfsd_do_file_acquire [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Oct 4 15:41:10 2022 -0400

    nfsd: rework hashtable handling in nfsd_do_file_acquire
    
    [ Upstream commit 243a5263014a30436c93ed3f1f864c1da845455e ]
    
    nfsd_file is RCU-freed, so we need to hold the rcu_read_lock long enough
    to get a reference after finding it in the hash. Take the
    rcu_read_lock() and call rhashtable_lookup directly.
    
    Switch to using rhashtable_lookup_insert_key as well, and use the usual
    retry mechanism if we hit an -EEXIST. Rename the "retry" bool to
    open_retry, and eliminiate the insert_err goto target.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: rework refcounting in filecache [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Sun Dec 11 06:19:33 2022 -0500

    nfsd: rework refcounting in filecache
    
    [ Upstream commit ac3a2585f018f10039b4a856dcb122da88c1c1c9 ]
    
    The filecache refcounting is a bit non-standard for something searchable
    by RCU, in that we maintain a sentinel reference while it's hashed. This
    in turn requires that we have to do things differently in the "put"
    depending on whether its hashed, which we believe to have led to races.
    
    There are other problems in here too. nfsd_file_close_inode_sync can end
    up freeing an nfsd_file while there are still outstanding references to
    it, and there are a number of subtle ToC/ToU races.
    
    Rework the code so that the refcount is what drives the lifecycle. When
    the refcount goes to zero, then unhash and rcu free the object. A task
    searching for a nfsd_file is allowed to bump its refcount, but only if
    it's not already 0. Ensure that we don't make any other changes to it
    until a reference is held.
    
    With this change, the LRU carries a reference. Take special care to deal
    with it when removing an entry from the list, and ensure that we only
    repurpose the nf_lru list_head when the refcount is 0 to ensure
    exclusive access to it.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: rpc_peeraddr2str needs rcu lock [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Jun 14 11:20:49 2021 -0400

    nfsd: rpc_peeraddr2str needs rcu lock
    
    [ Upstream commit 05570a2b01117209b500e1989ce8f1b0524c489f ]
    
    I'm not even sure cl_xprt can change here, but we're getting "suspicious
    RCU usage" warnings, and other rpc_peeraddr2str callers are taking the
    rcu lock.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Save location of NFSv4 COMPOUND status [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 13 10:40:59 2021 -0400

    NFSD: Save location of NFSv4 COMPOUND status
    
    [ Upstream commit 3b0ebb255fdc49a3d340846deebf045ef58ec744 ]
    
    Refactor: Currently nfs4svc_encode_compoundres() relies on the NFS
    dispatcher to pass in the buffer location of the COMPOUND status.
    Instead, save that buffer location in struct nfsd4_compoundres.
    
    The compound tag follows immediately after.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: separate nfsd_last_thread() from nfsd_put() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Jul 31 16:48:32 2023 +1000

    nfsd: separate nfsd_last_thread() from nfsd_put()
    
    [ Upstream commit 9f28a971ee9fdf1bf8ce8c88b103f483be610277 ]
    
    Now that the last nfsd thread is stopped by an explicit act of calling
    svc_set_num_threads() with a count of zero, we only have a limited
    number of places that can happen, and don't need to call
    nfsd_last_thread() in nfsd_put()
    
    So separate that out and call it at the two places where the number of
    threads is set to zero.
    
    Move the clearing of ->nfsd_serv and the call to svc_xprt_destroy_all()
    into nfsd_last_thread(), as they are really part of the same action.
    
    nfsd_put() is now a thin wrapper around svc_put(), so make it a static
    inline.
    
    nfsd_put() cannot be called after nfsd_last_thread(), so in a couple of
    places we have to use svc_put() instead.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Separate tracepoints for acquire and create [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:43 2022 -0400

    NFSD: Separate tracepoints for acquire and create
    
    [ Upstream commit be0230069fcbf7d332d010b57c1d0cfd623a84d6 ]
    
    These tracepoints collect different information: the create case does
    not open a file, so there's no nf_file available.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: set attributes when creating symlinks [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: set attributes when creating symlinks
    
    [ Upstream commit 93adc1e391a761441d783828b93979b38093d011 ]
    
    The NFS protocol includes attributes when creating symlinks.
    Linux does store attributes for symlinks and allows them to be set,
    though they are not used for permission checking.
    
    NFSD currently doesn't set standard (struct iattr) attributes when
    creating symlinks, but for NFSv4 it does set ACLs and security labels.
    This is inconsistent.
    
    To improve consistency, pass the provided attributes into nfsd_symlink()
    and call nfsd_create_setattr() to set them.
    
    NOTE: this results in a behaviour change for all NFS versions when the
    client sends non-default attributes with a SYMLINK request. With the
    Linux client, the only attributes are:
            attr.ia_mode = S_IFLNK | S_IRWXUGO;
            attr.ia_valid = ATTR_MODE;
    so the final outcome will be unchanged. Other clients might sent
    different attributes, and if they did they probably expect them to be
    honoured.
    
    We ignore any error from nfsd_create_setattr().  It isn't really clear
    what should be done if a file is successfully created, but the
    attributes cannot be set.  NFS doesn't allow partial success to be
    reported.  Reporting failure is probably more misleading than reporting
    success, so the status is ignored.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Set PF_LOCAL_THROTTLE on local filesystems only [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Mon Nov 30 17:03:19 2020 -0500

    nfsd: Set PF_LOCAL_THROTTLE on local filesystems only
    
    [ Upstream commit 01cbf3853959feec40ec9b9a399e12a021cd4d81 ]
    
    Don't set PF_LOCAL_THROTTLE on remote filesystems like NFS, since they
    aren't expected to ever be subject to double buffering.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Set up an rhashtable for the filecache [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:26:23 2022 -0400

    NFSD: Set up an rhashtable for the filecache
    
    [ Upstream commit fc22945ecc2a0a028f3683115f98a922d506c284 ]
    
    Add code to initialize and tear down an rhashtable. The rhashtable
    is not used yet.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Show state of courtesy client in client info [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:27 2022 -0700

    NFSD: Show state of courtesy client in client info
    
    [ Upstream commit e9488d5ae13c0a72223c507e2508dc2ac66cad4f ]
    
    Update client_info_show to show state of courtesy client
    and seconds since last renew.
    
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Shrink size of struct nfsd4_copy [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:22 2022 -0400

    NFSD: Shrink size of struct nfsd4_copy
    
    [ Upstream commit 87689df694916c40e8e6c179ab1c8710f65cb6c6 ]
    
    struct nfsd4_copy is part of struct nfsd4_op, which resides in an
    8-element array.
    
    sizeof(struct nfsd4_op):
    Before: /* size: 1696, cachelines: 27, members: 5 */
    After:  /* size: 672, cachelines: 11, members: 5 */
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Shrink size of struct nfsd4_copy_notify [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jul 27 14:40:16 2022 -0400

    NFSD: Shrink size of struct nfsd4_copy_notify
    
    [ Upstream commit 09426ef2a64ee189ca1e3298f1e874842dbf35ea ]
    
    struct nfsd4_copy_notify is part of struct nfsd4_op, which resides
    in an 8-element array.
    
    sizeof(struct nfsd4_op):
    Before: /* size: 2208, cachelines: 35, members: 5 */
    After:  /* size: 1696, cachelines: 27, members: 5 */
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: silence extraneous printk on nfsd.ko insertion [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Jul 20 08:39:23 2022 -0400

    nfsd: silence extraneous printk on nfsd.ko insertion
    
    [ Upstream commit 3a5940bfa17fb9964bf9688b4356ca643a8f5e2d ]
    
    This printk pops every time nfsd.ko gets plugged in. Most kmods don't do
    that and this one is not very informative. Olaf's email address seems to
    be defunct at this point anyway. Just drop it.
    
    Cc: Olaf Kirch <okir@suse.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Simplify code around svc_exit_thread() call in nfsd() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Jul 31 16:48:31 2023 +1000

    nfsd: Simplify code around svc_exit_thread() call in nfsd()
    
    [ Upstream commit 18e4cf915543257eae2925671934937163f5639b ]
    
    Previously a thread could exit asynchronously (due to a signal) so some
    care was needed to hold nfsd_mutex over the last svc_put() call.  Now a
    thread can only exit when svc_set_num_threads() is called, and this is
    always called under nfsd_mutex.  So no care is needed.
    
    Not only is the mutex held when a thread exits now, but the svc refcount
    is elevated, so the svc_put() in svc_exit_thread() will never be a final
    put, so the mutex isn't even needed at this point in the code.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: simplify locking for network notifier. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    NFSD: simplify locking for network notifier.
    
    [ Upstream commit d057cfec4940ce6eeffa22b4a71dec203b06cd55 ]
    
    nfsd currently maintains an open-coded read/write semaphore (refcount
    and wait queue) for each network namespace to ensure the nfs service
    isn't shut down while the notifier is running.
    
    This is excessive.  As there is unlikely to be contention between
    notifiers and they run without sleeping, a single spinlock is sufficient
    to avoid problems.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: ensure nfsd_notifier_lock is static ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: simplify nfsd4_change_info [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Nov 30 17:46:15 2020 -0500

    nfsd: simplify nfsd4_change_info
    
    [ Upstream commit b2140338d8dca827ad9e83f3e026e9d51748b265 ]
    
    It doesn't make sense to carry all these extra fields around.  Just
    make everything into change attribute from the start.
    
    This is just cleanup, there should be no change in behavior.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: simplify nfsd4_check_open_reclaim [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:44 2021 -0500

    nfsd: simplify nfsd4_check_open_reclaim
    
    [ Upstream commit 1722b04624806ced51693f546edb83e8b2297a77 ]
    
    The set_client() was already taken care of by process_open1().
    
    The comments here are mostly redundant with the code.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: simplify nfsd_renew [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:39 2021 -0500

    nfsd: simplify nfsd_renew
    
    [ Upstream commit b4587eb2cf4b6271f67fb93b75f7de2a2026e853 ]
    
    You can take the single-exit thing too far, I think.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: simplify per-net file cache management [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Wed Dec 1 10:58:14 2021 +1100

    NFSD: simplify per-net file cache management
    
    [ Upstream commit 1463b38e7cf34d4cc60f41daff459ad807b2e408 ]
    
    We currently have a 'laundrette' for closing cached files - a different
    work-item for each network-namespace.
    
    These 'laundrettes' (aka struct nfsd_fcache_disposal) are currently on a
    list, and are freed using rcu.
    
    The list is not necessary as we have a per-namespace structure (struct
    nfsd_net) which can hold a link to the nfsd_fcache_disposal.
    The use of kfree_rcu is also unnecessary as the cache is cleaned of all
    files associated with a given namespace, and no new files can be added,
    before the nfsd_fcache_disposal is freed.
    
    So add a '->fcache_disposal' link to nfsd_net, and discard the list
    management and rcu usage.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: simplify process_lock [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Thu Jan 21 17:57:38 2021 -0500

    nfsd: simplify process_lock
    
    [ Upstream commit a9d53a75cf574d6aa41f3cb4968fffe4f64e0fad ]
    
    Similarly, this STALE_CLIENTID check is already handled by:
    
    nfs4_preprocess_confirmed_seqid_op()->
            nfs4_preprocess_seqid_op()->
                    nfsd4_lookup_stateid()->
                            set_client()->
                                    STALE_CLIENTID()
    
    (This may cause it to return a different error in some cases where
    there are multiple things wrong; pynfs test SEQ10 regressed on this
    commit because of that, but I think that's the test's fault, and I've
    fixed it separately.)
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Simplify READ_PLUS [+ + +]

Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date:   Tue Sep 13 14:01:51 2022 -0400

    NFSD: Simplify READ_PLUS
    
    [ Upstream commit eeadcb75794516839078c28b3730132aeb700ce6 ]
    
    Chuck had suggested reverting READ_PLUS so it returns a single DATA
    segment covering the requested read range. This prepares the server for
    a future "sparse read" function so support can easily be added without
    needing to rip out the old READ_PLUS code at the same time.
    
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Simplify starting_len [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:09:10 2022 -0400

    NFSD: Simplify starting_len
    
    [ Upstream commit 071ae99feadfc55979f89287d6ad2c6a315cb46d ]
    
    Clean-up: Now that nfsd4_encode_readv() does not have to encode the
    EOF or rd_length values, it no longer needs to subtract 8 from
    @starting_len.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: simplify struct nfsfh [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 2 11:16:32 2021 +1000

    NFSD: simplify struct nfsfh
    
    [ Upstream commit d8b26071e65e80a348602b939e333242f989221b ]
    
    Most of the fields in 'struct knfsd_fh' are 2 levels deep (a union and a
    struct) and are accessed using macros like:
    
     #define fh_FOO fh_base.fh_new.fb_FOO
    
    This patch makes the union and struct anonymous, so that "fh_FOO" can be
    a name directly within 'struct knfsd_fh' and the #defines aren't needed.
    
    The file handle as a whole is sometimes accessed as "fh_base" or
    "fh_base.fh_pad", neither of which are particularly helpful names.
    As the struct holding the filehandle is now anonymous, we
    cannot use the name of that, so we union it with 'fh_raw' and use that
    where the raw filehandle is needed.  fh_raw also ensure the structure is
    large enough for the largest possible filehandle.
    
    fh_raw is a 'char' array, removing any need to cast it for memcpy etc.
    
    SVCFH_fmt() is simplified using the "%ph" printk format.  This
    changes the appearance of filehandles in dprintk() debugging, making
    them a little more precise.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: simplify test_bit return in NFSD_FILE_KEY_FULL comparator [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Jan 6 10:39:01 2023 -0500

    nfsd: simplify test_bit return in NFSD_FILE_KEY_FULL comparator
    
    [ Upstream commit d69b8dbfd0866abc5ec84652cc1c10fc3d4d91ef ]
    
    test_bit returns bool, so we can just compare the result of that to the
    key->gc value without the "!!".
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: simplify the delayed disposal list code [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Apr 14 17:31:44 2023 -0400

    nfsd: simplify the delayed disposal list code
    
    [ Upstream commit 92e4a6733f922f0fef1d0995f7b2d0eaff86c7ea ]
    
    When queueing a dispose list to the appropriate "freeme" lists, it
    pointlessly queues the objects one at a time to an intermediate list.
    
    Remove a few helpers and just open code a list_move to make it more
    clear and efficient. Better document the resulting functions with
    kerneldoc comments.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Skip extra computation for RC_NOCACHE case [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Sep 28 11:39:02 2021 -0400

    NFSD: Skip extra computation for RC_NOCACHE case
    
    [ Upstream commit 0f29ce32fbc56cfdb304eec8a4deb920ccfd89c3 ]
    
    Force the compiler to skip unneeded initialization for cases that
    don't need those values. For example, NFSv4 COMPOUND operations are
    RC_NOCACHE.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: skip some unnecessary stats in the v4 case [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Jan 29 14:27:01 2021 -0500

    nfsd: skip some unnecessary stats in the v4 case
    
    [ Upstream commit 428a23d2bf0ca8fd4d364a464c3e468f0e81671e ]
    
    In the typical case of v4 and an i_version-supporting filesystem, we can
    skip a stat which is only required to fake up a change attribute from
    ctime.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Streamline the rare "found" case [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Sep 28 11:40:59 2021 -0400

    NFSD: Streamline the rare "found" case
    
    [ Upstream commit add1511c38166cf1036765f8c4aa939f0275a799 ]
    
    Move a rarely called function call site out of the hot path.
    
    This is an exceptionally small improvement because the compiler
    inlines most of the functions that nfsd_cache_lookup() calls.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Trace boot verifier resets [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Dec 28 14:27:56 2021 -0500

    NFSD: Trace boot verifier resets
    
    [ Upstream commit 75acacb6583df0b9328dc701d8eeea05af49b8b5 ]
    
    According to commit bbf2f098838a ("nfsd: Reset the boot verifier on
    all write I/O errors"), the Linux NFS server forces all clients to
    resend pending unstable writes if any server-side write or commit
    operation encounters an error (say, ENOSPC). This is a rare and
    quite exceptional event that could require administrative recovery
    action, so it should be made trace-able. Example trace event:
    
    nfsd-938   [002]  7174.945558: nfsd_writeverf_reset: boot_time=        61cc920d xid=0xdcd62036 error=-28 new verifier=0x08aecc6142515904
    
    [ cel: adjusted to apply to v5.10.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Trace delegation revocations [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:09 2022 -0400

    NFSD: Trace delegation revocations
    
    [ Upstream commit a1c74569bbde91299f24535abf711be5c84df9de ]
    
    Delegation revocation is an exceptional event that is not otherwise
    visible externally (eg, no network traffic is emitted). Generate a
    trace record when it occurs so that revocation can be observed or
    other activity can be triggered. Example:
    
    nfsd-1104  [005]  1912.002544: nfsd_stid_revoke:        client 633c9343:4e82788d stateid 00000003:00000001 ref=2 type=DELEG
    
    Trace infrastructure is provided for subsequent additional tracing
    related to nfs4_stid activity.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Trace filecache LRU activity [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:11 2022 -0400

    NFSD: Trace filecache LRU activity
    
    [ Upstream commit c46203acddd9b9200dbc53d0603c97355fd3a03b ]
    
    Observe the operation of garbage collection and the lifetime of
    filecache items.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Trace filecache opens [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sun Mar 27 16:42:20 2022 -0400

    NFSD: Trace filecache opens
    
    [ Upstream commit 0122e882119ddbd9efa6edfeeac3f5c704a7aeea ]
    
    Instrument calls to nfsd_open_verified() to get a sense of the
    filecache hit rate.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Trace stateids returned via DELEGRETURN [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:03 2022 -0400

    NFSD: Trace stateids returned via DELEGRETURN
    
    [ Upstream commit 20eee313ff4b8a7e71ae9560f5c4ba27cd763005 ]
    
    Handing out a delegation stateid is recorded with the
    nfsd_deleg_read tracepoint, but there isn't a matching tracepoint
    for recording when the stateid is returned.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: track filehandle aliasing in nfs4_files [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Apr 16 14:00:16 2021 -0400

    nfsd: track filehandle aliasing in nfs4_files
    
    [ Upstream commit a0ce48375a367222989c2618fe68bf34db8c7bb7 ]
    
    It's unusual but possible for multiple filehandles to point to the same
    file.  In that case, we may end up with multiple nfs4_files referencing
    the same inode.
    
    For delegation purposes it will turn out to be useful to flag those
    cases.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: unregister shrinker when nfsd_init_net() fails [+ + +]

Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date:   Mon Oct 10 14:59:02 2022 +0900

    NFSD: unregister shrinker when nfsd_init_net() fails
    
    [ Upstream commit bd86c69dae65de30f6d47249418ba7889809e31a ]
    
    syzbot is reporting UAF read at register_shrinker_prepared() [1], for
    commit 7746b32f467b3813 ("NFSD: add shrinker to reap courtesy clients on
    low memory condition") missed that nfsd4_leases_net_shutdown() from
    nfsd_exit_net() is called only when nfsd_init_net() succeeded.
    If nfsd_init_net() fails due to nfsd_reply_cache_init() failure,
    register_shrinker() from nfsd4_init_leases_net() has to be undone
    before nfsd_init_net() returns.
    
    Link: https://syzkaller.appspot.com/bug?extid=ff796f04613b4c84ad89 [1]
    Reported-by: syzbot <syzbot+ff796f04613b4c84ad89@syzkaller.appspotmail.com>
    Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Fixes: 7746b32f467b3813 ("NFSD: add shrinker to reap courtesy clients on low memory condition")
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: Unregister the cld notifier when laundry_wq create failed [+ + +]

Author: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Date:   Sat May 21 12:08:44 2022 +0800

    nfsd: Unregister the cld notifier when laundry_wq create failed
    
    [ Upstream commit 62fdb65edb6c43306c774939001f3a00974832aa ]
    
    If laundry_wq create failed, the cld notifier should be unregistered.
    
    Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update ACCESS3arg decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 14:32:04 2020 -0400

    NFSD: Update ACCESS3arg decoder to use struct xdr_stream
    
    [ Upstream commit 3b921a2b14251e9e203f1e8af76e8ade79f50e50 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: update comment over __nfsd_file_cache_purge [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Jan 26 12:21:16 2023 -0500

    nfsd: update comment over __nfsd_file_cache_purge
    
    [ Upstream commit 972cc0e0924598cb293b919d39c848dc038b2c28 ]
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update COMMIT3arg decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 14:41:56 2020 -0400

    NFSD: Update COMMIT3arg decoder to use struct xdr_stream
    
    [ Upstream commit c8d26a0acfe77f0880e0acfe77e4209cf8f3a38b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: update create verifier comment [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Fri Oct 15 14:42:11 2021 -0400

    nfsd: update create verifier comment
    
    [ Upstream commit 2336d696862186fd4a6ddd1ea0cb243b3e32847c ]
    
    I don't know if that Solaris behavior matters any more or if it's still
    possible to look up that bug ID any more.  The XFS behavior's definitely
    still relevant, though; any but the most recent XFS filesystems will
    lose the top bits.
    
    Reported-by: Frank S. Filz <ffilzlnx@mindspring.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update file_hashtbl() helpers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:22 2022 -0400

    NFSD: Update file_hashtbl() helpers
    
    [ Upstream commit 3fe828caddd81e68e9d29353c6e9285a658ca056 ]
    
    Enable callers to use const pointers for type safety.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update GETATTR3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 14:30:02 2020 -0400

    NFSD: Update GETATTR3args decoder to use struct xdr_stream
    
    [ Upstream commit 9575363a9e4c8d7e2f9ba5e79884d623fff0be6f ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update nfsd_cb_args tracepoint [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri May 14 15:57:39 2021 -0400

    NFSD: Update nfsd_cb_args tracepoint
    
    [ Upstream commit d6cbe98ff32aef795462a309ef048cfb89d1a11d ]
    
    Clean-up: Re-order the display of IP address and client ID to be
    consistent with other _cb_ tracepoints.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update NFSv2 diropargs decoding to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 19 14:33:24 2020 -0400

    NFSD: Update NFSv2 diropargs decoding to use struct xdr_stream
    
    [ Upstream commit 6d742c1864c18f143ea2031f1ed66bcd8f4812de ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 19:46:58 2020 -0400

    NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream
    
    [ Upstream commit 7f87fc2d34d475225e78b7f5c4eabb121f4282b2 ]
    
    The benefit of the xdr_stream helpers is that they transparently
    handle encoding an XDR data item that crosses page boundaries.
    Most of the open-coded logic to do that here can be eliminated.
    
    A sub-buffer and sub-stream are set up as a sink buffer for the
    directory entry encoder. As an entry is encoded, it is added to
    the end of the content in this buffer/stream. The total length of
    the directory list is tracked in the buffer's @len field.
    
    When it comes time to encode the Reply, the sub-buffer is merged
    into rq_res's page array at the correct place using
    xdr_write_pages().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update READ3arg decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 14:34:40 2020 -0400

    NFSD: Update READ3arg decoder to use struct xdr_stream
    
    [ Upstream commit be63bd2ac6bbf8c065a0ef6dfbea76934326c352 ]
    
    The code that sets up rq_vec is refactored so that it is now
    adjacent to the nfsd_read() call site where it is used.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update READDIR3args decoders to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 19 13:23:52 2020 -0400

    NFSD: Update READDIR3args decoders to use struct xdr_stream
    
    [ Upstream commit 9cedc2e64c296efb3bebe93a0ceeb5e71e8d722d ]
    
    As an additional clean up, neither nfsd3_proc_readdir() nor
    nfsd3_proc_readdirplus() make use of the dircount argument, so
    remove it from struct nfsd3_readdirargs.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update READLINK3arg decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Oct 24 12:51:18 2020 -0400

    NFSD: Update READLINK3arg decoder to use struct xdr_stream
    
    [ Upstream commit 224c1c894e48cd72e4dd9fb6311be80cbe1369b0 ]
    
    The NFSv3 READLINK request takes a single filehandle, so it can
    re-use GETATTR's decoder.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the CREATE3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 15:56:11 2020 -0400

    NFSD: Update the CREATE3args decoder to use struct xdr_stream
    
    [ Upstream commit 6b3a11960d898b25a30103cc6a2ff0b24b90a83b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the GETATTR3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 11:58:41 2020 -0400

    NFSD: Update the GETATTR3res encoder to use struct xdr_stream
    
    [ Upstream commit 2c42f804d30f6a8d86665eca84071b316821ea08 ]
    
    As an additional clean up, some renaming is done to more closely
    reflect the data type and variable names used in the NFSv3 XDR
    definition provided in RFC 1813. "attrstat" is an NFSv2 thingie.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the LINK3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 19 13:26:32 2020 -0400

    NFSD: Update the LINK3args decoder to use struct xdr_stream
    
    [ Upstream commit efaa1e7c2c7475f0a9bbeb904d9aba09b73dd52a ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the MKDIR3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 17:02:16 2020 -0400

    NFSD: Update the MKDIR3args decoder to use struct xdr_stream
    
    [ Upstream commit 83374c278db193f3e8b2608b45da1132b867a760 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the MKNOD3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 17:04:03 2020 -0400

    NFSD: Update the MKNOD3args decoder to use struct xdr_stream
    
    [ Upstream commit f8a38e2d6c885f9d7cd03febc515d36293de4a5b ]
    
    This commit removes the last usage of the original decode_sattr3(),
    so it is removed as a clean-up.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 ACL ACCESS argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 11:49:29 2020 -0500

    NFSD: Update the NFSv2 ACL ACCESS argument decoder to use struct xdr_stream
    
    [ Upstream commit 64063892efc1daa3a48882673811ff327ba75ed5 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 14:52:09 2020 -0500

    NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream
    
    [ Upstream commit 07f5c2963c04b11603e9667f89bb430c132e9cc1 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 ACL GETATTR argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 11:46:50 2020 -0500

    NFSD: Update the NFSv2 ACL GETATTR argument decoder to use struct xdr_stream
    
    [ Upstream commit 571d31f37a57729c9d3463b5a692a84e619b408a ]
    
    Since the ACL GETATTR procedure is the same as the normal GETATTR
    procedure, simply re-use nfssvc_decode_fhandleargs.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 14:49:57 2020 -0500

    NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream
    
    [ Upstream commit 8d2009a10b3abaa12a39deb4876b215714993fe8 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 15:28:59 2020 -0400

    NFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream
    
    [ Upstream commit 92b54a4fa4224e6116eb0d87a39dd05af23fcdfa ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 CREATE argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:43:58 2020 -0400

    NFSD: Update the NFSv2 CREATE argument decoder to use struct xdr_stream
    
    [ Upstream commit 7dcf65b91ecaf60ce593e7859ae2b29b7c46ccbd ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 diropres encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 16:44:16 2020 -0400

    NFSD: Update the NFSv2 diropres encoder to use struct xdr_stream
    
    [ Upstream commit e3b4ef221ac57c08341c97a10c8a81c041f76716 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 GETACL argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 11:32:04 2020 -0500

    NFSD: Update the NFSv2 GETACL argument decoder to use struct xdr_stream
    
    [ Upstream commit 635a45d34706400c59c3b18ca9fccba195147bda ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 14:38:47 2020 -0500

    NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream
    
    [ Upstream commit f8cba47344f794b54373189bec23195b51020faf ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 GETATTR argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:14:23 2020 -0400

    NFSD: Update the NFSv2 GETATTR argument decoder to use struct xdr_stream
    
    [ Upstream commit ebcd8e8b28535b643a4c06685bd363b3b73a96af ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 LINK argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:34:24 2020 -0400

    NFSD: Update the NFSv2 LINK argument decoder to use struct xdr_stream
    
    [ Upstream commit 77edcdf91f6245a9881b84e4e101738148bd039a ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READ argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:15:51 2020 -0400

    NFSD: Update the NFSv2 READ argument decoder to use struct xdr_stream
    
    [ Upstream commit 8c293ef993c8df0b1bea9ecb0de6eb96dec3ac9d ]
    
    The code that sets up rq_vec is refactored so that it is now
    adjacent to the nfsd_read() call site where it is used.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READ result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 16:40:11 2020 -0400

    NFSD: Update the NFSv2 READ result encoder to use struct xdr_stream
    
    [ Upstream commit a6f8d9dc9e44b51303d9abde4643460137d19b28 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READDIR argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Oct 19 14:15:51 2020 -0400

    NFSD: Update the NFSv2 READDIR argument decoder to use struct xdr_stream
    
    [ Upstream commit 8688361ae2edb8f7e61d926dc5000c9a44f29370 ]
    
    As an additional clean up, move code not related to XDR decoding
    into readdir's .pc_func call out.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 14 13:45:35 2020 -0500

    NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream
    
    [ Upstream commit f5dcccd647da513a89f3b6ca392b0c1eb050b9fc ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READDIR result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 16:49:01 2020 -0400

    NFSD: Update the NFSv2 READDIR result encoder to use struct xdr_stream
    
    [ Upstream commit 94c8f8c682a6497af7ea71351b18f637c6337d42 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READLINK argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:21:25 2020 -0400

    NFSD: Update the NFSv2 READLINK argument decoder to use struct xdr_stream
    
    [ Upstream commit 1fcbd1c9456ba129d38420e345e91c4b6363db47 ]
    
    If the code that sets up the sink buffer for nfsd_readlink() is
    moved adjacent to the nfsd_readlink() call site that uses it, then
    the only argument is a file handle, and the fhandle decoder can be
    used instead.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 15:41:09 2020 -0400

    NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream
    
    [ Upstream commit d9014b0f8fae11f22a3d356553844e06ddcdce4a ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 RENAME argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:35:41 2020 -0400

    NFSD: Update the NFSv2 RENAME argument decoder to use struct xdr_stream
    
    [ Upstream commit 62aa557efb81ea3339fabe7f5b1a343e742bbbdf ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 11:56:26 2020 -0500

    NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream
    
    [ Upstream commit 68519ff2a1c72c67fcdc4b81671acda59f420af9 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 11:37:35 2020 -0500

    NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream
    
    [ Upstream commit 427eab3ba22891845265f9a3846de6ac152ec836 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 14:47:56 2020 -0500

    NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream
    
    [ Upstream commit 778f068fa0c0846b650ebdb8795fd51b5badc332 ]
    
    The SETACL result encoder is exactly the same as the NFSv2
    attrstatres decoder.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 SETATTR argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:39:06 2020 -0400

    NFSD: Update the NFSv2 SETATTR argument decoder to use struct xdr_stream
    
    [ Upstream commit 2fdd6bd293b9e7dda61220538b2759fbf06f5af0 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 stat encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 11:08:02 2020 -0400

    NFSD: Update the NFSv2 stat encoder to use struct xdr_stream
    
    [ Upstream commit a887eaed2a964754334cd3f8c5fe87e413e68fef ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 STATFS result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 23 19:01:38 2020 -0400

    NFSD: Update the NFSv2 STATFS result encoder to use struct xdr_stream
    
    [ Upstream commit bf15229f2ced4f14946eef958336f764e30f8efb ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 SYMLINK argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:46:03 2020 -0400

    NFSD: Update the NFSv2 SYMLINK argument decoder to use struct xdr_stream
    
    [ Upstream commit 09f75a5375ac61f4adb94da0accc1cfc60eb4f2b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv2 WRITE argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 21 12:18:36 2020 -0400

    NFSD: Update the NFSv2 WRITE argument decoder to use struct xdr_stream
    
    [ Upstream commit a51b5b737a0be93fae6ea2a18df03ab2359a3f4b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 ACCESS3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 13:56:58 2020 -0400

    NFSD: Update the NFSv3 ACCESS3res encoder to use struct xdr_stream
    
    [ Upstream commit 907c38227fb57f5c537491ca76dd0b9636029393 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 COMMIT3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:35:46 2020 -0400

    NFSD: Update the NFSv3 COMMIT3res encoder to use struct xdr_stream
    
    [ Upstream commit 5ef2826c761079e27904c85034df34e601b82d94 ]
    
    As an additional clean up, encode_wcc_data() is removed because it
    is now no longer used.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 CREATE family of encoders to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:27:23 2020 -0400

    NFSD: Update the NFSv3 CREATE family of encoders to use struct xdr_stream
    
    [ Upstream commit 78315b36781d259dcbdc102ff22c3f2f25712223 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 DIROPargs decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 15:42:33 2020 -0400

    NFSD: Update the NFSv3 DIROPargs decoder to use struct xdr_stream
    
    [ Upstream commit 54d1d43dc709f58be38d278bfc38e9bfb38d35fc ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 FSINFO3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 13:42:13 2020 -0400

    NFSD: Update the NFSv3 FSINFO3res encoder to use struct xdr_stream
    
    [ Upstream commit 0a139d1b7f327010acc36e8162936d3108c7addb ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 FSSTAT3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 6 13:08:45 2020 -0500

    NFSD: Update the NFSv3 FSSTAT3res encoder to use struct xdr_stream
    
    [ Upstream commit 8b7044984fd6eeadf72285e3617116bd15e9e676 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 GETACL argument decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Nov 17 11:52:04 2020 -0500

    NFSD: Update the NFSv3 GETACL argument decoder to use struct xdr_stream
    
    [ Upstream commit 05027eafc266487c6e056d10ab352861df95b5d4 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 16:11:42 2020 -0500

    NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream
    
    [ Upstream commit 20798dfe249a01ad1b12eec7dbc572db5003244a ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 LINK3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:08:29 2020 -0400

    NFSD: Update the NFSv3 LINK3res encoder to use struct xdr_stream
    
    [ Upstream commit 4d74380a446f75eebb2171687d9b8baf0025bdf1 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 14:46:58 2020 -0400

    NFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream
    
    [ Upstream commit 5cf353354af1a385f29dec4609a1532d32c83a25 ]
    
    Also, clean up: Rename the encoder function to match the name of
    the result structure in RFC 1813, consistent with other encoder
    function names in nfs3xdr.c. "diropres" is an NFSv2 thingie.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 6 13:15:09 2020 -0500

    NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream
    
    [ Upstream commit ded04a587f6ceaaba3caefad4021f2212b46c9ff ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 READ3res encode to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:23:50 2020 -0400

    NFSD: Update the NFSv3 READ3res encode to use struct xdr_stream
    
    [ Upstream commit cc9bcdad7773c295375e66c892c7ac00524706f2 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 19:31:48 2020 -0400

    NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream
    
    [ Upstream commit e4ccfe3014de435984939a3d84b7f241d3b57b0d ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:18:40 2020 -0400

    NFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream
    
    [ Upstream commit 9a9c8923b3efd593d0e6a405efef9d58c6e6804b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 RENAMEv3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:33:05 2020 -0400

    NFSD: Update the NFSv3 RENAMEv3res encoder to use struct xdr_stream
    
    [ Upstream commit 89d79e9672dfa6d0cc416699c16f2d312da58ff2 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 18 16:21:24 2020 -0500

    NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream
    
    [ Upstream commit 15e432bf0cfd1e6aebfa9ffd4e0cc2ff4f3ae2db ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 wccstat result encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:12:38 2020 -0400

    NFSD: Update the NFSv3 wccstat result encoder to use struct xdr_stream
    
    [ Upstream commit 70f8e839859a994e324e1d18889f8319bbd5bff9 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the NFSv3 WRITE3res encoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 15:26:31 2020 -0400

    NFSD: Update the NFSv3 WRITE3res encoder to use struct xdr_stream
    
    [ Upstream commit ecb7a085ac15a8844ebf12fca6ae51ce71ac9b3b ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the RENAME3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 15:44:12 2020 -0400

    NFSD: Update the RENAME3args decoder to use struct xdr_stream
    
    [ Upstream commit d181e0a4bef36ee74d1338e5b5c2561d7463a5d0 ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the SETATTR3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 15:48:22 2020 -0400

    NFSD: Update the SETATTR3args decoder to use struct xdr_stream
    
    [ Upstream commit 9cde9360d18d8b352b737d10f90f2aecccf93dbe ]
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update the SYMLINK3args decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 20 16:01:16 2020 -0400

    NFSD: Update the SYMLINK3args decoder to use struct xdr_stream
    
    [ Upstream commit da39201637297460c13134c29286a00f3a1c92fe ]
    
    Similar to the WRITE decoder, code that checks the sanity of the
    payload size is re-wired to work with xdr_stream infrastructure.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Update WRITE3arg decoder to use struct xdr_stream [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Oct 22 11:14:55 2020 -0400

    NFSD: Update WRITE3arg decoder to use struct xdr_stream
    
    [ Upstream commit c43b2f229a01969a7ccf94b033c5085e0ec2040c ]
    
    As part of the update, open code that sanity-checks the size of the
    data payload against the length of the RPC Call message has to be
    re-implemented to use xdr_stream infrastructure.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: use (un)lock_inode instead of fh_(un)lock for file operations [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
    
    [ Upstream commit bb4d53d66e4b8c8b8e5634802262e53851a2d2db ]
    
    When locking a file to access ACLs and xattrs etc, use explicit locking
    with inode_lock() instead of fh_lock().  This means that the calls to
    fh_fill_pre/post_attr() are also explicit which improves readability and
    allows us to place them only where they are needed.  Only the xattr
    calls need pre/post information.
    
    When locking a file we don't need I_MUTEX_PARENT as the file is not a
    parent of anything, so we can use inode_lock() directly rather than the
    inode_lock_nested() call that fh_lock() uses.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use const pointers as parameters to fh_ helpers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:16 2022 -0400

    NFSD: Use const pointers as parameters to fh_ helpers
    
    [ Upstream commit b48f8056c034f28dd54668399f1d22be421b0bef ]
    
    Enable callers to use const pointers where they are able to.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops [+ + +]

Author: ChenXiaoSong <chenxiaosong2@huawei.com>
Date:   Fri Sep 23 00:31:52 2022 +0800

    nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops
    
    [ Upstream commit 0cfb0c4228a5c8e2ed2b58f8309b660b187cef02 ]
    
    Use DEFINE_PROC_SHOW_ATTRIBUTE helper macro to simplify the code.
    
    Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops [+ + +]

Author: ChenXiaoSong <chenxiaosong2@huawei.com>
Date:   Fri Sep 23 00:31:54 2022 +0800

    nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops
    
    [ Upstream commit 1d7f6b302b75ff7acb9eb3cab0c631b10cfa7542 ]
    
    Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
    
    inode is converted from seq_file->file instead of seq_file->private in
    client_info_show().
    
    Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops [+ + +]

Author: ChenXiaoSong <chenxiaosong2@huawei.com>
Date:   Fri Sep 23 00:31:53 2022 +0800

    nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops
    
    [ Upstream commit 9beeaab8e05d353d709103cafa1941714b4d5d94 ]
    
    Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
    
    Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
    [ cel: reduce line length ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops [+ + +]

Author: ChenXiaoSong <chenxiaosong2@huawei.com>
Date:   Fri Sep 23 00:31:56 2022 +0800

    nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops
    
    [ Upstream commit 1342f9dd3fc219089deeb2620f6790f19b4129b1 ]
    
    Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
    
    Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops [+ + +]

Author: ChenXiaoSong <chenxiaosong2@huawei.com>
Date:   Fri Sep 23 00:31:55 2022 +0800

    nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops
    
    [ Upstream commit 64776611a06322b99386f8dfe3b3ba1aa0347a38 ]
    
    Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
    
    nfsd_net is converted from seq_file->file instead of seq_file->private in
    nfsd_reply_cache_stats_show().
    
    Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
    [ cel: reduce line length ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use DEFINE_SPINLOCK() for spinlock [+ + +]

Author: Guobin Huang <huangguobin4@huawei.com>
Date:   Tue Apr 6 20:08:18 2021 +0800

    NFSD: Use DEFINE_SPINLOCK() for spinlock
    
    [ Upstream commit b73ac6808b0f7994a05ebc38571e2e9eaf98a0f4 ]
    
    spinlock can be initialized automatically with DEFINE_SPINLOCK()
    rather than explicitly calling spin_lock_init().
    
    Reported-by: Hulk Robot <hulkci@huawei.com>
    Signed-off-by: Guobin Huang <huangguobin4@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: use explicit lock/unlock for directory ops [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: use explicit lock/unlock for directory ops
    
    [ Upstream commit debf16f0c671cb8db154a9ebcd6014cfff683b80 ]
    
    When creating or unlinking a name in a directory use explicit
    inode_lock_nested() instead of fh_lock(), and explicit calls to
    fh_fill_pre_attrs() and fh_fill_post_attrs().  This is already done
    for renames, with lock_rename() as the explicit locking.
    
    Also move the 'fill' calls closer to the operation that might change the
    attributes.  This way they are avoided on some error paths.
    
    For the v2-only code in nfsproc.c, the fill calls are not replaced as
    they aren't needed.
    
    Making the locking explicit will simplify proposed future changes to
    locking for directories.  It also makes it easily visible exactly where
    pre/post attributes are used - not all callers of fh_lock() actually
    need the pre/post attributes.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>
    [ cel: backported to 5.10.y, prior to idmapped mounts ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use fsnotify group lock helpers [+ + +]

Author: Amir Goldstein <amir73il@gmail.com>
Date:   Fri Apr 22 15:03:20 2022 +0300

    nfsd: use fsnotify group lock helpers
    
    [ Upstream commit b8962a9d8cc2d8c93362e2f684091c79f702f6f3 ]
    
    Before commit 9542e6a643fc6 ("nfsd: Containerise filecache laundrette")
    nfsd would close open files in direct reclaim context and that could
    cause a deadlock when fsnotify mark allocation went into direct reclaim
    and nfsd shrinker tried to free existing fsnotify marks.
    
    To avoid issues like this in future code, set the FSNOTIFY_GROUP_NOFS
    flag on nfsd fsnotify group to prevent going into direct reclaim from
    fsnotify_add_inode_mark().
    
    Link: https://lore.kernel.org/r/20220422120327.3459282-10-amir73il@gmail.com
    Suggested-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: use locks_inode_context helper [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 16 09:36:07 2022 -0500

    nfsd: use locks_inode_context helper
    
    [ Upstream commit 77c67530e1f95ac25c7075635f32f04367380894 ]
    
    nfsd currently doesn't access i_flctx safely everywhere. This requires a
    smp_load_acquire, as the pointer is set via cmpxchg (a release
    operation).
    
    Acked-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use only RQ_DROPME to signal the need to drop a reply [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Nov 26 15:55:30 2022 -0500

    NFSD: Use only RQ_DROPME to signal the need to drop a reply
    
    [ Upstream commit 9315564747cb6a570e99196b3a4880fb817635fd ]
    
    Clean up: NFSv2 has the only two usages of rpc_drop_reply in the
    NFSD code base. Since NFSv2 is going away at some point, replace
    these in order to simplify the "drop this reply?" check in
    nfsd_dispatch().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use rhashtable for managing nfs4_file objects [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Oct 28 10:47:53 2022 -0400

    NFSD: Use rhashtable for managing nfs4_file objects
    
    [ Upstream commit d47b295e8d76a4d69f0e2ea0cd8a79c9d3488280 ]
    
    fh_match() is costly, especially when filehandles are large (as is
    the case for NFSv4). It needs to be used sparingly when searching
    data structures. Unfortunately, with common workloads, I see
    multiple thousands of objects stored in file_hashtbl[], which has
    just 256 buckets, making its bucket hash chains quite lengthy.
    
    Walking long hash chains with the state_lock held blocks other
    activity that needs that lock. Sizable hash chains are a common
    occurrance once the server has handed out some delegations, for
    example -- IIUC, each delegated file is held open on the server by
    an nfs4_file object.
    
    To help mitigate the cost of searching with fh_match(), replace the
    nfs4_file hash table with an rhashtable, which can dynamically
    resize its bucket array to minimize hash chain length.
    
    The result of this modification is an improvement in the latency of
    NFSv4 operations, and the reduction of nfsd CPU utilization due to
    eliminating the cost of multiple calls to fh_match() and reducing
    the CPU cache misses incurred while walking long hash chains in the
    nfs4_file hash table.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use set_bit(RQ_DROPME) [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Jan 7 10:15:35 2023 -0500

    NFSD: Use set_bit(RQ_DROPME)
    
    [ Upstream commit 5304930dbae82d259bcf7e5611db7c81e7a42eff ]
    
    The premise that "Once an svc thread is scheduled and executing an
    RPC, no other processes will touch svc_rqst::rq_flags" is false.
    svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd
    threads when determining which thread to wake up next.
    
    Fixes: 9315564747cb ("NFSD: Use only RQ_DROPME to signal the need to drop a reply")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use struct_size() helper in alloc_session() [+ + +]

Author: Xiu Jianfeng <xiujianfeng@huawei.com>
Date:   Fri Nov 11 17:18:35 2022 +0800

    NFSD: Use struct_size() helper in alloc_session()
    
    [ Upstream commit 85a0d0c9a58002ef7d1bf5e3ea630f4fbd42a4f0 ]
    
    Use struct_size() helper to simplify the code, no functional changes.
    
    Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:23:02 2022 -0400

    NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks
    
    [ Upstream commit c3d2a04f05c590303c125a176e6e43df4a436fdb ]
    
    Replace the check for buffer over/underflow with a helper that is
    commonly used for this purpose. The helper also sets xdr->nwords
    correctly after successfully linearizing the symlink argument into
    the stream's scratch buffer.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Use xdr_pad_size() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 22 16:09:16 2022 -0400

    NFSD: Use xdr_pad_size()
    
    [ Upstream commit 5e64d85c7d0c59cfcd61d899720b8ccfe895d743 ]
    
    Clean up: Use a helper instead of open-coding the calculation of
    the XDR pad size.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: verify the opened dentry after setting a delegation [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Jul 26 16:45:30 2022 +1000

    NFSD: verify the opened dentry after setting a delegation
    
    [ Upstream commit 876c553cb41026cb6ad3cef970a35e5f69c42a25 ]
    
    Between opening a file and setting a delegation on it, someone could
    rename or unlink the dentry. If this happens, we do not want to grant a
    delegation on the open.
    
    On a CLAIM_NULL open, we're opening by filename, and we may (in the
    non-create case) or may not (in the create case) be holding i_rwsem
    when attempting to set a delegation.  The latter case allows a
    race.
    
    After getting a lease, redo the lookup of the file being opened and
    validate that the resulting dentry matches the one in the open file
    description.
    
    To properly redo the lookup we need an rqst pointer to pass to
    nfsd_lookup_dentry(), so make sure that is available.
    
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: WARN when freeing an item still linked via nf_lru [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:25:04 2022 -0400

    NFSD: WARN when freeing an item still linked via nf_lru
    
    [ Upstream commit 668ed92e651d3c25f9b6e8cb7ceca54d00daa96d ]
    
    Add a guardrail to prevent freeing memory that is still on a list.
    This includes either a dispose list or the LRU list.
    
    This is the sign of a bug, but this class of bugs can be detected
    so that they don't endanger system stability, especially while
    debugging.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Write verifier might go backwards [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Dec 30 10:26:18 2021 -0500

    NFSD: Write verifier might go backwards
    
    [ Upstream commit cdc556600c0133575487cc69fb3128440b3c3e92 ]
    
    When vfs_iter_write() starts to fail because a file system is full,
    a bunch of writes can fail at once with ENOSPC. These writes
    repeatedly invoke nfsd_reset_boot_verifier() in quick succession.
    
    Ensure that the time it grabs doesn't go backwards due to an ntp
    adjustment going on at the same time.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Zero counters when the filecache is re-initialized [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jul 8 14:24:51 2022 -0400

    NFSD: Zero counters when the filecache is re-initialized
    
    [ Upstream commit 8b330f78040cbe16cf8029df70391b2a491f17e2 ]
    
    If nfsd_file_cache_init() is called after a shutdown, be sure the
    stat counters are reset.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: NFSD:fix boolreturn.cocci warning [+ + +]

Author: Changcheng Deng <deng.changcheng@zte.com.cn>
Date:   Tue Oct 19 04:14:22 2021 +0000

    NFSD:fix boolreturn.cocci warning
    
    [ Upstream commit 291cd656da04163f4bba67953c1f2f823e0d1231 ]
    
    ./fs/nfsd/nfssvc.c: 1072: 8-9: :WARNING return of 0/1 in function
    'nfssvc_decode_voidarg' with return type bool
    
    Return statements in functions returning bool should use true/false
    instead of 1/0.
    
    Reported-by: Zeal Robot <zealci@zte.com.cn>
    Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd_splice_actor(): handle compound pages [+ + +]

Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat Sep 10 22:14:02 2022 +0100

    nfsd_splice_actor(): handle compound pages
    
    [ Upstream commit bfbfb6182ad1d7d184b16f25165faad879147f79 ]
    
    pipe_buffer might refer to a compound page (and contain more than a PAGE_SIZE
    worth of data).  Theoretically it had been possible since way back, but
    nfsd_splice_actor() hadn't run into that until copy_page_to_iter() change.
    Fortunately, the only thing that changes for compound pages is that we
    need to stuff each relevant subpage in and convert the offset into offset
    in the first subpage.
    
    Acked-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Benjamin Coddington <bcodding@redhat.com>
    Fixes: f0f6b614f83d "copy_page_to_iter(): don't split high-order page in case of ITER_PIPE"
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
    [ cel: "‘for’ loop initial declarations are only allowed in C99 or C11 mode" ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code. [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Thu Apr 22 03:37:49 2021 -0400

    NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code.
    
    [ Upstream commit d9092b4bb2109502eb8972021a3f74febc931a63 ]
    
    The client SSC code should not depend on any of the CONFIG_NFSD config.
    This patch removes all CONFIG_NFSD from NFSv4.2 client SSC code and
    simplifies the config of CONFIG_NFS_V4_2_SSC_HELPER, NFSD_V4_2_INTER_SSC.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSv4_2: SSC helper should use its own config. [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Thu Jan 28 01:42:26 2021 -0500

    NFSv4_2: SSC helper should use its own config.
    
    [ Upstream commit 02591f9febd5f69bb4c266a4abf899c4cf21964f ]
    
    Currently NFSv4_2 SSC helper, nfs_ssc, incorrectly uses GRACE_PERIOD
    as its config. Fix by adding new config NFS_V4_2_SSC_HELPER which
    depends on NFS_V4_2 and is automatically selected when NFSD_V4 is
    enabled. Also removed the file name from a comment in nfs_ssc.c.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NLM: Defend against file_lock changes after vfs_test_lock() [+ + +]

Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Mon Jun 13 09:40:06 2022 -0400

    NLM: Defend against file_lock changes after vfs_test_lock()
    
    [ Upstream commit 184cefbe62627730c30282df12bcff9aae4816ea ]
    
    Instead of trusting that struct file_lock returns completely unchanged
    after vfs_test_lock() when there's no conflicting lock, stash away our
    nlm_lockowner reference so we can properly release it for all cases.
    
    This defends against another file_lock implementation overwriting fl_owner
    when the return type is F_UNLCK.
    
    Reported-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
    Tested-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NLM: Fix svcxdr_encode_owner() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 16 17:24:54 2021 -0400

    NLM: Fix svcxdr_encode_owner()
    
    [ Upstream commit 89c485c7a3ecbc2ebd568f9c9c2edf3a8cf7485b ]
    
    Dai Ngo reports that, since the XDR overhaul, the NLM server crashes
    when the TEST procedure wants to return NLM_DENIED. There is a bug
    in svcxdr_encode_owner() that none of our standard test cases found.
    
    Replace the open-coded function with a call to an appropriate
    pre-fabricated XDR helper.
    
    Reported-by: Dai Ngo <Dai.Ngo@oracle.com>
    Fixes: a6a63ca5652e ("lockd: Common NLM XDR helpers")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nlm: minor nlm_lookup_file argument change [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Aug 23 12:01:18 2021 -0400

    nlm: minor nlm_lookup_file argument change
    
    [ Upstream commit 2dc6f19e4f438d4c14987cb17aee38aaf7304e7f ]
    
    It'll come in handy to get the whole nlm_lock.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nlm: minor refactoring [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Aug 23 11:26:39 2021 -0400

    nlm: minor refactoring
    
    [ Upstream commit a81041b7d8f08c4e1014173c5483a0f18724a576 ]
    
    Make this lookup slightly more concise, and prepare for changing how we
    look this up in a following patch.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

proc/fd: In fdinfo seq_show don't use get_files_struct [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:34 2020 -0600

    proc/fd: In fdinfo seq_show don't use get_files_struct
    
    [ Upstream commit 775e0656b27210ae668e33af00bece858f44576f ]
    
    When discussing[1] exec and posix file locks it was realized that none
    of the callers of get_files_struct fundamentally needed to call
    get_files_struct, and that by switching them to helper functions
    instead it will both simplify their code and remove unnecessary
    increments of files_struct.count.  Those unnecessary increments can
    result in exec unnecessarily unsharing files_struct which breaking
    posix locks, and it can result in fget_light having to fallback to
    fget reducing system performance.
    
    Instead hold task_lock for the duration that task->files needs to be
    stable in seq_show.  The task_lock was already taken in
    get_files_struct, and so skipping get_files_struct performs less work
    overall, and avoids the problems with the files_struct reference
    count.
    
    [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-12-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-17-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

proc/fd: In proc_fd_link use fget_task [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:23 2020 -0600

    proc/fd: In proc_fd_link use fget_task
    
    [ Upstream commit 439be32656035d3239fd56f9b83353ec06cb3b45 ]
    
    When discussing[1] exec and posix file locks it was realized that none
    of the callers of get_files_struct fundamentally needed to call
    get_files_struct, and that by switching them to helper functions
    instead it will both simplify their code and remove unnecessary
    increments of files_struct.count.  Those unnecessary increments can
    result in exec unnecessarily unsharing files_struct which breaking
    posix locks, and it can result in fget_light having to fallback to
    fget reducing system performance.
    
    Simplifying proc_fd_link is a little bit tricky.  It is necessary to
    know that there is a reference to fd_f   ile while path_get is running.
    This reference can either be guaranteed to exist either by locking the
    fdtable as the code currently does or by taking a reference on the
    file in question.
    
    Use fget_task to remove the need for get_files_struct and
    to take a reference to file in question.
    
    [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-8-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-6-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:32 2020 -0600

    proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
    
    [ Upstream commit 5b17b61870e2f4b0a4fdc5c6039fbdb4ffb796df ]
    
    When discussing[1] exec and posix file locks it was realized that none
    of the callers of get_files_struct fundamentally needed to call
    get_files_struct, and that by switching them to helper functions
    instead it will both simplify their code and remove unnecessary
    increments of files_struct.count.  Those unnecessary increments can
    result in exec unnecessarily unsharing files_struct which breaking
    posix locks, and it can result in fget_light having to fallback to
    fget reducing system performance.
    
    Using task_lookup_next_fd_rcu simplifies proc_readfd_common, by moving
    the checking for the maximum file descritor into the generic code, and
    by remvoing the need for capturing and releasing a reference on
    files_struct.
    
    As task_lookup_fd_rcu may update the fd ctx->pos has been changed
    to be the fd +2 after task_lookup_fd_rcu returns.
    
    [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    Tested-by: Andy Lavr <andy.lavr@gmail.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-15-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

proc/fd: In tid_fd_mode use task_lookup_fd_rcu [+ + +]

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Nov 20 17:14:29 2020 -0600

    proc/fd: In tid_fd_mode use task_lookup_fd_rcu
    
    [ Upstream commit 64eb661fda0269276b4c46965832938e3f268268 ]
    
    When discussing[1] exec and posix file locks it was realized that none
    of the callers of get_files_struct fundamentally needed to call
    get_files_struct, and that by switching them to helper functions
    instead it will both simplify their code and remove unnecessary
    increments of files_struct.count.  Those unnecessary increments can
    result in exec unnecessarily unsharing files_struct which breaking
    posix locks, and it can result in fget_light having to fallback to
    fget reducing system performance.
    
    Instead of manually coding finding the files struct for a task and
    then calling files_lookup_fd_rcu, use the helper task_lookup_fd_rcu
    that combines those to steps.   Making the code simpler and removing
    the need to get a reference on a files_struct.
    
    [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
    v1: https://lkml.kernel.org/r/20200817220425.9389-7-ebiederm@xmission.com
    Link: https://lkml.kernel.org/r/20201120231441.29911-12-ebiederm@xmission.com
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "fanotify: limit number of event merge attempts" [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Mar 7 09:22:43 2024 -0500

    Revert "fanotify: limit number of event merge attempts"
    
    Temporarily revert commit ad3ea16746cc ("fanotify: limit number of
    event merge attempts") to enable subsequent upstream commits to
    apply and build cleanly.
    
    Stable-dep-of: 8988f11abb82 ("fanotify: reduce event objectid to 29-bit hash")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "fget: clarify and improve __fget_files() implementation" [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Feb 29 18:19:36 2024 -0500

    Revert "fget: clarify and improve __fget_files() implementation"
    
    Temporarily revert commit 0849f83e4782 ("fget: clarify and improve
    __fget_files() implementation") to enable subsequent upstream
    commits to apply and build cleanly.
    
    Stable-dep-of: bebf684bf330 ("file: Rename __fcheck_files to files_lookup_fd_raw")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "nfsd4: support change_attr_type attribute" [+ + +]

Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Mon Nov 30 17:46:18 2020 -0500

    Revert "nfsd4: support change_attr_type attribute"
    
    This reverts commit a85857633b04d57f4524cca0a2bfaf87b2543f9f.
    
    We're still factoring ctime into our change attribute even in the
    IS_I_VERSION case.  If someone sets the system time backwards, a client
    could see the change attribute go backwards.  Maybe we can just say
    "well, don't do that", but there's some question whether that's good
    enough, or whether we need a better guarantee.
    
    Also, the client still isn't actually using the attribute.
    
    While we're still figuring this out, let's just stop returning this
    attribute.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "nfsd: skip some unnecessary stats in the v4 case" [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Dec 24 14:22:28 2021 -0500

    Revert "nfsd: skip some unnecessary stats in the v4 case"
    
    [ Upstream commit 58f258f65267542959487dbe8b5641754411843d ]
    
    On the wire, I observed NFSv4 OPEN(CREATE) operations sometimes
    returning a reasonable-looking value in the cinfo.before field and
    zero in the cinfo.after field.
    
    RFC 8881 Section 10.8.1 says:
    > When a client is making changes to a given directory, it needs to
    > determine whether there have been changes made to the directory by
    > other clients.  It does this by using the change attribute as
    > reported before and after the directory operation in the associated
    > change_info4 value returned for the operation.
    
    and
    
    > ... The post-operation change
    > value needs to be saved as the basis for future change_info4
    > comparisons.
    
    A good quality client implementation therefore saves the zero
    cinfo.after value. During a subsequent OPEN operation, it will
    receive a different non-zero value in the cinfo.before field for
    that directory, and it will incorrectly believe the directory has
    changed, triggering an undesirable directory cache invalidation.
    
    There are filesystem types where fs_supports_change_attribute()
    returns false, tmpfs being one. On NFSv4 mounts, this means the
    fh_getattr() call site in fill_pre_wcc() and fill_post_wcc() is
    never invoked. Subsequently, nfsd4_change_attribute() is invoked
    with an uninitialized @stat argument.
    
    In fill_pre_wcc(), @stat contains stale stack garbage, which is
    then placed on the wire. In fill_post_wcc(), ->fh_post_wc is all
    zeroes, so zero is placed on the wire. Both of these values are
    meaningless.
    
    This fix can be applied immediately to stable kernels. Once there
    are more regression tests in this area, this optimization can be
    attempted again.
    
    Fixes: 428a23d2bf0c ("nfsd: skip some unnecessary stats in the v4 case")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "SUNRPC: Use RMW bitops in single-threaded hot paths" [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jan 6 12:43:37 2023 -0500

    Revert "SUNRPC: Use RMW bitops in single-threaded hot paths"
    
    [ Upstream commit 7827c81f0248e3c2f40d438b020f3d222f002171 ]
    
    The premise that "Once an svc thread is scheduled and executing an
    RPC, no other processes will touch svc_rqst::rq_flags" is false.
    svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd
    threads when determining which thread to wake up next.
    
    Found via KCSAN.
    
    Fixes: 28df0988815f ("SUNRPC: Use RMW bitops in single-threaded hot paths")
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC/NFSD: clean up get/put functions. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC/NFSD: clean up get/put functions.
    
    [ Upstream commit 8c62d12740a1450d2e8456d5747f440e10db281a ]
    
    svc_destroy() is poorly named - it doesn't necessarily destroy the svc,
    it might just reduce the ref count.
    nfsd_destroy() is poorly named for the same reason.
    
    This patch:
     - removes the refcount functionality from svc_destroy(), moving it to
       a new svc_put().  Almost all previous callers of svc_destroy() now
       call svc_put().
     - renames nfsd_destroy() to nfsd_put() and improves the code, using
       the new svc_destroy() rather than svc_put()
     - removes a few comments that explain the important for balanced
       get/put calls.  This should be obvious.
    
    The only non-trivial part of this is that svc_destroy() would call
    svc_sock_update() on a non-final decrement.  It can no longer do that,
    and svc_put() isn't really a good place of it.  This call is now made
    from svc_exit_thread() which seems like a good place.  This makes the
    call *before* sv_nrthreads is decremented rather than after.  This
    is not particularly important as the call just sets a flag which
    causes sv_nrthreads set be checked later.  A subsequent patch will
    improve the ordering.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Add svc_rqst::rq_auth_stat [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jul 15 15:52:06 2021 -0400

    SUNRPC: Add svc_rqst::rq_auth_stat
    
    [ Upstream commit 438623a06bacd69c40c4af633bb09a3bbb9dfc78 ]
    
    I'd like to take commit 4532608d71c8 ("SUNRPC: Clean up generic
    dispatcher code") even further by using only private local SVC
    dispatchers for all kernel RPC services. This change would enable
    the removal of the logic that switches between
    svc_generic_dispatch() and a service's private dispatcher, and
    simplify the invocation of the service's pc_release method
    so that humans can visually verify that it is always invoked
    properly.
    
    All that will come later.
    
    First, let's provide a better way to return authentication errors
    from SVC dispatcher functions. Instead of overloading the dispatch
    method's *statp argument, add a field to struct svc_rqst that can
    hold an error value.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Add svc_rqst_replace_page() API [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jul 1 10:03:10 2021 -0400

    SUNRPC: Add svc_rqst_replace_page() API
    
    [ Upstream commit 2f0f88f42f2eab0421ed37d7494de9124fdf0d34 ]
    
    Replacing a page in rq_pages[] requires a get_page(), which is a
    bus-locked operation, and a put_page(), which can be even more
    costly.
    
    To reduce the cost of replacing a page in rq_pages[], batch the
    put_page() operations by collecting "freed" pages in a pagevec,
    and then release those pages when the pagevec is full. This
    pagevec is also emptied when each RPC completes.
    
    [ cel: adjusted to apply without f6e70aab9dfe ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Add xdr_set_scratch_page() and xdr_reset_scratch_buffer() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Nov 11 15:52:47 2020 -0500

    SUNRPC: Add xdr_set_scratch_page() and xdr_reset_scratch_buffer()
    
    [ Upstream commit 0ae4c3e8a64ace1b8d7de033b0751afe43024416 ]
    
    Clean up: De-duplicate some frequently-used code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: always treat sv_nrpools==1 as "not pooled" [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC: always treat sv_nrpools==1 as "not pooled"
    
    [ Upstream commit 93aa619eb0b42eec2f3a9b4d9db41f5095390aec ]
    
    Currently 'pooled' services hold a reference on the pool_map, and
    'unpooled' services do not.
    svc_destroy() uses the presence of ->svo_function (via
    svc_serv_is_pooled()) to determine if the reference should be dropped.
    There is no direct correlation between being pooled and the use of
    svo_function, though in practice, lockd is the only non-pooled service,
    and the only one not to use svo_function.
    
    This is untidy and would cause problems if we changed lockd to use
    svc_set_num_threads(), which requires the use of ->svo_function.
    
    So change the test for "is the service pooled" to "is sv_nrpools > 1".
    
    This means that when svc_pool_map_get() returns 1, it must NOT take a
    reference to the pool.
    
    We discard svc_serv_is_pooled(), and test sv_nrpools directly.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Change return value type of .pc_decode [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 12 11:57:28 2021 -0400

    SUNRPC: Change return value type of .pc_decode
    
    [ Upstream commit c44b31c263798ec34614dd394c31ef1a2e7e716e ]
    
    Returning an undecorated integer is an age-old trope, but it's
    not clear (even to previous experts in this code) that the only
    valid return values are 1 and 0. These functions do not return
    a negative errno, rpc_stat value, or a positive length.
    
    Document there are only two valid return values by having
    .pc_decode return only true or false.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Change return value type of .pc_encode [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 13 10:41:13 2021 -0400

    SUNRPC: Change return value type of .pc_encode
    
    [ Upstream commit 130e2054d4a652a2bd79fb1557ddcd19c053cb37 ]
    
    Returning an undecorated integer is an age-old trope, but it's
    not clear (even to previous experts in this code) that the only
    valid return values are 1 and 0. These functions do not return
    a negative errno, rpc_stat value, or a positive length.
    
    Document there are only two valid return values by having
    .pc_encode return only true or false.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: change svc_get() to return the svc. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC: change svc_get() to return the svc.
    
    [ Upstream commit df5e49c880ea0776806b8a9f8ab95e035272cf6f ]
    
    It is common for 'get' functions to return the object that was 'got',
    and there are a couple of places where users of svc_get() would be a
    little simpler if svc_get() did that.
    
    Make it so.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: discard svo_setup and rename svc_set_num_threads_sync() [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC: discard svo_setup and rename svc_set_num_threads_sync()
    
    [ Upstream commit 3ebdbe5203a874614819700d3f470724cb803709 ]
    
    The ->svo_setup callback serves no purpose.  It is always called from
    within the same module that chooses which callback is needed.  So
    discard it and call the relevant function directly.
    
    Now that svc_set_num_threads() is no longer used remove it and rename
    svc_set_num_threads_sync() to remove the "_sync" suffix.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Display RPC procedure names instead of proc numbers [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Dec 3 10:22:09 2020 -0500

    SUNRPC: Display RPC procedure names instead of proc numbers
    
    [ Upstream commit 89ff87494c6e4b32ea7960d0c644efdbb2fe6ef5 ]
    
    Make the sunrpc trace subsystem trace events easier to use.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Eliminate the RQ_AUTHERR flag [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jul 15 15:52:19 2021 -0400

    SUNRPC: Eliminate the RQ_AUTHERR flag
    
    [ Upstream commit 9082e1d914f8b27114352b1940bbcc7522f682e7 ]
    
    Now that there is an alternate method for returning an auth_stat
    value, replace the RQ_AUTHERR flag with use of that new method.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Export svc_xprt_received() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Jan 29 13:04:04 2021 -0500

    SUNRPC: Export svc_xprt_received()
    
    [ Upstream commit 7dcfbd86adc45f6d6b37278efd22530cf80ab474 ]
    
    Prepare svc_xprt_received() to be called from transport code instead
    of from generic RPC server code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Fix xdr_encode_bool() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jul 19 09:18:35 2022 -0400

    SUNRPC: Fix xdr_encode_bool()
    
    [ Upstream commit c770f31d8f580ed4b965c64f924ec1cc50e41734 ]
    
    I discovered that xdr_encode_bool() was returning the same address
    that was passed in the @p parameter. The documenting comment states
    that the intent is to return the address of the next buffer
    location, just like the other "xdr_encode_*" helpers.
    
    The result was the encoded results of NFSv3 PATHCONF operations were
    not formed correctly.
    
    Fixes: ded04a587f6c ("NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Make trace_svc_process() display the RPC procedure symbolically [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Sep 17 17:22:49 2020 -0400

    SUNRPC: Make trace_svc_process() display the RPC procedure symbolically
    
    [ Upstream commit 2289e87b5951f97783f07fc895e6c5e804b53668 ]
    
    The next few patches will employ these strings to help make server-
    side trace logs more human-readable. A similar technique is already
    in use in kernel RPC client code.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jan 25 17:57:23 2022 -0500

    SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt()
    
    [ Upstream commit c0219c499799c1e92bd570c15a47e6257a27bb15 ]
    
    Neil says:
    "These functions were separated in commit 0971374e2818 ("SUNRPC:
    Reduce contention in svc_xprt_enqueue()") so that the XPT_BUSY check
    happened before taking any spinlocks.
    
    We have since moved or removed the spinlocks so the extra test is
    fairly pointless."
    
    I've made this a separate patch in case the XPT_BUSY change has
    unexpected consequences and needs to be reverted.
    
    Suggested-by: Neil Brown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Move definition of XDR_UNIT [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Nov 27 17:37:02 2020 -0500

    SUNRPC: Move definition of XDR_UNIT
    
    [ Upstream commit 81d217474326b25d7f14274b02fe3da1e85ad934 ]
    
    Clean up: The unit of XDR alignment is defined by RFC 4506,
    not as part of the RPC message header. Thus it belongs in
    include/linux/sunrpc/xdr.h.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: move the pool_map definitions (back) into svc.c [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC: move the pool_map definitions (back) into svc.c
    
    [ Upstream commit cf0e124e0a489944d08fcc3c694d2b234d2cc658 ]
    
    These definitions are not used outside of svc.c, and there is no
    evidence that they ever have been.  So move them into svc.c
    and make the declarations 'static'.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Optimize xdr_reserve_space() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jun 7 16:47:58 2022 -0400

    SUNRPC: Optimize xdr_reserve_space()
    
    [ Upstream commit 62ed448cc53b654036f7d7f3c99f299d79ad14c3 ]
    
    Transitioning between encode buffers is quite infrequent. It happens
    about 1 time in 400 calls to xdr_reserve_space(), measured on NFSD
    with a typical build/test workload.
    
    Force the compiler to remove that code from xdr_reserve_space(),
    which is a hot path on both the server and the client. This change
    reduces the size of xdr_reserve_space() from 10 cache lines to 2
    when compiled with -Os.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Parametrize how much of argsize should be zeroed [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Sep 12 17:22:38 2022 -0400

    SUNRPC: Parametrize how much of argsize should be zeroed
    
    [ Upstream commit 103cc1fafee48adb91fca0e19deb869fd23e46ab ]
    
    Currently, SUNRPC clears the whole of .pc_argsize before processing
    each incoming RPC transaction. Add an extra parameter to struct
    svc_procedure to enable upper layers to reduce the amount of each
    operation's argument structure that is zeroed by SUNRPC.
    
    The size of struct nfsd4_compoundargs, in particular, is a lot to
    clear on each incoming RPC Call. A subsequent patch will cut this
    down to something closer to what NFSv2 and NFSv3 uses.
    
    This patch should cause no behavior changes.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Prepare for xdr_stream-style decoding on the server-side [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Nov 5 11:19:42 2020 -0500

    SUNRPC: Prepare for xdr_stream-style decoding on the server-side
    
    [ Upstream commit 5191955d6fc65e6d4efe8f4f10a6028298f57281 ]
    
    A "permanent" struct xdr_stream is allocated in struct svc_rqst so
    that it is usable by all server-side decoders. A per-rqst scratch
    buffer is also allocated to handle decoding XDR data items that
    cross page boundaries.
    
    To demonstrate how it will be used, add the first call site for the
    new svcxdr_init_decode() API.
    
    As an additional part of the overall conversion, add symbolic
    constants for successful and failed XDR operations. Returning "0" is
    overloaded. Sometimes it means something failed, but sometimes it
    means success. To make it more clear when XDR decoding functions
    succeed or fail, introduce symbolic constants.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Remove svc_shutdown_net() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jan 26 11:30:55 2022 -0500

    SUNRPC: Remove svc_shutdown_net()
    
    [ Upstream commit c7d7ec8f043e53ad16e30f5ebb8b9df415ec0f2b ]
    
    Clean up: svc_shutdown_net() now does nothing but call
    svc_close_net(). Replace all external call sites.
    
    svc_close_net() is renamed to be the inverse of svc_xprt_create().
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Remove svo_shutdown method [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jan 25 13:49:29 2022 -0500

    SUNRPC: Remove svo_shutdown method
    
    [ Upstream commit 87cdd8641c8a1ec6afd2468265e20840a57fd888 ]
    
    Clean up. Neil observed that "any code that calls svc_shutdown_net()
    knows what the shutdown function should be, and so can call it
    directly."
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Remove the .svo_enqueue_xprt method [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Jan 25 10:17:59 2022 -0500

    SUNRPC: Remove the .svo_enqueue_xprt method
    
    [ Upstream commit a9ff2e99e9fa501ec965da03c18a5422b37a2f44 ]
    
    We have never been able to track down and address the underlying
    cause of the performance issues with workqueue-based service
    support. svo_enqueue_xprt is called multiple times per RPC, so
    it adds instruction path length, but always ends up at the same
    function: svc_xprt_do_enqueue(). We do not anticipate needing
    this flexibility for dynamic nfsd thread management support.
    
    As a micro-optimization, remove .svo_enqueue_xprt because
    Spectre/Meltdown makes virtual function calls more costly.
    
    This change essentially reverts commit b9e13cdfac70 ("nfsd/sunrpc:
    turn enqueueing a svc_xprt into a svc_serv operation").
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Rename svc_close_xprt() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Jan 31 13:34:29 2022 -0500

    SUNRPC: Rename svc_close_xprt()
    
    [ Upstream commit 4355d767a21b9445958fc11bce9a9701f76529d3 ]
    
    Clean up: Use the "svc_xprt_<task>" function naming convention as
    is used for other external APIs.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Rename svc_create_xprt() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jan 26 11:42:08 2022 -0500

    SUNRPC: Rename svc_create_xprt()
    
    [ Upstream commit 352ad31448fecc78a2e9b78da64eea5d63b8d0ce ]
    
    Clean up: Use the "svc_xprt_<task>" function naming convention as
    is used for other external APIs.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Rename svc_encode_read_payload() [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jun 10 10:36:42 2020 -0400

    SUNRPC: Rename svc_encode_read_payload()
    
    [ Upstream commit 03493bca084fdca48abc59b00e06ce733aa9eb7d ]
    
    Clean up: "result payload" is a less confusing name for these
    payloads. "READ payload" reflects only the NFS usage.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Replace the "__be32 *p" parameter to .pc_decode [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Tue Oct 12 11:57:22 2021 -0400

    SUNRPC: Replace the "__be32 *p" parameter to .pc_decode
    
    [ Upstream commit 16c663642c7ec03cd4cee5fec520bb69e97babe4 ]
    
    The passed-in value of the "__be32 *p" parameter is now unused in
    every server-side XDR decoder, and can be removed.
    
    Note also that there is a line in each decoder that sets up a local
    pointer to a struct xdr_stream. Passing that pointer from the
    dispatcher instead saves one line per decoder function.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Replace the "__be32 *p" parameter to .pc_encode [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Oct 13 10:41:06 2021 -0400

    SUNRPC: Replace the "__be32 *p" parameter to .pc_encode
    
    [ Upstream commit fda494411485aff91768842c532f90fb8eb54943 ]
    
    The passed-in value of the "__be32 *p" parameter is now unused in
    every server-side XDR encoder, and can be removed.
    
    Note also that there is a line in each encoder that sets up a local
    pointer to a struct xdr_stream. Passing that pointer from the
    dispatcher instead saves one line per encoder function.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Return true/false (not 1/0) from bool functions [+ + +]

Author: Haowen Bai <baihaowen@meizu.com>
Date:   Mon Mar 28 10:48:59 2022 +0800

    SUNRPC: Return true/false (not 1/0) from bool functions
    
    [ Upstream commit 5f7b839d47dbc74cf4a07beeab5191f93678673e ]
    
    Return boolean values ("true" or "false") instead of 1 or 0 from bool
    functions.  This fixes the following warnings from coccicheck:
    
    ./fs/nfsd/nfs2acl.c:289:9-10: WARNING: return of 0/1 in function
    'nfsaclsvc_encode_accessres' with return type bool
    ./fs/nfsd/nfs2acl.c:252:9-10: WARNING: return of 0/1 in function
    'nfsaclsvc_encode_getaclres' with return type bool
    
    Signed-off-by: Haowen Bai <baihaowen@meizu.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Set rq_auth_stat in the pg_authenticate() callout [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Thu Jul 15 15:52:12 2021 -0400

    SUNRPC: Set rq_auth_stat in the pg_authenticate() callout
    
    [ Upstream commit 5c2465dfd457f3015eebcc3ace50570e1d896aeb ]
    
    In a few moments, rq_auth_stat will need to be explicitly set to
    rpc_auth_ok before execution gets to the dispatcher.
    
    svc_authenticate() already sets it, but it often gets reset to
    rpc_autherr_badcred right after that call, even when authentication
    is successful. Let's ensure that the pg_authenticate callout and
    svc_set_client() set it properly in every case.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: stop using ->sv_nrthreads as a refcount [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC: stop using ->sv_nrthreads as a refcount
    
    [ Upstream commit ec52361df99b490f6af412b046df9799b92c1050 ]
    
    The use of sv_nrthreads as a general refcount results in clumsy code, as
    is seen by various comments needed to explain the situation.
    
    This patch introduces a 'struct kref' and uses that for reference
    counting, leaving sv_nrthreads to be a pure count of threads.  The kref
    is managed particularly in svc_get() and svc_put(), and also nfsd_put();
    
    svc_destroy() now takes a pointer to the embedded kref, rather than to
    the serv.
    
    nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero.  This
    happens when a transport is created before the first thread is started.
    To support this, a 'keep_active' flag is introduced which holds a ref on
    the svc_serv.  This is set when any listening socket is successfully
    added (unless there are running threads), and cleared when the number of
    threads is set.  So when the last thread exits, the nfs_serv will be
    destroyed.
    The use of 'keep_active' replaces previous code which checked if there
    were any permanent sockets.
    
    We no longer clear ->rq_server when nfsd() exits.  This was done
    to prevent svc_exit_thread() from calling svc_destroy().
    Instead we take an extra reference to the svc_serv to prevent
    svc_destroy() from being called.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Trace calls to .rpc_call_done [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Sat Oct 16 18:02:57 2021 -0400

    SUNRPC: Trace calls to .rpc_call_done
    
    [ Upstream commit b40887e10dcacc5e8ae3c1a99dcba20877c4831b ]
    
    Introduce a single tracepoint that can replace simple dprintk call
    sites in upper layer "rpc_call_done" callbacks. Example:
    
       kworker/u24:2-1254  [001]   771.026677: rpc_stats_latency:    task:00000001@00000002 xid=0x16a6f3c0 rpcbindv2 GETPORT backlog=446 rtt=101 execute=555
       kworker/u24:2-1254  [001]   771.026677: rpc_task_call_done:   task:00000001@00000002 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpcb_getport_done
       kworker/u24:2-1254  [001]   771.026678: rpcb_setport:         task:00000001@00000002 status=0 port=20048
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Use RMW bitops in single-threaded hot paths [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Apr 29 10:06:21 2022 -0400

    SUNRPC: Use RMW bitops in single-threaded hot paths
    
    [ Upstream commit 28df0988815f63e2af5e6718193c9f68681ad7ff ]
    
    I noticed CPU pipeline stalls while using perf.
    
    Once an svc thread is scheduled and executing an RPC, no other
    processes will touch svc_rqst::rq_flags. Thus bus-locked atomics are
    not needed outside the svc thread scheduler.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: use sv_lock to protect updates to sv_nrthreads. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 29 15:51:25 2021 +1100

    SUNRPC: use sv_lock to protect updates to sv_nrthreads.
    
    [ Upstream commit 2a36395fac3b72771f87c3ee4387e3a96d85a7cc ]
    
    Using sv_lock means we don't need to hold the service mutex over these
    updates.
    
    In particular,  svc_exit_thread() no longer requires synchronisation, so
    threads can exit asynchronously.
    
    Note that we could use an atomic_t, but as there are many more read
    sites than writes, that would add unnecessary noise to the code.
    Some reads are already racy, and there is no need for them to not be.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

sysctl: introduce new proc handler proc_dobool [+ + +]

Author: Jia He <hejianet@gmail.com>
Date:   Tue Aug 3 12:59:36 2021 +0200

    sysctl: introduce new proc handler proc_dobool
    
    [ Upstream commit a2071573d6346819cc4e5787b4206f2184985160 ]
    
    This is to let bool variable could be correctly displayed in
    big/little endian sysctl procfs. sizeof(bool) is arch dependent,
    proc_dobool should work in all arches.
    
    Suggested-by: Pan Xinhui <xinhui@linux.vnet.ibm.com>
    Signed-off-by: Jia He <hejianet@gmail.com>
    [thuth: rebased the patch to the current kernel version]
    Signed-off-by: Thomas Huth <thuth@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

UAPI: nfsfh.h: Replace one-element array with flexible-array member [+ + +]

Author: Gustavo A. R. Silva <gustavoars@kernel.org>
Date:   Tue Mar 23 17:48:58 2021 -0500

    UAPI: nfsfh.h: Replace one-element array with flexible-array member
    
    [ Upstream commit c0a744dcaa29e9537e8607ae9c965ad936124a4d ]
    
    There is a regular need in the kernel to provide a way to declare having
    a dynamically sized set of trailing elements in a structure. Kernel code
    should always use “flexible array members”[1] for these cases. The older
    style of one-element or zero-length arrays should no longer be used[2].
    
    Use an anonymous union with a couple of anonymous structs in order to
    keep userspace unchanged:
    
    $ pahole -C nfs_fhbase_new fs/nfsd/nfsfh.o
    struct nfs_fhbase_new {
            union {
                    struct {
                            __u8       fb_version_aux;       /*     0     1 */
                            __u8       fb_auth_type_aux;     /*     1     1 */
                            __u8       fb_fsid_type_aux;     /*     2     1 */
                            __u8       fb_fileid_type_aux;   /*     3     1 */
                            __u32      fb_auth[1];           /*     4     4 */
                    };                                       /*     0     8 */
                    struct {
                            __u8       fb_version;           /*     0     1 */
                            __u8       fb_auth_type;         /*     1     1 */
                            __u8       fb_fsid_type;         /*     2     1 */
                            __u8       fb_fileid_type;       /*     3     1 */
                            __u32      fb_auth_flex[0];      /*     4     0 */
                    };                                       /*     0     4 */
            };                                               /*     0     8 */
    
            /* size: 8, cachelines: 1, members: 1 */
            /* last cacheline: 8 bytes */
    };
    
    Also, this helps with the ongoing efforts to enable -Warray-bounds by
    fixing the following warnings:
    
    fs/nfsd/nfsfh.c: In function ‘nfsd_set_fh_dentry’:
    fs/nfsd/nfsfh.c:191:41: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
      191 |        ntohl((__force __be32)fh->fh_fsid[1])));
          |                              ~~~~~~~~~~~^~~
    ./include/linux/kdev_t.h:12:46: note: in definition of macro ‘MKDEV’
       12 | #define MKDEV(ma,mi) (((ma) << MINORBITS) | (mi))
          |                                              ^~
    ./include/uapi/linux/byteorder/little_endian.h:40:26: note: in expansion of macro ‘__swab32’
       40 | #define __be32_to_cpu(x) __swab32((__force __u32)(__be32)(x))
          |                          ^~~~~~~~
    ./include/linux/byteorder/generic.h:136:21: note: in expansion of macro ‘__be32_to_cpu’
      136 | #define ___ntohl(x) __be32_to_cpu(x)
          |                     ^~~~~~~~~~~~~
    ./include/linux/byteorder/generic.h:140:18: note: in expansion of macro ‘___ntohl’
      140 | #define ntohl(x) ___ntohl(x)
          |                  ^~~~~~~~
    fs/nfsd/nfsfh.c:191:8: note: in expansion of macro ‘ntohl’
      191 |        ntohl((__force __be32)fh->fh_fsid[1])));
          |        ^~~~~
    fs/nfsd/nfsfh.c:192:32: warning: array subscript 2 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
      192 |    fh->fh_fsid[1] = fh->fh_fsid[2];
          |                     ~~~~~~~~~~~^~~
    fs/nfsd/nfsfh.c:192:15: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
      192 |    fh->fh_fsid[1] = fh->fh_fsid[2];
          |    ~~~~~~~~~~~^~~
    
    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays
    
    Link: https://github.com/KSPP/linux/issues/79
    Link: https://github.com/KSPP/linux/issues/109
    Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>