Changelog in Linux kernel 6.6.72

 
ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Fri Dec 20 19:13:52 2024 +0100

    ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]
    
    commit 66d337fede44dcbab4107d37684af8fcab3d648e upstream.
    
    Like the Vivobook X1704VAP the X1504VAP has its keyboard IRQ (1) described
    as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh which
    breaks the keyboard.
    
    Add the X1504VAP to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20241220181352.25974-1-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[] [+ + +]
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Sat Dec 28 17:48:45 2024 +0100

    ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[]
    
    commit 7ed4e4a659d99499dc6968c61970d41b64feeac0 upstream.
    
    The TongFang GM5HG0A is a TongFang barebone design which is sold under
    various brand names.
    
    The ACPI IRQ override for the keyboard IRQ must be used on these AMD Zen
    laptops in order for the IRQ to work.
    
    At least on the SKIKK Vanaheim variant the DMI product- and board-name
    strings have been replaced by the OEM with "Vanaheim" so checking that
    board-name contains "GM5HG0A" as is usually done for TongFang barebones
    quirks does not work.
    
    The DMI OEM strings do contain "GM5HG0A". I have looked at the dmidecode
    for a few other TongFang devices and the TongFang code-name string being
    in the OEM strings seems to be something which is consistently true.
    
    Add a quirk checking one of the DMI_OEM_STRING(s) is "GM5HG0A" in the hope
    that this will work for other OEM versions of the "GM5HG0A" too.
    
    Link: https://www.skikk.eu/en/laptops/vanaheim-15-rtx-4060
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219614
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://patch.msgid.link/20241228164845.42381-1-hdegoede@redhat.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
afs: Fix the maximum cell name length [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Mon Jan 6 16:21:00 2025 +0000

    afs: Fix the maximum cell name length
    
    [ Upstream commit 8fd56ad6e7c90ac2bddb0741c6b248c8c5d56ac8 ]
    
    The kafs filesystem limits the maximum length of a cell to 256 bytes, but a
    problem occurs if someone actually does that: kafs tries to create a
    directory under /proc/net/afs/ with the name of the cell, but that fails
    with a warning:
    
            WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:405
    
    because procfs limits the maximum filename length to 255.
    
    However, the DNS limits the maximum lookup length and, by extension, the
    maximum cell name, to 255 less two (length count and trailing NUL).
    
    Fix this by limiting the maximum acceptable cellname length to 253.  This
    also allows us to be sure we can create the "/afs/.<cell>/" mountpoint too.
    
    Further, split the YFS VL record cell name maximum to be the 256 allowed by
    the protocol and ignore the record retrieved by YFSVL.GetCellName if it
    exceeds 253.
    
    Fixes: c3e9f888263b ("afs: Implement client support for the YFSVL.GetCellName RPC op")
    Reported-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/r/6776d25d.050a0220.3a8527.0048.GAE@google.com/
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/376236.1736180460@warthog.procyon.org.uk
    Tested-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
arm64: dts: rockchip: add hevc power domain clock to rk3328 [+ + +]
Author: Peter Geis <pgwipeout@gmail.com>
Date:   Sat Dec 14 22:43:39 2024 +0000

    arm64: dts: rockchip: add hevc power domain clock to rk3328
    
    [ Upstream commit 3699f2c43ea9984e00d70463f8c29baaf260ea97 ]
    
    There is a race condition at startup between disabling power domains not
    used and disabling clocks not used on the rk3328. When the clocks are
    disabled first, the hevc power domain fails to shut off leading to a
    splat of failures. Add the hevc core clock to the rk3328 power domain
    node to prevent this condition.
    
    rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-.... }
    1087 jiffies s: 89 root: 0x8/.
    rcu: blocking rcu_node structures (internal RCU debug):
    Sending NMI from CPU 0 to CPUs 3:
    NMI backtrace for cpu 3
    CPU: 3 UID: 0 PID: 86 Comm: kworker/3:3 Not tainted 6.12.0-rc5+ #53
    Hardware name: Firefly ROC-RK3328-CC (DT)
    Workqueue: pm genpd_power_off_work_fn
    pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : regmap_unlock_spinlock+0x18/0x30
    lr : regmap_read+0x60/0x88
    sp : ffff800081123c00
    x29: ffff800081123c00 x28: ffff2fa4c62cad80 x27: 0000000000000000
    x26: ffffd74e6e660eb8 x25: ffff2fa4c62cae00 x24: 0000000000000040
    x23: ffffd74e6d2f3ab8 x22: 0000000000000001 x21: ffff800081123c74
    x20: 0000000000000000 x19: ffff2fa4c0412000 x18: 0000000000000000
    x17: 77202c31203d2065 x16: 6c6469203a72656c x15: 6c6f72746e6f632d
    x14: 7265776f703a6e6f x13: 2063766568206e69 x12: 616d6f64202c3431
    x11: 347830206f742030 x10: 3430303034783020 x9 : ffffd74e6c7369e0
    x8 : 3030316666206e69 x7 : 205d383738353733 x6 : 332e31202020205b
    x5 : ffffd74e6c73fc88 x4 : ffffd74e6c73fcd4 x3 : ffffd74e6c740b40
    x2 : ffff800080015484 x1 : 0000000000000000 x0 : ffff2fa4c0412000
    Call trace:
    regmap_unlock_spinlock+0x18/0x30
    rockchip_pmu_set_idle_request+0xac/0x2c0
    rockchip_pd_power+0x144/0x5f8
    rockchip_pd_power_off+0x1c/0x30
    _genpd_power_off+0x9c/0x180
    genpd_power_off.part.0.isra.0+0x130/0x2a8
    genpd_power_off_work_fn+0x6c/0x98
    process_one_work+0x170/0x3f0
    worker_thread+0x290/0x4a8
    kthread+0xec/0xf8
    ret_from_fork+0x10/0x20
    rockchip-pm-domain ff100000.syscon:power-controller: failed to get ack on domain 'hevc', val=0x88220
    
    Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
    Signed-off-by: Peter Geis <pgwipeout@gmail.com>
    Reviewed-by: Dragan Simic <dsimic@manjaro.org>
    Link: https://lore.kernel.org/r/20241214224339.24674-1-pgwipeout@gmail.com
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ARM: dts: imxrt1050: Fix clocks for mmc [+ + +]
Author: Jesse Taube <Mr.Bossman075@gmail.com>
Date:   Mon Nov 18 10:36:41 2024 -0500

    ARM: dts: imxrt1050: Fix clocks for mmc
    
    [ Upstream commit 5f122030061db3e5d2bddd9cf5c583deaa6c54ff ]
    
    One of the usdhc1 controller's clocks should be IMXRT1050_CLK_AHB_PODF not
    IMXRT1050_CLK_OSC.
    
    Fixes: 1c4f01be3490 ("ARM: dts: imx: Add i.MXRT1050-EVK support")
    Signed-off-by: Jesse Taube <Mr.Bossman075@gmail.com>
    Signed-off-by: Shawn Guo <shawnguo@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ASoC: mediatek: disable buffer pre-allocation [+ + +]
Author: Chen-Yu Tsai <wenst@chromium.org>
Date:   Thu Dec 19 18:53:02 2024 +0800

    ASoC: mediatek: disable buffer pre-allocation
    
    [ Upstream commit 32c9c06adb5b157ef259233775a063a43746d699 ]
    
    On Chromebooks based on Mediatek MT8195 or MT8188, the audio frontend
    (AFE) is limited to accessing a very small window (1 MiB) of memory,
    which is described as a reserved memory region in the device tree.
    
    On these two platforms, the maximum buffer size is given as 512 KiB.
    The MediaTek common code uses the same value for preallocations. This
    means that only the first two PCM substreams get preallocations, and
    then the whole space is exhausted, barring any other substreams from
    working. Since the substreams used are not always the first two, this
    means audio won't work correctly.
    
    This is observed on the MT8188 Geralt Chromebooks, on which the
    "mediatek,dai-link" property was dropped when it was upstreamed. That
    property causes the driver to only register the PCM substreams listed
    in the property, and in the order given.
    
    Instead of trying to compute an optimal value and figuring out which
    streams are used, simply disable preallocation. The PCM buffers are
    managed by the core and are allocated and released on the fly. There
    should be no impact to any of the other MediaTek platforms.
    
    Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Link: https://patch.msgid.link/20241219105303.548437-1-wenst@chromium.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: rt722: add delay time to wait for the calibration procedure [+ + +]
Author: Shuming Fan <shumingf@realtek.com>
Date:   Wed Dec 18 17:13:07 2024 +0800

    ASoC: rt722: add delay time to wait for the calibration procedure
    
    [ Upstream commit c9e3ebdc52ebe028f238c9df5162ae92483bedd5 ]
    
    The calibration procedure needs some time to finish.
    This patch adds the delay time to ensure the calibration procedure is completed correctly.
    
    Signed-off-by: Shuming Fan <shumingf@realtek.com>
    Link: https://patch.msgid.link/20241218091307.96656-1-shumingf@realtek.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
block, bfq: fix waker_bfqq UAF after bfq_split_bfqq() [+ + +]
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Wed Jan 8 16:41:48 2025 +0800

    block, bfq: fix waker_bfqq UAF after bfq_split_bfqq()
    
    [ Upstream commit fcede1f0a043ccefe9bc6ad57f12718e42f63f1d ]
    
    Our syzkaller report a following UAF for v6.6:
    
    BUG: KASAN: slab-use-after-free in bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958
    Read of size 8 at addr ffff8881b57147d8 by task fsstress/232726
    
    CPU: 2 PID: 232726 Comm: fsstress Not tainted 6.6.0-g3629d1885222 #39
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:88 [inline]
     dump_stack_lvl+0x91/0xf0 lib/dump_stack.c:106
     print_address_description.constprop.0+0x66/0x300 mm/kasan/report.c:364
     print_report+0x3e/0x70 mm/kasan/report.c:475
     kasan_report+0xb8/0xf0 mm/kasan/report.c:588
     hlist_add_head include/linux/list.h:1023 [inline]
     bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958
     bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271
     bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323
     blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660
     blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143
     __submit_bio+0xa0/0x6b0 block/blk-core.c:639
     __submit_bio_noacct_mq block/blk-core.c:718 [inline]
     submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747
     submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847
     __ext4_read_bh fs/ext4/super.c:205 [inline]
     ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230
     __read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567
     ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947
     ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182
     ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660
     ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569
     iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91
     iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80
     ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051
     ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220
     do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811
     __do_sys_ioctl fs/ioctl.c:869 [inline]
     __se_sys_ioctl+0xae/0x190 fs/ioctl.c:857
     do_syscall_x64 arch/x86/entry/common.c:51 [inline]
     do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
     entry_SYSCALL_64_after_hwframe+0x78/0xe2
    
    Allocated by task 232719:
     kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
     kasan_set_track+0x25/0x30 mm/kasan/common.c:52
     __kasan_slab_alloc+0x87/0x90 mm/kasan/common.c:328
     kasan_slab_alloc include/linux/kasan.h:188 [inline]
     slab_post_alloc_hook mm/slab.h:768 [inline]
     slab_alloc_node mm/slub.c:3492 [inline]
     kmem_cache_alloc_node+0x1b8/0x6f0 mm/slub.c:3537
     bfq_get_queue+0x215/0x1f00 block/bfq-iosched.c:5869
     bfq_get_bfqq_handle_split+0x167/0x5f0 block/bfq-iosched.c:6776
     bfq_init_rq+0x13a4/0x17a0 block/bfq-iosched.c:6938
     bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271
     bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323
     blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660
     blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143
     __submit_bio+0xa0/0x6b0 block/blk-core.c:639
     __submit_bio_noacct_mq block/blk-core.c:718 [inline]
     submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747
     submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847
     __ext4_read_bh fs/ext4/super.c:205 [inline]
     ext4_read_bh_nowait+0x15a/0x240 fs/ext4/super.c:217
     ext4_read_bh_lock+0xac/0xd0 fs/ext4/super.c:242
     ext4_bread_batch+0x268/0x500 fs/ext4/inode.c:958
     __ext4_find_entry+0x448/0x10f0 fs/ext4/namei.c:1671
     ext4_lookup_entry fs/ext4/namei.c:1774 [inline]
     ext4_lookup.part.0+0x359/0x6f0 fs/ext4/namei.c:1842
     ext4_lookup+0x72/0x90 fs/ext4/namei.c:1839
     __lookup_slow+0x257/0x480 fs/namei.c:1696
     lookup_slow fs/namei.c:1713 [inline]
     walk_component+0x454/0x5c0 fs/namei.c:2004
     link_path_walk.part.0+0x773/0xda0 fs/namei.c:2331
     link_path_walk fs/namei.c:3826 [inline]
     path_openat+0x1b9/0x520 fs/namei.c:3826
     do_filp_open+0x1b7/0x400 fs/namei.c:3857
     do_sys_openat2+0x5dc/0x6e0 fs/open.c:1428
     do_sys_open fs/open.c:1443 [inline]
     __do_sys_openat fs/open.c:1459 [inline]
     __se_sys_openat fs/open.c:1454 [inline]
     __x64_sys_openat+0x148/0x200 fs/open.c:1454
     do_syscall_x64 arch/x86/entry/common.c:51 [inline]
     do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
     entry_SYSCALL_64_after_hwframe+0x78/0xe2
    
    Freed by task 232726:
     kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
     kasan_set_track+0x25/0x30 mm/kasan/common.c:52
     kasan_save_free_info+0x2b/0x50 mm/kasan/generic.c:522
     ____kasan_slab_free mm/kasan/common.c:236 [inline]
     __kasan_slab_free+0x12a/0x1b0 mm/kasan/common.c:244
     kasan_slab_free include/linux/kasan.h:164 [inline]
     slab_free_hook mm/slub.c:1827 [inline]
     slab_free_freelist_hook mm/slub.c:1853 [inline]
     slab_free mm/slub.c:3820 [inline]
     kmem_cache_free+0x110/0x760 mm/slub.c:3842
     bfq_put_queue+0x6a7/0xfb0 block/bfq-iosched.c:5428
     bfq_forget_entity block/bfq-wf2q.c:634 [inline]
     bfq_put_idle_entity+0x142/0x240 block/bfq-wf2q.c:645
     bfq_forget_idle+0x189/0x1e0 block/bfq-wf2q.c:671
     bfq_update_vtime block/bfq-wf2q.c:1280 [inline]
     __bfq_lookup_next_entity block/bfq-wf2q.c:1374 [inline]
     bfq_lookup_next_entity+0x350/0x480 block/bfq-wf2q.c:1433
     bfq_update_next_in_service+0x1c0/0x4f0 block/bfq-wf2q.c:128
     bfq_deactivate_entity+0x10a/0x240 block/bfq-wf2q.c:1188
     bfq_deactivate_bfqq block/bfq-wf2q.c:1592 [inline]
     bfq_del_bfqq_busy+0x2e8/0xad0 block/bfq-wf2q.c:1659
     bfq_release_process_ref+0x1cc/0x220 block/bfq-iosched.c:3139
     bfq_split_bfqq+0x481/0xdf0 block/bfq-iosched.c:6754
     bfq_init_rq+0xf29/0x17a0 block/bfq-iosched.c:6934
     bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271
     bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323
     blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660
     blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143
     __submit_bio+0xa0/0x6b0 block/blk-core.c:639
     __submit_bio_noacct_mq block/blk-core.c:718 [inline]
     submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747
     submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847
     __ext4_read_bh fs/ext4/super.c:205 [inline]
     ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230
     __read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567
     ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947
     ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182
     ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660
     ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569
     iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91
     iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80
     ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051
     ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220
     do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811
     __do_sys_ioctl fs/ioctl.c:869 [inline]
     __se_sys_ioctl+0xae/0x190 fs/ioctl.c:857
     do_syscall_x64 arch/x86/entry/common.c:51 [inline]
     do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
     entry_SYSCALL_64_after_hwframe+0x78/0xe2
    
    commit 1ba0403ac644 ("block, bfq: fix uaf for accessing waker_bfqq after
    splitting") fix the problem that if waker_bfqq is in the merge chain,
    and current is the only procress, waker_bfqq can be freed from
    bfq_split_bfqq(). However, the case that waker_bfqq is not in the merge
    chain is missed, and if the procress reference of waker_bfqq is 0,
    waker_bfqq can be freed as well.
    
    Fix the problem by checking procress reference if waker_bfqq is not in
    the merge_chain.
    
    Fixes: 1ba0403ac644 ("block, bfq: fix uaf for accessing waker_bfqq after splitting")
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20250108084148.1549973-1-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Bluetooth: btnxpuart: Fix driver sending truncated data [+ + +]
Author: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Date:   Fri Dec 20 18:32:52 2024 +0530

    Bluetooth: btnxpuart: Fix driver sending truncated data
    
    [ Upstream commit 8023dd2204254a70887f5ee58d914bf70a060b9d ]
    
    This fixes the apparent controller hang issue seen during stress test
    where the host sends a truncated payload, followed by HCI commands. The
    controller treats these HCI commands as a part of previously truncated
    payload, leading to command timeouts.
    
    Adding a serdev_device_wait_until_sent() call after
    serdev_device_write_buf() fixed the issue.
    
    Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets")
    Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_sync: Fix not setting Random Address when required [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Mon Nov 25 15:42:09 2024 -0500

    Bluetooth: hci_sync: Fix not setting Random Address when required
    
    [ Upstream commit c2994b008492db033d40bd767be1620229a3035e ]
    
    This fixes errors such as the following when Own address type is set to
    Random Address but it has not been programmed yet due to either be
    advertising or connecting:
    
    < HCI Command: LE Set Exte.. (0x08|0x0041) plen 13
            Own address type: Random (0x03)
            Filter policy: Ignore not in accept list (0x01)
            PHYs: 0x05
            Entry 0: LE 1M
              Type: Passive (0x00)
              Interval: 60.000 msec (0x0060)
              Window: 30.000 msec (0x0030)
            Entry 1: LE Coded
              Type: Passive (0x00)
              Interval: 180.000 msec (0x0120)
              Window: 90.000 msec (0x0090)
    > HCI Event: Command Complete (0x0e) plen 4
          LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1
            Status: Success (0x00)
    < HCI Command: LE Set Exten.. (0x08|0x0042) plen 6
            Extended scan: Enabled (0x01)
            Filter duplicates: Enabled (0x01)
            Duration: 0 msec (0x0000)
            Period: 0.00 sec (0x0000)
    > HCI Event: Command Complete (0x0e) plen 4
          LE Set Extended Scan Enable (0x08|0x0042) ncmd 1
            Status: Invalid HCI Command Parameters (0x12)
    
    Fixes: c45074d68a9b ("Bluetooth: Fix not generating RPA when required")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: MGMT: Fix Add Device to responding before completing [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Mon Nov 25 15:42:10 2024 -0500

    Bluetooth: MGMT: Fix Add Device to responding before completing
    
    [ Upstream commit a182d9c84f9c52fb5db895ecceeee8b3a1bf661e ]
    
    Add Device with LE type requires updating resolving/accept list which
    requires quite a number of commands to complete and each of them may
    fail, so instead of pretending it would always work this checks the
    return of hci_update_passive_scan_sync which indicates if everything
    worked as intended.
    
    Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bnxt_en: Fix possible memory leak when hwrm_req_replace fails [+ + +]
Author: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Date:   Fri Jan 3 20:38:47 2025 -0800

    bnxt_en: Fix possible memory leak when hwrm_req_replace fails
    
    [ Upstream commit c8dafb0e4398dacc362832098a04b97da3b0395b ]
    
    When hwrm_req_replace() fails, the driver is not invoking bnxt_req_drop()
    which could cause a memory leak.
    
    Fixes: bbf33d1d9805 ("bnxt_en: update all firmware calls to use the new APIs")
    Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
    Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Signed-off-by: Michael Chan <michael.chan@broadcom.com>
    Link: https://patch.msgid.link/20250104043849.3482067-2-michael.chan@broadcom.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
btrfs: avoid NULL pointer dereference if no valid extent tree [+ + +]
Author: Qu Wenruo <wqu@suse.com>
Date:   Thu Jan 2 14:44:16 2025 +1030

    btrfs: avoid NULL pointer dereference if no valid extent tree
    
    [ Upstream commit 6aecd91a5c5b68939cf4169e32bc49f3cd2dd329 ]
    
    [BUG]
    Syzbot reported a crash with the following call trace:
    
      BTRFS info (device loop0): scrub: started on devid 1
      BUG: kernel NULL pointer dereference, address: 0000000000000208
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0
      Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G           O       6.13.0-rc4-custom+ #206
      Tainted: [O]=OOT_MODULE
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
      RIP: 0010:find_first_extent_item+0x26/0x1f0 [btrfs]
      Call Trace:
       <TASK>
       scrub_find_fill_first_stripe+0x13d/0x3b0 [btrfs]
       scrub_simple_mirror+0x175/0x260 [btrfs]
       scrub_stripe+0x5d4/0x6c0 [btrfs]
       scrub_chunk+0xbb/0x170 [btrfs]
       scrub_enumerate_chunks+0x2f4/0x5f0 [btrfs]
       btrfs_scrub_dev+0x240/0x600 [btrfs]
       btrfs_ioctl+0x1dc8/0x2fa0 [btrfs]
       ? do_sys_openat2+0xa5/0xf0
       __x64_sys_ioctl+0x97/0xc0
       do_syscall_64+0x4f/0x120
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
       </TASK>
    
    [CAUSE]
    The reproducer is using a corrupted image where extent tree root is
    corrupted, thus forcing to use "rescue=all,ro" mount option to mount the
    image.
    
    Then it triggered a scrub, but since scrub relies on extent tree to find
    where the data/metadata extents are, scrub_find_fill_first_stripe()
    relies on an non-empty extent root.
    
    But unfortunately scrub_find_fill_first_stripe() doesn't really expect
    an NULL pointer for extent root, it use extent_root to grab fs_info and
    triggered a NULL pointer dereference.
    
    [FIX]
    Add an extra check for a valid extent root at the beginning of
    scrub_find_fill_first_stripe().
    
    The new error path is introduced by 42437a6386ff ("btrfs: introduce
    mount option rescue=ignorebadroots"), but that's pretty old, and later
    commit b979547513ff ("btrfs: scrub: introduce helper to find and fill
    sector info for a scrub_stripe") changed how we do scrub.
    
    So for kernels older than 6.6, the fix will need manual backport.
    
    Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com
    Link: https://lore.kernel.org/linux-btrfs/67756935.050a0220.25abdd.0a12.GAE@google.com/
    Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots")
    Reviewed-by: Anand Jain <anand.jain@oracle.com>
    Signed-off-by: Qu Wenruo <wqu@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Sat Nov 16 00:32:39 2024 +0100

    cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
    
    [ Upstream commit 7e25044b804581b9c029d5a28d8800aebde18043 ]
    
    The 'np' device_node is initialized via of_cpu_device_node_get(), which
    requires explicit calls to of_node_put() when it is no longer required
    to avoid leaking the resource.
    
    Instead of adding the missing calls to of_node_put() in all execution
    paths, use the cleanup attribute for 'np' by means of the __free()
    macro, which automatically calls of_node_put() when the variable goes
    out of scope. Given that 'np' is only used within the
    for_each_possible_cpu(), reduce its scope to release the nood after
    every iteration of the loop.
    
    Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver")
    Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce08@gmail.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cxgb4: Avoid removal of uninserted tid [+ + +]
Author: Anumula Murali Mohan Reddy <anumula@chelsio.com>
Date:   Fri Jan 3 14:53:27 2025 +0530

    cxgb4: Avoid removal of uninserted tid
    
    [ Upstream commit 4c1224501e9d6c5fd12d83752f1c1b444e0e3418 ]
    
    During ARP failure, tid is not inserted but _c4iw_free_ep()
    attempts to remove tid which results in error.
    This patch fixes the issue by avoiding removal of uninserted tid.
    
    Fixes: 59437d78f088 ("cxgb4/chtls: fix ULD connection failures due to wrong TID base")
    Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com>
    Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
    Link: https://patch.msgid.link/20250103092327.1011925-1-anumula@chelsio.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
dm array: fix cursor index when skipping across block boundaries [+ + +]
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date:   Thu Dec 5 19:41:53 2024 +0800

    dm array: fix cursor index when skipping across block boundaries
    
    [ Upstream commit 0bb1968da2737ba68fd63857d1af2b301a18d3bf ]
    
    dm_array_cursor_skip() seeks to the target position by loading array
    blocks iteratively until the specified number of entries to skip is
    reached. When seeking across block boundaries, it uses
    dm_array_cursor_next() to step into the next block.
    dm_array_cursor_skip() must first move the cursor index to the end
    of the current block; otherwise, the cursor position could incorrectly
    remain in the same block, causing the actual number of skipped entries
    to be much smaller than expected.
    
    This bug affects cache resizing in v2 metadata and could lead to data
    loss if the fast device is shrunk during the first-time resume. For
    example:
    
    1. create a cache metadata consists of 32768 blocks, with a dirty block
       assigned to the second bitmap block. cache_restore v1.0 is required.
    
    cat <<EOF >> cmeta.xml
    <superblock uuid="" block_size="64" nr_cache_blocks="32768" \
    policy="smq" hint_width="4">
      <mappings>
        <mapping cache_block="32767" origin_block="0" dirty="true"/>
      </mappings>
    </superblock>
    EOF
    dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
    cache_restore -i cmeta.xml -o /dev/mapper/cmeta --metadata-version=2
    
    2. bring up the cache while attempt to discard all the blocks belonging
       to the second bitmap block (block# 32576 to 32767). The last command
       is expected to fail, but it actually succeeds.
    
    dmsetup create cdata --table "0 2084864 linear /dev/sdc 8192"
    dmsetup create corig --table "0 65536 linear /dev/sdc 2105344"
    dmsetup create cache --table "0 65536 cache /dev/mapper/cmeta \
    /dev/mapper/cdata /dev/mapper/corig 64 2 metadata2 writeback smq \
    2 migration_threshold 0"
    
    In addition to the reproducer described above, this fix can be
    verified using the "array_cursor/skip" tests in dm-unit:
      dm-unit run /pdata/array_cursor/skip/ --kernel-dir <KERNEL_DIR>
    
    Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
    Fixes: 9b696229aa7d ("dm persistent data: add cursor skip functions to the cursor APIs")
    Reviewed-by: Joe Thornber <thornber@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dm array: fix releasing a faulty array block twice in dm_array_cursor_end [+ + +]
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date:   Thu Dec 5 19:41:51 2024 +0800

    dm array: fix releasing a faulty array block twice in dm_array_cursor_end
    
    [ Upstream commit f2893c0804d86230ffb8f1c8703fdbb18648abc8 ]
    
    When dm_bm_read_lock() fails due to locking or checksum errors, it
    releases the faulty block implicitly while leaving an invalid output
    pointer behind. The caller of dm_bm_read_lock() should not operate on
    this invalid dm_block pointer, or it will lead to undefined result.
    For example, the dm_array_cursor incorrectly caches the invalid pointer
    on reading a faulty array block, causing a double release in
    dm_array_cursor_end(), then hitting the BUG_ON in dm-bufio cache_put().
    
    Reproduce steps:
    
    1. initialize a cache device
    
    dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
    dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
    dmsetup create corig --table "0 524288 linear /dev/sdc $262144"
    dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1
    dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
    /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
    
    2. wipe the second array block offline
    
    dmsteup remove cache cmeta cdata corig
    mapping_root=$(dd if=/dev/sdc bs=1c count=8 skip=192 \
    2>/dev/null | hexdump -e '1/8 "%u\n"')
    ablock=$(dd if=/dev/sdc bs=1c count=8 skip=$((4096*mapping_root+2056)) \
    2>/dev/null | hexdump -e '1/8 "%u\n"')
    dd if=/dev/zero of=/dev/sdc bs=4k count=1 seek=$ablock
    
    3. try reopen the cache device
    
    dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
    dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
    dmsetup create corig --table "0 524288 linear /dev/sdc $262144"
    dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
    /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
    
    Kernel logs:
    
    (snip)
    device-mapper: array: array_block_check failed: blocknr 0 != wanted 10
    device-mapper: block manager: array validator check failed for block 10
    device-mapper: array: get_ablock failed
    device-mapper: cache metadata: dm_array_cursor_next for mapping failed
    ------------[ cut here ]------------
    kernel BUG at drivers/md/dm-bufio.c:638!
    
    Fix by setting the cached block pointer to NULL on errors.
    
    In addition to the reproducer described above, this fix can be
    verified using the "array_cursor/damaged" test in dm-unit:
      dm-unit run /pdata/array_cursor/damaged --kernel-dir <KERNEL_DIR>
    
    Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
    Fixes: fdd1315aa5f0 ("dm array: introduce cursor api")
    Reviewed-by: Joe Thornber <thornber@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dm array: fix unreleased btree blocks on closing a faulty array cursor [+ + +]
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date:   Thu Dec 5 19:41:52 2024 +0800

    dm array: fix unreleased btree blocks on closing a faulty array cursor
    
    [ Upstream commit 626f128ee9c4133b1cfce4be2b34a1508949370e ]
    
    The cached block pointer in dm_array_cursor might be NULL if it reaches
    an unreadable array block, or the array is empty. Therefore,
    dm_array_cursor_end() should call dm_btree_cursor_end() unconditionally,
    to prevent leaving unreleased btree blocks.
    
    This fix can be verified using the "array_cursor/iterate/empty" test
    in dm-unit:
      dm-unit run /pdata/array_cursor/iterate/empty --kernel-dir <KERNEL_DIR>
    
    Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
    Fixes: fdd1315aa5f0 ("dm array: introduce cursor api")
    Reviewed-by: Joe Thornber <thornber@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
dm thin: make get_first_thin use rcu-safe list first function [+ + +]
Author: Krister Johansen <kjlx@templeofstupid.com>
Date:   Tue Jan 7 15:24:58 2025 -0800

    dm thin: make get_first_thin use rcu-safe list first function
    
    commit 80f130bfad1dab93b95683fc39b87235682b8f72 upstream.
    
    The documentation in rculist.h explains the absence of list_empty_rcu()
    and cautions programmers against relying on a list_empty() ->
    list_first() sequence in RCU safe code.  This is because each of these
    functions performs its own READ_ONCE() of the list head.  This can lead
    to a situation where the list_empty() sees a valid list entry, but the
    subsequent list_first() sees a different view of list head state after a
    modification.
    
    In the case of dm-thin, this author had a production box crash from a GP
    fault in the process_deferred_bios path.  This function saw a valid list
    head in get_first_thin() but when it subsequently dereferenced that and
    turned it into a thin_c, it got the inside of the struct pool, since the
    list was now empty and referring to itself.  The kernel on which this
    occurred printed both a warning about a refcount_t being saturated, and
    a UBSAN error for an out-of-bounds cpuid access in the queued spinlock,
    prior to the fault itself.  When the resulting kdump was examined, it
    was possible to see another thread patiently waiting in thin_dtr's
    synchronize_rcu.
    
    The thin_dtr call managed to pull the thin_c out of the active thins
    list (and have it be the last entry in the active_thins list) at just
    the wrong moment which lead to this crash.
    
    Fortunately, the fix here is straight forward.  Switch get_first_thin()
    function to use list_first_or_null_rcu() which performs just a single
    READ_ONCE() and returns NULL if the list is already empty.
    
    This was run against the devicemapper test suite's thin-provisioning
    suites for delete and suspend and no regressions were observed.
    
    Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
    Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep")
    Cc: stable@vger.kernel.org
    Acked-by: Ming-Hung Tsai <mtsai@redhat.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY [+ + +]
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Tue Jan 7 17:47:01 2025 +0100

    dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY
    
    commit 47f33c27fc9565fb0bc7dfb76be08d445cd3d236 upstream.
    
    dm-ebs uses dm-bufio to process requests that are not aligned on logical
    sector size. dm-bufio doesn't support passing integrity data (and it is
    unclear how should it do it), so we shouldn't set the
    DM_TARGET_PASSES_INTEGRITY flag.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Cc: stable@vger.kernel.org
    Fixes: d3c7b35c20d6 ("dm: add emulated block size target")
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2) [+ + +]
Author: Milan Broz <gmazyland@gmail.com>
Date:   Wed Dec 18 13:56:58 2024 +0100

    dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2)
    
    commit 6df90c02bae468a3a6110bafbc659884d0c4966c upstream.
    
    This patch fixes an issue that was fixed in the commit
      df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size")
    but later broken again in the commit
      8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
    
    If the Reed-Solomon roots setting spans multiple blocks, the code does not
    use proper parity bytes and randomly fails to repair even trivial errors.
    
    This bug cannot happen if the sector size is multiple of RS roots
    setting (Android case with roots 2).
    
    The previous solution was to find a dm-bufio block size that is multiple
    of the device sector size and roots size. Unfortunately, the optimization
    in commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
    is incorrect and uses data block size for some roots (for example, it uses
    4096 block size for roots = 20).
    
    This patch uses a different approach:
    
     - It always uses a configured data block size for dm-bufio to avoid
     possible misaligned IOs.
    
     - and it caches the processed parity bytes, so it can join it
     if it spans two blocks.
    
    As the RS calculation is called only if an error is detected and
    the process is computationally intensive, copying a few more bytes
    should not introduce performance issues.
    
    The issue was reported to cryptsetup with trivial reproducer
      https://gitlab.com/cryptsetup/cryptsetup/-/issues/923
    
    Reproducer (with roots=20):
    
     # create verity device with RS FEC
     dd if=/dev/urandom of=data.img bs=4096 count=8 status=none
     veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=20 | \
     awk '/^Root hash/{ print $3 }' >roothash
    
     # create an erasure that should always be repairable with this roots setting
     dd if=/dev/zero of=data.img conv=notrunc bs=1 count=4 seek=4 status=none
    
     # try to read it through dm-verity
     veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=20 $(cat roothash)
     dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer
    
     Even now the log says it cannot repair it:
       : verity-fec: 7:1: FEC 0: failed to correct: -74
       : device-mapper: verity: 7:1: data block 0 is corrupted
       ...
    
    With this fix, errors are properly repaired.
       : verity-fec: 7:1: FEC 0: corrected 4 errors
    
    Signed-off-by: Milan Broz <gmazyland@gmail.com>
    Fixes: 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
    Cc: stable@vger.kernel.org
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Milan Broz <gmazyland@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amd/display: Add check for granularity in dml ceil/floor helpers [+ + +]
Author: Roman Li <Roman.Li@amd.com>
Date:   Fri Dec 13 13:51:07 2024 -0500

    drm/amd/display: Add check for granularity in dml ceil/floor helpers
    
    commit 0881fbc4fd62e00a2b8e102725f76d10351b2ea8 upstream.
    
    [Why]
    Wrapper functions for dcn_bw_ceil2() and dcn_bw_floor2()
    should check for granularity is non zero to avoid assert and
    divide-by-zero error in dcn_bw_ functions.
    
    [How]
    Add check for granularity 0.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
    Signed-off-by: Roman Li <Roman.Li@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit f6e09701c3eb2ccb8cb0518e0b67f1c69742a4ec)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: increase MAX_SURFACES to the value supported by hw [+ + +]
Author: Melissa Wen <mwen@igalia.com>
Date:   Tue Dec 17 17:45:04 2024 -0300

    drm/amd/display: increase MAX_SURFACES to the value supported by hw
    
    commit 21541bc6b44241e3f791f9e552352d8440b2b29e upstream.
    
    As the hw supports up to 4 surfaces, increase the maximum number of
    surfaces to prevent the DC error when trying to use more than three
    planes.
    
    [drm:dc_state_add_plane [amdgpu]] *ERROR* Surface: can not attach plane_state 000000003e2cb82c! Maximum is: 3
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693
    Signed-off-by: Melissa Wen <mwen@igalia.com>
    Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit b8d6daffc871a42026c3c20bff7b8fa0302298c1)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amdkfd: fixed page fault when enable MES shader debugger [+ + +]
Author: Jesse.zhang@amd.com <Jesse.zhang@amd.com>
Date:   Wed Dec 18 18:23:52 2024 +0800

    drm/amdkfd: fixed page fault when enable MES shader debugger
    
    commit 9738609449c3e44d1afb73eecab4763362b57930 upstream.
    
    Initialize the process context address before setting the shader debugger.
    
    [  260.781212] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0)
    [  260.781236] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 10
    [  260.781255] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A40
    [  260.781270] amdgpu 0000:03:00.0: amdgpu:      Faulty UTCL2 client ID: CPC (0x5)
    [  260.781284] amdgpu 0000:03:00.0: amdgpu:      MORE_FAULTS: 0x0
    [  260.781296] amdgpu 0000:03:00.0: amdgpu:      WALKER_ERROR: 0x0
    [  260.781308] amdgpu 0000:03:00.0: amdgpu:      PERMISSION_FAULTS: 0x4
    [  260.781320] amdgpu 0000:03:00.0: amdgpu:      MAPPING_ERROR: 0x0
    [  260.781332] amdgpu 0000:03:00.0: amdgpu:      RW: 0x1
    [  260.782017] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0)
    [  260.782039] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 10
    [  260.782058] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41
    [  260.782073] amdgpu 0000:03:00.0: amdgpu:      Faulty UTCL2 client ID: CPC (0x5)
    [  260.782087] amdgpu 0000:03:00.0: amdgpu:      MORE_FAULTS: 0x1
    [  260.782098] amdgpu 0000:03:00.0: amdgpu:      WALKER_ERROR: 0x0
    [  260.782110] amdgpu 0000:03:00.0: amdgpu:      PERMISSION_FAULTS: 0x4
    [  260.782122] amdgpu 0000:03:00.0: amdgpu:      MAPPING_ERROR: 0x0
    [  260.782137] amdgpu 0000:03:00.0: amdgpu:      RW: 0x1
    [  260.782155] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0)
    [  260.782166] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 10
    
    Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd")
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3849
    Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 5b231f5bc9ff02ec5737f2ec95cdf15ac95088e9)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/mediatek: Add return value check when reading DPCD [+ + +]
Author: Liankun Yang <liankun.yang@mediatek.com>
Date:   Wed Dec 18 19:34:07 2024 +0800

    drm/mediatek: Add return value check when reading DPCD
    
    [ Upstream commit 522908140645865dc3e2fac70fd3b28834dfa7be ]
    
    Check the return value of drm_dp_dpcd_readb() to confirm that
    AUX communication is successful. To simplify the code, replace
    drm_dp_dpcd_readb() and DP_GET_SINK_COUNT() with drm_dp_read_sink_count().
    
    Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
    Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
    Reviewed-by: Guillaume Ranquet <granquet@baylibre.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218113448.2992-1-liankun.yang@mediatek.com/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/mediatek: Fix mode valid issue for dp [+ + +]
Author: Liankun Yang <liankun.yang@mediatek.com>
Date:   Fri Oct 25 16:28:28 2024 +0800

    drm/mediatek: Fix mode valid issue for dp
    
    [ Upstream commit 0d68b55887cedc7487036ed34cb4c2097c4228f1 ]
    
    Fix dp mode valid issue to avoid abnormal display of limit state.
    
    After DP passes link training, it can express the lane count of the
    current link status is good. Calculate the maximum bandwidth supported
    by DP using the current lane count.
    
    The color format will select the best one based on the bandwidth
    requirements of the current timing mode. If the current timing mode
    uses RGB and meets the DP link bandwidth requirements, RGB will be used.
    
    If the timing mode uses RGB but does not meet the DP link bandwidthi
    requirements, it will continue to check whether YUV422 meets
    the DP link bandwidth.
    
    FEC overhead is approximately 2.4% from DP 1.4a spec 2.2.1.4.2.
    The down-spread amplitude shall either be disabled (0.0%) or up
    to 0.5% from 1.4a 3.5.2.6. Add up to approximately 3% total overhead.
    
    Because rate is already divided by 10,
    mode->clock does not need to be multiplied by 10.
    
    Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
    Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-3-liankun.yang@mediatek.com/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/mediatek: Fix YCbCr422 color format issue for DP [+ + +]
Author: Liankun Yang <liankun.yang@mediatek.com>
Date:   Fri Oct 25 16:28:27 2024 +0800

    drm/mediatek: Fix YCbCr422 color format issue for DP
    
    [ Upstream commit ef24fbd8f12015ff827973fffefed3902ffd61cc ]
    
    Setting up misc0 for Pixel Encoding Format.
    
    According to the definition of YCbCr in spec 1.2a Table 2-96,
    0x1 << 1 should be written to the register.
    
    Use switch case to distinguish RGB, YCbCr422,
    and unsupported color formats.
    
    Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
    Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-2-liankun.yang@mediatek.com/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/mediatek: Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported [+ + +]
Author: Daniel Golle <daniel@makrotopia.org>
Date:   Tue Dec 17 01:18:01 2024 +0000

    drm/mediatek: Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported
    
    [ Upstream commit f8d9b91739e1fb436447c437a346a36deb676a36 ]
    
    Touching DISP_REG_OVL_PITCH_MSB leads to video overlay on MT2701, MT7623N
    and probably other older SoCs being broken.
    
    Move setting up AFBC layer configuration into a separate function only
    being called on hardware which actually supports AFBC which restores the
    behavior as it was before commit c410fa9b07c3 ("drm/mediatek: Add AFBC
    support to Mediatek DRM driver") on non-AFBC hardware.
    
    Fixes: c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Daniel Golle <daniel@makrotopia.org>
    Reviewed-by: CK Hu <ck.hu@mediatek.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/c7fbd3c3e633c0b7dd6d1cd78ccbdded31e1ca0f.1734397800.git.daniel@makrotopia.org/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/mediatek: Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err [+ + +]
Author: Guoqing Jiang <guoqing.jiang@canonical.com>
Date:   Mon Dec 23 10:32:27 2024 +0800

    drm/mediatek: Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err
    
    [ Upstream commit 36684e9d88a2e2401ae26715a2e217cb4295cea7 ]
    
    The pointer need to be set to NULL, otherwise KASAN complains about
    use-after-free. Because in mtk_drm_bind, all private's drm are set
    as follows.
    
    private->all_drm_private[i]->drm = drm;
    
    And drm will be released by drm_dev_put in case mtk_drm_kms_init returns
    failure. However, the shutdown path still accesses the previous allocated
    memory in drm_atomic_helper_shutdown.
    
    [   84.874820] watchdog: watchdog0: watchdog did not stop!
    [   86.512054] ==================================================================
    [   86.513162] BUG: KASAN: use-after-free in drm_atomic_helper_shutdown+0x33c/0x378
    [   86.514258] Read of size 8 at addr ffff0000d46fc068 by task shutdown/1
    [   86.515213]
    [   86.515455] CPU: 1 UID: 0 PID: 1 Comm: shutdown Not tainted 6.13.0-rc1-mtk+gfa1a78e5d24b-dirty #55
    [   86.516752] Hardware name: Unknown Product/Unknown Product, BIOS 2022.10 10/01/2022
    [   86.517960] Call trace:
    [   86.518333]  show_stack+0x20/0x38 (C)
    [   86.518891]  dump_stack_lvl+0x90/0xd0
    [   86.519443]  print_report+0xf8/0x5b0
    [   86.519985]  kasan_report+0xb4/0x100
    [   86.520526]  __asan_report_load8_noabort+0x20/0x30
    [   86.521240]  drm_atomic_helper_shutdown+0x33c/0x378
    [   86.521966]  mtk_drm_shutdown+0x54/0x80
    [   86.522546]  platform_shutdown+0x64/0x90
    [   86.523137]  device_shutdown+0x260/0x5b8
    [   86.523728]  kernel_restart+0x78/0xf0
    [   86.524282]  __do_sys_reboot+0x258/0x2f0
    [   86.524871]  __arm64_sys_reboot+0x90/0xd8
    [   86.525473]  invoke_syscall+0x74/0x268
    [   86.526041]  el0_svc_common.constprop.0+0xb0/0x240
    [   86.526751]  do_el0_svc+0x4c/0x70
    [   86.527251]  el0_svc+0x4c/0xc0
    [   86.527719]  el0t_64_sync_handler+0x144/0x168
    [   86.528367]  el0t_64_sync+0x198/0x1a0
    [   86.528920]
    [   86.529157] The buggy address belongs to the physical page:
    [   86.529972] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff0000d46fd4d0 pfn:0x1146fc
    [   86.531319] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff)
    [   86.532267] raw: 0bfffc0000000000 0000000000000000 dead000000000122 0000000000000000
    [   86.533390] raw: ffff0000d46fd4d0 0000000000000000 00000000ffffffff 0000000000000000
    [   86.534511] page dumped because: kasan: bad access detected
    [   86.535323]
    [   86.535559] Memory state around the buggy address:
    [   86.536265]  ffff0000d46fbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    [   86.537314]  ffff0000d46fbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    [   86.538363] >ffff0000d46fc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    [   86.544733]                                                           ^
    [   86.551057]  ffff0000d46fc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    [   86.557510]  ffff0000d46fc100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    [   86.563928] ==================================================================
    [   86.571093] Disabling lock debugging due to kernel taint
    [   86.577642] Unable to handle kernel paging request at virtual address e0e9c0920000000b
    [   86.581834] KASAN: maybe wild-memory-access in range [0x0752049000000058-0x075204900000005f]
    ...
    
    Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support")
    Signed-off-by: Guoqing Jiang <guoqing.jiang@canonical.com>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223023227.1258112-1-guoqing.jiang@canonical.com/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/mediatek: stop selecting foreign drivers [+ + +]
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Dec 18 09:58:31 2024 +0100

    drm/mediatek: stop selecting foreign drivers
    
    [ Upstream commit 924d66011f2401a4145e2e814842c5c4572e439f ]
    
    The PHY portion of the mediatek hdmi driver was originally part of
    the driver it self and later split out into drivers/phy, which a
    'select' to keep the prior behavior.
    
    However, this leads to build failures when the PHY driver cannot
    be built:
    
    WARNING: unmet direct dependencies detected for PHY_MTK_HDMI
      Depends on [n]: (ARCH_MEDIATEK || COMPILE_TEST [=y]) && COMMON_CLK [=y] && OF [=y] && REGULATOR [=n]
      Selected by [m]:
      - DRM_MEDIATEK_HDMI [=m] && HAS_IOMEM [=y] && DRM [=m] && DRM_MEDIATEK [=m]
    ERROR: modpost: "devm_regulator_register" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined!
    ERROR: modpost: "rdev_get_drvdata" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined!
    
    The best option here is to just not select the phy driver and leave that
    up to the defconfig. Do the same for the other PHY and memory drivers
    selected here as well for consistency.
    
    Fixes: a481bf2f0ca4 ("drm/mediatek: Separate mtk_hdmi_phy to an independent module")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Reviewed-by: CK Hu <ck.hu@mediatek.com>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218085837.2670434-1-arnd@kernel.org/
    Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
erofs: fix PSI memstall accounting [+ + +]
Author: Gao Xiang <xiang@kernel.org>
Date:   Wed Jan 8 23:15:20 2025 +0800

    erofs: fix PSI memstall accounting
    
    commit 1a2180f6859c73c674809f9f82e36c94084682ba upstream.
    
    Max Kellermann recently reported psi_group_cpu.tasks[NR_MEMSTALL] is
    incorrect in the 6.11.9 kernel.
    
    The root cause appears to be that, since the problematic commit, bio
    can be NULL, causing psi_memstall_leave() to be skipped in
    z_erofs_submit_queue().
    
    Reported-by: Max Kellermann <max.kellermann@ionos.com>
    Closes: https://lore.kernel.org/r/CAKPOu+8tvSowiJADW2RuKyofL_CSkm_SuyZA7ME5vMLWmL6pqw@mail.gmail.com
    Fixes: 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of crafted images properly")
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20241127085236.3538334-1-hsiangkao@linux.alibaba.com
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

erofs: handle overlapped pclusters out of crafted images properly [+ + +]
Author: Gao Xiang <xiang@kernel.org>
Date:   Wed Jan 8 23:15:19 2025 +0800

    erofs: handle overlapped pclusters out of crafted images properly
    
    commit 9e2f9d34dd12e6e5b244ec488bcebd0c2d566c50 upstream.
    
    syzbot reported a task hang issue due to a deadlock case where it is
    waiting for the folio lock of a cached folio that will be used for
    cache I/Os.
    
    After looking into the crafted fuzzed image, I found it's formed with
    several overlapped big pclusters as below:
    
     Ext:   logical offset   |  length :     physical offset    |  length
       0:        0..   16384 |   16384 :     151552..    167936 |   16384
       1:    16384..   32768 |   16384 :     155648..    172032 |   16384
       2:    32768..   49152 |   16384 :  537223168.. 537239552 |   16384
    ...
    
    Here, extent 0/1 are physically overlapped although it's entirely
    _impossible_ for normal filesystem images generated by mkfs.
    
    First, managed folios containing compressed data will be marked as
    up-to-date and then unlocked immediately (unlike in-place folios) when
    compressed I/Os are complete.  If physical blocks are not submitted in
    the incremental order, there should be separate BIOs to avoid dependency
    issues.  However, the current code mis-arranges z_erofs_fill_bio_vec()
    and BIO submission which causes unexpected BIO waits.
    
    Second, managed folios will be connected to their own pclusters for
    efficient inter-queries.  However, this is somewhat hard to implement
    easily if overlapped big pclusters exist.  Again, these only appear in
    fuzzed images so let's simply fall back to temporary short-lived pages
    for correctness.
    
    Additionally, it justifies that referenced managed folios cannot be
    truncated for now and reverts part of commit 2080ca1ed3e4 ("erofs: tidy
    up `struct z_erofs_bvec`") for simplicity although it shouldn't be any
    difference.
    
    Reported-by: syzbot+4fc98ed414ae63d1ada2@syzkaller.appspotmail.com
    Reported-by: syzbot+de04e06b28cfecf2281c@syzkaller.appspotmail.com
    Reported-by: syzbot+c8c8238b394be4a1087d@syzkaller.appspotmail.com
    Tested-by: syzbot+4fc98ed414ae63d1ada2@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/r/0000000000002fda01061e334873@google.com
    Fixes: 8e6c8fa9f2e9 ("erofs: enable big pcluster feature")
    Link: https://lore.kernel.org/r/20240910070847.3356592-1-hsiangkao@linux.alibaba.com
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
eth: gve: use appropriate helper to set xdp_features [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Jan 6 10:02:10 2025 -0800

    eth: gve: use appropriate helper to set xdp_features
    
    [ Upstream commit db78475ba0d3c66d430f7ded2388cc041078a542 ]
    
    Commit f85949f98206 ("xdp: add xdp_set_features_flag utility routine")
    added routines to inform the core about XDP flag changes.
    GVE support was added around the same time and missed using them.
    
    GVE only changes the flags on error recover or resume.
    Presumably the flags may change during resume if VM migrated.
    User would not get the notification and upper devices would
    not get a chance to recalculate their flags.
    
    Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format")
    Reviewed-By: Jeroen de Borst <jeroendb@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
exfat: fix the infinite loop in __exfat_free_cluster() [+ + +]
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date:   Mon Dec 16 13:39:42 2024 +0800

    exfat: fix the infinite loop in __exfat_free_cluster()
    
    [ Upstream commit a5324b3a488d883aa2d42f72260054e87d0940a0 ]
    
    In __exfat_free_cluster(), the cluster chain is traversed until the
    EOF cluster. If the cluster chain includes a loop due to file system
    corruption, the EOF cluster cannot be traversed, resulting in an
    infinite loop.
    
    This commit uses the total number of clusters to prevent this infinite
    loop.
    
    Reported-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=1de5a37cb85a2d536330
    Tested-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com
    Fixes: 31023864e67a ("exfat: add fat entry operations")
    Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
    Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

exfat: fix the infinite loop in exfat_readdir() [+ + +]
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date:   Fri Dec 13 13:08:37 2024 +0800

    exfat: fix the infinite loop in exfat_readdir()
    
    [ Upstream commit fee873761bd978d077d8c55334b4966ac4cb7b59 ]
    
    If the file system is corrupted so that a cluster is linked to
    itself in the cluster chain, and there is an unused directory
    entry in the cluster, 'dentry' will not be incremented, causing
    condition 'dentry < max_dentries' unable to prevent an infinite
    loop.
    
    This infinite loop causes s_lock not to be released, and other
    tasks will hang, such as exfat_sync_fs().
    
    This commit stops traversing the cluster chain when there is unused
    directory entry in the cluster to avoid this infinite loop.
    
    Reported-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=205c2644abdff9d3f9fc
    Tested-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com
    Fixes: ca06197382bd ("exfat: add directory operations")
    Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
    Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
f2fs: fix null-ptr-deref in f2fs_submit_page_bio() [+ + +]
Author: Ye Bin <yebin10@huawei.com>
Date:   Sat Oct 12 00:44:50 2024 +0800

    f2fs: fix null-ptr-deref in f2fs_submit_page_bio()
    
    commit b7d0a97b28083084ebdd8e5c6bccd12e6ec18faa upstream.
    
    There's issue as follows when concurrently installing the f2fs.ko
    module and mounting the f2fs file system:
    KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
    RIP: 0010:__bio_alloc+0x2fb/0x6c0 [f2fs]
    Call Trace:
     <TASK>
     f2fs_submit_page_bio+0x126/0x8b0 [f2fs]
     __get_meta_page+0x1d4/0x920 [f2fs]
     get_checkpoint_version.constprop.0+0x2b/0x3c0 [f2fs]
     validate_checkpoint+0xac/0x290 [f2fs]
     f2fs_get_valid_checkpoint+0x207/0x950 [f2fs]
     f2fs_fill_super+0x1007/0x39b0 [f2fs]
     mount_bdev+0x183/0x250
     legacy_get_tree+0xf4/0x1e0
     vfs_get_tree+0x88/0x340
     do_new_mount+0x283/0x5e0
     path_mount+0x2b2/0x15b0
     __x64_sys_mount+0x1fe/0x270
     do_syscall_64+0x5f/0x170
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Above issue happens as the biset of the f2fs file system is not
    initialized before register "f2fs_fs_type".
    To address above issue just register "f2fs_fs_type" at the last in
    init_f2fs_fs(). Ensure that all f2fs file system resources are
    initialized.
    
    Fixes: f543805fcd60 ("f2fs: introduce private bioset")
    Signed-off-by: Ye Bin <yebin10@huawei.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Bin Lan <lanbincn@qq.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
fs/Kconfig: make hugetlbfs a menuconfig [+ + +]
Author: Peter Xu <peterx@redhat.com>
Date:   Fri Nov 24 10:19:02 2023 -0500

    fs/Kconfig: make hugetlbfs a menuconfig
    
    [ Upstream commit cddba0af0b7919e93134469f6fdf29a7d362768a ]
    
    Hugetlb vmemmap default option (HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON)
    is a sub-option to hugetlbfs, but it shows in the same level as hugetlbfs
    itself, under "Pesudo filesystems".
    
    Make the vmemmap option a sub-option to hugetlbfs, by changing hugetlbfs
    into a menuconfig.  When moving it, fix a typo 'v' spot by Randy.
    
    Link: https://lkml.kernel.org/r/20231124151902.1075697-1-peterx@redhat.com
    Signed-off-by: Peter Xu <peterx@redhat.com>
    Reviewed-by: Muchun Song <songmuchun@bytedance.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur [+ + +]
Author: Daniil Stas <daniil.stas@posteo.net>
Date:   Sun Jan 5 21:36:18 2025 +0000

    hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur
    
    [ Upstream commit 82163d63ae7a4c36142cd252388737205bb7e4b9 ]
    
    scsi_execute_cmd() function can return both negative (linux codes) and
    positive (scsi_cmnd result field) error codes.
    
    Currently the driver just passes error codes of scsi_execute_cmd() to
    hwmon core, which is incorrect because hwmon only checks for negative
    error codes. This leads to hwmon reporting uninitialized data to
    userspace in case of SCSI errors (for example if the disk drive was
    disconnected).
    
    This patch checks scsi_execute_cmd() output and returns -EIO if it's
    error code is positive.
    
    Fixes: 5b46903d8bf37 ("hwmon: Driver for disk and solid state drives with temperature sensors")
    Signed-off-by: Daniil Stas <daniil.stas@posteo.net>
    Cc: Guenter Roeck <linux@roeck-us.net>
    Cc: Chris Healy <cphealy@gmail.com>
    Cc: Linus Walleij <linus.walleij@linaro.org>
    Cc: Martin K. Petersen <martin.petersen@oracle.com>
    Cc: Bart Van Assche <bvanassche@acm.org>
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-scsi@vger.kernel.org
    Cc: linux-ide@vger.kernel.org
    Cc: linux-hwmon@vger.kernel.org
    Link: https://lore.kernel.org/r/20250105213618.531691-1-daniil.stas@posteo.net
    [groeck: Avoid inline variable declaration for portability]
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ice: fix incorrect PHY settings for 100 GB/s [+ + +]
Author: Przemyslaw Korba <przemyslaw.korba@intel.com>
Date:   Wed Dec 4 14:22:18 2024 +0100

    ice: fix incorrect PHY settings for 100 GB/s
    
    [ Upstream commit 6c5b989116083a98f45aada548ff54e7a83a9c2d ]
    
    ptp4l application reports too high offset when ran on E823 device
    with a 100GB/s link. Those values cannot go under 100ns, like in a
    working case when using 100 GB/s cable.
    
    This is due to incorrect frequency settings on the PHY clocks for
    100 GB/s speed. Changes are introduced to align with the internal
    hardware documentation, and correctly initialize frequency in PHY
    clocks with the frequency values that are in our HW spec.
    
    To reproduce the issue run ptp4l as a Time Receiver on E823 device,
    and observe the offset, which will never approach values seen
    in the PTP working case.
    
    Reproduction output:
    ptp4l -i enp137s0f3 -m -2 -s -f /etc/ptp4l_8275.conf
    ptp4l[5278.775]: master offset      12470 s2 freq  +41288 path delay -3002
    ptp4l[5278.837]: master offset      10525 s2 freq  +39202 path delay -3002
    ptp4l[5278.900]: master offset     -24840 s2 freq  -20130 path delay -3002
    ptp4l[5278.963]: master offset      10597 s2 freq  +37908 path delay -3002
    ptp4l[5279.025]: master offset       8883 s2 freq  +36031 path delay -3002
    ptp4l[5279.088]: master offset       7267 s2 freq  +34151 path delay -3002
    ptp4l[5279.150]: master offset       5771 s2 freq  +32316 path delay -3002
    ptp4l[5279.213]: master offset       4388 s2 freq  +30526 path delay -3002
    ptp4l[5279.275]: master offset     -30434 s2 freq  -28485 path delay -3002
    ptp4l[5279.338]: master offset     -28041 s2 freq  -27412 path delay -3002
    ptp4l[5279.400]: master offset       7870 s2 freq  +31118 path delay -3002
    
    Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support")
    Reviewed-by: Milena Olech <milena.olech@intel.com>
    Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com>
    Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe() [+ + +]
Author: Keisuke Nishimura <keisuke.nishimura@inria.fr>
Date:   Tue Oct 29 19:27:12 2024 +0100

    ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe()
    
    [ Upstream commit 2c87309ea741341c6722efdf1fb3f50dd427c823 ]
    
    ca8210_test_interface_init() returns the result of kfifo_alloc(),
    which can be non-zero in case of an error. The caller, ca8210_probe(),
    should check the return value and do error-handling if it fails.
    
    Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver")
    Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/20241029182712.318271-1-keisuke.nishimura@inria.fr
    Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
igc: field get conversion [+ + +]
Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date:   Tue Dec 5 17:01:09 2023 -0800

    igc: field get conversion
    
    [ Upstream commit a8e0c7a6800dc466ac815264c16971b9adf7ffbd ]
    
    Refactor the igc driver to use FIELD_GET() for mask and shift reads,
    which reduces lines of code and adds clarity of intent.
    
    This code was generated by the following coccinelle/spatch script and
    then manually repaired in a later patch.
    
    @get@
    constant shift,mask;
    type T;
    expression a;
    @@
    -((T)((a) & mask) >> shift)
    +FIELD_GET(mask, a)
    
    and applied via:
    spatch --sp-file field_prep.cocci --in-place --dir \
     drivers/net/ethernet/intel/
    
    Cc: Julia Lawall <Julia.Lawall@inria.fr>
    Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: bd2776e39c2a ("igc: return early when failing to read EECD register")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igc: return early when failing to read EECD register [+ + +]
Author: En-Wei Wu <en-wei.wu@canonical.com>
Date:   Wed Dec 18 10:37:42 2024 +0800

    igc: return early when failing to read EECD register
    
    [ Upstream commit bd2776e39c2a82ef4681d02678bb77b3d41e79be ]
    
    When booting with a dock connected, the igc driver may get stuck for ~40
    seconds if PCIe link is lost during initialization.
    
    This happens because the driver access device after EECD register reads
    return all F's, indicating failed reads. Consequently, hw->hw_addr is set
    to NULL, which impacts subsequent rd32() reads. This leads to the driver
    hanging in igc_get_hw_semaphore_i225(), as the invalid hw->hw_addr
    prevents retrieving the expected value.
    
    To address this, a validation check and a corresponding return value
    catch is added for the EECD register read result. If all F's are
    returned, indicating PCIe link loss, the driver will return -ENXIO
    immediately. This avoids the 40-second hang and significantly improves
    boot time when using a dock with an igc NIC.
    
    Log before the patch:
    [    0.911913] igc 0000:70:00.0: enabling device (0000 -> 0002)
    [    0.912386] igc 0000:70:00.0: PTM enabled, 4ns granularity
    [    1.571098] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
    [   43.449095] igc_get_hw_semaphore_i225: igc 0000:70:00.0 (unnamed net_device) (uninitialized): Driver can't access device - SMBI bit is set.
    [   43.449186] igc 0000:70:00.0: probe with driver igc failed with error -13
    [   46.345701] igc 0000:70:00.0: enabling device (0000 -> 0002)
    [   46.345777] igc 0000:70:00.0: PTM enabled, 4ns granularity
    
    Log after the patch:
    [    1.031000] igc 0000:70:00.0: enabling device (0000 -> 0002)
    [    1.032097] igc 0000:70:00.0: PTM enabled, 4ns granularity
    [    1.642291] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
    [    5.480490] igc 0000:70:00.0: enabling device (0000 -> 0002)
    [    5.480516] igc 0000:70:00.0: PTM enabled, 4ns granularity
    
    Fixes: ab4056126813 ("igc: Add NVM support")
    Cc: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
    Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
    Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iio: adc: ad7124: Disable all channels at probe time [+ + +]
Author: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Date:   Mon Nov 4 11:19:04 2024 +0100

    iio: adc: ad7124: Disable all channels at probe time
    
    commit 4be339af334c283a1a1af3cb28e7e448a0aa8a7c upstream.
    
    When during a measurement two channels are enabled, two measurements are
    done that are reported sequencially in the DATA register. As the code
    triggered by reading one of the sysfs properties expects that only one
    channel is enabled it only reads the first data set which might or might
    not belong to the intended channel.
    
    To prevent this situation disable all channels during probe. This fixes
    a problem in practise because the reset default for channel 0 is
    enabled. So all measurements before the first measurement on channel 0
    (which disables channel 0 at the end) might report wrong values.
    
    Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels")
    Reviewed-by: Nuno Sa <nuno.sa@analog.com>
    Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
    Link: https://patch.msgid.link/20241104101905.845737-2-u.kleine-koenig@baylibre.com
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adc: at91: call input_free_device() on allocated iio_dev [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Sat Dec 7 13:30:45 2024 +0900

    iio: adc: at91: call input_free_device() on allocated iio_dev
    
    commit de6a73bad1743e9e81ea5a24c178c67429ff510b upstream.
    
    Current implementation of at91_ts_register() calls input_free_deivce()
    on st->ts_input, however, the err label can be reached before the
    allocated iio_dev is stored to st->ts_input. Thus call
    input_free_device() on input instead of st->ts_input.
    
    Fixes: 84882b060301 ("iio: adc: at91_adc: Add support for touchscreens without TSMR")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Link: https://patch.msgid.link/20241207043045.1255409-1-joe@pf.is.s.u-tokyo.ac.jp
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adc: rockchip_saradc: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:12 2024 +0100

    iio: adc: rockchip_saradc: fix information leak in triggered buffer
    
    commit 38724591364e1e3b278b4053f102b49ea06ee17c upstream.
    
    The 'data' local struct is used to push data to user space from a
    triggered buffer, but it does not set values for inactive channels, as
    it only uses iio_for_each_active_channel() to assign new values.
    
    Initialize the struct to zero before using it to avoid pushing
    uninitialized information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep() [+ + +]
Author: Fabio Estevam <festevam@gmail.com>
Date:   Fri Nov 22 13:43:08 2024 -0300

    iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep()
    
    commit 2a8e34096ec70d73ebb6d9920688ea312700cbd9 upstream.
    
    Using gpiod_set_value() to control the reset GPIO causes some verbose
    warnings during boot when the reset GPIO is controlled by an I2C IO
    expander.
    
    As the caller can sleep, use the gpiod_set_value_cansleep() variant to
    fix the issue.
    
    Tested on a custom i.MX93 board with a ADS124S08 ADC.
    
    Cc: stable@kernel.org
    Fixes: e717f8c6dfec ("iio: adc: Add the TI ads124s08 ADC code")
    Signed-off-by: Fabio Estevam <festevam@gmail.com>
    Link: https://patch.msgid.link/20241122164308.390340-1-festevam@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adc: ti-ads8688: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:16 2024 +0100

    iio: adc: ti-ads8688: fix information leak in triggered buffer
    
    commit 2a7377ccfd940cd6e9201756aff1e7852c266e69 upstream.
    
    The 'buffer' local array is used to push data to user space from a
    triggered buffer, but it does not set values for inactive channels, as
    it only uses iio_for_each_active_channel() to assign new values.
    
    Initialize the array to zero before using it to avoid pushing
    uninitialized information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: 61fa5dfa5f52 ("iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-8-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:17 2024 +0100

    iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer
    
    commit 333be433ee908a53f283beb95585dfc14c8ffb46 upstream.
    
    The 'data' array is allocated via kmalloc() and it is used to push data
    to user space from a triggered buffer, but it does not set values for
    inactive channels, as it only uses iio_for_each_active_channel()
    to assign new values.
    
    Use kzalloc for the memory allocation to avoid pushing uninitialized
    information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: 415f79244757 ("iio: Move IIO Dummy Driver out of staging")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-9-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: gyro: fxas21002c: Fix missing data update in trigger handler [+ + +]
Author: Carlos Song <carlos.song@nxp.com>
Date:   Sat Nov 16 10:29:45 2024 -0500

    iio: gyro: fxas21002c: Fix missing data update in trigger handler
    
    commit fa13ac6cdf9b6c358e7d77c29fb60145c7a87965 upstream.
    
    The fxas21002c_trigger_handler() may fail to acquire sample data because
    the runtime PM enters the autosuspend state and sensor can not return
    sample data in standby mode..
    
    Resume the sensor before reading the sample data into the buffer within the
    trigger handler. After the data is read, place the sensor back into the
    autosuspend state.
    
    Fixes: a0701b6263ae ("iio: gyro: add core driver for fxas21002c")
    Signed-off-by: Carlos Song <carlos.song@nxp.com>
    Signed-off-by: Frank Li <Frank.Li@nxp.com>
    Link: https://patch.msgid.link/20241116152945.4006374-1-Frank.Li@nxp.com
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on [+ + +]
Author: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Date:   Wed Nov 13 21:25:45 2024 +0100

    iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on
    
    commit 65a60a590142c54a3f3be11ff162db2d5b0e1e06 upstream.
    
    Currently suspending while sensors are one will result in timestamping
    continuing without gap at resume. It can work with monotonic clock but
    not with other clocks. Fix that by resetting timestamping.
    
    Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
    Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend-v1-1-dfc77c394173@tdk.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: imu: kmx61: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:13 2024 +0100

    iio: imu: kmx61: fix information leak in triggered buffer
    
    commit 6ae053113f6a226a2303caa4936a4c37f3bfff7b upstream.
    
    The 'buffer' local array is used to push data to user space from a
    triggered buffer, but it does not set values for inactive channels, as
    it only uses iio_for_each_active_channel() to assign new values.
    
    Initialize the array to zero before using it to avoid pushing
    uninitialized information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: c3a23ecc0901 ("iio: imu: kmx61: Add support for data ready triggers")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-5-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: inkern: call iio_device_put() only on mapped devices [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Wed Dec 4 20:13:42 2024 +0900

    iio: inkern: call iio_device_put() only on mapped devices
    
    commit 64f43895b4457532a3cc524ab250b7a30739a1b1 upstream.
    
    In the error path of iio_channel_get_all(), iio_device_put() is called
    on all IIO devices, which can cause a refcount imbalance. Fix this error
    by calling iio_device_put() only on IIO devices whose refcounts were
    previously incremented by iio_device_get().
    
    Fixes: 314be14bb893 ("iio: Rename _st_ functions to loose the bit that meant the staging version.")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Link: https://patch.msgid.link/20241204111342.1246706-1-joe@pf.is.s.u-tokyo.ac.jp
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: light: vcnl4035: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:14 2024 +0100

    iio: light: vcnl4035: fix information leak in triggered buffer
    
    commit 47b43e53c0a0edf5578d5d12f5fc71c019649279 upstream.
    
    The 'buffer' local array is used to push data to userspace from a
    triggered buffer, but it does not set an initial value for the single
    data element, which is an u16 aligned to 8 bytes. That leaves at least
    4 bytes uninitialized even after writing an integer value with
    regmap_read().
    
    Initialize the array to zero before using it to avoid pushing
    uninitialized information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: ec90b52c07c0 ("iio: light: vcnl4035: Fix buffer alignment in iio_push_to_buffers_with_timestamp()")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-6-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: pressure: zpa2326: fix information leak in triggered buffer [+ + +]
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Mon Nov 25 22:16:11 2024 +0100

    iio: pressure: zpa2326: fix information leak in triggered buffer
    
    commit 6007d10c5262f6f71479627c1216899ea7f09073 upstream.
    
    The 'sample' local struct is used to push data to user space from a
    triggered buffer, but it has a hole between the temperature and the
    timestamp (u32 pressure, u16 temperature, GAP, u64 timestamp).
    This hole is never initialized.
    
    Initialize the struct to zero before using it to avoid pushing
    uninitialized information to userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: 03b262f2bbf4 ("iio:pressure: initial zpa2326 barometer support")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-3-0cb6e98d895c@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period [+ + +]
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed Jan 8 11:16:13 2025 -0700

    io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period
    
    Commit c9a40292a44e78f71258b8522655bffaf5753bdb upstream.
    
    io_eventfd_do_signal() is invoked from an RCU callback, but when
    dropping the reference to the io_ev_fd, it calls io_eventfd_free()
    directly if the refcount drops to zero. This isn't correct, as any
    potential freeing of the io_ev_fd should be deferred another RCU grace
    period.
    
    Just call io_eventfd_put() rather than open-code the dec-and-test and
    free, which will correctly defer it another RCU grace period.
    
    Fixes: 21a091b970cd ("io_uring: signal registered eventfd to process deferred task work")
    Reported-by: Jann Horn <jannh@google.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
io_uring/timeout: fix multishot updates [+ + +]
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Sat Jan 4 18:29:02 2025 +0000

    io_uring/timeout: fix multishot updates
    
    commit c83c846231db8b153bfcb44d552d373c34f78245 upstream.
    
    After update only the first shot of a multishot timeout request adheres
    to the new timeout value while all subsequent retries continue to use
    the old value. Don't forget to update the timeout stored in struct
    io_timeout_data.
    
    Cc: stable@vger.kernel.org
    Fixes: ea97f6c8558e8 ("io_uring: add support for multishot timeouts")
    Reported-by: Christian Mazakas <christian.mazakas@gmail.com>
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/e6516c3304eb654ec234cfa65c88a9579861e597.1736015288.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ipvlan: Fix use-after-free in ipvlan_get_iflink(). [+ + +]
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Jan 6 16:19:11 2025 +0900

    ipvlan: Fix use-after-free in ipvlan_get_iflink().
    
    [ Upstream commit cb358ff94154774d031159b018adf45e17673941 ]
    
    syzbot presented an use-after-free report [0] regarding ipvlan and
    linkwatch.
    
    ipvlan does not hold a refcnt of the lower device unlike vlan and
    macvlan.
    
    If the linkwatch work is triggered for the ipvlan dev, the lower dev
    might have already been freed, resulting in UAF of ipvlan->phy_dev in
    ipvlan_get_iflink().
    
    We can delay the lower dev unregistration like vlan and macvlan by
    holding the lower dev's refcnt in dev->netdev_ops->ndo_init() and
    releasing it in dev->priv_destructor().
    
    Jakub pointed out calling .ndo_XXX after unregister_netdevice() has
    returned is error prone and suggested [1] addressing this UAF in the
    core by taking commit 750e51603395 ("net: avoid potential UAF in
    default_operstate()") further.
    
    Let's assume unregistering devices DOWN and use RCU protection in
    default_operstate() not to race with the device unregistration.
    
    [0]:
    BUG: KASAN: slab-use-after-free in ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353
    Read of size 4 at addr ffff0000d768c0e0 by task kworker/u8:35/6944
    
    CPU: 0 UID: 0 PID: 6944 Comm: kworker/u8:35 Not tainted 6.13.0-rc2-g9bc5c9515b48 #12 4c3cb9e8b4565456f6a355f312ff91f4f29b3c47
    Hardware name: linux,dummy-virt (DT)
    Workqueue: events_unbound linkwatch_event
    Call trace:
     show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:484 (C)
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120
     print_address_description mm/kasan/report.c:378 [inline]
     print_report+0x16c/0x6f0 mm/kasan/report.c:489
     kasan_report+0xc0/0x120 mm/kasan/report.c:602
     __asan_report_load4_noabort+0x20/0x30 mm/kasan/report_generic.c:380
     ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353
     dev_get_iflink+0x7c/0xd8 net/core/dev.c:674
     default_operstate net/core/link_watch.c:45 [inline]
     rfc2863_policy+0x144/0x360 net/core/link_watch.c:72
     linkwatch_do_dev+0x60/0x228 net/core/link_watch.c:175
     __linkwatch_run_queue+0x2f4/0x5b8 net/core/link_watch.c:239
     linkwatch_event+0x64/0xa8 net/core/link_watch.c:282
     process_one_work+0x700/0x1398 kernel/workqueue.c:3229
     process_scheduled_works kernel/workqueue.c:3310 [inline]
     worker_thread+0x8c4/0xe10 kernel/workqueue.c:3391
     kthread+0x2b0/0x360 kernel/kthread.c:389
     ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
    
    Allocated by task 9303:
     kasan_save_stack mm/kasan/common.c:47 [inline]
     kasan_save_track+0x30/0x68 mm/kasan/common.c:68
     kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568
     poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
     __kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394
     kasan_kmalloc include/linux/kasan.h:260 [inline]
     __do_kmalloc_node mm/slub.c:4283 [inline]
     __kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4289
     __kvmalloc_node_noprof+0x9c/0x230 mm/util.c:650
     alloc_netdev_mqs+0xb4/0x1118 net/core/dev.c:11209
     rtnl_create_link+0x2b8/0xb60 net/core/rtnetlink.c:3595
     rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3771
     __rtnl_newlink net/core/rtnetlink.c:3896 [inline]
     rtnl_newlink+0x122c/0x15c0 net/core/rtnetlink.c:4011
     rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901
     netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542
     rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928
     netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
     netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347
     netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891
     sock_sendmsg_nosec net/socket.c:711 [inline]
     __sock_sendmsg net/socket.c:726 [inline]
     __sys_sendto+0x2ec/0x438 net/socket.c:2197
     __do_sys_sendto net/socket.c:2204 [inline]
     __se_sys_sendto net/socket.c:2200 [inline]
     __arm64_sys_sendto+0xe4/0x110 net/socket.c:2200
     __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
     invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49
     el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132
     do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151
     el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744
     el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762
     el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
    
    Freed by task 10200:
     kasan_save_stack mm/kasan/common.c:47 [inline]
     kasan_save_track+0x30/0x68 mm/kasan/common.c:68
     kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582
     poison_slab_object mm/kasan/common.c:247 [inline]
     __kasan_slab_free+0x48/0x68 mm/kasan/common.c:264
     kasan_slab_free include/linux/kasan.h:233 [inline]
     slab_free_hook mm/slub.c:2338 [inline]
     slab_free mm/slub.c:4598 [inline]
     kfree+0x140/0x420 mm/slub.c:4746
     kvfree+0x4c/0x68 mm/util.c:693
     netdev_release+0x94/0xc8 net/core/net-sysfs.c:2034
     device_release+0x98/0x1c0
     kobject_cleanup lib/kobject.c:689 [inline]
     kobject_release lib/kobject.c:720 [inline]
     kref_put include/linux/kref.h:65 [inline]
     kobject_put+0x2b0/0x438 lib/kobject.c:737
     netdev_run_todo+0xdd8/0xf48 net/core/dev.c:10924
     rtnl_unlock net/core/rtnetlink.c:152 [inline]
     rtnl_net_unlock net/core/rtnetlink.c:209 [inline]
     rtnl_dellink+0x484/0x680 net/core/rtnetlink.c:3526
     rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901
     netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542
     rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928
     netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
     netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347
     netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891
     sock_sendmsg_nosec net/socket.c:711 [inline]
     __sock_sendmsg net/socket.c:726 [inline]
     ____sys_sendmsg+0x410/0x708 net/socket.c:2583
     ___sys_sendmsg+0x178/0x1d8 net/socket.c:2637
     __sys_sendmsg net/socket.c:2669 [inline]
     __do_sys_sendmsg net/socket.c:2674 [inline]
     __se_sys_sendmsg net/socket.c:2672 [inline]
     __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2672
     __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
     invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49
     el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132
     do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151
     el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744
     el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762
     el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
    
    The buggy address belongs to the object at ffff0000d768c000
     which belongs to the cache kmalloc-cg-4k of size 4096
    The buggy address is located 224 bytes inside of
     freed 4096-byte region [ffff0000d768c000, ffff0000d768d000)
    
    The buggy address belongs to the physical page:
    page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117688
    head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    memcg:ffff0000c77ef981
    flags: 0xbfffe0000000040(head|node=0|zone=2|lastcpupid=0x1ffff)
    page_type: f5(slab)
    raw: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122
    raw: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981
    head: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122
    head: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981
    head: 0bfffe0000000003 fffffdffc35da201 ffffffffffffffff 0000000000000000
    head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff0000d768bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff0000d768c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff0000d768c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                           ^
     ffff0000d768c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff0000d768c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    
    Fixes: 8c55facecd7a ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down")
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Suggested-by: Jakub Kicinski <kuba@kernel.org>
    Link: https://lore.kernel.org/netdev/20250102174400.085fd8ac@kernel.org/ [1]
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://patch.msgid.link/20250106071911.64355-1-kuniyu@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
jbd2: flush filesystem device before updating tail sequence [+ + +]
Author: Zhang Yi <yi.zhang@huawei.com>
Date:   Tue Dec 3 09:44:07 2024 +0800

    jbd2: flush filesystem device before updating tail sequence
    
    [ Upstream commit a0851ea9cd555c333795b85ddd908898b937c4e1 ]
    
    When committing transaction in jbd2_journal_commit_transaction(), the
    disk caches for the filesystem device should be flushed before updating
    the journal tail sequence. However, this step is missed if the journal
    is not located on the filesystem device. As a result, the filesystem may
    become inconsistent following a power failure or system crash. Fix it by
    ensuring that the filesystem device is flushed appropriately.
    
    Fixes: 3339578f0578 ("jbd2: cleanup journal tail after transaction commit")
    Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
    Link: https://lore.kernel.org/r/20241203014407.805916-3-yi.zhang@huaweicloud.com
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

jbd2: increase IO priority for writing revoke records [+ + +]
Author: Zhang Yi <yi.zhang@huawei.com>
Date:   Tue Dec 3 09:44:06 2024 +0800

    jbd2: increase IO priority for writing revoke records
    
    [ Upstream commit ac1e21bd8c883aeac2f1835fc93b39c1e6838b35 ]
    
    Commit '6a3afb6ac6df ("jbd2: increase the journal IO's priority")'
    increases the priority of journal I/O by marking I/O with the
    JBD2_JOURNAL_REQ_FLAGS. However, that commit missed the revoke buffers,
    so also addresses that kind of I/Os.
    
    Fixes: 6a3afb6ac6df ("jbd2: increase the journal IO's priority")
    Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
    Link: https://lore.kernel.org/r/20241203014407.805916-2-yi.zhang@huaweicloud.com
    Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ksmbd: fix a missing return value check bug [+ + +]
Author: Wentao Liang <liangwentao@iscas.ac.cn>
Date:   Mon Dec 23 23:30:50 2024 +0800

    ksmbd: fix a missing return value check bug
    
    [ Upstream commit 4c16e1cadcbcaf3c82d5fc310fbd34d0f5d0db7c ]
    
    In the smb2_send_interim_resp(), if ksmbd_alloc_work_struct()
    fails to allocate a node, it returns a NULL pointer to the
    in_work pointer. This can lead to an illegal memory write of
    in_work->response_buf when allocate_interim_rsp_buf() attempts
    to perform a kzalloc() on it.
    
    To address this issue, incorporating a check for the return
    value of ksmbd_alloc_work_struct() ensures that the function
    returns immediately upon allocation failure, thereby preventing
    the aforementioned illegal memory access.
    
    Fixes: 041bba4414cd ("ksmbd: fix wrong interim response on compound")
    Signed-off-by: Wentao Liang <liangwentao@iscas.ac.cn>
    Acked-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked [+ + +]
Author: He Wang <xw897002528@gmail.com>
Date:   Mon Jan 6 03:39:54 2025 +0000

    ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked
    
    [ Upstream commit 2ac538e40278a2c0c051cca81bcaafc547d61372 ]
    
    When `ksmbd_vfs_kern_path_locked` met an error and it is not the last
    entry, it will exit without restoring changed path buffer. But later this
    buffer may be used as the filename for creation.
    
    Fixes: c5a709f08d40 ("ksmbd: handle caseless file creation")
    Signed-off-by: He Wang <xw897002528@gmail.com>
    Acked-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ksmbd: Implement new SMB3 POSIX type [+ + +]
Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Jan 7 17:41:21 2025 +0900

    ksmbd: Implement new SMB3 POSIX type
    
    commit e8580b4c600e085b3c8e6404392de2f822d4c132 upstream.
    
    As SMB3 posix extension specification, Give posix file type to posix
    mode.
    
    https://www.samba.org/~slow/SMB3_POSIX/fscc_posix_extensions.html#posix-file-type-definition
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Linux: Linux 6.6.72 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Fri Jan 17 13:36:27 2025 +0100

    Linux 6.6.72
    
    Link: https://lore.kernel.org/r/20250115103554.357917208@linuxfoundation.org
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Hardik Garg <hargar@linux.microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
memblock tests: fix implicit declaration of function 'numa_valid_node' [+ + +]
Author: Wei Yang <richard.weiyang@gmail.com>
Date:   Mon Jun 24 01:54:32 2024 +0000

    memblock tests: fix implicit declaration of function 'numa_valid_node'
    
    commit 9364a7e40d54e6858479f0a96e1a04aa1204be16 upstream.
    
    commit 8043832e2a12 ("memblock: use numa_valid_node() helper to check
    for invalid node ID") introduce a new helper numa_valid_node(), which is
    not defined in memblock tests.
    
    Let's add it in the corresponding header file.
    
    Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
    CC: Mike Rapoport (IBM) <rppt@kernel.org>
    Link: https://lore.kernel.org/r/20240624015432.31134-1-richard.weiyang@gmail.com
    Signed-off-by: Mike Rapoport <rppt@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
memblock: make memblock_set_node() also warn about use of MAX_NUMNODES [+ + +]
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed May 29 09:39:10 2024 +0200

    memblock: make memblock_set_node() also warn about use of MAX_NUMNODES
    
    [ Upstream commit e0eec24e2e199873f43df99ec39773ad3af2bff7 ]
    
    On an (old) x86 system with SRAT just covering space above 4Gb:
    
        ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug
    
    the commit referenced below leads to this NUMA configuration no longer
    being refused by a CONFIG_NUMA=y kernel (previously
    
        NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used.
        No NUMA configuration found
        Faking a node at [mem 0x0000000000000000-0x000000027fffffff]
    
    was seen in the log directly after the message quoted above), because of
    memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This
    in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES
    triggering, followed by a NULL deref in memmap_init() when trying to
    access node 64's (NODE_SHIFT=6) node data.
    
    To compensate said change, make memblock_set_node() warn on and adjust
    a passed in value of MAX_NUMNODES, just like various other functions
    already do.
    
    Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware")
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/1c8a058c-5365-4f27-a9f1-3aeb7fb3e7b2@suse.com
    Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
memblock: use numa_valid_node() helper to check for invalid node ID [+ + +]
Author: Mike Rapoport (IBM) <rppt@kernel.org>
Date:   Fri Jun 14 11:05:43 2024 +0300

    memblock: use numa_valid_node() helper to check for invalid node ID
    
    commit 8043832e2a123fd9372007a29192f2f3ba328cd6 upstream.
    
    Introduce numa_valid_node(nid) that verifies that nid is a valid node ID
    and use that instead of comparing nid parameter with either NUMA_NO_NODE
    or MAX_NUMNODES.
    
    This makes the checks for valid node IDs consistent and more robust and
    allows to get rid of multiple WARNings.
    
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling [+ + +]
Author: Rengarajan S <rengarajan.s@microchip.com>
Date:   Thu Dec 5 19:06:25 2024 +0530

    misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling
    
    commit 194f9f94a5169547d682e9bbcc5ae6d18a564735 upstream.
    
    Resolve kernel panic caused by improper handling of IRQs while
    accessing GPIO values. This is done by replacing generic_handle_irq with
    handle_nested_irq.
    
    Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Rengarajan S <rengarajan.s@microchip.com>
    Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config [+ + +]
Author: Rengarajan S <rengarajan.s@microchip.com>
Date:   Thu Dec 5 19:06:26 2024 +0530

    misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config
    
    commit c7a5378a0f707686de3ddb489f1653c523bb7dcc upstream.
    
    Driver returns -EOPNOTSUPPORTED on unsupported parameters case in set
    config. Upper level driver checks for -ENOTSUPP. Because of the return
    code mismatch, the ioctls from userspace fail. Resolve the issue by
    passing -ENOTSUPP during unsupported case.
    
    Fixes: 7d3e4d807df2 ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Rengarajan S <rengarajan.s@microchip.com>
    Link: https://lore.kernel.org/r/20241205133626.1483499-3-rengarajan.s@microchip.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks [+ + +]
Author: David Hildenbrand <david@redhat.com>
Date:   Fri Jul 26 17:07:27 2024 +0200

    mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks
    
    [ Upstream commit 188cac58a8bcdf82c7f63275b68f7a46871e45d6 ]
    
    Sharing page tables between processes but falling back to per-MM page
    table locks cannot possibly work.
    
    So, let's make sure that we do have split PMD locks by adding a new
    Kconfig option and letting that depend on CONFIG_SPLIT_PMD_PTLOCKS.
    
    Link: https://lkml.kernel.org/r/20240726150728.3159964-3-david@redhat.com
    Signed-off-by: David Hildenbrand <david@redhat.com>
    Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Juergen Gross <jgross@suse.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm: hugetlb: independent PMD page table shared count [+ + +]
Author: Liu Shixin <liushixin2@huawei.com>
Date:   Mon Dec 16 15:11:47 2024 +0800

    mm: hugetlb: independent PMD page table shared count
    
    [ Upstream commit 59d9094df3d79443937add8700b2ef1a866b1081 ]
    
    The folio refcount may be increased unexpectly through try_get_folio() by
    caller such as split_huge_pages.  In huge_pmd_unshare(), we use refcount
    to check whether a pmd page table is shared.  The check is incorrect if
    the refcount is increased by the above caller, and this can cause the page
    table leaked:
    
     BUG: Bad page state in process sh  pfn:109324
     page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x66 pfn:0x109324
     flags: 0x17ffff800000000(node=0|zone=2|lastcpupid=0xfffff)
     page_type: f2(table)
     raw: 017ffff800000000 0000000000000000 0000000000000000 0000000000000000
     raw: 0000000000000066 0000000000000000 00000000f2000000 0000000000000000
     page dumped because: nonzero mapcount
     ...
     CPU: 31 UID: 0 PID: 7515 Comm: sh Kdump: loaded Tainted: G    B              6.13.0-rc2master+ #7
     Tainted: [B]=BAD_PAGE
     Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
     Call trace:
      show_stack+0x20/0x38 (C)
      dump_stack_lvl+0x80/0xf8
      dump_stack+0x18/0x28
      bad_page+0x8c/0x130
      free_page_is_bad_report+0xa4/0xb0
      free_unref_page+0x3cc/0x620
      __folio_put+0xf4/0x158
      split_huge_pages_all+0x1e0/0x3e8
      split_huge_pages_write+0x25c/0x2d8
      full_proxy_write+0x64/0xd8
      vfs_write+0xcc/0x280
      ksys_write+0x70/0x110
      __arm64_sys_write+0x24/0x38
      invoke_syscall+0x50/0x120
      el0_svc_common.constprop.0+0xc8/0xf0
      do_el0_svc+0x24/0x38
      el0_svc+0x34/0x128
      el0t_64_sync_handler+0xc8/0xd0
      el0t_64_sync+0x190/0x198
    
    The issue may be triggered by damon, offline_page, page_idle, etc, which
    will increase the refcount of page table.
    
    1. The page table itself will be discarded after reporting the
       "nonzero mapcount".
    
    2. The HugeTLB page mapped by the page table miss freeing since we
       treat the page table as shared and a shared page table will not be
       unmapped.
    
    Fix it by introducing independent PMD page table shared count.  As
    described by comment, pt_index/pt_mm/pt_frag_refcount are used for s390
    gmap, x86 pgds and powerpc, pt_share_count is used for x86/arm64/riscv
    pmds, so we can reuse the field as pt_share_count.
    
    Link: https://lkml.kernel.org/r/20241216071147.3984217-1-liushixin2@huawei.com
    Fixes: 39dde65c9940 ("[PATCH] shared page table for hugetlb page")
    Signed-off-by: Liu Shixin <liushixin2@huawei.com>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Ken Chen <kenneth.w.chen@intel.com>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Jane Chu <jane.chu@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mptcp: sysctl: sched: avoid using current->nsproxy [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Jan 8 16:34:30 2025 +0100

    mptcp: sysctl: sched: avoid using current->nsproxy
    
    commit d38e26e36206ae3d544d496513212ae931d1da0a upstream.
    
    Using the 'net' structure via 'current' is not recommended for different
    reasons.
    
    First, if the goal is to use it to read or write per-netns data, this is
    inconsistent with how the "generic" sysctl entries are doing: directly
    by only using pointers set to the table entry, e.g. table->data. Linked
    to that, the per-netns data should always be obtained from the table
    linked to the netns it had been created for, which may not coincide with
    the reader's or writer's netns.
    
    Another reason is that access to current->nsproxy->netns can oops if
    attempted when current->nsproxy had been dropped when the current task
    is exiting. This is what syzbot found, when using acct(2):
    
      Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
      CPU: 1 UID: 0 PID: 5924 Comm: syz-executor Not tainted 6.13.0-rc5-syzkaller-00004-gccb98ccef0e5 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
      RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125
      Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00
      RSP: 0018:ffffc900034774e8 EFLAGS: 00010206
    
      RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620
      RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028
      RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040
      R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000
      R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000
      FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       proc_sys_call_handler+0x403/0x5d0 fs/proc/proc_sysctl.c:601
       __kernel_write_iter+0x318/0xa80 fs/read_write.c:612
       __kernel_write+0xf6/0x140 fs/read_write.c:632
       do_acct_process+0xcb0/0x14a0 kernel/acct.c:539
       acct_pin_kill+0x2d/0x100 kernel/acct.c:192
       pin_kill+0x194/0x7c0 fs/fs_pin.c:44
       mnt_pin_kill+0x61/0x1e0 fs/fs_pin.c:81
       cleanup_mnt+0x3ac/0x450 fs/namespace.c:1366
       task_work_run+0x14e/0x250 kernel/task_work.c:239
       exit_task_work include/linux/task_work.h:43 [inline]
       do_exit+0xad8/0x2d70 kernel/exit.c:938
       do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
       get_signal+0x2576/0x2610 kernel/signal.c:3017
       arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337
       exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
       exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
       __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
       syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
       do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7fee3cb87a6a
      Code: Unable to access opcode bytes at 0x7fee3cb87a40.
      RSP: 002b:00007fffcccac688 EFLAGS: 00000202 ORIG_RAX: 0000000000000037
      RAX: 0000000000000000 RBX: 00007fffcccac710 RCX: 00007fee3cb87a6a
      RDX: 0000000000000041 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 0000000000000003 R08: 00007fffcccac6ac R09: 00007fffcccacac7
      R10: 00007fffcccac710 R11: 0000000000000202 R12: 00007fee3cd49500
      R13: 00007fffcccac6ac R14: 0000000000000000 R15: 00007fee3cd4b000
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125
      Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00
      RSP: 0018:ffffc900034774e8 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620
      RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028
      RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040
      R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000
      R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000
      FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      ----------------
      Code disassembly (best guess), 1 bytes skipped:
         0: 42 80 3c 38 00          cmpb   $0x0,(%rax,%r15,1)
         5: 0f 85 fe 02 00 00       jne    0x309
         b: 4d 8b a4 24 08 09 00    mov    0x908(%r12),%r12
        12: 00
        13: 48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
        1a: fc ff df
        1d: 49 8d 7c 24 28          lea    0x28(%r12),%rdi
        22: 48 89 fa                mov    %rdi,%rdx
        25: 48 c1 ea 03             shr    $0x3,%rdx
      * 29: 80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
        2d: 0f 85 cc 02 00 00       jne    0x2ff
        33: 4d 8b 7c 24 28          mov    0x28(%r12),%r15
        38: 48                      rex.W
        39: 8d                      .byte 0x8d
        3a: 84 24 c8                test   %ah,(%rax,%rcx,8)
    
    Here with 'net.mptcp.scheduler', the 'net' structure is not really
    needed, because the table->data already has a pointer to the current
    scheduler, the only thing needed from the per-netns data.
    Simply use 'data', instead of getting (most of the time) the same thing,
    but from a longer and indirect way.
    
    Fixes: 6963c508fd7a ("mptcp: only allow set existing scheduler for net.mptcp.scheduler")
    Cc: stable@vger.kernel.org
    Reported-by: syzbot+e364f774c6f57f2c86d1@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-2-5df34b2083e8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net/mlx5: Fix variable not being completed when function returns [+ + +]
Author: Chenguang Zhao <zhaochenguang@kylinos.cn>
Date:   Wed Jan 8 11:00:09 2025 +0800

    net/mlx5: Fix variable not being completed when function returns
    
    [ Upstream commit 0e2909c6bec9048f49d0c8e16887c63b50b14647 ]
    
    When cmd_alloc_index(), fails cmd_work_handler() needs
    to complete ent->slotted before returning early.
    Otherwise the task which issued the command may hang:
    
       mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
       INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
             Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       kworker/13:2    D    0 4055883      2 0x00000228
       Workqueue: events mlx5e_tx_dim_work [mlx5_core]
       Call trace:
          __switch_to+0xe8/0x150
          __schedule+0x2a8/0x9b8
          schedule+0x2c/0x88
          schedule_timeout+0x204/0x478
          wait_for_common+0x154/0x250
          wait_for_completion+0x28/0x38
          cmd_exec+0x7a0/0xa00 [mlx5_core]
          mlx5_cmd_exec+0x54/0x80 [mlx5_core]
          mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
          mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
          mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
          process_one_work+0x1b0/0x448
          worker_thread+0x54/0x468
          kthread+0x134/0x138
          ret_from_fork+0x10/0x18
    
    Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
    Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
    Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
    Acked-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: 802: LLC+SNAP OID:PID lookup on start of skb data [+ + +]
Author: Antonio Pastor <antonio.pastor@gmail.com>
Date:   Thu Jan 2 20:23:00 2025 -0500

    net: 802: LLC+SNAP OID:PID lookup on start of skb data
    
    [ Upstream commit 1e9b0e1c550c42c13c111d1a31e822057232abc4 ]
    
    802.2+LLC+SNAP frames received by napi_complete_done() with GRO and DSA
    have skb->transport_header set two bytes short, or pointing 2 bytes
    before network_header & skb->data. This was an issue as snap_rcv()
    expected offset to point to SNAP header (OID:PID), causing packet to
    be dropped.
    
    A fix at llc_fixup_skb() (a024e377efed) resets transport_header for any
    LLC consumers that may care about it, and stops SNAP packets from being
    dropped, but doesn't fix the problem which is that LLC and SNAP should
    not use transport_header offset.
    
    Ths patch eliminates the use of transport_header offset for SNAP lookup
    of OID:PID so that SNAP does not rely on the offset at all.
    The offset is reset after pull for any SNAP packet consumers that may
    (but shouldn't) use it.
    
    Fixes: fda55eca5a33 ("net: introduce skb_transport_header_was_set()")
    Signed-off-by: Antonio Pastor <antonio.pastor@gmail.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20250103012303.746521-1-antonio.pastor@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: libwx: fix firmware mailbox abnormal return [+ + +]
Author: Jiawen Wu <jiawenwu@trustnetic.com>
Date:   Fri Jan 3 16:10:13 2025 +0800

    net: libwx: fix firmware mailbox abnormal return
    
    [ Upstream commit 8ce4f287524c74a118b0af1eebd4b24a8efca57a ]
    
    The existing SW-FW interaction flow on the driver is wrong. Follow this
    wrong flow, driver would never return error if there is a unknown command.
    Since firmware writes back 'firmware ready' and 'unknown command' in the
    mailbox message if there is an unknown command sent by driver. So reading
    'firmware ready' does not timeout. Then driver would mistakenly believe
    that the interaction has completed successfully.
    
    It tends to happen with the use of custom firmware. Move the check for
    'unknown command' out of the poll timeout for 'firmware ready'. And adjust
    the debug log so that mailbox messages are always printed when commands
    timeout.
    
    Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware")
    Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
    Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: dwmac-tegra: Read iommu stream id from device tree [+ + +]
Author: Parker Newman <pnewman@connecttech.com>
Date:   Tue Jan 7 16:24:59 2025 -0500

    net: stmmac: dwmac-tegra: Read iommu stream id from device tree
    
    [ Upstream commit 426046e2d62dd19533808661e912b8e8a9eaec16 ]
    
    Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be
    written to the MGBE_WRAP_AXI_ASID0_CTRL register.
    
    The current driver is hard coded to use MGBE0's SID for all controllers.
    This causes softirq time outs and kernel panics when using controllers
    other than MGBE0.
    
    Example dmesg errors when an ethernet cable is connected to MGBE1:
    
    [  116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
    [  121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms
    [  121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter.
    [  121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
    [  121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171)
    [  121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features
    [  121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
    [  121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock
    [  121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode
    [  125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
    [  181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
    [  181.921404] rcu:     7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337
    [  181.921684] rcu:     (detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8)
    [  181.921878] Sending NMI from CPU 4 to CPUs 7:
    [  181.921886] NMI backtrace for cpu 7
    [  181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
    [  181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
    [  181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  181.922847] pc : handle_softirqs+0x98/0x368
    [  181.922978] lr : __do_softirq+0x18/0x20
    [  181.923095] sp : ffff80008003bf50
    [  181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
    [  181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
    [  181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
    [  181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
    [  181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
    [  181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
    [  181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
    [  181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74
    [  181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1
    [  181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
    [  181.967591] Call trace:
    [  181.970043]  handle_softirqs+0x98/0x368 (P)
    [  181.974240]  __do_softirq+0x18/0x20
    [  181.977743]  ____do_softirq+0x14/0x28
    [  181.981415]  call_on_irq_stack+0x24/0x30
    [  181.985180]  do_softirq_own_stack+0x20/0x30
    [  181.989379]  __irq_exit_rcu+0x114/0x140
    [  181.993142]  irq_exit_rcu+0x14/0x28
    [  181.996816]  el1_interrupt+0x44/0xb8
    [  182.000316]  el1h_64_irq_handler+0x14/0x20
    [  182.004343]  el1h_64_irq+0x80/0x88
    [  182.007755]  cpuidle_enter_state+0xc4/0x4a8 (P)
    [  182.012305]  cpuidle_enter+0x3c/0x58
    [  182.015980]  cpuidle_idle_call+0x128/0x1c0
    [  182.020005]  do_idle+0xe0/0xf0
    [  182.023155]  cpu_startup_entry+0x3c/0x48
    [  182.026917]  secondary_start_kernel+0xdc/0x120
    [  182.031379]  __secondary_switched+0x74/0x78
    [  212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/.
    [  212.985935] rcu: blocking rcu_node structures (internal RCU debug):
    [  212.992758] Sending NMI from CPU 0 to CPUs 7:
    [  212.998539] NMI backtrace for cpu 7
    [  213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
    [  213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
    [  213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  213.040528] pc : handle_softirqs+0x98/0x368
    [  213.046563] lr : __do_softirq+0x18/0x20
    [  213.051293] sp : ffff80008003bf50
    [  213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
    [  213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
    [  213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
    [  213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
    [  213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
    [  213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
    [  213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
    [  213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608
    [  213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04
    [  213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
    [  213.162063] Call trace:
    [  213.165494]  handle_softirqs+0x98/0x368 (P)
    [  213.171256]  __do_softirq+0x18/0x20
    [  213.177291]  ____do_softirq+0x14/0x28
    [  213.182017]  call_on_irq_stack+0x24/0x30
    [  213.186565]  do_softirq_own_stack+0x20/0x30
    [  213.191815]  __irq_exit_rcu+0x114/0x140
    [  213.196891]  irq_exit_rcu+0x14/0x28
    [  213.202401]  el1_interrupt+0x44/0xb8
    [  213.207741]  el1h_64_irq_handler+0x14/0x20
    [  213.213519]  el1h_64_irq+0x80/0x88
    [  213.217541]  cpuidle_enter_state+0xc4/0x4a8 (P)
    [  213.224364]  cpuidle_enter+0x3c/0x58
    [  213.228653]  cpuidle_idle_call+0x128/0x1c0
    [  213.233993]  do_idle+0xe0/0xf0
    [  213.237928]  cpu_startup_entry+0x3c/0x48
    [  213.243791]  secondary_start_kernel+0xdc/0x120
    [  213.249830]  __secondary_switched+0x74/0x78
    
    This bug has existed since the dwmac-tegra driver was added in Dec 2022
    (See Fixes tag below for commit hash).
    
    The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit
    only uses MGBE0 which is why the bug was not found previously. Connect Tech
    has many products that use 2 (or more) MGBE controllers.
    
    The solution is to read the controller's SID from the existing "iommus"
    device tree property. The 2nd field of the "iommus" device tree property
    is the controller's SID.
    
    Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property:
    
    smmu_niso0: iommu@12000000 {
            compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500";
    ...
    }
    
    /* MGBE1 */
    ethernet@6900000 {
            compatible = "nvidia,tegra234-mgbe";
    ...
            iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>;
    ...
    }
    
    Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in
    the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the
    SID using the tegra_dev_iommu_get_stream_id() helper function found in
    linux/iommu.h.
    
    Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus"
    property is removed from the device tree or the IOMMU is disabled.
    
    While the Tegra234 SOC technically supports bypassing the IOMMU, it is not
    supported by the current firmware, has not been tested and not recommended.
    More detailed discussion with Thierry Reding from Nvidia linked below.
    
    Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support")
    Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com
    Signed-off-by: Parker Newman <pnewman@connecttech.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Acked-by: Thierry Reding <treding@nvidia.com>
    Link: https://patch.msgid.link/6fb97f32cf4accb4f7cf92846f6b60064ba0a3bd.1736284360.git.pnewman@connecttech.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Jan 3 10:45:46 2025 +0000

    net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute
    
    [ Upstream commit a039e54397c6a75b713b9ce7894a62e06956aa92 ]
    
    syzbot found that TCA_FLOW_RSHIFT attribute was not validated.
    Right shitfing a 32bit integer is undefined for large shift values.
    
    UBSAN: shift-out-of-bounds in net/sched/cls_flow.c:329:23
    shift exponent 9445 is too large for 32-bit type 'u32' (aka 'unsigned int')
    CPU: 1 UID: 0 PID: 54 Comm: kworker/u8:3 Not tainted 6.13.0-rc3-syzkaller-00180-g4f619d518db9 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
    Workqueue: ipv6_addrconf addrconf_dad_work
    Call Trace:
     <TASK>
      __dump_stack lib/dump_stack.c:94 [inline]
      dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
      ubsan_epilogue lib/ubsan.c:231 [inline]
      __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468
      flow_classify+0x24d5/0x25b0 net/sched/cls_flow.c:329
      tc_classify include/net/tc_wrapper.h:197 [inline]
      __tcf_classify net/sched/cls_api.c:1771 [inline]
      tcf_classify+0x420/0x1160 net/sched/cls_api.c:1867
      sfb_classify net/sched/sch_sfb.c:260 [inline]
      sfb_enqueue+0x3ad/0x18b0 net/sched/sch_sfb.c:318
      dev_qdisc_enqueue+0x4b/0x290 net/core/dev.c:3793
      __dev_xmit_skb net/core/dev.c:3889 [inline]
      __dev_queue_xmit+0xf0e/0x3f50 net/core/dev.c:4400
      dev_queue_xmit include/linux/netdevice.h:3168 [inline]
      neigh_hh_output include/net/neighbour.h:523 [inline]
      neigh_output include/net/neighbour.h:537 [inline]
      ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236
      iptunnel_xmit+0x55d/0x9b0 net/ipv4/ip_tunnel_core.c:82
      udp_tunnel_xmit_skb+0x262/0x3b0 net/ipv4/udp_tunnel_core.c:173
      geneve_xmit_skb drivers/net/geneve.c:916 [inline]
      geneve_xmit+0x21dc/0x2d00 drivers/net/geneve.c:1039
      __netdev_start_xmit include/linux/netdevice.h:5002 [inline]
      netdev_start_xmit include/linux/netdevice.h:5011 [inline]
      xmit_one net/core/dev.c:3590 [inline]
      dev_hard_start_xmit+0x27a/0x7d0 net/core/dev.c:3606
      __dev_queue_xmit+0x1b73/0x3f50 net/core/dev.c:4434
    
    Fixes: e5dfb815181f ("[NET_SCHED]: Add flow classifier")
    Reported-by: syzbot+1dbb57d994e54aaa04d2@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/netdev/6777bf49.050a0220.178762.0040.GAE@google.com/T/#u
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20250103104546.3714168-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfilter: conntrack: clamp maximum hashtable size to INT_MAX [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Wed Jan 8 22:56:33 2025 +0100

    netfilter: conntrack: clamp maximum hashtable size to INT_MAX
    
    [ Upstream commit b541ba7d1f5a5b7b3e2e22dc9e40e18a7d6dbc13 ]
    
    Use INT_MAX as maximum size for the conntrack hashtable. Otherwise, it
    is possible to hit WARN_ON_ONCE in __kvmalloc_node_noprof() when
    resizing hashtable because __GFP_NOWARN is unset. See:
    
      0708a0afe291 ("mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls")
    
    Note: hashtable resize is only possible from init_netns.
    
    Fixes: 9cc1c73ad666 ("netfilter: conntrack: avoid integer overflow when resizing")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: imbalance in flowtable binding [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Jan 2 13:01:13 2025 +0100

    netfilter: nf_tables: imbalance in flowtable binding
    
    [ Upstream commit 13210fc63f353fe78584048079343413a3cdf819 ]
    
    All these cases cause imbalance between BIND and UNBIND calls:
    
    - Delete an interface from a flowtable with multiple interfaces
    
    - Add a (device to a) flowtable with --check flag
    
    - Delete a netns containing a flowtable
    
    - In an interactive nft session, create a table with owner flag and
      flowtable inside, then quit.
    
    Fix it by calling FLOW_BLOCK_UNBIND when unregistering hooks, then
    remove late FLOW_BLOCK_UNBIND call when destroying flowtable.
    
    Fixes: ff4bf2f42a40 ("netfilter: nf_tables: add nft_unregister_flowtable_hook()")
    Reported-by: Phil Sutter <phil@nwl.cc>
    Tested-by: Phil Sutter <phil@nwl.cc>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ovl: do not encode lower fh with upper sb_writers held [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Wed Aug 16 16:47:59 2023 +0300

    ovl: do not encode lower fh with upper sb_writers held
    
    [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
    
    When lower fs is a nested overlayfs, calling encode_fh() on a lower
    directory dentry may trigger copy up and take sb_writers on the upper fs
    of the lower nested overlayfs.
    
    The lower nested overlayfs may have the same upper fs as this overlayfs,
    so nested sb_writers lock is illegal.
    
    Move all the callers that encode lower fh to before ovl_want_write().
    
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ovl: pass realinode to ovl_encode_real_fh() instead of realdentry [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Sun Jan 5 17:24:03 2025 +0100

    ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
    
    [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
    
    We want to be able to encode an fid from an inode with no alias.
    
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ovl: support encoding fid from inode with no alias [+ + +]
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Sun Jan 5 17:24:04 2025 +0100

    ovl: support encoding fid from inode with no alias
    
    [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
    
    Dmitry Safonov reported that a WARN_ON() assertion can be trigered by
    userspace when calling inotify_show_fdinfo() for an overlayfs watched
    inode, whose dentry aliases were discarded with drop_caches.
    
    The WARN_ON() assertion in inotify_show_fdinfo() was removed, because
    it is possible for encoding file handle to fail for other reason, but
    the impact of failing to encode an overlayfs file handle goes beyond
    this assertion.
    
    As shown in the LTP test case mentioned in the link below, failure to
    encode an overlayfs file handle from a non-aliased inode also leads to
    failure to report an fid with FAN_DELETE_SELF fanotify events.
    
    As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails
    if it cannot find an alias for the inode, but this failure can be fixed.
    ovl_encode_fh() seldom uses the alias and in the case of non-decodable
    file handles, as is often the case with fanotify fid info,
    ovl_encode_fh() never needs to use the alias to encode a file handle.
    
    Defer finding an alias until it is actually needed so ovl_encode_fh()
    will not fail in the common case of FAN_DELETE_SELF fanotify events.
    
    Fixes: 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles")
    Reported-by: Dmitry Safonov <dima@arista.com>
    Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
pds_core: limit loop over fw name list [+ + +]
Author: Shannon Nelson <shannon.nelson@amd.com>
Date:   Fri Jan 3 11:51:47 2025 -0800

    pds_core: limit loop over fw name list
    
    [ Upstream commit 8c817eb26230dc0ae553cee16ff43a4a895f6756 ]
    
    Add an array size limit to the for-loop to be sure we don't try
    to reference a fw_version string off the end of the fw info names
    array.  We know that our firmware only has a limited number
    of firmware slot names, but we shouldn't leave this unchecked.
    
    Fixes: 45d76f492938 ("pds_core: set up device and adminq")
    Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Brett Creeley <brett.creeley@amd.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
pgtable: fix s390 ptdesc field comments [+ + +]
Author: Alexander Gordeev <agordeev@linux.ibm.com>
Date:   Tue Nov 21 20:43:49 2023 +0100

    pgtable: fix s390 ptdesc field comments
    
    [ Upstream commit 38ca8a185389716e9f7566bce4bb0085f71da61d ]
    
    Patch series "minor ptdesc updates", v3.
    
    This patch (of 2):
    
    Since commit d08d4e7cd6bf ("s390/mm: use full 4KB page for 2KB PTE") there
    is no fragmented page tracking on s390.  Fix the corresponding comments.
    
    Link: https://lkml.kernel.org/r/cover.1700594815.git.agordeev@linux.ibm.com
    Link: https://lkml.kernel.org/r/2eead241f3a45bed26c7911cf66bded1e35670b8.1700594815.git.agordeev@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Suggested-by: Heiko Carstens <hca@linux.ibm.com>
    Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it [+ + +]
Author: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Date:   Mon Jan 6 18:40:34 2025 +0100

    platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it
    
    [ Upstream commit dd410d784402c5775f66faf8b624e85e41c38aaf ]
    
    Wakeup for IRQ1 should be disabled only in cases where i8042 had
    actually enabled it, otherwise "wake_depth" for this IRQ will try to
    drop below zero and there will be an unpleasant WARN() logged:
    
    kernel: atkbd serio0: Disabling IRQ1 wakeup source to avoid platform firmware bug
    kernel: ------------[ cut here ]------------
    kernel: Unbalanced IRQ 1 wake disable
    kernel: WARNING: CPU: 10 PID: 6431 at kernel/irq/manage.c:920 irq_set_irq_wake+0x147/0x1a0
    
    The PMC driver uses DEFINE_SIMPLE_DEV_PM_OPS() to define its dev_pm_ops
    which sets amd_pmc_suspend_handler() to the .suspend, .freeze, and
    .poweroff handlers. i8042_pm_suspend(), however, is only set as
    the .suspend handler.
    
    Fix the issue by call PMC suspend handler only from the same set of
    dev_pm_ops handlers as i8042_pm_suspend(), which currently means just
    the .suspend handler.
    
    To reproduce this issue try hibernating (S4) the machine after a fresh boot
    without putting it into s2idle first.
    
    Fixes: 8e60615e8932 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN")
    Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
    Link: https://lore.kernel.org/r/c8f28c002ca3c66fbeeb850904a1f43118e17200.1736184606.git.mail@maciej.szmigiero.name
    [ij: edited the commit message.]
    Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe() [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Sun Dec 15 12:01:59 2024 +0900

    pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()
    
    [ Upstream commit 469c0682e03d67d8dc970ecaa70c2d753057c7c0 ]
    
    imx_gpcv2_probe() leaks an OF node reference obtained by
    of_get_child_by_name(). Fix it by declaring the device node with the
    __free(device_node) cleanup construct.
    
    This bug was found by an experimental static analysis tool that I am
    developing.
    
    Fixes: 03aa12629fc4 ("soc: imx: Add GPCv2 power gating driver")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Cc: stable@vger.kernel.org
    Message-ID: <20241215030159.1526624-1-joe@pf.is.s.u-tokyo.ac.jp>
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pmdomain: imx: gpcv2: Simplify with scoped for each OF child loop [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Fri Aug 23 14:51:08 2024 +0200

    pmdomain: imx: gpcv2: Simplify with scoped for each OF child loop
    
    [ Upstream commit 13bd778c900537f3fff7cfb671ff2eb0e92feee6 ]
    
    Use scoped for_each_child_of_node_scoped() when iterating over device
    nodes to make code a bit simpler.
    
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-4-8320722eaf39@linaro.org
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Stable-dep-of: 469c0682e03d ("pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
riscv: Fix early ftrace nop patching [+ + +]
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Thu May 23 13:51:34 2024 +0200

    riscv: Fix early ftrace nop patching
    
    commit 6ca445d8af0ed5950ebf899415fd6bfcd7d9d7a3 upstream.
    
    Commit c97bf629963e ("riscv: Fix text patching when IPI are used")
    converted ftrace_make_nop() to use patch_insn_write() which does not
    emit any icache flush relying entirely on __ftrace_modify_code() to do
    that.
    
    But we missed that ftrace_make_nop() was called very early directly when
    converting mcount calls into nops (actually on riscv it converts 2B nops
    emitted by the compiler into 4B nops).
    
    This caused crashes on multiple HW as reported by Conor and Björn since
    the booting core could have half-patched instructions in its icache
    which would trigger an illegal instruction trap: fix this by emitting a
    local flush icache when early patching nops.
    
    Fixes: c97bf629963e ("riscv: Fix text patching when IPI are used")
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Reported-by: Conor Dooley <conor.dooley@microchip.com>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
    Tested-by: Björn Töpel <bjorn@rivosinc.com>
    Link: https://lore.kernel.org/r/20240523115134.70380-1-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: Fix sleeping in invalid context in die() [+ + +]
Author: Nam Cao <namcao@linutronix.de>
Date:   Mon Nov 18 10:13:33 2024 +0100

    riscv: Fix sleeping in invalid context in die()
    
    commit 6a97f4118ac07cfdc316433f385dbdc12af5025e upstream.
    
    die() can be called in exception handler, and therefore cannot sleep.
    However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled.
    That causes the following warning:
    
    BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
    in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex
    preempt_count: 110001, expected: 0
    RCU nest depth: 0, expected: 0
    CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234
    Hardware name: riscv-virtio,qemu (DT)
    Call Trace:
        dump_backtrace+0x1c/0x24
        show_stack+0x2c/0x38
        dump_stack_lvl+0x5a/0x72
        dump_stack+0x14/0x1c
        __might_resched+0x130/0x13a
        rt_spin_lock+0x2a/0x5c
        die+0x24/0x112
        do_trap_insn_illegal+0xa0/0xea
        _new_vmalloc_restore_context_a0+0xcc/0xd8
    Oops - illegal instruction [#1]
    
    Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT
    enabled.
    
    Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
    Signed-off-by: Nam Cao <namcao@linutronix.de>
    Cc: stable@vger.kernel.org
    Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: Fix text patching when IPI are used [+ + +]
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Thu Feb 29 13:10:56 2024 +0100

    riscv: Fix text patching when IPI are used
    
    [ Upstream commit c97bf629963e52b205ed5fbaf151e5bd342f9c63 ]
    
    For now, we use stop_machine() to patch the text and when we use IPIs for
    remote icache flushes (which is emitted in patch_text_nosync()), the system
    hangs.
    
    So instead, make sure every CPU executes the stop_machine() patching
    function and emit a local icache flush there.
    
    Co-developed-by: Björn Töpel <bjorn@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
    Link: https://lore.kernel.org/r/20240229121056.203419-3-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Stable-dep-of: 13134cc94914 ("riscv: kprobes: Fix incorrect address calculation")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: kprobes: Fix incorrect address calculation [+ + +]
Author: Nam Cao <namcao@linutronix.de>
Date:   Tue Nov 19 12:10:56 2024 +0100

    riscv: kprobes: Fix incorrect address calculation
    
    commit 13134cc949148e1dfa540a0fe5dc73569bc62155 upstream.
    
    p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
    multiplied by four. This is clearly undesirable for this case.
    
    Cast it to (void *) first before any calculation.
    
    Below is a sample before/after. The dumped memory is two kprobe slots, the
    first slot has
    
      - c.addiw a0, 0x1c (0x7125)
      - ebreak           (0x00100073)
    
    and the second slot has:
    
      - c.addiw a0, -4   (0x7135)
      - ebreak           (0x00100073)
    
    Before this patch:
    
    (gdb) x/16xh 0xff20000000135000
    0xff20000000135000:     0x7125  0x0000  0x0000  0x0000  0x7135  0x0010  0x0000  0x0000
    0xff20000000135010:     0x0073  0x0010  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
    
    After this patch:
    
    (gdb) x/16xh 0xff20000000125000
    0xff20000000125000:     0x7125  0x0073  0x0010  0x0000  0x7135  0x0073  0x0010  0x0000
    0xff20000000125010:     0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
    
    Fixes: b1756750a397 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
    Signed-off-by: Nam Cao <namcao@linutronix.de>
    Cc: stable@vger.kernel.org
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    [rebase to v6.6]
    Signed-off-by: Nam Cao <namcao@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: mm: Fix the out of bound issue of vmemmap address [+ + +]
Author: Xu Lu <luxu.kernel@bytedance.com>
Date:   Mon Dec 9 20:26:17 2024 +0800

    riscv: mm: Fix the out of bound issue of vmemmap address
    
    [ Upstream commit f754f27e98f88428aaf6be6e00f5cbce97f62d4b ]
    
    In sparse vmemmap model, the virtual address of vmemmap is calculated as:
    ((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)).
    And the struct page's va can be calculated with an offset:
    (vmemmap + (pfn)).
    
    However, when initializing struct pages, kernel actually starts from the
    first page from the same section that phys_ram_base belongs to. If the
    first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then
    we get an va below VMEMMAP_START when calculating va for it's struct page.
    
    For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the
    first page in the same section is actually pfn 0x80000. During
    init_unavailable_range(), we will initialize struct page for pfn 0x80000
    with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is
    below VMEMMAP_START as well as PCI_IO_END.
    
    This commit fixes this bug by introducing a new variable
    'vmemmap_start_pfn' which is aligned with memory section size and using
    it to calculate vmemmap address instead of phys_ram_base.
    
    Fixes: a11dd49dcb93 ("riscv: Sparse-Memory/vmemmap out-of-bounds fix")
    Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Tested-by: Björn Töpel <bjorn@rivosinc.com>
    Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
    Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
sched: sch_cake: add bounds checks to host bulk flow fairness counts [+ + +]
Author: Toke Høiland-Jørgensen <toke@redhat.com>
Date:   Tue Jan 7 13:01:05 2025 +0100

    sched: sch_cake: add bounds checks to host bulk flow fairness counts
    
    [ Upstream commit 737d4d91d35b5f7fa5bb442651472277318b0bfd ]
    
    Even though we fixed a logic error in the commit cited below, syzbot
    still managed to trigger an underflow of the per-host bulk flow
    counters, leading to an out of bounds memory access.
    
    To avoid any such logic errors causing out of bounds memory accesses,
    this commit factors out all accesses to the per-host bulk flow counters
    to a series of helpers that perform bounds-checking before any
    increments and decrements. This also has the benefit of improving
    readability by moving the conditional checks for the flow mode into
    these helpers, instead of having them spread out throughout the
    code (which was the cause of the original logic error).
    
    As part of this change, the flow quantum calculation is consolidated
    into a helper function, which means that the dithering applied to the
    ost load scaling is now applied both in the DRR rotation and when a
    sparse flow's quantum is first initiated. The only user-visible effect
    of this is that the maximum packet size that can be sent while a flow
    stays sparse will now vary with +/- one byte in some cases. This should
    not make a noticeable difference in practice, and thus it's not worth
    complicating the code to preserve the old behaviour.
    
    Fixes: 546ea84d07e3 ("sched: sch_cake: fix bulk flow accounting logic for host fairness")
    Reported-by: syzbot+f63600d288bfb7057424@syzkaller.appspotmail.com
    Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Acked-by: Dave Taht <dave.taht@gmail.com>
    Link: https://patch.msgid.link/20250107120105.70685-1-toke@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence() [+ + +]
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date:   Thu Dec 19 22:20:41 2024 +0530

    scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()
    
    commit 7bac65687510038390a0a54cbe14fba08d037e46 upstream.
    
    PHY might already be powered on during ufs_qcom_power_up_sequence() in a
    couple of cases:
    
     1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk
    
     2. Resuming from spm_lvl = 5 suspend
    
    In those cases, it is necessary to call phy_power_off() and phy_exit() in
    ufs_qcom_power_up_sequence() function to power off the PHY before calling
    phy_init() and phy_power_on().
    
    Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is
    not handled. So to satisfy both cases, call phy_power_off() and phy_exit()
    if the phy_count is non-zero. And with this change, the reinit_notify()
    callback is no longer needed.
    
    This fixes the below UFS resume failure with spm_lvl = 5:
    
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
    ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
    ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5
    ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5
    ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5
    
    Cc: stable@vger.kernel.org # 6.3
    Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device")
    Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com>
    Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
sctp: sysctl: auth_enable: avoid using current->nsproxy [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Jan 8 16:34:34 2025 +0100

    sctp: sysctl: auth_enable: avoid using current->nsproxy
    
    commit 15649fd5415eda664ef35780c2013adeb5d9c695 upstream.
    
    As mentioned in a previous commit of this series, using the 'net'
    structure via 'current' is not recommended for different reasons:
    
    - Inconsistency: getting info from the reader's/writer's netns vs only
      from the opener's netns.
    
    - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
      (null-ptr-deref), e.g. when the current task is exiting, as spotted by
      syzbot [1] using acct(2).
    
    The 'net' structure can be obtained from the table->data using
    container_of().
    
    Note that table->data could also be used directly, but that would
    increase the size of this fix, while 'sctp.ctl_sock' still needs to be
    retrieved from 'net' structure.
    
    Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-6-5df34b2083e8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Jan 8 16:34:32 2025 +0100

    sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
    
    commit ea62dd1383913b5999f3d16ae99d411f41b528d4 upstream.
    
    As mentioned in a previous commit of this series, using the 'net'
    structure via 'current' is not recommended for different reasons:
    
    - Inconsistency: getting info from the reader's/writer's netns vs only
      from the opener's netns.
    
    - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
      (null-ptr-deref), e.g. when the current task is exiting, as spotted by
      syzbot [1] using acct(2).
    
    The 'net' structure can be obtained from the table->data using
    container_of().
    
    Note that table->data could also be used directly, as this is the only
    member needed from the 'net' structure, but that would increase the size
    of this fix, to use '*data' everywhere 'net->sctp.sctp_hmac_alg' is
    used.
    
    Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-4-5df34b2083e8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Jan 8 16:34:36 2025 +0100

    sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
    
    commit 6259d2484d0ceff42245d1f09cc8cb6ee72d847a upstream.
    
    As mentioned in a previous commit of this series, using the 'net'
    structure via 'current' is not recommended for different reasons:
    
    - Inconsistency: getting info from the reader's/writer's netns vs only
      from the opener's netns.
    
    - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
      (null-ptr-deref), e.g. when the current task is exiting, as spotted by
      syzbot [1] using acct(2).
    
    The 'net' structure can be obtained from the table->data using
    container_of().
    
    Note that table->data could also be used directly, as this is the only
    member needed from the 'net' structure, but that would increase the size
    of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is
    used.
    
    Fixes: d1e462a7a5f3 ("sctp: add probe_interval in sysctl and sock/asoc/transport")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-8-5df34b2083e8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: sysctl: rto_min/max: avoid using current->nsproxy [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Jan 8 16:34:33 2025 +0100

    sctp: sysctl: rto_min/max: avoid using current->nsproxy
    
    commit 9fc17b76fc70763780aa78b38fcf4742384044a5 upstream.
    
    As mentioned in a previous commit of this series, using the 'net'
    structure via 'current' is not recommended for different reasons:
    
    - Inconsistency: getting info from the reader's/writer's netns vs only
      from the opener's netns.
    
    - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
      (null-ptr-deref), e.g. when the current task is exiting, as spotted by
      syzbot [1] using acct(2).
    
    The 'net' structure can be obtained from the table->data using
    container_of().
    
    Note that table->data could also be used directly, as this is the only
    member needed from the 'net' structure, but that would increase the size
    of this fix, to use '*data' everywhere 'net->sctp.rto_min/max' is used.
    
    Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-5-5df34b2083e8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: sysctl: udp_port: avoid using current->nsproxy [+ + +]
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Jan 8 16:34:35 2025 +0100

    sctp: sysctl: udp_port: avoid using current->nsproxy
    
    commit c10377bbc1972d858eaf0ab366a311b39f8ef1b6 upstream.
    
    As mentioned in a previous commit of this series, using the 'net'
    structure via 'current' is not recommended for different reasons:
    
    - Inconsistency: getting info from the reader's/writer's netns vs only
      from the opener's netns.
    
    - current->nsproxy can be NULL in some cases, resulting in an 'Oops'
      (null-ptr-deref), e.g. when the current task is exiting, as spotted by
      syzbot [1] using acct(2).
    
    The 'net' structure can be obtained from the table->data using
    container_of().
    
    Note that table->data could also be used directly, but that would
    increase the size of this fix, while 'sctp.ctl_sock' still needs to be
    retrieved from 'net' structure.
    
    Fixes: 046c052b475e ("sctp: enable udp tunneling socks")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-7-5df34b2083e8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
selftests/alsa: Fix circular dependency involving global-timer [+ + +]
Author: Li Zhijian <lizhijian@fujitsu.com>
Date:   Wed Dec 18 10:59:31 2024 +0800

    selftests/alsa: Fix circular dependency involving global-timer
    
    [ Upstream commit 55853cb829dc707427c3519f6b8686682a204368 ]
    
    The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular
    dependency on the global-timer target due to its inclusion in
    $(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency
    warning during the build process.
    
    To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been
    replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change
    ensures that libatest.so is built before any other targets that require it,
    without creating a circular dependency.
    
    This fix addresses the following warning:
    
    make[4]: Entering directory 'tools/testing/selftests/alsa'
    make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped.
    make[4]: Nothing to be done for 'all'.
    make[4]: Leaving directory 'tools/testing/selftests/alsa'
    
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Jaroslav Kysela <perex@perex.cz>
    Cc: Takashi Iwai <tiwai@suse.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
    Link: https://patch.msgid.link/20241218025931.914164-1-lizhijian@fujitsu.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
smb: client: sync the root session and superblock context passwords before automounting [+ + +]
Author: Meetakshi Setiya <msetiya@microsoft.com>
Date:   Wed Jan 8 05:10:34 2025 -0500

    smb: client: sync the root session and superblock context passwords before automounting
    
    commit 20b1aa912316ffb7fbb5f407f17c330f2a22ddff upstream.
    
    In some cases, when password2 becomes the working password, the
    client swaps the two password fields in the root session struct, but
    not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit
    fs context from their parent mounts. Therefore, they might end up
    getting the passwords in the stale order.
    The automount should succeed, because the mount function will end up
    retrying with the actual password anyway. But to reduce these
    unnecessary session setup retries for automounts, we can sync the
    parent context's passwords with the root session's passwords before
    duplicating it to the child's fs context.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com>
    Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
    Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
staging: iio: ad9832: Correct phase range check [+ + +]
Author: Zicheng Qu <quzicheng@huawei.com>
Date:   Thu Nov 7 01:10:15 2024 +0000

    staging: iio: ad9832: Correct phase range check
    
    commit 4636e859ebe0011f41e35fa79bab585b8004e9a3 upstream.
    
    User Perspective:
    When a user sets the phase value, the ad9832_write_phase() is called.
    The phase register has a 12-bit resolution, so the valid range is 0 to
    4095. If the phase offset value of 4096 is input, it effectively exactly
    equals 0 in the lower 12 bits, meaning no offset.
    
    Reasons for the Change:
    1) Original Condition (phase > BIT(AD9832_PHASE_BITS)):
    This condition allows a phase value equal to 2^12, which is 4096.
    However, this value exceeds the valid 12-bit range, as the maximum valid
    phase value should be 4095.
    2) Modified Condition (phase >= BIT(AD9832_PHASE_BITS)):
    Ensures that the phase value is within the valid range, preventing
    invalid datafrom being written.
    
    Impact on Subsequent Logic: st->data = cpu_to_be16(addr | phase):
    If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr
    is AD9832_REG_PHASE0 (1100 0000 0000 0000), then addr | phase results in
    1101 0000 0000 0000, occupying DB12. According to the section of WRITING
    TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be
    DB11. The original condition leads to incorrect DB12 usage, which
    contradicts the datasheet and could pose potential issues for future
    updates if DB12 is used in such related cases.
    
    Fixes: ea707584bac1 ("Staging: IIO: DDS: AD9832 / AD9835 driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
    Link: https://patch.msgid.link/20241107011015.2472600-3-quzicheng@huawei.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: iio: ad9834: Correct phase range check [+ + +]
Author: Zicheng Qu <quzicheng@huawei.com>
Date:   Thu Nov 7 01:10:14 2024 +0000

    staging: iio: ad9834: Correct phase range check
    
    commit c0599762f0c7e260b99c6b7bceb8eae69b804c94 upstream.
    
    User Perspective:
    When a user sets the phase value, the ad9834_write_phase() is called.
    The phase register has a 12-bit resolution, so the valid range is 0 to
    4095. If the phase offset value of 4096 is input, it effectively exactly
    equals 0 in the lower 12 bits, meaning no offset.
    
    Reasons for the Change:
    1) Original Condition (phase > BIT(AD9834_PHASE_BITS)):
    This condition allows a phase value equal to 2^12, which is 4096.
    However, this value exceeds the valid 12-bit range, as the maximum valid
    phase value should be 4095.
    2) Modified Condition (phase >= BIT(AD9834_PHASE_BITS)):
    Ensures that the phase value is within the valid range, preventing
    invalid datafrom being written.
    
    Impact on Subsequent Logic: st->data = cpu_to_be16(addr | phase):
    If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr
    is AD9834_REG_PHASE0 (1100 0000 0000 0000), then addr | phase results in
    1101 0000 0000 0000, occupying DB12. According to the section of WRITING
    TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be
    DB11. The original condition leads to incorrect DB12 usage, which
    contradicts the datasheet and could pose potential issues for future
    updates if DB12 is used in such related cases.
    
    Fixes: 12b9d5bf76bf ("Staging: IIO: DDS: AD9833 / AD9834 driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
    Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://patch.msgid.link/20241107011015.2472600-2-quzicheng@huawei.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
tcp/dccp: allow a connection when sk_max_ack_backlog is zero [+ + +]
Author: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
Date:   Thu Jan 2 17:14:26 2025 +0000

    tcp/dccp: allow a connection when sk_max_ack_backlog is zero
    
    [ Upstream commit 3479c7549fb1dfa7a1db4efb7347c7b8ef50de4b ]
    
    If the backlog of listen() is set to zero, sk_acceptq_is_full() allows
    one connection to be made, but inet_csk_reqsk_queue_is_full() does not.
    When the net.ipv4.tcp_syncookies is zero, inet_csk_reqsk_queue_is_full()
    will cause an immediate drop before the sk_acceptq_is_full() check in
    tcp_conn_request(), resulting in no connection can be made.
    
    This patch tries to keep consistent with 64a146513f8f ("[NET]: Revert
    incorrect accept queue backlog changes.").
    
    Link: https://lore.kernel.org/netdev/20250102080258.53858-1-kuniyu@amazon.com/
    Fixes: ef547f2ac16b ("tcp: remove max_qlen_log")
    Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20250102171426.915276-1-dzq.aishenghu0@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp/dccp: complete lockless accesses to sk->sk_max_ack_backlog [+ + +]
Author: Jason Xing <kernelxing@tencent.com>
Date:   Sun Mar 31 17:05:21 2024 +0800

    tcp/dccp: complete lockless accesses to sk->sk_max_ack_backlog
    
    [ Upstream commit 9a79c65f00e2b036e17af3a3a607d7d732b7affb ]
    
    Since commit 099ecf59f05b ("net: annotate lockless accesses to
    sk->sk_max_ack_backlog") decided to handle the sk_max_ack_backlog
    locklessly, there is one more function mostly called in TCP/DCCP
    cases. So this patch completes it:)
    
    Signed-off-by: Jason Xing <kernelxing@tencent.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240331090521.71965-1-kerneljasonxing@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 3479c7549fb1 ("tcp/dccp: allow a connection when sk_max_ack_backlog is zero")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tcp: Annotate data-race around sk->sk_mark in tcp_v4_send_reset [+ + +]
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Tue Jan 7 11:14:39 2025 +0100

    tcp: Annotate data-race around sk->sk_mark in tcp_v4_send_reset
    
    [ Upstream commit 80fb40baba19e25a1b6f3ecff6fc5c0171806bde ]
    
    This is a follow-up to 3c5b4d69c358 ("net: annotate data-races around
    sk->sk_mark"). sk->sk_mark can be read and written without holding
    the socket lock. IPv6 equivalent is already covered with READ_ONCE()
    annotation in tcp_v6_send_response().
    
    Fixes: 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/f459d1fc44f205e13f6d8bdca2c8bfb9902ffac9.1736244569.git.daniel@iogearbox.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
thermal: of: fix OF node leak in of_thermal_zone_find() [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Tue Dec 24 12:18:09 2024 +0900

    thermal: of: fix OF node leak in of_thermal_zone_find()
    
    [ Upstream commit 9164e0912af206a72ddac4915f7784e470a04ace ]
    
    of_thermal_zone_find() calls of_parse_phandle_with_args(), but does not
    release the OF node reference obtained by it.
    
    Add a of_node_put() call when the call is successful.
    
    Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Link: https://patch.msgid.link/20241224031809.950461-1-joe@pf.is.s.u-tokyo.ac.jp
    [ rjw: Changelog edit ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tls: Fix tls_sw_sendmsg error handling [+ + +]
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Sat Jan 4 10:29:45 2025 -0500

    tls: Fix tls_sw_sendmsg error handling
    
    [ Upstream commit b341ca51d2679829d26a3f6a4aa9aee9abd94f92 ]
    
    We've noticed that NFS can hang when using RPC over TLS on an unstable
    connection, and investigation shows that the RPC layer is stuck in a tight
    loop attempting to transmit, but forever getting -EBADMSG back from the
    underlying network.  The loop begins when tcp_sendmsg_locked() returns
    -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
    calling the socket's error reporting handler.
    
    Instead of converting errors from tcp_sendmsg_locked(), let's pass them
    along in this path.  The RPC layer handles -EPIPE by reconnecting the
    transport, which prevents the endless attempts to transmit on a broken
    connection.
    
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
    Link: https://patch.msgid.link/9594185559881679d81f071b181a10eb07cd079f.1736004079.git.bcodding@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
topology: Keep the cpumask unchanged when printing cpumap [+ + +]
Author: Li Huafei <lihuafei1@huawei.com>
Date:   Thu Nov 14 19:01:41 2024 +0800

    topology: Keep the cpumask unchanged when printing cpumap
    
    commit cbd399f78e23ad4492c174fc5e6b3676dba74a52 upstream.
    
    During fuzz testing, the following warning was discovered:
    
     different return values (15 and 11) from vsnprintf("%*pbl
     ", ...)
    
     test:keyward is WARNING in kvasprintf
     WARNING: CPU: 55 PID: 1168477 at lib/kasprintf.c:30 kvasprintf+0x121/0x130
     Call Trace:
      kvasprintf+0x121/0x130
      kasprintf+0xa6/0xe0
      bitmap_print_to_buf+0x89/0x100
      core_siblings_list_read+0x7e/0xb0
      kernfs_file_read_iter+0x15b/0x270
      new_sync_read+0x153/0x260
      vfs_read+0x215/0x290
      ksys_read+0xb9/0x160
      do_syscall_64+0x56/0x100
      entry_SYSCALL_64_after_hwframe+0x78/0xe2
    
    The call trace shows that kvasprintf() reported this warning during the
    printing of core_siblings_list. kvasprintf() has several steps:
    
     (1) First, calculate the length of the resulting formatted string.
    
     (2) Allocate a buffer based on the returned length.
    
     (3) Then, perform the actual string formatting.
    
     (4) Check whether the lengths of the formatted strings returned in
         steps (1) and (2) are consistent.
    
    If the core_cpumask is modified between steps (1) and (3), the lengths
    obtained in these two steps may not match. Indeed our test includes cpu
    hotplugging, which should modify core_cpumask while printing.
    
    To fix this issue, cache the cpumask into a temporary variable before
    calling cpumap_print_{list, cpumask}_to_buf(), to keep it unchanged
    during the printing process.
    
    Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Li Huafei <lihuafei1@huawei.com>
    Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Link: https://lore.kernel.org/r/20241114110141.94725-1-lihuafei1@huawei.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
tty: serial: 8250: Fix another runtime PM usage counter underflow [+ + +]
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date:   Tue Dec 10 19:01:20 2024 +0200

    tty: serial: 8250: Fix another runtime PM usage counter underflow
    
    commit ed2761958ad77e54791802b07095786150eab844 upstream.
    
    The commit f9b11229b79c ("serial: 8250: Fix PM usage_count for console
    handover") fixed one runtime PM usage counter balance problem that
    occurs because .dev is not set during univ8250 setup preventing call to
    pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the
    runtime PM usage counter underflow as .dev is already set at that time.
    
    Call pm_runtime_get_sync() to balance the RPM usage counter also in
    serial8250_register_8250_port() before trying to add the port.
    
    Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
    Fixes: bedb404e91bb ("serial: 8250_port: Don't use power management for kernel console")
    Cc: stable <stable@kernel.org>
    Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
    Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
usb-storage: Add max sectors quirk for Nokia 208 [+ + +]
Author: Lubomir Rintel <lrintel@redhat.com>
Date:   Wed Jan 1 22:22:06 2025 +0100

    usb-storage: Add max sectors quirk for Nokia 208
    
    commit cdef30e0774802df2f87024d68a9d86c3b99ca2a upstream.
    
    This fixes data corruption when accessing the internal SD card in mass
    storage mode.
    
    I am actually not too sure why. I didn't figure a straightforward way to
    reproduce the issue, but i seem to get garbage when issuing a lot (over 50)
    of large reads (over 120 sectors) are done in a quick succession. That is,
    time seems to matter here -- larger reads are fine if they are done with
    some delay between them.
    
    But I'm not great at understanding this sort of things, so I'll assume
    the issue other, smarter, folks were seeing with similar phones is the
    same problem and I'll just put my quirk next to theirs.
    
    The "Software details" screen on the phone is as follows:
    
      V 04.06
      07-08-13
      RM-849
      (c) Nokia
    
    TL;DR version of the device descriptor:
    
      idVendor           0x0421 Nokia Mobile Phones
      idProduct          0x06c2
      bcdDevice            4.06
      iManufacturer           1 Nokia
      iProduct                2 Nokia 208
    
    The patch assumes older firmwares are broken too (I'm unable to test, but
    no biggie if they aren't I guess), and I have no idea if newer firmware
    exists.
    
    Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
    Cc: stable <stable@kernel.org>
    Acked-by: Alan Stern <stern@rowland.harvard.edu>
    Link: https://lore.kernel.org/r/20250101212206.2386207-1-lkundrak@v3.sk
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe() [+ + +]
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date:   Mon Dec 16 10:55:39 2024 +0900

    usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe()
    
    commit 74adad500346fb07d69af2c79acbff4adb061134 upstream.
    
    Current implementation of ci_hdrc_imx_driver does not decrement the
    refcount of the device obtained in usbmisc_get_init_data(). Add a
    put_device() call in .remove() and in .probe() before returning an
    error.
    
    This bug was found by an experimental static analysis tool that I am
    developing.
    
    Cc: stable <stable@kernel.org>
    Fixes: f40017e0f332 ("chipidea: usbmisc_imx: Add USB support for VF610 SoCs")
    Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
    Acked-by: Peter Chen <peter.chen@kernel.org>
    Link: https://lore.kernel.org/r/20241216015539.352579-1-joe@pf.is.s.u-tokyo.ac.jp
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
USB: core: Disable LPM only for non-suspended ports [+ + +]
Author: Kai-Heng Feng <kaihengf@nvidia.com>
Date:   Fri Dec 6 15:48:17 2024 +0800

    USB: core: Disable LPM only for non-suspended ports
    
    commit 59bfeaf5454b7e764288d84802577f4a99bf0819 upstream.
    
    There's USB error when tegra board is shutting down:
    [  180.919315] usb 2-3: Failed to set U1 timeout to 0x0,error code -113
    [  180.919995] usb 2-3: Failed to set U1 timeout to 0xa,error code -113
    [  180.920512] usb 2-3: Failed to set U2 timeout to 0x4,error code -113
    [  186.157172] tegra-xusb 3610000.usb: xHCI host controller not responding, assume dead
    [  186.157858] tegra-xusb 3610000.usb: HC died; cleaning up
    [  186.317280] tegra-xusb 3610000.usb: Timeout while waiting for evaluate context command
    
    The issue is caused by disabling LPM on already suspended ports.
    
    For USB2 LPM, the LPM is already disabled during port suspend. For USB3
    LPM, port won't transit to U1/U2 when it's already suspended in U3,
    hence disabling LPM is only needed for ports that are not suspended.
    
    Cc: Wayne Chang <waynec@nvidia.com>
    Cc: stable <stable@kernel.org>
    Fixes: d920a2ed8620 ("usb: Disable USB3 LPM at shutdown")
    Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
    Acked-by: Alan Stern <stern@rowland.harvard.edu>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Link: https://lore.kernel.org/r/20241206074817.89189-1-kaihengf@nvidia.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
usb: dwc3-am62: Disable autosuspend during remove [+ + +]
Author: Prashanth K <quic_prashk@quicinc.com>
Date:   Mon Dec 9 16:27:28 2024 +0530

    usb: dwc3-am62: Disable autosuspend during remove
    
    commit 625e70ccb7bbbb2cc912e23c63390946170c085c upstream.
    
    Runtime PM documentation (Section 5) mentions, during remove()
    callbacks, drivers should undo the runtime PM changes done in
    probe(). Usually this means calling pm_runtime_disable(),
    pm_runtime_dont_use_autosuspend() etc. Hence add missing
    function to disable autosuspend on dwc3-am62 driver unbind.
    
    Fixes: e8784c0aec03 ("drivers: usb: dwc3: Add AM62 USB wrapper driver")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
    Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
    Link: https://lore.kernel.org/r/20241209105728.3216872-1-quic_prashk@quicinc.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: dwc3: gadget: fix writing NYET threshold [+ + +]
Author: André Draszik <andre.draszik@linaro.org>
Date:   Mon Dec 9 11:49:53 2024 +0000

    usb: dwc3: gadget: fix writing NYET threshold
    
    commit 01ea6bf5cb58b20cc1bd159f0cf74a76cf04bb69 upstream.
    
    Before writing a new value to the register, the old value needs to be
    masked out for the new value to be programmed as intended, because at
    least in some cases the reset value of that field is 0xf (max value).
    
    At the moment, the dwc3 core initialises the threshold to the maximum
    value (0xf), with the option to override it via a DT. No upstream DTs
    seem to override it, therefore this commit doesn't change behaviour for
    any upstream platform. Nevertheless, the code should be fixed to have
    the desired outcome.
    
    Do so.
    
    Fixes: 80caf7d21adc ("usb: dwc3: add lpm erratum support")
    Cc: stable@vger.kernel.org # 5.10+ (needs adjustment for 5.4)
    Signed-off-by: André Draszik <andre.draszik@linaro.org>
    Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
    Link: https://lore.kernel.org/r/20241209-dwc3-nyet-fix-v2-1-02755683345b@linaro.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: fix reference leak in usb_new_device() [+ + +]
Author: Ma Ke <make_ruc2021@163.com>
Date:   Wed Dec 18 15:13:46 2024 +0800

    usb: fix reference leak in usb_new_device()
    
    commit 0df11fa8cee5a9cf8753d4e2672bb3667138c652 upstream.
    
    When device_add(&udev->dev) succeeds and a later call fails,
    usb_new_device() does not properly call device_del(). As comment of
    device_add() says, 'if device_add() succeeds, you should call
    device_del() when you want to get rid of it. If device_add() has not
    succeeded, use only put_device() to drop the reference count'.
    
    Found by code review.
    
    Cc: stable <stable@kernel.org>
    Fixes: 9f8b17e643fe ("USB: make usbdevices export their device nodes instead of using a separate class")
    Signed-off-by: Ma Ke <make_ruc2021@163.com>
    Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
    Link: https://lore.kernel.org/r/20241218071346.2973980-1-make_ruc2021@163.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: configfs: Ignore trailing LF for user strings to cdev [+ + +]
Author: Ingo Rohloff <ingo.rohloff@lauterbach.com>
Date:   Thu Dec 12 16:41:14 2024 +0100

    usb: gadget: configfs: Ignore trailing LF for user strings to cdev
    
    commit 9466545720e231fc02acd69b5f4e9138e09a26f6 upstream.
    
    Since commit c033563220e0f7a8
    ("usb: gadget: configfs: Attach arbitrary strings to cdev")
    a user can provide extra string descriptors to a USB gadget via configfs.
    
    For "manufacturer", "product", "serialnumber", setting the string via
    configfs ignores a trailing LF.
    
    For the arbitrary strings the LF was not ignored.
    
    This patch ignores a trailing LF to make this consistent with the existing
    behavior for "manufacturer", ...  string descriptors.
    
    Fixes: c033563220e0 ("usb: gadget: configfs: Attach arbitrary strings to cdev")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
    Link: https://lore.kernel.org/r/20241212154114.29295-1-ingo.rohloff@lauterbach.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: f_fs: Remove WARN_ON in functionfs_bind [+ + +]
Author: Akash M <akash.m5@samsung.com>
Date:   Thu Dec 19 18:22:19 2024 +0530

    usb: gadget: f_fs: Remove WARN_ON in functionfs_bind
    
    commit dfc51e48bca475bbee984e90f33fdc537ce09699 upstream.
    
    This commit addresses an issue related to below kernel panic where
    panic_on_warn is enabled. It is caused by the unnecessary use of WARN_ON
    in functionsfs_bind, which easily leads to the following scenarios.
    
    1.adb_write in adbd               2. UDC write via configfs
      =================                  =====================
    
    ->usb_ffs_open_thread()           ->UDC write
     ->open_functionfs()               ->configfs_write_iter()
      ->adb_open()                      ->gadget_dev_desc_UDC_store()
       ->adb_write()                     ->usb_gadget_register_driver_owner
                                          ->driver_register()
    ->StartMonitor()                       ->bus_add_driver()
     ->adb_read()                           ->gadget_bind_driver()
    <times-out without BIND event>           ->configfs_composite_bind()
                                              ->usb_add_function()
    ->open_functionfs()                        ->ffs_func_bind()
     ->adb_open()                               ->functionfs_bind()
                                           <ffs->state !=FFS_ACTIVE>
    
    The adb_open, adb_read, and adb_write operations are invoked from the
    daemon, but trying to bind the function is a process that is invoked by
    UDC write through configfs, which opens up the possibility of a race
    condition between the two paths. In this race scenario, the kernel panic
    occurs due to the WARN_ON from functionfs_bind when panic_on_warn is
    enabled. This commit fixes the kernel panic by removing the unnecessary
    WARN_ON.
    
    Kernel panic - not syncing: kernel: panic_on_warn set ...
    [   14.542395] Call trace:
    [   14.542464]  ffs_func_bind+0x1c8/0x14a8
    [   14.542468]  usb_add_function+0xcc/0x1f0
    [   14.542473]  configfs_composite_bind+0x468/0x588
    [   14.542478]  gadget_bind_driver+0x108/0x27c
    [   14.542483]  really_probe+0x190/0x374
    [   14.542488]  __driver_probe_device+0xa0/0x12c
    [   14.542492]  driver_probe_device+0x3c/0x220
    [   14.542498]  __driver_attach+0x11c/0x1fc
    [   14.542502]  bus_for_each_dev+0x104/0x160
    [   14.542506]  driver_attach+0x24/0x34
    [   14.542510]  bus_add_driver+0x154/0x270
    [   14.542514]  driver_register+0x68/0x104
    [   14.542518]  usb_gadget_register_driver_owner+0x48/0xf4
    [   14.542523]  gadget_dev_desc_UDC_store+0xf8/0x144
    [   14.542526]  configfs_write_iter+0xf0/0x138
    
    Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Akash M <akash.m5@samsung.com>
    Link: https://lore.kernel.org/r/20241219125221.1679-1-akash.m5@samsung.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints [+ + +]
Author: Prashanth K <quic_prashk@quicinc.com>
Date:   Wed Dec 11 17:29:15 2024 +0530

    usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints
    
    commit 057bd54dfcf68b1f67e6dfc32a47a72e12198495 upstream.
    
    Currently afunc_bind sets std_ac_if_desc.bNumEndpoints to 1 if
    controls (mute/volume) are enabled. During next afunc_bind call,
    bNumEndpoints would be unchanged and incorrectly set to 1 even
    if the controls aren't enabled.
    
    Fix this by resetting the value of bNumEndpoints to 0 on every
    afunc_bind call.
    
    Fixes: eaf6cbe09920 ("usb: gadget: f_uac2: add volume and mute support")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
    Link: https://lore.kernel.org/r/20241211115915.159864-1-quic_prashk@quicinc.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: midi2: Reverse-select at the right place [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Jan 1 14:11:19 2025 +0100

    usb: gadget: midi2: Reverse-select at the right place
    
    commit 6f660ffce7c938f2a5d8473c0e0b45e4fb25ef7f upstream.
    
    We should do reverse selection of other components from
    CONFIG_USB_F_MIDI2 which is tristate, instead of
    CONFIG_USB_CONFIGFS_F_MIDI2 which is bool, for satisfying subtle
    module dependencies.
    
    Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Link: https://lore.kernel.org/r/20250101131124.27599-1-tiwai@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null [+ + +]
Author: Lianqin Hu <hulianqin@vivo.com>
Date:   Tue Dec 17 07:58:44 2024 +0000

    usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null
    
    commit 13014969cbf07f18d62ceea40bd8ca8ec9d36cec upstream.
    
    Considering that in some extreme cases, when performing the
    unbinding operation, gserial_disconnect has cleared gser->ioport,
    which triggers gadget reconfiguration, and then calls gs_read_complete,
    resulting in access to a null pointer. Therefore, ep is disabled before
    gserial_disconnect sets port to null to prevent this from happening.
    
    Call trace:
     gs_read_complete+0x58/0x240
     usb_gadget_giveback_request+0x40/0x160
     dwc3_remove_requests+0x170/0x484
     dwc3_ep0_out_start+0xb0/0x1d4
     __dwc3_gadget_start+0x25c/0x720
     kretprobe_trampoline.cfi_jt+0x0/0x8
     kretprobe_trampoline.cfi_jt+0x0/0x8
     udc_bind_to_driver+0x1d8/0x300
     usb_gadget_probe_driver+0xa8/0x1dc
     gadget_dev_desc_UDC_store+0x13c/0x188
     configfs_write_iter+0x160/0x1f4
     vfs_write+0x2d0/0x40c
     ksys_write+0x7c/0xf0
     __arm64_sys_write+0x20/0x30
     invoke_syscall+0x60/0x150
     el0_svc_common+0x8c/0xf8
     do_el0_svc+0x28/0xa0
     el0_svc+0x24/0x84
    
    Fixes: c1dca562be8a ("usb gadget: split out serial core")
    Cc: stable <stable@kernel.org>
    Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Lianqin Hu <hulianqin@vivo.com>
    Link: https://lore.kernel.org/r/TYUPR06MB621733B5AC690DBDF80A0DCCD2042@TYUPR06MB6217.apcprd06.prod.outlook.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
USB: serial: cp210x: add Phoenix Contact UPS Device [+ + +]
Author: Johan Hovold <johan@kernel.org>
Date:   Wed Jan 8 11:24:36 2025 +0100

    USB: serial: cp210x: add Phoenix Contact UPS Device
    
    commit 854eee93bd6e3dca619d47087af4d65b2045828e upstream.
    
    Phoenix Contact sells UPS Quint devices [1] with a custom datacable [2]
    that embeds a Silicon Labs converter:
    
    Bus 001 Device 003: ID 1b93:1013 Silicon Labs Phoenix Contact UPS Device
    Device Descriptor:
      bLength                18
      bDescriptorType         1
      bcdUSB               2.00
      bDeviceClass            0
      bDeviceSubClass         0
      bDeviceProtocol         0
      bMaxPacketSize0        64
      idVendor           0x1b93
      idProduct          0x1013
      bcdDevice            1.00
      iManufacturer           1 Silicon Labs
      iProduct                2 Phoenix Contact UPS Device
      iSerial                 3 <redacted>
      bNumConfigurations     1
      Configuration Descriptor:
        bLength                 9
        bDescriptorType         2
        wTotalLength       0x0020
        bNumInterfaces          1
        bConfigurationValue     1
        iConfiguration          0
        bmAttributes         0x80
          (Bus Powered)
        MaxPower              100mA
        Interface Descriptor:
          bLength                 9
          bDescriptorType         4
          bInterfaceNumber        0
          bAlternateSetting       0
          bNumEndpoints           2
          bInterfaceClass       255 Vendor Specific Class
          bInterfaceSubClass      0
          bInterfaceProtocol      0
          iInterface              2 Phoenix Contact UPS Device
          Endpoint Descriptor:
            bLength                 7
            bDescriptorType         5
            bEndpointAddress     0x01  EP 1 OUT
            bmAttributes            2
              Transfer Type            Bulk
              Synch Type               None
              Usage Type               Data
            wMaxPacketSize     0x0040  1x 64 bytes
            bInterval               0
          Endpoint Descriptor:
            bLength                 7
            bDescriptorType         5
            bEndpointAddress     0x82  EP 2 IN
            bmAttributes            2
              Transfer Type            Bulk
              Synch Type               None
              Usage Type               Data
            wMaxPacketSize     0x0040  1x 64 bytes
            bInterval               0
    
    [1] https://www.phoenixcontact.com/en-pc/products/power-supply-unit-quint-ps-1ac-24dc-10-2866763
    [2] https://www.phoenixcontact.com/en-il/products/data-cable-preassembled-ifs-usb-datacable-2320500
    
    Reported-by: Giuseppe Corbelli <giuseppe.corbelli@antaresvision.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Johan Hovold <johan@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: serial: option: add MeiG Smart SRM815 [+ + +]
Author: Chukun Pan <amadeus@jmu.edu.cn>
Date:   Sun Dec 15 18:00:27 2024 +0800

    USB: serial: option: add MeiG Smart SRM815
    
    commit c1947d244f807b1f95605b75a4059e7b37b5dcc3 upstream.
    
    It looks like SRM815 shares ID with SRM825L.
    
    T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480  MxCh= 0
    D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=2dee ProdID=4d22 Rev= 4.14
    S:  Manufacturer=MEIG
    S:  Product=LTE-A Module
    S:  SerialNumber=123456
    C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
    I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn>
    Link: https://lore.kernel.org/lkml/20241215100027.1970930-1-amadeus@jmu.edu.cn/
    Link: https://lore.kernel.org/all/4333b4d0-281f-439d-9944-5570cbc4971d@gmail.com/
    Cc: stable@vger.kernel.org
    Signed-off-by: Johan Hovold <johan@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: serial: option: add Neoway N723-EA support [+ + +]
Author: Michal Hrusecky <michal.hrusecky@turris.com>
Date:   Tue Jan 7 17:08:29 2025 +0100

    USB: serial: option: add Neoway N723-EA support
    
    commit f5b435be70cb126866fa92ffc6f89cda9e112c75 upstream.
    
    Update the USB serial option driver to support Neoway N723-EA.
    
    ID 2949:8700 Marvell Mobile Composite Device Bus
    
    T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=2949 ProdID=8700 Rev= 1.00
    S:  Manufacturer=Marvell
    S:  Product=Mobile Composite Device Bus
    S:  SerialNumber=200806006809080000
    C:* #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=500mA
    A:  FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03
    I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host
    E:  Ad=87(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=89(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    Tested successfully connecting to the Internet via rndis interface after
    dialing via AT commands on If#=4 or If#=6.
    
    Not sure of the purpose of the other serial interface.
    
    Signed-off-by: Michal Hrusecky <michal.hrusecky@turris.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Johan Hovold <johan@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm() [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Fri Dec 6 16:09:18 2024 +0300

    usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm()
    
    commit b9711ff7cde0cfbcdd44cb1fac55b6eec496e690 upstream.
    
    If max_contaminant_read_adc_mv() fails, then return the error code.  Don't
    return zero.
    
    Fixes: 02b332a06397 ("usb: typec: maxim_contaminant: Implement check_contaminant callback")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: André Draszik <andre.draszik@linaro.org>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/f1bf3768-419e-40dd-989c-f7f455d6c824@stanley.mountain
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
USB: usblp: return error when setting unsupported protocol [+ + +]
Author: Jun Yan <jerrysteve1101@gmail.com>
Date:   Thu Dec 12 22:38:52 2024 +0800

    USB: usblp: return error when setting unsupported protocol
    
    commit 7a3d76a0b60b3f6fc3375e4de2174bab43f64545 upstream.
    
    Fix the regression introduced by commit d8c6edfa3f4e ("USB:
    usblp: don't call usb_set_interface if there's a single alt"),
    which causes that unsupported protocols can also be set via
    ioctl when the num_altsetting of the device is 1.
    
    Move the check for protocol support to the earlier stage.
    
    Fixes: d8c6edfa3f4e ("USB: usblp: don't call usb_set_interface if there's a single alt")
    Cc: stable <stable@kernel.org>
    Signed-off-by: Jun Yan <jerrysteve1101@gmail.com>
    Link: https://lore.kernel.org/r/20241212143852.671889-1-jerrysteve1101@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
workqueue: Add rcu lock check at the end of work item execution [+ + +]
Author: Xuewen Yan <xuewen.yan@unisoc.com>
Date:   Wed Jan 10 11:27:24 2024 +0800

    workqueue: Add rcu lock check at the end of work item execution
    
    [ Upstream commit 1a65a6d17cbc58e1aeffb2be962acce49efbef9c ]
    
    Currently the workqueue just checks the atomic and locking states after work
    execution ends. However, sometimes, a work item may not unlock rcu after
    acquiring rcu_read_lock(). And as a result, it would cause rcu stall, but
    the rcu stall warning can not dump the work func, because the work has
    finished.
    
    In order to quickly discover those works that do not call rcu_read_unlock()
    after rcu_read_lock(), add the rcu lock check.
    
    Use rcu_preempt_depth() to check the work's rcu status. Normally, this value
    is 0. If this value is bigger than 0, it means the work are still holding
    rcu lock. If so, print err info and the work func.
    
    tj: Reworded the description for clarity. Minor formatting tweak.
    
    Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
    Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
    Reviewed-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Stable-dep-of: de35994ecd2d ("workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker [+ + +]
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Thu Dec 19 09:30:30 2024 +0000

    workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker
    
    [ Upstream commit de35994ecd2dd6148ab5a6c5050a1670a04dec77 ]
    
    After commit
    746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
    amdgpu started seeing the following warning:
    
     [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
    ...
     [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
    ...
     [ ] Call Trace:
     [ ]  <TASK>
    ...
     [ ]  ? check_flush_dependency+0xf5/0x110
    ...
     [ ]  cancel_delayed_work_sync+0x6e/0x80
     [ ]  amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
     [ ]  amdgpu_ring_alloc+0x40/0x50 [amdgpu]
     [ ]  amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
     [ ]  ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
     [ ]  amdgpu_job_run+0xaa/0x1f0 [amdgpu]
     [ ]  drm_sched_run_job_work+0x257/0x430 [gpu_sched]
     [ ]  process_one_work+0x217/0x720
    ...
     [ ]  </TASK>
    
    The intent of the verifcation done in check_flush_depedency is to ensure
    forward progress during memory reclaim, by flagging cases when either a
    memory reclaim process, or a memory reclaim work item is flushed from a
    context not marked as memory reclaim safe.
    
    This is correct when flushing, but when called from the
    cancel(_delayed)_work_sync() paths it is a false positive because work is
    either already running, or will not be running at all. Therefore
    cancelling it is safe and we can relax the warning criteria by letting the
    helper know of the calling context.
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Fixes: fca839c00a12 ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue")
    References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: Christian König <christian.koenig@amd.com
    Cc: Matthew Brost <matthew.brost@intel.com>
    Cc: <stable@vger.kernel.org> # v4.5+
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

workqueue: Update lock debugging code [+ + +]
Author: Tejun Heo <tj@kernel.org>
Date:   Sun Feb 4 11:28:06 2024 -1000

    workqueue: Update lock debugging code
    
    [ Upstream commit c35aea39d1e106f61fd2130f0d32a3bac8bd4570 ]
    
    These changes are in preparation of BH workqueue which will execute work
    items from BH context.
    
    - Update lock and RCU depth checks in process_one_work() so that it
      remembers and checks against the starting depths and prints out the depth
      changes.
    
    - Factor out lockdep annotations in the flush paths into
      touch_{wq|work}_lockdep_map(). The work->lockdep_map touching is moved
      from __flush_work() to its callee - start_flush_work(). This brings it
      closer to the wq counterpart and will allow testing the associated wq's
      flags which will be needed to support BH workqueues. This is not expected
      to cause any functional changes.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Tested-by: Allen Pais <allen.lkml@gmail.com>
    Stable-dep-of: de35994ecd2d ("workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/fpu: Ensure shadow stack is active before "getting" registers [+ + +]
Author: Rick Edgecombe <rick.p.edgecombe@intel.com>
Date:   Tue Jan 7 15:30:56 2025 -0800

    x86/fpu: Ensure shadow stack is active before "getting" registers
    
    commit a9d9c33132d49329ada647e4514d210d15e31d81 upstream.
    
    The x86 shadow stack support has its own set of registers. Those registers
    are XSAVE-managed, but they are "supervisor state components" which means
    that userspace can not touch them with XSAVE/XRSTOR.  It also means that
    they are not accessible from the existing ptrace ABI for XSAVE state.
    Thus, there is a new ptrace get/set interface for it.
    
    The regset code that ptrace uses provides an ->active() handler in
    addition to the get/set ones. For shadow stack this ->active() handler
    verifies that shadow stack is enabled via the ARCH_SHSTK_SHSTK bit in the
    thread struct. The ->active() handler is checked from some call sites of
    the regset get/set handlers, but not the ptrace ones. This was not
    understood when shadow stack support was put in place.
    
    As a result, both the set/get handlers can be called with
    XFEATURE_CET_USER in its init state, which would cause get_xsave_addr() to
    return NULL and trigger a WARN_ON(). The ssp_set() handler luckily has an
    ssp_active() check to avoid surprising the kernel with shadow stack
    behavior when the kernel is not ready for it (ARCH_SHSTK_SHSTK==0). That
    check just happened to avoid the warning.
    
    But the ->get() side wasn't so lucky. It can be called with shadow stacks
    disabled, triggering the warning in practice, as reported by Christina
    Schimpe:
    
    WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 ssp_get+0x89/0xa0
    [...]
    Call Trace:
    <TASK>
    ? show_regs+0x6e/0x80
    ? ssp_get+0x89/0xa0
    ? __warn+0x91/0x150
    ? ssp_get+0x89/0xa0
    ? report_bug+0x19d/0x1b0
    ? handle_bug+0x46/0x80
    ? exc_invalid_op+0x1d/0x80
    ? asm_exc_invalid_op+0x1f/0x30
    ? __pfx_ssp_get+0x10/0x10
    ? ssp_get+0x89/0xa0
    ? ssp_get+0x52/0xa0
    __regset_get+0xad/0xf0
    copy_regset_to_user+0x52/0xc0
    ptrace_regset+0x119/0x140
    ptrace_request+0x13c/0x850
    ? wait_task_inactive+0x142/0x1d0
    ? do_syscall_64+0x6d/0x90
    arch_ptrace+0x102/0x300
    [...]
    
    Ensure that shadow stacks are active in a thread before looking them up
    in the XSAVE buffer. Since ARCH_SHSTK_SHSTK and user_ssp[SHSTK_EN] are
    set at the same time, the active check ensures that there will be
    something to find in the XSAVE buffer.
    
    [ dhansen: changelog/subject tweaks ]
    
    Fixes: 2fab02b25ae7 ("x86: Add PTRACE interface for shadow stack")
    Reported-by: Christina Schimpe <christina.schimpe@intel.com>
    Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Tested-by: Christina Schimpe <christina.schimpe@intel.com>
    Cc:stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20250107233056.235536-1-rick.p.edgecombe%40intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node() [+ + +]
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed May 29 09:42:05 2024 +0200

    x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node()
    
    commit 3ac36aa7307363b7247ccb6f6a804e11496b2b36 upstream.
    
    memblock_set_node() warns about using MAX_NUMNODES, see
    
      e0eec24e2e19 ("memblock: make memblock_set_node() also warn about use of MAX_NUMNODES")
    
    for details.
    
    Reported-by: Narasimhan V <Narasimhan.V@amd.com>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Cc: stable@vger.kernel.org
    [bp: commit message]
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Tested-by: Paul E. McKenney <paulmck@kernel.org>
    Link: https://lore.kernel.org/r/20240603141005.23261-1-bp@kernel.org
    Link: https://lore.kernel.org/r/abadb736-a239-49e4-ab42-ace7acdd4278@suse.com
    Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>