commit a2943d2d9a00ae7c5c1fde2b2e7e9cdb47e7db05 Author: Greg Kroah-Hartman Date: Wed Aug 30 16:11:13 2023 +0200 Linux 6.1.50 Link: https://lore.kernel.org/r/20230828101156.480754469@linuxfoundation.org Tested-by: Conor Dooley Tested-by: Linux Kernel Functional Testing Tested-by: Salvatore Bonaccorso Tested-by: Joel Fernandes (Google) Tested-by: SeongJae Park Tested-by: Bagas Sanjaya Tested-by: Sudip Mukherjee Tested-by: Takeshi Ogasawara Tested-by: Shuah Khan Tested-by: Florian Fainelli Tested-by: Guenter Roeck Tested-by: Jon Hunter Tested-by: Pavel Machek (CIP) Signed-off-by: Greg Kroah-Hartman commit 19641b979b24b31c6da4b5ec29f6c57057499657 Author: Arnd Bergmann Date: Mon Jun 5 10:58:29 2023 +0200 ASoC: amd: vangogh: select CONFIG_SND_AMD_ACP_CONFIG commit fd0a7ec379dbf21b7bfd81914381ae5281706ef5 upstream. The vangogh driver just gained a link time dependency that now causes randconfig builds to fail: x86_64-linux-ld: sound/soc/amd/vangogh/pci-acp5x.o: in function `snd_acp5x_probe': pci-acp5x.c:(.text+0xbb): undefined reference to `snd_amd_acp_find_config' Fixes: e89f45edb747e ("ASoC: amd: vangogh: Add check for acp config flags in vangogh platform") Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20230605085839.2157268-1-arnd@kernel.org Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 9d5a3b4aee11301acce8c1752cb272478d082357 Author: Liam R. Howlett Date: Fri Aug 18 20:43:55 2023 -0400 maple_tree: disable mas_wr_append() when other readers are possible [ Upstream commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b ] The current implementation of append may cause duplicate data and/or incorrect ranges to be returned to a reader during an update. Although this has not been reported or seen, disable the append write operation while the tree is in rcu mode out of an abundance of caution. During the analysis of the mas_next_slot() the following was artificially created by separating the writer and reader code: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last slot write is to start of slot store current contents in slot overwrite old end pivot mas_next_slot(): read end metadata read old end pivot return with incorrect range store new value Alternatively: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last lost write to end of slot store value mas_next_slot(): read end metadata read old end pivot read new end pivot return with incorrect range set old end pivot There may be other accesses that are not safe since we are now updating both metadata and pointers, so disabling append if there could be rcu readers is the safest action. Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett Cc: Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin commit 936cf79649e0b9cbe6b638e6cc12712a9c6b26de Author: Mario Limonciello Date: Wed Aug 23 20:11:49 2023 -0500 ASoC: amd: yc: Fix a non-functional mic on Lenovo 82SJ [ Upstream commit c008323fe361bd62a43d9fb29737dacd5c067fb7 ] Lenovo 82SJ doesn't have DMIC connected like 82V2 does. Narrow the match down to only cover 82V2. Reported-by: prosenfeld@Yuhsbstudents.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217063 Fixes: 2232b2dd8cd4 ("ASoC: amd: yc: Add Lenovo Yoga Slim 7 Pro X to quirks table") Signed-off-by: Mario Limonciello commit d10ab996bd5cb63e854924dd3d6ffb036d4d432a Author: Bartosz Golaszewski Date: Tue Aug 22 21:29:43 2023 +0200 gpio: sim: pass the GPIO device's software node to irq domain [ Upstream commit 6e39c1ac688161b4db3617aabbca589b395242bc ] Associate the swnode of the GPIO device's (which is the interrupt controller here) with the irq domain. Otherwise the interrupt-controller device attribute is a no-op. Fixes: cb8c474e79be ("gpio: sim: new testing module") Signed-off-by: Bartosz Golaszewski Reviewed-by: Andy Shevchenko Signed-off-by: Sasha Levin commit 3c839f8332dfca0e7480b6c03785cc85da2c5415 Author: Bartosz Golaszewski Date: Tue Aug 22 21:29:42 2023 +0200 gpio: sim: dispose of irq mappings before destroying the irq_sim domain [ Upstream commit ab4109f91b328ff5cb5e1279f64d443241add2d1 ] If a GPIO simulator device is unbound with interrupts still requested, we will hit a use-after-free issue in __irq_domain_deactivate_irq(). The owner of the irq domain must dispose of all mappings before destroying the domain object. Fixes: cb8c474e79be ("gpio: sim: new testing module") Signed-off-by: Bartosz Golaszewski Reviewed-by: Andy Shevchenko Signed-off-by: Sasha Levin commit 3282e79a85c1096b1f59925d7a04a4303d605759 Author: Rob Clark Date: Fri Aug 18 07:59:38 2023 -0700 dma-buf/sw_sync: Avoid recursive lock during fence signal [ Upstream commit e531fdb5cd5ee2564b7fe10c8a9219e2b2fac61e ] If a signal callback releases the sw_sync fence, that will trigger a deadlock as the timeline_fence_release recurses onto the fence->lock (used both for signaling and the the timeline tree). To avoid that, temporarily hold an extra reference to the signalled fences until after we drop the lock. (This is an alternative implementation of https://patchwork.kernel.org/patch/11664717/ which avoids some potential UAF issues with the original patch.) v2: Remove now obsolete comment, use list_move_tail() and list_del_init() Reported-by: Bas Nieuwenhuizen Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free") Signed-off-by: Rob Clark Link: https://patchwork.freedesktop.org/patch/msgid/20230818145939.39697-1-robdclark@gmail.com Reviewed-by: Christian König Signed-off-by: Christian König Signed-off-by: Sasha Levin commit 6ed06b94f68363f4bc53608623059170d3689f4e Author: Biju Das Date: Tue Aug 15 14:15:58 2023 +0100 pinctrl: renesas: rza2: Add lock around pinctrl_generic{{add,remove}_group,{add,remove}_function} [ Upstream commit 8fcc1c40b747069644db6102c1d84c942c9d4d86 ] The pinctrl group and function creation/remove calls expect caller to take care of locking. Add lock around these functions. Fixes: b59d0e782706 ("pinctrl: Add RZ/A2 pin and gpio controller") Signed-off-by: Biju Das Reviewed-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20230815131558.33787-4-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven Signed-off-by: Sasha Levin commit 3fb1b959af17a723491fb8776562274d229c9bfb Author: Biju Das Date: Tue Aug 15 14:15:57 2023 +0100 pinctrl: renesas: rzv2m: Fix NULL pointer dereference in rzv2m_dt_subnode_to_map() [ Upstream commit f982b9d57e7f834138fc908804fe66f646f2b108 ] Fix the below random NULL pointer crash during boot by serializing pinctrl group and function creation/remove calls in rzv2m_dt_subnode_to_map() with mutex lock. Crash logs: pc : __pi_strcmp+0x20/0x140 lr : pinmux_func_name_to_selector+0x68/0xa4 Call trace: __pi_strcmp+0x20/0x140 pinmux_generic_add_function+0x34/0xcc rzv2m_dt_subnode_to_map+0x2e4/0x418 rzv2m_dt_node_to_map+0x15c/0x18c pinctrl_dt_to_map+0x218/0x37c create_pinctrl+0x70/0x3d8 While at it, add a comment for lock. Fixes: 92a9b8252576 ("pinctrl: renesas: Add RZ/V2M pin and gpio controller driver") Signed-off-by: Biju Das Reviewed-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20230815131558.33787-3-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven Signed-off-by: Sasha Levin commit 4a75bf3f6f4f276a58c4647433ec8676af59ac7c Author: Biju Das Date: Tue Aug 15 14:15:56 2023 +0100 pinctrl: renesas: rzg2l: Fix NULL pointer dereference in rzg2l_dt_subnode_to_map() [ Upstream commit 661efa2284bbc2338da0424e219603f034072c74 ] Fix the below random NULL pointer crash during boot by serializing pinctrl group and function creation/remove calls in rzg2l_dt_subnode_to_map() with mutex lock. Crash log: pc : __pi_strcmp+0x20/0x140 lr : pinmux_func_name_to_selector+0x68/0xa4 Call trace: __pi_strcmp+0x20/0x140 pinmux_generic_add_function+0x34/0xcc rzg2l_dt_subnode_to_map+0x314/0x44c rzg2l_dt_node_to_map+0x164/0x194 pinctrl_dt_to_map+0x218/0x37c create_pinctrl+0x70/0x3d8 While at it, add comments for bitmap_lock and lock. Fixes: c4c4637eb57f ("pinctrl: renesas: Add RZ/G2L pin and gpio controller driver") Tested-by: Chris Paterson Signed-off-by: Biju Das Reviewed-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20230815131558.33787-2-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven Signed-off-by: Sasha Levin commit 0ba9a242a6b37be06e8fc5be4ecbbc254df37ff0 Author: Biju Das Date: Tue Jul 25 18:51:40 2023 +0100 clk: Fix undefined reference to `clk_rate_exclusive_{get,put}' [ Upstream commit 2746f13f6f1df7999001d6595b16f789ecc28ad1 ] The COMMON_CLK config is not enabled in some of the architectures. This causes below build issues: pwm-rz-mtu3.c:(.text+0x114): undefined reference to `clk_rate_exclusive_put' pwm-rz-mtu3.c:(.text+0x32c): undefined reference to `clk_rate_exclusive_get' Fix these issues by moving clk_rate_exclusive_{get,put} inside COMMON_CLK code block, as clk.c is enabled by COMMON_CLK. Fixes: 55e9b8b7b806 ("clk: add clk_rate_exclusive api") Reported-by: kernel test robot Closes: https://lore.kernel.org/all/202307251752.vLfmmhYm-lkp@intel.com/ Signed-off-by: Biju Das Link: https://lore.kernel.org/r/20230725175140.361479-1-biju.das.jz@bp.renesas.com Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit 70461151d0eb765976faf79424385dfdb3e1e468 Author: Zhu Wang Date: Tue Aug 22 01:52:54 2023 +0000 scsi: core: raid_class: Remove raid_component_add() commit 60c5fd2e8f3c42a5abc565ba9876ead1da5ad2b7 upstream. The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by: Zhu Wang Link: https://lore.kernel.org/r/20230822015254.184270-1-wangzhu9@huawei.com Reviewed-by: Bart Van Assche Fixes: 04b5b5cb0136 ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 774cb3de7ac93b1e3cdbcf29cc0e908b12796473 Author: Zhu Wang Date: Sat Aug 19 08:39:41 2023 +0000 scsi: snic: Fix double free in snic_tgt_create() commit 1bd3a76880b2bce017987cf53780b372cf59528e upstream. Commit 41320b18a0e0 ("scsi: snic: Fix possible memory leak if device_add() fails") fixed the memory leak caused by dev_set_name() when device_add() failed. However, it did not consider that 'tgt' has already been released when put_device(&tgt->dev) is called. Remove kfree(tgt) in the error path to avoid double free of 'tgt' and move put_device(&tgt->dev) after the removed kfree(tgt) to avoid a use-after-free. Fixes: 41320b18a0e0 ("scsi: snic: Fix possible memory leak if device_add() fails") Signed-off-by: Zhu Wang Link: https://lore.kernel.org/r/20230819083941.164365-1-wangzhu9@huawei.com Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit bd20e20c4d64e131498ec91f1b066cd4708088b3 Author: Yin Fengwei Date: Tue Aug 8 10:09:17 2023 +0800 madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check commit 0e0e9bd5f7b9d40fd03b70092367247d52da1db0 upstream. Commit 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio") replaced the page_mapcount() with folio_mapcount() to check whether the folio is shared by other mapping. It's not correct for large folios. folio_mapcount() returns the total mapcount of large folio which is not suitable to detect whether the folio is shared. Use folio_estimated_sharers() which returns a estimated number of shares. That means it's not 100% correct. It should be OK for madvise case here. User-visible effects is that the THP is skipped when user call madvise. But the correct behavior is THP should be split and processed then. NOTE: this change is a temporary fix to reduce the user-visible effects before the long term fix from David is ready. Link: https://lkml.kernel.org/r/20230808020917.2230692-4-fengwei.yin@intel.com Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio") Signed-off-by: Yin Fengwei Reviewed-by: Yu Zhao Reviewed-by: Ryan Roberts Cc: David Hildenbrand Cc: Kefeng Wang Cc: Matthew Wilcox Cc: Minchan Kim Cc: Vishal Moola (Oracle) Cc: Yang Shi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit f67e3a725b4975e3be3fda61ea6e4978213dcc2f Author: Oliver Hartkopp Date: Mon Aug 21 16:45:47 2023 +0200 can: raw: add missing refcount for memory leak fix commit c275a176e4b69868576e543409927ae75e3a3288 upstream. Commit ee8b94c8510c ("can: raw: fix receiver memory leak") introduced a new reference to the CAN netdevice that has assigned CAN filters. But this new ro->dev reference did not maintain its own refcount which lead to another KASAN use-after-free splat found by Eric Dumazet. This patch ensures a proper refcount for the CAN nedevice. Fixes: ee8b94c8510c ("can: raw: fix receiver memory leak") Reported-by: Eric Dumazet Cc: Ziyang Xuan Signed-off-by: Oliver Hartkopp Link: https://lore.kernel.org/r/20230821144547.6658-3-socketcan@hartkopp.net Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b7803afc77bee77e7df6662e1959df0038fbaac3 Author: Ming Lei Date: Mon Feb 20 12:14:13 2023 +0800 ublk: remove check IO_URING_F_SQE128 in ublk_ch_uring_cmd commit 9c7c4bc986932218fd0df9d2a100509772028fb1 upstream. sizeof(struct ublksrv_io_cmd) is 16bytes, which can be held in 64byte SQE, so not necessary to check IO_URING_F_SQE128. With this change, we get chance to save half SQ ring memory. Fixed: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20230220041413.1524335-1-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f016326d31d010433b2a1a08a4856c214ae829eb Author: Sanjay R Mehta Date: Wed Aug 2 06:11:49 2023 -0500 thunderbolt: Fix Thunderbolt 3 display flickering issue on 2nd hot plug onwards commit 583893a66d731f5da010a3fa38a0460e05f0149b upstream. Previously, on unplug events, the TMU mode was disabled first followed by the Time Synchronization Handshake, irrespective of whether the tb_switch_tmu_rate_write() API was successful or not. However, this caused a problem with Thunderbolt 3 (TBT3) devices, as the TSPacketInterval bits were always enabled by default, leading the host router to assume that the device router's TMU was already enabled and preventing it from initiating the Time Synchronization Handshake. As a result, TBT3 monitors experienced display flickering from the second hot plug onwards. To address this issue, we have modified the code to only disable the Time Synchronization Handshake during TMU disable if the tb_switch_tmu_rate_write() function is successful. This ensures that the TBT3 devices function correctly and eliminates the display flickering issue. Co-developed-by: Sanath S Signed-off-by: Sanath S Signed-off-by: Sanjay R Mehta Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg [ USB4v2 introduced support for uni-directional TMU mode as part of d49b4f043d63 ("thunderbolt: Add support for enhanced uni-directional TMU mode") This is not a stable candidate commit, so adjust the code for backport. ] Signed-off-by: Mario Limonciello Signed-off-by: Greg Kroah-Hartman commit d3ff67076bed6b09be43faec7339cc51dfba451d Author: Dietmar Eggemann Date: Sun Aug 20 16:24:17 2023 +0100 cgroup/cpuset: Free DL BW in case can_attach() fails commit 2ef269ef1ac006acf974793d975539244d77b28f upstream. cpuset_can_attach() can fail. Postpone DL BW allocation until all tasks have been checked. DL BW is not allocated per-task but as a sum over all DL tasks migrating. If multiple controllers are attached to the cgroup next to the cpuset controller a non-cpuset can_attach() can fail. In this case free DL BW in cpuset_cancel_attach(). Finally, update cpuset DL task count (nr_deadline_tasks) only in cpuset_attach(). Suggested-by: Waiman Long Signed-off-by: Dietmar Eggemann Signed-off-by: Juri Lelli Reviewed-by: Waiman Long Signed-off-by: Tejun Heo Signed-off-by: Qais Yousef (Google) Signed-off-by: Greg Kroah-Hartman commit f0135131bb0e5b02313898da809822f4edca7e4f Author: Dietmar Eggemann Date: Sun Aug 20 16:24:16 2023 +0100 sched/deadline: Create DL BW alloc, free & check overflow interface commit 85989106feb734437e2d598b639991b9185a43a6 upstream. While moving a set of tasks between exclusive cpusets, cpuset_can_attach() -> task_can_attach() calls dl_cpu_busy(..., p) for DL BW overflow checking and per-task DL BW allocation on the destination root_domain for the DL tasks in this set. This approach has the issue of not freeing already allocated DL BW in the following error cases: (1) The set of tasks includes multiple DL tasks and DL BW overflow checking fails for one of the subsequent DL tasks. (2) Another controller next to the cpuset controller which is attached to the same cgroup fails in its can_attach(). To address this problem rework dl_cpu_busy(): (1) Split it into dl_bw_check_overflow() & dl_bw_alloc() and add a dedicated dl_bw_free(). (2) dl_bw_alloc() & dl_bw_free() take a `u64 dl_bw` parameter instead of a `struct task_struct *p` used in dl_cpu_busy(). This allows to allocate DL BW for a set of tasks too rather than only for a single task. Signed-off-by: Dietmar Eggemann Signed-off-by: Juri Lelli Signed-off-by: Tejun Heo Signed-off-by: Qais Yousef (Google) Signed-off-by: Greg Kroah-Hartman commit 064b960dbe942114a397788a57474c40cea04185 Author: Juri Lelli Date: Sun Aug 20 16:24:15 2023 +0100 cgroup/cpuset: Iterate only if DEADLINE tasks are present commit c0f78fd5edcf29b2822ac165f9248a6c165e8554 upstream. update_tasks_root_domain currently iterates over all tasks even if no DEADLINE task is present on the cpuset/root domain for which bandwidth accounting is being rebuilt. This has been reported to introduce 10+ ms delays on suspend-resume operations. Skip the costly iteration for cpusets that don't contain DEADLINE tasks. Reported-by: Qais Yousef (Google) Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli Reviewed-by: Waiman Long Signed-off-by: Tejun Heo Signed-off-by: Qais Yousef (Google) Signed-off-by: Greg Kroah-Hartman commit d1b4262b78cc7638642833252ae92fc586854ffb Author: Juri Lelli Date: Sun Aug 20 16:24:14 2023 +0100 sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets commit 6c24849f5515e4966d94fa5279bdff4acf2e9489 upstream. Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef (Google) Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli Reviewed-by: Waiman Long Signed-off-by: Tejun Heo Signed-off-by: Qais Yousef (Google) Signed-off-by: Greg Kroah-Hartman commit 9bcfe1527882deac71d367ea1ff59d38c0fe9486 Author: Juri Lelli Date: Sun Aug 20 16:24:13 2023 +0100 sched/cpuset: Bring back cpuset_mutex commit 111cd11bbc54850f24191c52ff217da88a5e639b upstream. Turns out percpu_cpuset_rwsem - commit 1243dc518c9d ("cgroup/cpuset: Convert cpuset_mutex to percpu_rwsem") - wasn't such a brilliant idea, as it has been reported to cause slowdowns in workloads that need to change cpuset configuration frequently and it is also not implementing priority inheritance (which causes troubles with realtime workloads). Convert percpu_cpuset_rwsem back to regular cpuset_mutex. Also grab it only for SCHED_DEADLINE tasks (other policies don't care about stable cpusets anyway). Signed-off-by: Juri Lelli Reviewed-by: Waiman Long Signed-off-by: Tejun Heo [ Conflict in kernel/cgroup/cpuset.c due to pulling new code/comments. Reject all new code. Remove BUG_ON() about rwsem that doesn't exist on mainline. ] Signed-off-by: Qais Yousef (Google) Signed-off-by: Greg Kroah-Hartman commit 7030fbf75f260924b2ab98195e6f9fe7b8c80884 Author: Juri Lelli Date: Sun Aug 20 16:24:12 2023 +0100 cgroup/cpuset: Rename functions dealing with DEADLINE accounting commit ad3a557daf6915296a43ef97a3e9c48e076c9dd8 upstream. rebuild_root_domains() and update_tasks_root_domain() have neutral names, but actually deal with DEADLINE bandwidth accounting. Rename them to use 'dl_' prefix so that intent is more clear. No functional change. Suggested-by: Qais Yousef (Google) Signed-off-by: Juri Lelli Reviewed-by: Waiman Long Signed-off-by: Tejun Heo Signed-off-by: Qais Yousef (Google) Signed-off-by: Greg Kroah-Hartman commit ce59b7c1b027bb2da6a69a81bcca63783561ae9b Author: Christian Brauner Date: Tue May 2 15:36:02 2023 +0200 nfsd: use vfs setgid helper commit 2d8ae8c417db284f598dffb178cc01e7db0f1821 upstream. We've aligned setgid behavior over multiple kernel releases. The details can be found in commit cf619f891971 ("Merge tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping") and commit 426b4ca2d6a5 ("Merge tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux"). Consistent setgid stripping behavior is now encapsulated in the setattr_should_drop_sgid() helper which is used by all filesystems that strip setgid bits outside of vfs proper. Usually ATTR_KILL_SGID is raised in e.g., chown_common() and is subject to the setattr_should_drop_sgid() check to determine whether the setgid bit can be retained. Since nfsd is raising ATTR_KILL_SGID unconditionally it will cause notify_change() to strip it even if the caller had the necessary privileges to retain it. Ensure that nfsd only raises ATR_KILL_SGID if the caller lacks the necessary privileges to retain the setgid bit. Without this patch the setgid stripping tests in LTP will fail: > As you can see, the problem is S_ISGID (0002000) was dropped on a > non-group-executable file while chown was invoked by super-user, while [...] > fchown02.c:66: TFAIL: testfile2: wrong mode permissions 0100700, expected 0102700 [...] > chown02.c:57: TFAIL: testfile2: wrong mode permissions 0100700, expected 0102700 With this patch all tests pass. Reported-by: Sherry Yang Signed-off-by: Christian Brauner Reviewed-by: Jeff Layton Cc: Signed-off-by: Chuck Lever [Harshit: backport to 6.1.y: Use init_user_ns instead of nop_mnt_idmap as we don't have commit abf08576afe3 ("fs: port vfs_*() helpers to struct mnt_idmap")] Signed-off-by: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit 362ed5d9311466c0322ab18fd271373ae883081f Author: Christian Brauner Date: Tue Mar 14 12:51:10 2023 +0100 nfs: use vfs setgid helper commit 4f704d9a8352f5c0a8fcdb6213b934630342bd44 upstream. We've aligned setgid behavior over multiple kernel releases. The details can be found in the following two merge messages: cf619f891971 ("Merge tag 'fs.ovl.setgid.v6.2') 426b4ca2d6a5 ("Merge tag 'fs.setgid.v6.0') Consistent setgid stripping behavior is now encapsulated in the setattr_should_drop_sgid() helper which is used by all filesystems that strip setgid bits outside of vfs proper. Switch nfs to rely on this helper as well. Without this patch the setgid stripping tests in xfstests will fail. Signed-off-by: Christian Brauner (Microsoft) Reviewed-by: Christoph Hellwig Message-Id: <20230313-fs-nfs-setgid-v2-1-9a59f436cfc0@kernel.org> Signed-off-by: Christian Brauner [ Harshit: backport to 6.1.y: fs/internal.h -- minor conflict due to code change differences. include/linux/fs.h -- Used struct user_namespace *mnt_userns instead of struct mnt_idmap *idmap fs/nfs/inode.c -- Used init_user_ns instead of nop_mnt_idmap ] Signed-off-by: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit a0ec52f36ce98796aa3076885b6b96dc7e093384 Author: Hangbin Liu Date: Wed Jan 18 10:09:27 2023 +0800 selftests/net: mv bpf/nat6to4.c to net folder commit 3c107f36db061603bee7564fbd6388b1f1879fd3 upstream. There are some issues with the bpf/nat6to4.c building. 1. It use TEST_CUSTOM_PROGS, which will add the nat6to4.o to kselftest-list file and run by common run_tests. 2. When building the test via `make -C tools/testing/selftests/ TARGETS="net"`, the nat6to4.o will be build in selftests/net/bpf/ folder. But in test udpgro_frglist.sh it refers to ../bpf/nat6to4.o. The correct path should be ./bpf/nat6to4.o. 3. If building the test via `make -C tools/testing/selftests/ TARGETS="net" install`. The nat6to4.o will be installed to kselftest_install/net/ folder. Then the udpgro_frglist.sh should refer to ./nat6to4.o. To fix the confusing test path, let's just move the nat6to4.c to net folder and build it as TEST_GEN_FILES. Fixes: edae34a3ed92 ("selftests net: add UDP GRO fraglist + bpf self-tests") Tested-by: Björn Töpel Signed-off-by: Hangbin Liu Link: https://lore.kernel.org/r/20230118020927.3971864-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni Signed-off-by: Hardik Garg Signed-off-by: Greg Kroah-Hartman commit f1fa6e6f85cb5cbf3f1e78c42c25f3b974584543 Author: Aleksa Savic Date: Mon Aug 7 19:20:03 2023 +0200 hwmon: (aquacomputer_d5next) Add selective 200ms delay after sending ctrl report commit 56b930dcd88c2adc261410501c402c790980bdb5 upstream. Add a 200ms delay after sending a ctrl report to Quadro, Octo, D5 Next and Aquaero to give them enough time to process the request and save the data to memory. Otherwise, under heavier userspace loads where multiple sysfs entries are usually set in quick succession, a new ctrl report could be requested from the device while it's still processing the previous one and fail with -EPIPE. The delay is only applied if two ctrl report operations are near each other in time. Reported by a user on Github [1] and tested by both of us. [1] https://github.com/aleksamagicka/aquacomputer_d5next-hwmon/issues/82 Fixes: 752b927951ea ("hwmon: (aquacomputer_d5next) Add support for Aquacomputer Octo") Signed-off-by: Aleksa Savic Link: https://lore.kernel.org/r/20230807172004.456968-1-savicaleksa83@gmail.com Signed-off-by: Guenter Roeck [ removed Aquaero support as it's not in 6.1 ] Signed-off-by: Aleksa Savic Signed-off-by: Greg Kroah-Hartman commit d8f9a9cfdcd31290cb8b720746458cb110301c68 Author: Feng Tang Date: Wed Aug 23 14:57:47 2023 +0800 x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4 commit 2c66ca3949dc701da7f4c9407f2140ae425683a5 upstream. 0-Day found a 34.6% regression in stress-ng's 'af-alg' test case, and bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()"), which optimizes the FPU init order, and moves the CR4_OSXSAVE enabling into a later place: arch_cpu_finalize_init identify_boot_cpu identify_cpu generic_identify get_cpu_cap --> setup cpu capability ... fpu__init_cpu fpu__init_cpu_xstate cr4_set_bits(X86_CR4_OSXSAVE); As the FPU is not yet initialized the CPU capability setup fails to set X86_FEATURE_OSXSAVE. Many security module like 'camellia_aesni_avx_x86_64' depend on this feature and therefore fail to load, causing the regression. Cure this by setting X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling. [ tglx: Moved it into the actual BSP FPU initialization code and added a comment ] Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()") Reported-by: kernel test robot Signed-off-by: Feng Tang Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@intel.com Link: https://lore.kernel.org/lkml/20230823065747.92257-1-feng.tang@intel.com Signed-off-by: Greg Kroah-Hartman commit 6bcb9c7d043578a71a2409a7c4bdb2febe471cce Author: Rick Edgecombe Date: Fri Aug 18 10:03:05 2023 -0700 x86/fpu: Invalidate FPU state correctly on exec() commit 1f69383b203e28cf8a4ca9570e572da1699f76cd upstream. The thread flag TIF_NEED_FPU_LOAD indicates that the FPU saved state is valid and should be reloaded when returning to userspace. However, the kernel will skip doing this if the FPU registers are already valid as determined by fpregs_state_valid(). The logic embedded there considers the state valid if two cases are both true: 1: fpu_fpregs_owner_ctx points to the current tasks FPU state 2: the last CPU the registers were live in was the current CPU. This is usually correct logic. A CPU’s fpu_fpregs_owner_ctx is set to the current FPU during the fpregs_restore_userregs() operation, so it indicates that the registers have been restored on this CPU. But this alone doesn’t preclude that the task hasn’t been rescheduled to a different CPU, where the registers were modified, and then back to the current CPU. To verify that this was not the case the logic relies on the second condition. So the assumption is that if the registers have been restored, AND they haven’t had the chance to be modified (by being loaded on another CPU), then they MUST be valid on the current CPU. Besides the lazy FPU optimizations, the other cases where the FPU registers might not be valid are when the kernel modifies the FPU register state or the FPU saved buffer. In this case the operation modifying the FPU state needs to let the kernel know the correspondence has been broken. The comment in “arch/x86/kernel/fpu/context.h” has: /* ... * If the FPU register state is valid, the kernel can skip restoring the * FPU state from memory. * * Any code that clobbers the FPU registers or updates the in-memory * FPU state for a task MUST let the rest of the kernel know that the * FPU registers are no longer valid for this task. * * Either one of these invalidation functions is enough. Invalidate * a resource you control: CPU if using the CPU for something else * (with preemption disabled), FPU for the current task, or a task that * is prevented from running by the current task. */ However, this is not completely true. When the kernel modifies the registers or saved FPU state, it can only rely on __fpu_invalidate_fpregs_state(), which wipes the FPU’s last_cpu tracking. The exec path instead relies on fpregs_deactivate(), which sets the CPU’s FPU context to NULL. This was observed to fail to restore the reset FPU state to the registers when returning to userspace in the following scenario: 1. A task is executing in userspace on CPU0 - CPU0’s FPU context points to tasks - fpu->last_cpu=CPU0 2. The task exec()’s 3. While in the kernel the task is preempted - CPU0 gets a thread executing in the kernel (such that no other FPU context is activated) - Scheduler sets task’s fpu->last_cpu=CPU0 when scheduling out 4. Task is migrated to CPU1 5. Continuing the exec(), the task gets to fpu_flush_thread()->fpu_reset_fpregs() - Sets CPU1’s fpu context to NULL - Copies the init state to the task’s FPU buffer - Sets TIF_NEED_FPU_LOAD on the task 6. The task reschedules back to CPU0 before completing the exec() and returning to userspace - During the reschedule, scheduler finds TIF_NEED_FPU_LOAD is set - Skips saving the registers and updating task’s fpu→last_cpu, because TIF_NEED_FPU_LOAD is the canonical source. 7. Now CPU0’s FPU context is still pointing to the task’s, and fpu->last_cpu is still CPU0. So fpregs_state_valid() returns true even though the reset FPU state has not been restored. So the root cause is that exec() is doing the wrong kind of invalidate. It should reset fpu->last_cpu via __fpu_invalidate_fpregs_state(). Further, fpu__drop() doesn't really seem appropriate as the task (and FPU) are not going away, they are just getting reset as part of an exec. So switch to __fpu_invalidate_fpregs_state(). Also, delete the misleading comment that says that either kind of invalidate will be enough, because it’s not always the case. Fixes: 33344368cb08 ("x86/fpu: Clean up the fpu__clear() variants") Reported-by: Lei Wang Signed-off-by: Rick Edgecombe Signed-off-by: Thomas Gleixner Tested-by: Lijun Pan Reviewed-by: Sohil Mehta Acked-by: Lijun Pan Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230818170305.502891-1-rick.p.edgecombe@intel.com Signed-off-by: Greg Kroah-Hartman commit 3bc9b0364a8c64d1bb1757b620ea3b9104e8054b Author: Ankit Nautiyal Date: Fri Aug 18 10:14:36 2023 +0530 drm/display/dp: Fix the DP DSC Receiver cap size commit 5ad1ab30ac0809d2963ddcf39ac34317a24a2f17 upstream. DP DSC Receiver Capabilities are exposed via DPCD 60h-6Fh. Fix the DSC RECEIVER CAP SIZE accordingly. Fixes: ffddc4363c28 ("drm/dp: Add DP DSC DPCD receiver capability size define and missing SHIFT") Cc: Anusha Srivatsa Cc: Manasi Navare Cc: # v5.0+ Signed-off-by: Ankit Nautiyal Reviewed-by: Stanislav Lisovskiy Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20230818044436.177806-1-ankit.k.nautiyal@intel.com Signed-off-by: Greg Kroah-Hartman commit 3abffee6091c5a2716963c229e192a36a9590a88 Author: Anshuman Gupta Date: Wed Aug 16 18:22:16 2023 +0530 drm/i915/dgfx: Enable d3cold at s2idle commit 2872144aec04baa7e43ecd2a60f7f0be3aa843fd upstream. System wide suspend already has support for lmem save/restore during suspend therefore enabling d3cold for s2idle and keepng it disable for runtime PM.(Refer below commit for d3cold runtime PM disable justification) 'commit 66eb93e71a7a ("drm/i915/dgfx: Keep PCI autosuspend control 'on' by default on all dGPU")' It will reduce the DG2 Card power consumption to ~0 Watt for s2idle power KPI. v2: - Added "Cc: stable@vger.kernel.org". Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8755 Cc: stable@vger.kernel.org Cc: Rodrigo Vivi Signed-off-by: Anshuman Gupta Reviewed-by: Rodrigo Vivi Tested-by: Aaron Ma Tested-by: Jianshui Yu Link: https://patchwork.freedesktop.org/patch/msgid/20230816125216.1722002-1-anshuman.gupta@intel.com (cherry picked from commit 2643e6d1f2a5e51877be24042d53cf956589be10) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 115f2ccd3a998fe7247f59f8fb5feffc878bcbb7 Author: Zack Rusin Date: Fri Jun 16 15:09:34 2023 -0400 drm/vmwgfx: Fix shader stage validation commit 14abdfae508228a7307f7491b5c4215ae70c6542 upstream. For multiple commands the driver was not correctly validating the shader stages resulting in possible kernel oopses. The validation code was only. if ever, checking the upper bound on the shader stages but never a lower bound (valid shader stages start at 1 not 0). Fixes kernel oopses ending up in vmw_binding_add, e.g.: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 2443 Comm: testcase Not tainted 6.3.0-rc4-vmwgfx #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 RIP: 0010:vmw_binding_add+0x4c/0x140 [vmwgfx] Code: 7e 30 49 83 ff 0e 0f 87 ea 00 00 00 4b 8d 04 7f 89 d2 89 cb 48 c1 e0 03 4c 8b b0 40 3d 93 c0 48 8b 80 48 3d 93 c0 49 0f af de <48> 03 1c d0 4c 01 e3 49 8> RSP: 0018:ffffb8014416b968 EFLAGS: 00010206 RAX: ffffffffc0933ec0 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 00000000ffffffff RSI: ffffb8014416b9c0 RDI: ffffb8014316f000 RBP: ffffb8014416b998 R08: 0000000000000003 R09: 746f6c735f726564 R10: ffffffffaaf2bda0 R11: 732e676e69646e69 R12: ffffb8014316f000 R13: ffffb8014416b9c0 R14: 0000000000000040 R15: 0000000000000006 FS: 00007fba8c0af740(0000) GS:ffff8a1277c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000007c0933eb8 CR3: 0000000118244001 CR4: 00000000003706e0 Call Trace: vmw_view_bindings_add+0xf5/0x1b0 [vmwgfx] ? ___drm_dbg+0x8a/0xb0 [drm] vmw_cmd_dx_set_shader_res+0x8f/0xc0 [vmwgfx] vmw_execbuf_process+0x590/0x1360 [vmwgfx] vmw_execbuf_ioctl+0x173/0x370 [vmwgfx] ? __drm_dev_dbg+0xb4/0xe0 [drm] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] drm_ioctl_kernel+0xbc/0x160 [drm] drm_ioctl+0x2d2/0x580 [drm] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] ? do_fault+0x1a6/0x420 vmw_generic_ioctl+0xbd/0x180 [vmwgfx] vmw_unlocked_ioctl+0x19/0x20 [vmwgfx] __x64_sys_ioctl+0x96/0xd0 do_syscall_64+0x5d/0x90 ? handle_mm_fault+0xe4/0x2f0 ? debug_smp_processor_id+0x1b/0x30 ? fpregs_assert_state_consistent+0x2e/0x50 ? exit_to_user_mode_prepare+0x40/0x180 ? irqentry_exit_to_user_mode+0xd/0x20 ? irqentry_exit+0x3f/0x50 ? exc_page_fault+0x8b/0x180 entry_SYSCALL_64_after_hwframe+0x72/0xdc Signed-off-by: Zack Rusin Cc: security@openanolis.org Reported-by: Ziming Zhang Testcase-found-by: Niels De Graef Fixes: d80efd5cb3de ("drm/vmwgfx: Initial DX support") Cc: # v4.3+ Reviewed-by: Maaz Mombasawala Reviewed-by: Martin Krastev Link: https://patchwork.freedesktop.org/patch/msgid/20230616190934.54828-1-zack@kde.org Signed-off-by: Greg Kroah-Hartman commit 1900e193b5ddd32f55fef7369152531a1886879e Author: Igor Mammedov Date: Wed Jul 26 14:35:18 2023 +0200 PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus commit cc22522fd55e257c86d340ae9aedc122e705a435 upstream. 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") changed acpiphp hotplug to use pci_assign_unassigned_bridge_resources() which depends on bridge being available, however enable_slot() can be called without bridge associated: 1. Legitimate case of hotplug on root bus (widely used in virt world) 2. A (misbehaving) firmware, that sends ACPI Bus Check notifications to non existing root ports (Dell Inspiron 7352/0W6WV0), which end up at enable_slot(..., bridge = 0) where bus has no bridge assigned to it. acpihp doesn't know that it's a bridge, and bus specific 'PCI subsystem' can't augment ACPI context with bridge information since the PCI device to get this data from is/was not available. Issue is easy to reproduce with QEMU's 'pc' machine, which supports PCI hotplug on hostbridge slots. To reproduce, boot kernel at commit 40613da52b13 in VM started with following CLI (assuming guest root fs is installed on sda1 partition): # qemu-system-x86_64 -M pc -m 1G -enable-kvm -cpu host \ -monitor stdio -serial file:serial.log \ -kernel arch/x86/boot/bzImage \ -append "root=/dev/sda1 console=ttyS0" \ guest_disk.img Once guest OS is fully booted at qemu prompt: (qemu) device_add e1000 (check serial.log) it will cause NULL pointer dereference at: void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) { struct pci_bus *parent = bridge->subordinate; BUG: kernel NULL pointer dereference, address: 0000000000000018 ? pci_assign_unassigned_bridge_resources+0x1f/0x260 enable_slot+0x21f/0x3e0 acpiphp_hotplug_notify+0x13d/0x260 acpi_device_hotplug+0xbc/0x540 acpi_hotplug_work_fn+0x15/0x20 process_one_work+0x1f7/0x370 worker_thread+0x45/0x3b0 The issue was discovered on Dell Inspiron 7352/0W6WV0 laptop with following sequence: 1. Suspend to RAM 2. Wake up with the same backtrace being observed: 3. 2nd suspend to RAM attempt makes laptop freeze Fix it by using __pci_bus_assign_resources() instead of pci_assign_unassigned_bridge_resources() as we used to do, but only in case when bus doesn't have a bridge associated (to cover for the case of ACPI event on hostbridge or non existing root port). That lets us keep hotplug on root bus working like it used to and at the same time keeps resource reassignment usable on root ports (and other 1st level bridges) that was fixed by 40613da52b13. Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") Link: https://lore.kernel.org/r/20230726123518.2361181-2-imammedo@redhat.com Reported-by: Woody Suwalski Tested-by: Woody Suwalski Tested-by: Michal Koutný Link: https://lore.kernel.org/r/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com Signed-off-by: Igor Mammedov Signed-off-by: Bjorn Helgaas Acked-by: Rafael J. Wysocki Acked-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit fe04122b932118e968ea4ba88bdc62aa806b88d1 Author: Wei Chen Date: Thu Aug 10 08:23:33 2023 +0000 media: vcodec: Fix potential array out-of-bounds in encoder queue_setup commit e7f2e65699e2290fd547ec12a17008764e5d9620 upstream. variable *nplanes is provided by user via system call argument. The possible value of q_data->fmt->num_planes is 1-3, while the value of *nplanes can be 1-8. The array access by index i can cause array out-of-bounds. Fix this bug by checking *nplanes against the array size. Fixes: 4e855a6efa54 ("[media] vcodec: mediatek: Add Mediatek V4L2 Video Encoder Driver") Signed-off-by: Wei Chen Cc: stable@vger.kernel.org Reviewed-by: Chen-Yu Tsai Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit 4919043ab93bb662961ffe34c55557b0aafd3bcd Author: Mario Limonciello Date: Fri Aug 18 09:48:50 2023 -0500 pinctrl: amd: Mask wake bits on probe again commit 6bc3462a0f5ecaa376a0b3d76dafc55796799e17 upstream. Shubhra reports that their laptop is heating up over s2idle. Even though it's getting into the deepest state, it appears to be having spurious wakeup events. While debugging a tangential issue with the RTC Carsten reports that recent 6.1.y based kernel face a similar problem. Looking at acpidump and GPIO register comparisons these spurious wakeup events are from the GPIO associated with the I2C touchpad on both laptops and occur even when the touchpad is not marked as a wake source by the kernel. This means that the boot firmware has programmed these bits and because Linux didn't touch them lead to spurious wakeup events from that GPIO. To fix this issue, restore most of the code that previously would clear all the bits associated with wakeup sources. This will allow the kernel to only program the wake up sources that are necessary. This is similar to what was done previously; but only the wake bits are cleared by default instead of interrupts and wake bits. If any other problems are reported then it may make sense to clear interrupts again too. Cc: Sachi King Cc: stable@vger.kernel.org Cc: Thorsten Leemhuis Fixes: 65f6c7c91cb2 ("pinctrl: amd: Revert "pinctrl: amd: disable and mask interrupts on probe"") Reported-by: Shubhra Prakash Nandi Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217754 Reported-by: Carsten Hatger Link: https://bugzilla.kernel.org/show_bug.cgi?id=217626#c28 Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20230818144850.1439-1-mario.limonciello@amd.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit c6b7d8902025a96e61c731dcb75dd8fb91c813e5 Author: Rob Herring Date: Fri Aug 18 15:40:57 2023 -0500 of: dynamic: Refactor action prints to not use "%pOF" inside devtree_lock commit 914d9d831e6126a6e7a92e27fcfaa250671be42c upstream. While originally it was fine to format strings using "%pOF" while holding devtree_lock, this now causes a deadlock. Lockdep reports: of_get_parent from of_fwnode_get_parent+0x18/0x24 ^^^^^^^^^^^^^ of_fwnode_get_parent from fwnode_count_parents+0xc/0x28 fwnode_count_parents from fwnode_full_name_string+0x18/0xac fwnode_full_name_string from device_node_string+0x1a0/0x404 device_node_string from pointer+0x3c0/0x534 pointer from vsnprintf+0x248/0x36c vsnprintf from vprintk_store+0x130/0x3b4 Fix this by moving the printing in __of_changeset_entry_apply() outside the lock. As the only difference in the multiple prints is the action name, use the existing "action_names" to refactor the prints into a single print. Fixes: a92eb7621b9fb2c2 ("lib/vsprintf: Make use of fwnode API to obtain node names and separators") Cc: stable@vger.kernel.org Reported-by: Geert Uytterhoeven Reviewed-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20230801-dt-changeset-fixes-v3-2-5f0410e007dd@kernel.org Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 2d00ca90b81e2ea5f2b604947546be08bbe14094 Author: Rob Herring Date: Fri Aug 18 15:40:56 2023 -0500 of: unittest: Fix EXPECT for parse_phandle_with_args_map() test commit 0aeae3788e28f64ccb95405d4dc8cd80637ffaea upstream. Commit 12e17243d8a1 ("of: base: improve error msg in of_phandle_iterator_next()") added printing of the phandle value on error, but failed to update the unittest. Fixes: 12e17243d8a1 ("of: base: improve error msg in of_phandle_iterator_next()") Cc: stable@vger.kernel.org Reviewed-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20230801-dt-changeset-fixes-v3-1-5f0410e007dd@kernel.org Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit e75de82b378617afd20805551e2e3596fbb447a1 Author: Arnd Bergmann Date: Fri Aug 11 15:10:13 2023 +0200 radix tree: remove unused variable commit d59070d1076ec5114edb67c87658aeb1d691d381 upstream. Recent versions of clang warn about an unused variable, though older versions saw the 'slot++' as a use and did not warn: radix-tree.c:1136:50: error: parameter 'slot' set but not used [-Werror,-Wunused-but-set-parameter] It's clearly not needed any more, so just remove it. Link: https://lkml.kernel.org/r/20230811131023.2226509-1-arnd@kernel.org Fixes: 3a08cd52c37c7 ("radix tree: Remove multiorder support") Signed-off-by: Arnd Bergmann Cc: Matthew Wilcox Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Peng Zhang Cc: Rong Tao Cc: Tom Rix Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit aa096bc3c8c09fbd836130292ff08cfe0df40d79 Author: Mingzheng Xing Date: Fri Aug 25 03:08:52 2023 +0800 riscv: Fix build errors using binutils2.37 toolchains commit ef21fa7c198e04f3d3053b1c5b5f2b4b225c3350 upstream. When building the kernel with binutils 2.37 and GCC-11.1.0/GCC-11.2.0, the following error occurs: Assembler messages: Error: cannot find default versions of the ISA extension `zicsr' Error: cannot find default versions of the ISA extension `zifencei' The above error originated from this commit of binutils[0], which has been resolved and backported by GCC-12.1.0[1] and GCC-11.3.0[2]. So fix this by change the GCC version in CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC to GCC-11.3.0. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f0bae2552db1dd4f1995608fbf6648fcee4e9e0c [0] Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ca2bbb88f999f4d3cc40e89bc1aba712505dd598 [1] Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d29f5d6ab513c52fd872f532c492e35ae9fd6671 [2] Fixes: ca09f772ccca ("riscv: Handle zicsr/zifencei issue between gcc and binutils") Reported-by: Conor Dooley Cc: Signed-off-by: Mingzheng Xing Link: https://lore.kernel.org/r/20230824190852.45470-1-xingmingzheng@iscas.ac.cn Closes: https://lore.kernel.org/all/20230823-captive-abdomen-befd942a4a73@wendy/ Reviewed-by: Conor Dooley Tested-by: Conor Dooley Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 33835975740e65c42ad147799bcd6aa430a7eef0 Author: Mingzheng Xing Date: Thu Aug 10 00:56:48 2023 +0800 riscv: Handle zicsr/zifencei issue between gcc and binutils commit ca09f772cccaeec4cd05a21528c37a260aa2dd2c upstream. Binutils-2.38 and GCC-12.1.0 bumped[0][1] the default ISA spec to the newer 20191213 version which moves some instructions from the I extension to the Zicsr and Zifencei extensions. So if one of the binutils and GCC exceeds that version, we should explicitly specifying Zicsr and Zifencei via -march to cope with the new changes. but this only occurs when binutils >= 2.36 and GCC >= 11.1.0. It's a different story when binutils < 2.36. binutils-2.36 supports the Zifencei extension[2] and splits Zifencei and Zicsr from I[3]. GCC-11.1.0 is particular[4] because it add support Zicsr and Zifencei extension for -march. binutils-2.35 does not support the Zifencei extension, and does not need to specify Zicsr and Zifencei when working with GCC >= 12.1.0. To make our lives easier, let's relax the check to binutils >= 2.36 in CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI. For the other two cases, where clang < 17 or GCC < 11.1.0, we will deal with them in CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC. For more information, please refer to: commit 6df2a016c0c8 ("riscv: fix build with binutils 2.38") commit e89c2e815e76 ("riscv: Handle zicsr/zifencei issues between clang and binutils") Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc [0] Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=98416dbb0a62579d4a7a4a76bab51b5b52fec2cd [1] Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=5a1b31e1e1cee6e9f1c92abff59cdcfff0dddf30 [2] Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=729a53530e86972d1143553a415db34e6e01d5d2 [3] Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b03be74bad08c382da47e048007a78fa3fb4ef49 [4] Link: https://lore.kernel.org/all/20230308220842.1231003-1-conor@kernel.org Link: https://lore.kernel.org/all/20230223220546.52879-1-conor@kernel.org Reviewed-by: Conor Dooley Acked-by: Guo Ren Cc: Signed-off-by: Mingzheng Xing Link: https://lore.kernel.org/r/20230809165648.21071-1-xingmingzheng@iscas.ac.cn Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 30ffd5890a0349187b820cc09b182f41f267d7b9 Author: Helge Deller Date: Fri Aug 25 21:50:33 2023 +0200 lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels commit 382d4cd1847517ffcb1800fd462b625db7b2ebea upstream. The gcc compiler translates on some architectures the 64-bit __builtin_clzll() function to a call to the libgcc function __clzdi2(), which should take a 64-bit parameter on 32- and 64-bit platforms. But in the current kernel code, the built-in __clzdi2() function is defined to operate (wrongly) on 32-bit parameters if BITS_PER_LONG == 32, thus the return values on 32-bit kernels are in the range from [0..31] instead of the expected [0..63] range. This patch fixes the in-kernel functions __clzdi2() and __ctzdi2() to take a 64-bit parameter on 32-bit kernels as well, thus it makes the functions identical for 32- and 64-bit kernels. This bug went unnoticed since kernel 3.11 for over 10 years, and here are some possible reasons for that: a) Some architectures have assembly instructions to count the bits and which are used instead of calling __clzdi2(), e.g. on x86 the bsr instruction and on ppc cntlz is used. On such architectures the wrong __clzdi2() implementation isn't used and as such the bug has no effect and won't be noticed. b) Some architectures link to libgcc.a, and the in-kernel weak functions get replaced by the correct 64-bit variants from libgcc.a. c) __builtin_clzll() and __clzdi2() doesn't seem to be used in many places in the kernel, and most likely only in uncritical functions, e.g. when printing hex values via seq_put_hex_ll(). The wrong return value will still print the correct number, but just in a wrong formatting (e.g. with too many leading zeroes). d) 32-bit kernels aren't used that much any longer, so they are less tested. A trivial testcase to verify if the currently running 32-bit kernel is affected by the bug is to look at the output of /proc/self/maps: Here the kernel uses a correct implementation of __clzdi2(): root@debian:~# cat /proc/self/maps 00010000-00019000 r-xp 00000000 08:05 787324 /usr/bin/cat 00019000-0001a000 rwxp 00009000 08:05 787324 /usr/bin/cat 0001a000-0003b000 rwxp 00000000 00:00 0 [heap] f7551000-f770d000 r-xp 00000000 08:05 794765 /usr/lib/hppa-linux-gnu/libc.so.6 ... and this kernel uses the broken implementation of __clzdi2(): root@debian:~# cat /proc/self/maps 0000000010000-0000000019000 r-xp 00000000 000000008:000000005 787324 /usr/bin/cat 0000000019000-000000001a000 rwxp 000000009000 000000008:000000005 787324 /usr/bin/cat 000000001a000-000000003b000 rwxp 00000000 00:00 0 [heap] 00000000f73d1000-00000000f758d000 r-xp 00000000 000000008:000000005 794765 /usr/lib/hppa-linux-gnu/libc.so.6 ... Signed-off-by: Helge Deller Fixes: 4df87bb7b6a22 ("lib: add weak clz/ctz functions") Cc: Chanho Min Cc: Geert Uytterhoeven Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 82bb5f8aba00cb38759957346119c6fc37f09518 Author: Sven Eckelmann Date: Mon Aug 21 21:48:48 2023 +0200 batman-adv: Hold rtnl lock during MTU update via netlink commit 987aae75fc1041072941ffb622b45ce2359a99b9 upstream. The automatic recalculation of the maximum allowed MTU is usually triggered by code sections which are already rtnl lock protected by callers outside of batman-adv. But when the fragmentation setting is changed via batman-adv's own batadv genl family, then the rtnl lock is not yet taken. But dev_set_mtu requires that the caller holds the rtnl lock because it uses netdevice notifiers. And this code will then fail the check for this lock: RTNL: assertion failed at net/core/dev.c (1953) Cc: stable@vger.kernel.org Reported-by: syzbot+f8812454d9b3ac00d282@syzkaller.appspotmail.com Fixes: c6a953cce8d0 ("batman-adv: Trigger events for auto adjusted MTU") Signed-off-by: Sven Eckelmann Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20230821-batadv-missing-mtu-rtnl-lock-v1-1-1c5a7bfe861e@narfation.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit cb1f73e691bb6c46eda7ed7e60c6ce6672094169 Author: Remi Pommarel Date: Wed Aug 9 17:29:13 2023 +0200 batman-adv: Fix batadv_v_ogm_aggr_send memory leak commit 421d467dc2d483175bad4fb76a31b9e5a3d744cf upstream. When batadv_v_ogm_aggr_send is called for an inactive interface, the skb is silently dropped by batadv_v_ogm_send_to_if() but never freed causing the following memory leak: unreferenced object 0xffff00000c164800 (size 512): comm "kworker/u8:1", pid 2648, jiffies 4295122303 (age 97.656s) hex dump (first 32 bytes): 00 80 af 09 00 00 ff ff e1 09 00 00 75 01 60 83 ............u.`. 1f 00 00 00 b8 00 00 00 15 00 05 00 da e3 d3 64 ...............d backtrace: [<0000000007ad20f6>] __kmalloc_track_caller+0x1a8/0x310 [<00000000d1029e55>] kmalloc_reserve.constprop.0+0x70/0x13c [<000000008b9d4183>] __alloc_skb+0xec/0x1fc [<00000000c7af5051>] __netdev_alloc_skb+0x48/0x23c [<00000000642ee5f5>] batadv_v_ogm_aggr_send+0x50/0x36c [<0000000088660bd7>] batadv_v_ogm_aggr_work+0x24/0x40 [<0000000042fc2606>] process_one_work+0x3b0/0x610 [<000000002f2a0b1c>] worker_thread+0xa0/0x690 [<0000000059fae5d4>] kthread+0x1fc/0x210 [<000000000c587d3a>] ret_from_fork+0x10/0x20 Free the skb in that case to fix this leak. Cc: stable@vger.kernel.org Fixes: 0da0035942d4 ("batman-adv: OGMv2 - add basic infrastructure") Signed-off-by: Remi Pommarel Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit f1bead97f0ad91bb3c1b90449400654fd5814bf6 Author: Remi Pommarel Date: Fri Aug 4 11:39:36 2023 +0200 batman-adv: Fix TT global entry leak when client roamed back commit d25ddb7e788d34cf27ff1738d11a87cb4b67d446 upstream. When a client roamed back to a node before it got time to destroy the pending local entry (i.e. within the same originator interval) the old global one is directly removed from hash table and left as such. But because this entry had an extra reference taken at lookup (i.e using batadv_tt_global_hash_find) there is no way its memory will be reclaimed at any time causing the following memory leak: unreferenced object 0xffff0000073c8000 (size 18560): comm "softirq", pid 0, jiffies 4294907738 (age 228.644s) hex dump (first 32 bytes): 06 31 ac 12 c7 7a 05 00 01 00 00 00 00 00 00 00 .1...z.......... 2c ad be 08 00 80 ff ff 6c b6 be 08 00 80 ff ff ,.......l....... backtrace: [<00000000ee6e0ffa>] kmem_cache_alloc+0x1b4/0x300 [<000000000ff2fdbc>] batadv_tt_global_add+0x700/0xe20 [<00000000443897c7>] _batadv_tt_update_changes+0x21c/0x790 [<000000005dd90463>] batadv_tt_update_changes+0x3c/0x110 [<00000000a2d7fc57>] batadv_tt_tvlv_unicast_handler_v1+0xafc/0xe10 [<0000000011793f2a>] batadv_tvlv_containers_process+0x168/0x2b0 [<00000000b7cbe2ef>] batadv_recv_unicast_tvlv+0xec/0x1f4 [<0000000042aef1d8>] batadv_batman_skb_recv+0x25c/0x3a0 [<00000000bbd8b0a2>] __netif_receive_skb_core.isra.0+0x7a8/0xe90 [<000000004033d428>] __netif_receive_skb_one_core+0x64/0x74 [<000000000f39a009>] __netif_receive_skb+0x48/0xe0 [<00000000f2cd8888>] process_backlog+0x174/0x344 [<00000000507d6564>] __napi_poll+0x58/0x1f4 [<00000000b64ef9eb>] net_rx_action+0x504/0x590 [<00000000056fa5e4>] _stext+0x1b8/0x418 [<00000000878879d6>] run_ksoftirqd+0x74/0xa4 unreferenced object 0xffff00000bae1a80 (size 56): comm "softirq", pid 0, jiffies 4294910888 (age 216.092s) hex dump (first 32 bytes): 00 78 b1 0b 00 00 ff ff 0d 50 00 00 00 00 00 00 .x.......P...... 00 00 00 00 00 00 00 00 50 c8 3c 07 00 00 ff ff ........P.<..... backtrace: [<00000000ee6e0ffa>] kmem_cache_alloc+0x1b4/0x300 [<00000000d9aaa49e>] batadv_tt_global_add+0x53c/0xe20 [<00000000443897c7>] _batadv_tt_update_changes+0x21c/0x790 [<000000005dd90463>] batadv_tt_update_changes+0x3c/0x110 [<00000000a2d7fc57>] batadv_tt_tvlv_unicast_handler_v1+0xafc/0xe10 [<0000000011793f2a>] batadv_tvlv_containers_process+0x168/0x2b0 [<00000000b7cbe2ef>] batadv_recv_unicast_tvlv+0xec/0x1f4 [<0000000042aef1d8>] batadv_batman_skb_recv+0x25c/0x3a0 [<00000000bbd8b0a2>] __netif_receive_skb_core.isra.0+0x7a8/0xe90 [<000000004033d428>] __netif_receive_skb_one_core+0x64/0x74 [<000000000f39a009>] __netif_receive_skb+0x48/0xe0 [<00000000f2cd8888>] process_backlog+0x174/0x344 [<00000000507d6564>] __napi_poll+0x58/0x1f4 [<00000000b64ef9eb>] net_rx_action+0x504/0x590 [<00000000056fa5e4>] _stext+0x1b8/0x418 [<00000000878879d6>] run_ksoftirqd+0x74/0xa4 Releasing the extra reference from batadv_tt_global_hash_find even at roam back when batadv_tt_global_free is called fixes this memory leak. Cc: stable@vger.kernel.org Fixes: 068ee6e204e1 ("batman-adv: roaming handling mechanism redesign") Signed-off-by: Remi Pommarel Signed-off-by; Sven Eckelmann Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit fc9b87d8b741a955d462fc14da4fcd788d00748a Author: Remi Pommarel Date: Fri Jul 28 15:38:50 2023 +0200 batman-adv: Do not get eth header before batadv_check_management_packet commit eac27a41ab641de074655d2932fc7f8cdb446881 upstream. If received skb in batadv_v_elp_packet_recv or batadv_v_ogm_packet_recv is either cloned or non linearized then its data buffer will be reallocated by batadv_check_management_packet when skb_cow or skb_linearize get called. Thus geting ethernet header address inside skb data buffer before batadv_check_management_packet had any chance to reallocate it could lead to the following kernel panic: Unable to handle kernel paging request at virtual address ffffff8020ab069a Mem abort info: ESR = 0x96000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000040f45000 [ffffff8020ab069a] pgd=180000007fffa003, p4d=180000007fffa003, pud=180000007fffa003, pmd=180000007fefe003, pte=0068000020ab0706 Internal error: Oops: 96000007 [#1] SMP Modules linked in: ahci_mvebu libahci_platform libahci dvb_usb_af9035 dvb_usb_dib0700 dib0070 dib7000m dibx000_common ath11k_pci ath10k_pci ath10k_core mwl8k_new nf_nat_sip nf_conntrack_sip xhci_plat_hcd xhci_hcd nf_nat_pptp nf_conntrack_pptp at24 sbsa_gwdt CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.15.42-00066-g3242268d425c-dirty #550 Hardware name: A8k (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : batadv_is_my_mac+0x60/0xc0 lr : batadv_v_ogm_packet_recv+0x98/0x5d0 sp : ffffff8000183820 x29: ffffff8000183820 x28: 0000000000000001 x27: ffffff8014f9af00 x26: 0000000000000000 x25: 0000000000000543 x24: 0000000000000003 x23: ffffff8020ab0580 x22: 0000000000000110 x21: ffffff80168ae880 x20: 0000000000000000 x19: ffffff800b561000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 00dc098924ae0032 x14: 0f0405433e0054b0 x13: ffffffff00000080 x12: 0000004000000001 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : ffffffc076dae000 x6 : ffffff8000183700 x5 : ffffffc00955e698 x4 : ffffff80168ae000 x3 : ffffff80059cf000 x2 : ffffff800b561000 x1 : ffffff8020ab0696 x0 : ffffff80168ae880 Call trace: batadv_is_my_mac+0x60/0xc0 batadv_v_ogm_packet_recv+0x98/0x5d0 batadv_batman_skb_recv+0x1b8/0x244 __netif_receive_skb_core.isra.0+0x440/0xc74 __netif_receive_skb_one_core+0x14/0x20 netif_receive_skb+0x68/0x140 br_pass_frame_up+0x70/0x80 br_handle_frame_finish+0x108/0x284 br_handle_frame+0x190/0x250 __netif_receive_skb_core.isra.0+0x240/0xc74 __netif_receive_skb_list_core+0x6c/0x90 netif_receive_skb_list_internal+0x1f4/0x310 napi_complete_done+0x64/0x1d0 gro_cell_poll+0x7c/0xa0 __napi_poll+0x34/0x174 net_rx_action+0xf8/0x2a0 _stext+0x12c/0x2ac run_ksoftirqd+0x4c/0x7c smpboot_thread_fn+0x120/0x210 kthread+0x140/0x150 ret_from_fork+0x10/0x20 Code: f9403844 eb03009f 54fffee1 f94 Thus ethernet header address should only be fetched after batadv_check_management_packet has been called. Fixes: 0da0035942d4 ("batman-adv: OGMv2 - add basic infrastructure") Cc: stable@vger.kernel.org Signed-off-by: Remi Pommarel Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit ed1eb19806ae645c20890b70b091c751dfd08f53 Author: Sven Eckelmann Date: Wed Jul 19 10:01:15 2023 +0200 batman-adv: Don't increase MTU when set by user commit d8e42a2b0addf238be8b3b37dcd9795a5c1be459 upstream. If the user set an MTU value, it usually means that there are special requirements for the MTU. But if an interface gots activated, the MTU was always recalculated and then the user set value was overwritten. The only reason why this user set value has to be overwritten, is when the MTU has to be decreased because batman-adv is not able to transfer packets with the user specified size. Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Cc: stable@vger.kernel.org Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit efef746c5a387a5a9553d405735884f7ce7852b6 Author: Sven Eckelmann Date: Wed Jul 19 09:29:29 2023 +0200 batman-adv: Trigger events for auto adjusted MTU commit c6a953cce8d0438391e6da48c8d0793d3fbfcfa6 upstream. If an interface changes the MTU, it is expected that an NETDEV_PRECHANGEMTU and NETDEV_CHANGEMTU notification events is triggered. This worked fine for .ndo_change_mtu based changes because core networking code took care of it. But for auto-adjustments after hard-interfaces changes, these events were simply missing. Due to this problem, non-batman-adv components weren't aware of MTU changes and thus couldn't perform their own tasks correctly. Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Cc: stable@vger.kernel.org Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit d6b64d710e9bcaa22fa99fdc1cb32e19a0eb5b38 Author: Christian Göttsche Date: Fri Aug 18 17:33:58 2023 +0200 selinux: set next pointer before attaching to list commit 70d91dc9b2ac91327d0eefd86163abc3548effa6 upstream. Set the next pointer in filename_trans_read_helper() before attaching the new node under construction to the list, otherwise garbage would be dereferenced on subsequent failure during cleanup in the out goto label. Cc: Fixes: 430059024389 ("selinux: implement new format of filename transitions") Signed-off-by: Christian Göttsche Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 36c5aecc789d4f881d18e6a8f4539636e11ab85e Author: Benjamin Coddington Date: Fri Aug 4 10:52:20 2023 -0400 nfsd: Fix race to FREE_STATEID and cl_revoked commit 3b816601e279756e781e6c4d9b3f3bd21a72ac67 upstream. We have some reports of linux NFS clients that cannot satisfy a linux knfsd server that always sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED even though those clients repeatedly walk all their known state using TEST_STATEID and receive NFS4_OK for all. Its possible for revoke_delegation() to set NFS4_REVOKED_DELEG_STID, then nfsd4_free_stateid() finds the delegation and returns NFS4_OK to FREE_STATEID. Afterward, revoke_delegation() moves the same delegation to cl_revoked. This would produce the observed client/server effect. Fix this by ensuring that the setting of sc_type to NFS4_REVOKED_DELEG_STID and move to cl_revoked happens within the same cl_lock. This will allow nfsd4_free_stateid() to properly remove the delegation from cl_revoked. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2217103 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2176575 Signed-off-by: Benjamin Coddington Cc: stable@vger.kernel.org # v4.17+ Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Greg Kroah-Hartman commit 96fb46ef8281c749abe114ed9385cec39bae00e4 Author: Trond Myklebust Date: Tue Aug 8 21:17:11 2023 -0400 NFS: Fix a use after free in nfs_direct_join_group() commit be2fd1560eb57b7298aa3c258ddcca0d53ecdea3 upstream. Be more careful when tearing down the subrequests of an O_DIRECT write as part of a retransmission. Reported-by: Chris Mason Fixes: ed5d588fe47f ("NFS: Try to join page groups before an O_DIRECT retransmission") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit bdc544a87d43a4102aa08114fc939dbddf175f98 Author: Miaohe Lin Date: Tue Jun 27 19:28:08 2023 +0800 mm: memory-failure: fix unexpected return value in soft_offline_page() commit e2c1ab070fdc81010ec44634838d24fce9ff9e53 upstream. When page_handle_poison() fails to handle the hugepage or free page in retry path, soft_offline_page() will return 0 while -EBUSY is expected in this case. Consequently the user will think soft_offline_page succeeds while it in fact failed. So the user will not try again later in this case. Link: https://lkml.kernel.org/r/20230627112808.1275241-1-linmiaohe@huawei.com Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages") Signed-off-by: Miaohe Lin Acked-by: Naoya Horiguchi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 07fad410aa6e90131cd1ba8d12bd1f9488f85af5 Author: Alexandre Ghiti Date: Wed Aug 9 18:46:33 2023 +0200 mm: add a call to flush_cache_vmap() in vmap_pfn() commit a50420c79731fc5cf27ad43719c1091e842a2606 upstream. flush_cache_vmap() must be called after new vmalloc mappings are installed in the page table in order to allow architectures to make sure the new mapping is visible. It could lead to a panic since on some architectures (like powerpc), the page table walker could see the wrong pte value and trigger a spurious page fault that can not be resolved (see commit f1cb8f9beba8 ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags")). But actually the patch is aiming at riscv: the riscv specification allows the caching of invalid entries in the TLB, and since we recently removed the vmalloc page fault handling, we now need to emit a tlb shootdown whenever a new vmalloc mapping is emitted (https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/). That's a temporary solution, there are ways to avoid that :) Link: https://lkml.kernel.org/r/20230809164633.1556126-1-alexghiti@rivosinc.com Fixes: 3e9a9e256b1e ("mm: add a vmap_pfn function") Reported-by: Dylan Jhong Closes: https://lore.kernel.org/linux-riscv/ZMytNY2J8iyjbPPy@atctrx.andestech.com/ Signed-off-by: Alexandre Ghiti Reviewed-by: Christoph Hellwig Reviewed-by: Palmer Dabbelt Acked-by: Palmer Dabbelt Reviewed-by: Dylan Jhong Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit a8a60bc8027e099b8dd9e13b49b393db2d940004 Author: David Hildenbrand Date: Sat Aug 5 12:12:56 2023 +0200 mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast commit 5805192c7b7257d290474cb1a3897d0567281bbc upstream. In contrast to most other GUP code, GUP-fast common page table walking code like gup_pte_range() also handles hugetlb pages. But in contrast to other hugetlb page table walking code, it does not look at the hugetlb PTE abstraction whereby we have only a single logical hugetlb PTE per hugetlb page, even when using multiple cont-PTEs underneath -- which is for example what huge_ptep_get() abstracts. So when we have a hugetlb page that is mapped via cont-PTEs, GUP-fast might stumble over a PTE that does not map the head page of a hugetlb page -- not the first "head" PTE of such a cont mapping. Logically, the whole hugetlb page is mapped (entire_mapcount == 1), but we might end up calling gup_must_unshare() with a tail page of a hugetlb page. We only maintain a single PageAnonExclusive flag per hugetlb page (as hugetlb pages cannot get partially COW-shared), stored for the head page. That flag is clear for all tail pages. So when gup_must_unshare() ends up calling PageAnonExclusive() with a tail page of a hugetlb page: 1) With CONFIG_DEBUG_VM_PGFLAGS Stumbles over the: VM_BUG_ON_PGFLAGS(PageHuge(page) && !PageHead(page), page); For example, when executing the COW selftests with 64k hugetlb pages on arm64: [ 61.082187] page:00000000829819ff refcount:3 mapcount:1 mapping:0000000000000000 index:0x1 pfn:0x11ee11 [ 61.082842] head:0000000080f79bf7 order:4 entire_mapcount:1 nr_pages_mapped:0 pincount:2 [ 61.083384] anon flags: 0x17ffff80003000e(referenced|uptodate|dirty|head|mappedtodisk|node=0|zone=2|lastcpupid=0xfffff) [ 61.084101] page_type: 0xffffffff() [ 61.084332] raw: 017ffff800000000 fffffc00037b8401 0000000000000402 0000000200000000 [ 61.084840] raw: 0000000000000010 0000000000000000 00000000ffffffff 0000000000000000 [ 61.085359] head: 017ffff80003000e ffffd9e95b09b788 ffffd9e95b09b788 ffff0007ff63cf71 [ 61.085885] head: 0000000000000000 0000000000000002 00000003ffffffff 0000000000000000 [ 61.086415] page dumped because: VM_BUG_ON_PAGE(PageHuge(page) && !PageHead(page)) [ 61.086914] ------------[ cut here ]------------ [ 61.087220] kernel BUG at include/linux/page-flags.h:990! [ 61.087591] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 61.087999] Modules linked in: ... [ 61.089404] CPU: 0 PID: 4612 Comm: cow Kdump: loaded Not tainted 6.5.0-rc4+ #3 [ 61.089917] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 61.090409] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 61.090897] pc : gup_must_unshare.part.0+0x64/0x98 [ 61.091242] lr : gup_must_unshare.part.0+0x64/0x98 [ 61.091592] sp : ffff8000825eb940 [ 61.091826] x29: ffff8000825eb940 x28: 0000000000000000 x27: fffffc00037b8440 [ 61.092329] x26: 0400000000000001 x25: 0000000000080101 x24: 0000000000080000 [ 61.092835] x23: 0000000000080100 x22: ffff0000cffb9588 x21: ffff0000c8ec6b58 [ 61.093341] x20: 0000ffffad6b1000 x19: fffffc00037b8440 x18: ffffffffffffffff [ 61.093850] x17: 2864616548656761 x16: 5021202626202965 x15: 6761702865677548 [ 61.094358] x14: 6567615028454741 x13: 2929656761702864 x12: 6165486567615021 [ 61.094858] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffd9e958b7a1c0 [ 61.095359] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 00000000002bffa8 [ 61.095873] x5 : ffff0008bb19e708 x4 : 0000000000000000 x3 : 0000000000000000 [ 61.096380] x2 : 0000000000000000 x1 : ffff0000cf6636c0 x0 : 0000000000000046 [ 61.096894] Call trace: [ 61.097080] gup_must_unshare.part.0+0x64/0x98 [ 61.097392] gup_pte_range+0x3a8/0x3f0 [ 61.097662] gup_pgd_range+0x1ec/0x280 [ 61.097942] lockless_pages_from_mm+0x64/0x1a0 [ 61.098258] internal_get_user_pages_fast+0xe4/0x1d0 [ 61.098612] pin_user_pages_fast+0x58/0x78 [ 61.098917] pin_longterm_test_start+0xf4/0x2b8 [ 61.099243] gup_test_ioctl+0x170/0x3b0 [ 61.099528] __arm64_sys_ioctl+0xa8/0xf0 [ 61.099822] invoke_syscall.constprop.0+0x7c/0xd0 [ 61.100160] el0_svc_common.constprop.0+0xe8/0x100 [ 61.100500] do_el0_svc+0x38/0xa0 [ 61.100736] el0_svc+0x3c/0x198 [ 61.100971] el0t_64_sync_handler+0x134/0x150 [ 61.101280] el0t_64_sync+0x17c/0x180 [ 61.101543] Code: aa1303e0 f00074c1 912b0021 97fffeb2 (d4210000) 2) Without CONFIG_DEBUG_VM_PGFLAGS Always detects "not exclusive" for passed tail pages and refuses to PIN the tail pages R/O, as gup_must_unshare() == true. GUP-fast will fallback to ordinary GUP. As ordinary GUP properly considers the logical hugetlb PTE abstraction in hugetlb_follow_page_mask(), pinning the page will succeed when looking at the PageAnonExclusive on the head page only. So the only real effect of this is that with cont-PTE hugetlb pages, we'll always fallback from GUP-fast to ordinary GUP when not working on the head page, which ends up checking the head page and do the right thing. Consequently, the cow selftests pass with cont-PTE hugetlb pages as well without CONFIG_DEBUG_VM_PGFLAGS. Note that this only applies to anon hugetlb pages that are mapped using cont-PTEs: for example 64k hugetlb pages on a 4k arm64 kernel. ... and only when R/O-pinning (FOLL_PIN) such pages that are mapped into the page table R/O using GUP-fast. On production kernels (and even most debug kernels, that don't set CONFIG_DEBUG_VM_PGFLAGS) this patch should theoretically not be required to be backported. But of course, it does not hurt. Link: https://lkml.kernel.org/r/20230805101256.87306-1-david@redhat.com Fixes: a7f226604170 ("mm/gup: trigger FAULT_FLAG_UNSHARE when R/O-pinning a possibly shared anonymous page") Signed-off-by: David Hildenbrand Reported-by: Ryan Roberts Reviewed-by: Ryan Roberts Tested-by: Ryan Roberts Cc: Vlastimil Babka Cc: John Hubbard Cc: Jason Gunthorpe Cc: Peter Xu Cc: Mike Kravetz Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit d4e11b85a2690c31b3d45b1dfa7e61d637a8f2fa Author: Takashi Iwai Date: Wed Aug 23 18:16:25 2023 +0200 ALSA: ymfpci: Fix the missing snd_card_free() call at probe error commit 1d0eb6143c1e85d3f9a3f5a616ee7e5dc351d33b upstream. Like a few other drivers, YMFPCI driver needs to clean up with snd_card_free() call at an error path of the probe; otherwise the other devres resources are released before the card and it results in the UAF. This patch uses the helper for handling the probe error gracefully. Fixes: f33fc1576757 ("ALSA: ymfpci: Create card with device-managed snd_devm_card_new()") Cc: Reported-and-tested-by: Takashi Yano Closes: https://lore.kernel.org/r/20230823135846.1812-1-takashi.yano@nifty.ne.jp Link: https://lore.kernel.org/r/20230823161625.5807-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit d13f3a63d236d311fceece5f34133636fd334bde Author: Hugh Dickins Date: Tue Aug 22 22:14:47 2023 -0700 shmem: fix smaps BUG sleeping while atomic commit e5548f85b4527c4c803b7eae7887c10bf8f90c97 upstream. smaps_pte_hole_lookup() is calling shmem_partial_swap_usage() with page table lock held: but shmem_partial_swap_usage() does cond_resched_rcu() if need_resched(): "BUG: sleeping function called from invalid context". Since shmem_partial_swap_usage() is designed to count across a range, but smaps_pte_hole_lookup() only calls it for a single page slot, just break out of the loop on the last or only page, before checking need_resched(). Link: https://lkml.kernel.org/r/6fe3b3ec-abdf-332f-5c23-6a3b3a3b11a9@google.com Fixes: 230100321518 ("mm/smaps: simplify shmem handling of pte holes") Signed-off-by: Hugh Dickins Acked-by: Peter Xu Cc: [5.16+] Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 091591f6e7c35fc2f0a74c9d044b0638a37b0f3f Author: Rik van Riel Date: Thu Aug 17 13:57:59 2023 -0400 mm,ima,kexec,of: use memblock_free_late from ima_free_kexec_buffer commit f0362a253606e2031f8d61c74195d4d6556e12a4 upstream. The code calling ima_free_kexec_buffer runs long after the memblock allocator has already been torn down, potentially resulting in a use after free in memblock_isolate_range. With KASAN or KFENCE, this use after free will result in a BUG from the idle task, and a subsequent kernel panic. Switch ima_free_kexec_buffer over to memblock_free_late to avoid that issue. Fixes: fee3ff99bc67 ("powerpc: Move arch independent ima kexec functions to drivers/of/kexec.c") Cc: stable@kernel.org Signed-off-by: Rik van Riel Suggested-by: Mike Rappoport Link: https://lore.kernel.org/r/20230817135759.0888e5ef@imladris.surriel.com Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit a7d172252bfad47d06a5dcf6fbb6d4ea85f4cedd Author: Andrey Skvortsov Date: Sat Aug 5 11:48:47 2023 +0300 clk: Fix slab-out-of-bounds error in devm_clk_release() commit 66fbfb35da47f391bdadf9fa7ceb88af4faa9022 upstream. Problem can be reproduced by unloading snd_soc_simple_card, because in devm_get_clk_from_child() devres data is allocated as `struct clk`, but devm_clk_release() expects devres data to be `struct devm_clk_state`. KASAN report: ================================================================== BUG: KASAN: slab-out-of-bounds in devm_clk_release+0x20/0x54 Read of size 8 at addr ffffff800ee09688 by task (udev-worker)/287 Call trace: dump_backtrace+0xe8/0x11c show_stack+0x1c/0x30 dump_stack_lvl+0x60/0x78 print_report+0x150/0x450 kasan_report+0xa8/0xf0 __asan_load8+0x78/0xa0 devm_clk_release+0x20/0x54 release_nodes+0x84/0x120 devres_release_all+0x144/0x210 device_unbind_cleanup+0x1c/0xac really_probe+0x2f0/0x5b0 __driver_probe_device+0xc0/0x1f0 driver_probe_device+0x68/0x120 __driver_attach+0x140/0x294 bus_for_each_dev+0xec/0x160 driver_attach+0x38/0x44 bus_add_driver+0x24c/0x300 driver_register+0xf0/0x210 __platform_driver_register+0x48/0x54 asoc_simple_card_init+0x24/0x1000 [snd_soc_simple_card] do_one_initcall+0xac/0x340 do_init_module+0xd0/0x300 load_module+0x2ba4/0x3100 __do_sys_init_module+0x2c8/0x300 __arm64_sys_init_module+0x48/0x5c invoke_syscall+0x64/0x190 el0_svc_common.constprop.0+0x124/0x154 do_el0_svc+0x44/0xdc el0_svc+0x14/0x50 el0t_64_sync_handler+0xec/0x11c el0t_64_sync+0x14c/0x150 Allocated by task 287: kasan_save_stack+0x38/0x60 kasan_set_track+0x28/0x40 kasan_save_alloc_info+0x20/0x30 __kasan_kmalloc+0xac/0xb0 __kmalloc_node_track_caller+0x6c/0x1c4 __devres_alloc_node+0x44/0xb4 devm_get_clk_from_child+0x44/0xa0 asoc_simple_parse_clk+0x1b8/0x1dc [snd_soc_simple_card_utils] simple_parse_node.isra.0+0x1ec/0x230 [snd_soc_simple_card] simple_dai_link_of+0x1bc/0x334 [snd_soc_simple_card] __simple_for_each_link+0x2ec/0x320 [snd_soc_simple_card] asoc_simple_probe+0x468/0x4dc [snd_soc_simple_card] platform_probe+0x90/0xf0 really_probe+0x118/0x5b0 __driver_probe_device+0xc0/0x1f0 driver_probe_device+0x68/0x120 __driver_attach+0x140/0x294 bus_for_each_dev+0xec/0x160 driver_attach+0x38/0x44 bus_add_driver+0x24c/0x300 driver_register+0xf0/0x210 __platform_driver_register+0x48/0x54 asoc_simple_card_init+0x24/0x1000 [snd_soc_simple_card] do_one_initcall+0xac/0x340 do_init_module+0xd0/0x300 load_module+0x2ba4/0x3100 __do_sys_init_module+0x2c8/0x300 __arm64_sys_init_module+0x48/0x5c invoke_syscall+0x64/0x190 el0_svc_common.constprop.0+0x124/0x154 do_el0_svc+0x44/0xdc el0_svc+0x14/0x50 el0t_64_sync_handler+0xec/0x11c el0t_64_sync+0x14c/0x150 The buggy address belongs to the object at ffffff800ee09600 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 136 bytes inside of 256-byte region [ffffff800ee09600, ffffff800ee09700) The buggy address belongs to the physical page: page:000000002d97303b refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4ee08 head:000000002d97303b order:1 compound_mapcount:0 compound_pincount:0 flags: 0x10200(slab|head|zone=0) raw: 0000000000010200 0000000000000000 dead000000000122 ffffff8002c02480 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffff800ee09580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffff800ee09600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffff800ee09680: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffffff800ee09700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffff800ee09780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Fixes: abae8e57e49a ("clk: generalize devm_clk_get() a bit") Signed-off-by: Andrey Skvortsov Link: https://lore.kernel.org/r/20230805084847.3110586-1-andrej.skvortzov@gmail.com Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 14904f4d8bf8c7e12794739ee4f5ff3a346d7bb2 Author: Benjamin Coddington Date: Fri Jun 30 09:18:13 2023 -0400 NFSv4: Fix dropped lock for racing OPEN and delegation return commit 1cbc11aaa01f80577b67ae02c73ee781112125fd upstream. Commmit f5ea16137a3f ("NFSv4: Retry LOCK on OLD_STATEID during delegation return") attempted to solve this problem by using nfs4's generic async error handling, but introduced a regression where v4.0 lock recovery would hang. The additional complexity introduced by overloading that error handling is not necessary for this case. This patch expects that commit to be reverted. The problem as originally explained in the above commit is: There's a small window where a LOCK sent during a delegation return can race with another OPEN on client, but the open stateid has not yet been updated. In this case, the client doesn't handle the OLD_STATEID error from the server and will lose this lock, emitting: "NFS: nfs4_handle_delegation_recall_error: unhandled error -10024". Fix this by using the old_stateid refresh helpers if the server replies with OLD_STATEID. Suggested-by: Trond Myklebust Signed-off-by: Benjamin Coddington Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit ac467d7405fe69b07b5a3fe233fd62fad7b2faec Author: André Apitzsch Date: Sat Aug 19 09:12:15 2023 +0200 platform/x86: ideapad-laptop: Add support for new hotkeys found on ThinkBook 14s Yoga ITL commit a260f7d726fde52c0278bd3fa085a758639bcee2 upstream. The Lenovo Thinkbook 14s Yoga ITL has 4 new symbols/shortcuts on their F9-F11 and PrtSc keys: F9: Has a symbol of a head with a headset, the manual says "Service key" F10: Has a symbol of a telephone horn which has been picked up from the receiver, the manual says: "Answer incoming calls" F11: Has a symbol of a telephone horn which is resting on the receiver, the manual says: "Reject incoming calls" PrtSc: Has a symbol of a siccor and a dashed ellipse, the manual says: "Open the Windows 'Snipping' Tool app" This commit adds support for these 4 new hkey events. Signed-off-by: André Apitzsch Link: https://lore.kernel.org/r/20230819-lenovo_keys-v1-1-9d34eac88e0a@apitzsch.eu Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Greg Kroah-Hartman commit e6a60eccd0c8c015a2493ceea832d2c0eaf83dba Author: Ping-Ke Shih Date: Fri Aug 18 09:40:04 2023 +0800 wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning commit b98c16107cc1647242abbd11f234c05a3a5864f6 upstream. The commit 06470f7468c8 ("mac80211: add API to allow filtering frames in BA sessions") added reorder_buf_filtered to mark frames filtered by firmware, and it can only work correctly if hw.max_rx_aggregation_subframes <= 64 since it stores the bitmap in a u64 variable. However, new HE or EHT devices can support BlockAck number up to 256 or 1024, and then using a higher subframe index leads UBSAN warning: UBSAN: shift-out-of-bounds in net/mac80211/rx.c:1129:39 shift exponent 215 is too large for 64-bit type 'long long unsigned int' Call Trace: dump_stack_lvl+0x48/0x70 dump_stack+0x10/0x20 __ubsan_handle_shift_out_of_bounds+0x1ac/0x360 ieee80211_release_reorder_frame.constprop.0.cold+0x64/0x69 [mac80211] ieee80211_sta_reorder_release+0x9c/0x400 [mac80211] ieee80211_prepare_and_rx_handle+0x1234/0x1420 [mac80211] ieee80211_rx_list+0xaef/0xf60 [mac80211] ieee80211_rx_napi+0x53/0xd0 [mac80211] Since only old hardware that supports <=64 BlockAck uses ieee80211_mark_rx_ba_filtered_frames(), limit the use as it is, so add a WARN_ONCE() and comment to note to avoid using this function if hardware capability is not suitable. Signed-off-by: Ping-Ke Shih Link: https://lore.kernel.org/r/20230818014004.16177-1-pkshih@realtek.com [edit commit message] Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit b8b7243aafecc1046d90b6d636355e2a1fe99835 Author: Michael Ellerman Date: Wed Aug 23 14:51:39 2023 +1000 ibmveth: Use dcbf rather than dcbfl commit bfedba3b2c7793ce127680bc8f70711e05ec7a17 upstream. When building for power4, newer binutils don't recognise the "dcbfl" extended mnemonic. dcbfl RA, RB is equivalent to dcbf RA, RB, 1. Switch to "dcbf" to avoid the build error. Signed-off-by: Michael Ellerman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 85607ef399d9763c9a5d122eb6bce91d27d85cb9 Author: Charles Keepax Date: Wed Aug 23 09:53:08 2023 +0100 ASoC: cs35l41: Correct amp_gain_tlv values commit 1613781d7e8a93618ff3a6b37f81f06769b53717 upstream. The current analog gain TLV seems to have completely incorrect values in it. The gain starts at 0.5dB, proceeds in 1dB steps, and has no mute value, correct the control to match. Signed-off-by: Charles Keepax Link: https://lore.kernel.org/r/20230823085308.753572-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 014fec5540108047a92ce9527e0ec285b0d59691 Author: BrenoRCBrito Date: Fri Aug 18 18:14:16 2023 -0300 ASoC: amd: yc: Add VivoBook Pro 15 to quirks list for acp6x commit 3b1f08833c45d0167741e4097b0150e7cf086102 upstream. VivoBook Pro 15 Ryzen Edition uses Ryzen 6800H processor, and adding to quirks list for acp6x will enable internal mic. Signed-off-by: BrenoRCBrito Link: https://lore.kernel.org/r/20230818211417.32167-1-brenorcbrito@gmail.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 22a406b3629a10979916ea7cace47858410117b5 Author: Jens Axboe Date: Tue Aug 22 18:00:02 2023 -0600 io_uring/msg_ring: fix missing lock on overflow for IOPOLL Commit e12d7a46f65ae4b7d58a5e0c1cbfa825cf8d830d upstream. If the target ring is configured with IOPOLL, then we always need to hold the target ring uring_lock before posting CQEs. We could just grab it unconditionally, but since we don't expect many target rings to be of this type, make grabbing the uring_lock conditional on the ring type. Link: https://lore.kernel.org/io-uring/Y8krlYa52%2F0YGqkg@ip-172-31-85-199.ec2.internal/ Reported-by: Xingyuan Mo Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 816c7cecf6a0cf04b5b543690e38a1b15bdf8e88 Author: Jens Axboe Date: Thu Jan 19 09:01:27 2023 -0700 io_uring/msg_ring: move double lock/unlock helpers higher up Commit 423d5081d0451faa59a707e57373801da5b40141 upstream. In preparation for needing them somewhere else, move them and get rid of the unused 'issue_flags' for the unlock side. No functional changes in this patch. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 4f59375285188baa5a22100af24f0fb3e2bc0e3d Author: Pavel Begunkov Date: Wed Dec 7 03:53:35 2022 +0000 io_uring: extract a io_msg_install_complete helper Commit 172113101641cf1f9628c528ec790cb809f2b704 upstream. Extract a helper called io_msg_install_complete() from io_msg_send_fd(), will be used later. Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/1500ca1054cc4286a3ee1c60aacead57fcdfa02a.1670384893.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 0d617fb6d5132dc1ffd12ec0c90af71fd89c63a0 Author: Pavel Begunkov Date: Wed Dec 7 03:53:34 2022 +0000 io_uring: get rid of double locking Commit 11373026f2960390d5e330df4e92735c4265c440 upstream. We don't need to take both uring_locks at once, msg_ring can be split in two parts, first getting a file from the filetable of the first ring and then installing it into the second one. Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/a80ecc2bc99c3b3f2cf20015d618b7c51419a797.1670384893.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 82d811ff566594de3676f35808e8a9e19c5c864c Author: Sean Christopherson Date: Wed Aug 23 18:01:04 2023 -0700 KVM: x86/mmu: Fix an sign-extension bug with mmu_seq that hangs vCPUs Upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring how KVM tracks the sequence counter snapshot. Take the vCPU's mmu_seq snapshot as an "unsigned long" instead of an "int" when checking to see if a page fault is stale, as the sequence count is stored as an "unsigned long" everywhere else in KVM. This fixes a bug where KVM will effectively hang vCPUs due to always thinking page faults are stale, which results in KVM refusing to "fix" faults. mmu_invalidate_seq (née mmu_notifier_seq) is a sequence counter used when KVM is handling page faults to detect if userspace mappings relevant to the guest were invalidated between snapshotting the counter and acquiring mmu_lock, i.e. to ensure that the userspace mapping KVM is using to resolve the page fault is fresh. If KVM sees that the counter has changed, KVM simply resumes the guest without fixing the fault. What _should_ happen is that the source of the mmu_notifier invalidations eventually goes away, mmu_invalidate_seq becomes stable, and KVM can once again fix guest page fault(s). But for a long-lived VM and/or a VM that the host just doesn't particularly like, it's possible for a VM to be on the receiving end of 2 billion (with a B) mmu_notifier invalidations. When that happens, bit 31 will be set in mmu_invalidate_seq. This causes the value to be turned into a 32-bit negative value when implicitly cast to an "int" by is_page_fault_stale(), and then sign-extended into a 64-bit unsigned when the signed "int" is implicitly cast back to an "unsigned long" on the call to mmu_invalidate_retry_hva(). As a result of the casting and sign-extension, given a sequence counter of e.g. 0x8002dc25, mmu_invalidate_retry_hva() ends up doing if (0x8002dc25 != 0xffffffff8002dc25) and signals that the page fault is stale and needs to be retried even though the sequence counter is stable, and KVM effectively hangs any vCPU that takes a page fault (EPT violation or #NPF when TDP is enabled). Reported-by: Brian Rak Reported-by: Amaan Cheval Reported-by: Eric Wheeler Closes: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers.com Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update") Signed-off-by: Sean Christopherson Signed-off-by: Greg Kroah-Hartman commit 2800385fda534f1bfed8389d634c393a07570bba Author: Sean Christopherson Date: Wed Apr 26 15:03:23 2023 -0700 KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated commit edbdb43fc96b11b3bfa531be306a1993d9fe89ec upstream. Preserve TDP MMU roots until they are explicitly invalidated by gifting the TDP MMU itself a reference to a root when it is allocated. Keeping a reference in the TDP MMU fixes a flaw where the TDP MMU exhibits terrible performance, and can potentially even soft-hang a vCPU, if a vCPU frequently unloads its roots, e.g. when KVM is emulating SMI+RSM. When KVM emulates something that invalidates _all_ TLB entries, e.g. SMI and RSM, KVM unloads all of the vCPUs roots (KVM keeps a small per-vCPU cache of previous roots). Unloading roots is a simple way to ensure KVM flushes and synchronizes all roots for the vCPU, as KVM flushes and syncs when allocating a "new" root (from the vCPU's perspective). In the shadow MMU, KVM keeps track of all shadow pages, roots included, in a per-VM hash table. Unloading a shadow MMU root just wipes it from the per-vCPU cache; the root is still tracked in the per-VM hash table. When KVM loads a "new" root for the vCPU, KVM will find the old, unloaded root in the per-VM hash table. Unlike the shadow MMU, the TDP MMU doesn't track "inactive" roots in a per-VM structure, where "active" in this case means a root is either in-use or cached as a previous root by at least one vCPU. When a TDP MMU root becomes inactive, i.e. the last vCPU reference to the root is put, KVM immediately frees the root (asterisk on "immediately" as the actual freeing may be done by a worker, but for all intents and purposes the root is gone). The TDP MMU behavior is especially problematic for 1-vCPU setups, as unloading all roots effectively frees all roots. The issue is mitigated to some degree in multi-vCPU setups as a different vCPU usually holds a reference to an unloaded root and thus keeps the root alive, allowing the vCPU to reuse its old root after unloading (with a flush+sync). The TDP MMU flaw has been known for some time, as until very recently, KVM's handling of CR0.WP also triggered unloading of all roots. The CR0.WP toggling scenario was eventually addressed by not unloading roots when _only_ CR0.WP is toggled, but such an approach doesn't Just Work for emulating SMM as KVM must emulate a full TLB flush on entry and exit to/from SMM. Given that the shadow MMU plays nice with unloading roots at will, teaching the TDP MMU to do the same is far less complex than modifying KVM to track which roots need to be flushed before reuse. Note, preserving all possible TDP MMU roots is not a concern with respect to memory consumption. Now that the role for direct MMUs doesn't include information about the guest, e.g. CR0.PG, CR0.WP, CR4.SMEP, etc., there are _at most_ six possible roots (where "guest_mode" here means L2): 1. 4-level !SMM !guest_mode 2. 4-level SMM !guest_mode 3. 5-level !SMM !guest_mode 4. 5-level SMM !guest_mode 5. 4-level !SMM guest_mode 6. 5-level !SMM guest_mode And because each vCPU can track 4 valid roots, a VM can already have all 6 root combinations live at any given time. Not to mention that, in practice, no sane VMM will advertise different guest.MAXPHYADDR values across vCPUs, i.e. KVM won't ever use both 4-level and 5-level roots for a single VM. Furthermore, the vast majority of modern hypervisors will utilize EPT/NPT when available, thus the guest_mode=%true cases are also unlikely to be utilized. Reported-by: Jeremi Piotrowski Link: https://lore.kernel.org/all/959c5bce-beb5-b463-7158-33fc4a4f910c@linux.microsoft.com Link: https://lkml.kernel.org/r/20220209170020.1775368-1-pbonzini%40redhat.com Link: https://lore.kernel.org/all/20230322013731.102955-1-minipli@grsecurity.net Link: https://lore.kernel.org/all/000000000000a0bc2b05f9dd7fab@google.com Link: https://lore.kernel.org/all/000000000000eca0b905fa0f7756@google.com Cc: Ben Gardon Cc: David Matlack Cc: stable@vger.kernel.org Tested-by: Jeremi Piotrowski Link: https://lore.kernel.org/r/20230426220323.3079789-1-seanjc@google.com Signed-off-by: Sean Christopherson Signed-off-by: Greg Kroah-Hartman commit a0559fd0e14eb144b8151e3104785138786274a8 Author: Hangbin Liu Date: Wed Aug 23 15:19:04 2023 +0800 bonding: fix macvlan over alb bond support [ Upstream commit e74216b8def3803e98ae536de78733e9d7f3b109 ] The commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") aims to enable the use of macvlans on top of rlb bond mode. However, the current rlb bond mode only handles ARP packets to update remote neighbor entries. This causes an issue when a macvlan is on top of the bond, and remote devices send packets to the macvlan using the bond's MAC address as the destination. After delivering the packets to the macvlan, the macvlan will rejects them as the MAC address is incorrect. Consequently, this commit makes macvlan over bond non-functional. To address this problem, one potential solution is to check for the presence of a macvlan port on the bond device using netif_is_macvlan_port(bond->dev) and return NULL in the rlb_arp_xmit() function. However, this approach doesn't fully resolve the situation when a VLAN exists between the bond and macvlan. So let's just do a partial revert for commit 14af9963ba1e in rlb_arp_xmit(). As the comment said, Don't modify or load balance ARPs that do not originate locally. Fixes: 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") Reported-by: susan.zheng@veritas.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816 Signed-off-by: Hangbin Liu Acked-by: Jay Vosburgh Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit b15dea3de413b80c6e51acb26c0d09354080af65 Author: Ido Schimmel Date: Wed Aug 23 09:43:48 2023 +0300 rtnetlink: Reject negative ifindexes in RTM_NEWLINK [ Upstream commit 30188bd7838c16a98a520db1fe9df01ffc6ed368 ] Negative ifindexes are illegal, but the kernel does not validate the ifindex in the ancillary header of RTM_NEWLINK messages, resulting in the kernel generating a warning [1] when such an ifindex is specified. Fix by rejecting negative ifindexes. [1] WARNING: CPU: 0 PID: 5031 at net/core/dev.c:9593 dev_index_reserve+0x1a2/0x1c0 net/core/dev.c:9593 [...] Call Trace: register_netdevice+0x69a/0x1490 net/core/dev.c:10081 br_dev_newlink+0x27/0x110 net/bridge/br_netlink.c:1552 rtnl_newlink_create net/core/rtnetlink.c:3471 [inline] __rtnl_newlink+0x115e/0x18c0 net/core/rtnetlink.c:3688 rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3701 rtnetlink_rcv_msg+0x439/0xd30 net/core/rtnetlink.c:6427 netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0x536/0x810 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:728 [inline] sock_sendmsg+0xd9/0x180 net/socket.c:751 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2538 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2592 __sys_sendmsg+0x117/0x1e0 net/socket.c:2621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 38f7b870d4a6 ("[RTNETLINK]: Link creation API") Reported-by: syzbot+5ba06978f34abb058571@syzkaller.appspotmail.com Signed-off-by: Ido Schimmel Reviewed-by: Jiri Pirko Reviewed-by: Jakub Kicinski Link: https://lore.kernel.org/r/20230823064348.2252280-1-idosch@nvidia.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit ed3fe5f9020c90125dfb40c1ae808d915ede68d8 Author: Florian Westphal Date: Tue Aug 22 19:49:52 2023 +0200 netfilter: nf_tables: fix out of memory error handling [ Upstream commit 5e1be4cdc98c989d5387ce94ff15b5ad06a5b681 ] Several instances of pipapo_resize() don't propagate allocation failures, this causes a crash when fault injection is enabled for gfp_kernel slabs. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Florian Westphal Reviewed-by: Stefano Brivio Signed-off-by: Sasha Levin commit 41841b585e53babdfb0fa6fdfa54f6d7c28c1206 Author: Pablo Neira Ayuso Date: Fri Aug 18 01:13:31 2023 +0200 netfilter: nf_tables: flush pending destroy work before netlink notifier [ Upstream commit 2c9f0293280e258606e54ed2b96fa71498432eae ] Destroy work waits for the RCU grace period then it releases the objects with no mutex held. All releases objects follow this path for transactions, therefore, order is guaranteed and references to top-level objects in the hierarchy remain valid. However, netlink notifier might interfer with pending destroy work. rcu_barrier() is not correct because objects are not release via RCU callback. Flush destroy work before releasing objects from netlink notifier path. Fixes: d4bc8271db21 ("netfilter: nf_tables: netlink notifier might race to release objects") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Florian Westphal Signed-off-by: Sasha Levin commit 136861956ad64429afbe31cfa90234114f7eab2e Author: Andrii Staikov Date: Tue Aug 22 15:16:53 2023 -0700 i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters() [ Upstream commit 9525a3c38accd2e186f52443e35e633e296cc7f5 ] Add check for pf->vf not being NULL before dereferencing pf->vf[vsi->vf_id] in updating VSI filter sync. Add a similar check before dereferencing !pf->vf[vsi->vf_id].trusted in the condition for clearing promisc mode bit. Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning") Signed-off-by: Andrii Staikov Signed-off-by: Aleksandr Loktionov Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 581668893e3126176b08349b7e35a3fd6b578d6a Author: Jamal Hadi Salim Date: Tue Aug 22 06:12:31 2023 -0400 net/sched: fix a qdisc modification with ambiguous command request [ Upstream commit da71714e359b64bd7aab3bd56ec53f307f058133 ] When replacing an existing root qdisc, with one that is of the same kind, the request boils down to essentially a parameterization change i.e not one that requires allocation and grafting of a new qdisc. syzbot was able to create a scenario which resulted in a taprio qdisc replacing an existing taprio qdisc with a combination of NLM_F_CREATE, NLM_F_REPLACE and NLM_F_EXCL leading to create and graft scenario. The fix ensures that only when the qdisc kinds are different that we should allow a create and graft, otherwise it goes into the "change" codepath. While at it, fix the code and comments to improve readability. While syzbot was able to create the issue, it did not zone on the root cause. Analysis from Vladimir Oltean helped narrow it down. v1->V2 changes: - remove "inline" function definition (Vladmir) - remove extrenous braces in branches (Vladmir) - change inline function names (Pedro) - Run tdc tests (Victor) v2->v3 changes: - dont break else/if (Simon) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+a3618a167af2021433cd@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/20230816225759.g25x76kmgzya2gei@skbuf/T/ Tested-by: Vladimir Oltean Tested-by: Victor Nogueira Reviewed-by: Pedro Tammela Reviewed-by: Victor Nogueira Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f94f30e2abfa17f1b8173ce1ed6fdfd0861f3d55 Author: Sasha Neftin Date: Mon Aug 21 10:17:21 2023 -0700 igc: Fix the typo in the PTM Control macro [ Upstream commit de43975721b97283d5f17eea4228faddf08f2681 ] The IGC_PTM_CTRL_SHRT_CYC defines the time between two consecutive PTM requests. The bit resolution of this field is six bits. That bit five was missing in the mask. This patch comes to correct the typo in the IGC_PTM_CTRL_SHRT_CYC macro. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Signed-off-by: Sasha Neftin Tested-by: Naama Meir Signed-off-by: Tony Nguyen Reviewed-by: Simon Horman Reviewed-by: Kalesh AP Link: https://lore.kernel.org/r/20230821171721.2203572-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 9b7fd6beec371a9fceb6bca0068272a4a31ba825 Author: Alessio Igor Bogani Date: Mon Aug 21 10:19:27 2023 -0700 igb: Avoid starting unnecessary workqueues [ Upstream commit b888c510f7b3d64ca75fc0f43b4a4bd1a611312f ] If ptp_clock_register() fails or CONFIG_PTP isn't enabled, avoid starting PTP related workqueues. In this way we can fix this: BUG: unable to handle page fault for address: ffffc9000440b6f8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 100000067 P4D 100000067 PUD 1001e0067 PMD 107dc5067 PTE 0 Oops: 0000 [#1] PREEMPT SMP [...] Workqueue: events igb_ptp_overflow_check RIP: 0010:igb_rd32+0x1f/0x60 [...] Call Trace: igb_ptp_read_82580+0x20/0x50 timecounter_read+0x15/0x60 igb_ptp_overflow_check+0x1a/0x50 process_one_work+0x1cb/0x3c0 worker_thread+0x53/0x3f0 ? rescuer_thread+0x370/0x370 kthread+0x142/0x160 ? kthread_associate_blkcg+0xc0/0xc0 ret_from_fork+0x1f/0x30 Fixes: 1f6e8178d685 ("igb: Prevent dropped Tx timestamps via work items and interrupts.") Fixes: d339b1331616 ("igb: add PTP Hardware Clock code") Signed-off-by: Alessio Igor Bogani Tested-by: Arpana Arland (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20230821171927.2203644-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 39d43b9cdfe8b02d9dc28994ddd578b9e5f34a7e Author: Oliver Hartkopp Date: Mon Aug 21 16:45:46 2023 +0200 can: isotp: fix support for transmission of SF without flow control [ Upstream commit 0bfe71159230bab79ee230225ae12ffecbb69f3e ] The original implementation had a very simple handling for single frame transmissions as it just sent the single frame without a timeout handling. With the new echo frame handling the echo frame was also introduced for single frames but the former exception ('simple without timers') has been maintained by accident. This leads to a 1 second timeout when closing the socket and to an -ECOMM error when CAN_ISOTP_WAIT_TX_DONE is selected. As the echo handling is always active (also for single frames) remove the wrong extra condition for single frames. Fixes: 9f39d36530e5 ("can: isotp: add support for transmission without flow control") Signed-off-by: Oliver Hartkopp Link: https://lore.kernel.org/r/20230821144547.6658-2-socketcan@hartkopp.net Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit f41781b9d8a4af2b1ede0e1045b62359cf6b66b4 Author: Hangbin Liu Date: Thu Aug 17 16:24:59 2023 +0800 selftests: bonding: do not set port down before adding to bond [ Upstream commit be809424659c2844a2d7ab653aacca4898538023 ] Before adding a port to bond, it need to be set down first. In the lacpdu test the author set the port down specifically. But commit a4abfa627c38 ("net: rtnetlink: Enslave device before bringing it up") changed the operation order, the kernel will set the port down _after_ adding to bond. So all the ports will be down at last and the test failed. In fact, the veth interfaces are already inactive when added. This means there's no need to set them down again before adding to the bond. Let's just remove the link down operation. Fixes: a4abfa627c38 ("net: rtnetlink: Enslave device before bringing it up") Reported-by: Zhengchao Shao Closes: https://lore.kernel.org/netdev/a0ef07c7-91b0-94bd-240d-944a330fcabd@huawei.com/ Signed-off-by: Hangbin Liu Link: https://lore.kernel.org/r/20230817082459.1685972-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 850e2322ae59149475500bf1347cf60a0f7f051c Author: Petr Oros Date: Fri Aug 11 10:07:02 2023 +0200 ice: Fix NULL pointer deref during VF reset [ Upstream commit 67f6317dfa609846a227a706532439a22828c24b ] During stress test with attaching and detaching VF from KVM and simultaneously changing VFs spoofcheck and trust there was a NULL pointer dereference in ice_reset_vf that VF's VSI is null. More than one instance of ice_reset_vf() can be running at a given time. When we rebuild the VSI in ice_reset_vf, another reset can be triaged from ice_service_task. In this case we can access the currently uninitialized VSI and cause panic. The window for this racing condition has been around for a long time but it's much worse after commit 227bf4500aaa ("ice: move VSI delete outside deconfig") because the reset runs faster. ice_reset_vf() using vf->cfg_lock and when we move this lock before accessing to the VF VSI, we can fix BUG for all cases. Panic occurs sometimes in ice_vsi_is_rx_queue_active() and sometimes in ice_vsi_stop_all_rx_rings() With our reproducer, we can hit BUG: ~8h before commit 227bf4500aaa ("ice: move VSI delete outside deconfig"). ~20m after commit 227bf4500aaa ("ice: move VSI delete outside deconfig"). After this fix we are not able to reproduce it after ~48h There was commit cf90b74341ee ("ice: Fix call trace with null VSI during VF reset") which also tried to fix this issue, but it was only partially resolved and the bug still exists. [ 6420.658415] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 6420.665382] #PF: supervisor read access in kernel mode [ 6420.670521] #PF: error_code(0x0000) - not-present page [ 6420.675659] PGD 0 [ 6420.677679] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 6420.682038] CPU: 53 PID: 326472 Comm: kworker/53:0 Kdump: loaded Not tainted 5.14.0-317.el9.x86_64 #1 [ 6420.691250] Hardware name: Dell Inc. PowerEdge R750/04V528, BIOS 1.6.5 04/15/2022 [ 6420.698729] Workqueue: ice ice_service_task [ice] [ 6420.703462] RIP: 0010:ice_vsi_is_rx_queue_active+0x2d/0x60 [ice] [ 6420.705860] ice 0000:ca:00.0: VF 0 is now untrusted [ 6420.709494] Code: 00 00 66 83 bf 76 04 00 00 00 48 8b 77 10 74 3e 31 c0 eb 0f 0f b7 97 76 04 00 00 48 83 c0 01 39 c2 7e 2b 48 8b 97 68 04 00 00 <0f> b7 0c 42 48 8b 96 20 13 00 00 48 8d 94 8a 00 00 12 00 8b 12 83 [ 6420.714426] ice 0000:ca:00.0 ens7f0: Setting MAC 22:22:22:22:22:00 on VF 0. VF driver will be reinitialized [ 6420.733120] RSP: 0018:ff778d2ff383fdd8 EFLAGS: 00010246 [ 6420.733123] RAX: 0000000000000000 RBX: ff2acf1916294000 RCX: 0000000000000000 [ 6420.733125] RDX: 0000000000000000 RSI: ff2acf1f2c6401a0 RDI: ff2acf1a27301828 [ 6420.762346] RBP: ff2acf1a27301828 R08: 0000000000000010 R09: 0000000000001000 [ 6420.769476] R10: ff2acf1916286000 R11: 00000000019eba3f R12: ff2acf19066460d0 [ 6420.776611] R13: ff2acf1f2c6401a0 R14: ff2acf1f2c6401a0 R15: 00000000ffffffff [ 6420.783742] FS: 0000000000000000(0000) GS:ff2acf28ffa80000(0000) knlGS:0000000000000000 [ 6420.791829] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6420.797575] CR2: 0000000000000000 CR3: 00000016ad410003 CR4: 0000000000773ee0 [ 6420.804708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6420.811034] vfio-pci 0000:ca:01.0: enabling device (0000 -> 0002) [ 6420.811840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6420.811841] PKRU: 55555554 [ 6420.811842] Call Trace: [ 6420.811843] [ 6420.811844] ice_reset_vf+0x9a/0x450 [ice] [ 6420.811876] ice_process_vflr_event+0x8f/0xc0 [ice] [ 6420.841343] ice_service_task+0x23b/0x600 [ice] [ 6420.845884] ? __schedule+0x212/0x550 [ 6420.849550] process_one_work+0x1e2/0x3b0 [ 6420.853563] ? rescuer_thread+0x390/0x390 [ 6420.857577] worker_thread+0x50/0x3a0 [ 6420.861242] ? rescuer_thread+0x390/0x390 [ 6420.865253] kthread+0xdd/0x100 [ 6420.868400] ? kthread_complete_and_exit+0x20/0x20 [ 6420.873194] ret_from_fork+0x1f/0x30 [ 6420.876774] [ 6420.878967] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iavf vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables bridge stp llc sctp ip6_udp_tunnel udp_tunnel nfp tls nfnetlink bluetooth mlx4_en mlx4_core rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma kvm_intel i40e kvm iTCO_wdt dcdbas ib_uverbs irqbypass iTCO_vendor_support mgag200 mei_me ib_core dell_smbios isst_if_mmio isst_if_mbox_pci rapl i2c_algo_bit drm_shmem_helper intel_cstate drm_kms_helper syscopyarea sysfillrect isst_if_common sysimgblt intel_uncore fb_sys_fops dell_wmi_descriptor wmi_bmof intel_vsec mei i2c_i801 acpi_ipmi ipmi_si i2c_smbus ipmi_devintf intel_pch_thermal acpi_power_meter pcspk r Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Fixes: f23df5220d2b ("ice: Fix spurious interrupt during removal of trusted VF") Signed-off-by: Petr Oros Reviewed-by: Simon Horman Reviewed-by: Przemek Kitszel Reviewed-by: Jacob Keller Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 7cddaed2a3f639c06ed033c92cfb552346c3ff83 Author: Petr Oros Date: Fri Aug 11 10:07:01 2023 +0200 Revert "ice: Fix ice VF reset during iavf initialization" [ Upstream commit 0ecff05e6c59dd82dbcb9706db911f7fd9f40fb8 ] This reverts commit 7255355a0636b4eff08d5e8139c77d98f151c4fc. After this commit we are not able to attach VF to VM: virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52 error: Failed to attach interface error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable ice_check_vf_ready_for_cfg() already contain waiting for reset. New condition in ice_check_vf_ready_for_reset() causing only problems. Fixes: 7255355a0636 ("ice: Fix ice VF reset during iavf initialization") Signed-off-by: Petr Oros Reviewed-by: Simon Horman Reviewed-by: Przemek Kitszel Reviewed-by: Jacob Keller Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 1188e9dd7af97adb7af7ea2e9e63c771eaaf278c Author: Jesse Brandeburg Date: Thu Aug 10 16:51:10 2023 -0700 ice: fix receive buffer size miscalculation [ Upstream commit 10083aef784031fa9f06c19a1b182e6fad5338d9 ] The driver is misconfiguring the hardware for some values of MTU such that it could use multiple descriptors to receive a packet when it could have simply used one. Change the driver to use a round-up instead of the result of a shift, as the shift can truncate the lower bits of the size, and result in the problem noted above. It also aligns this driver with similar code in i40e. The insidiousness of this problem is that everything works with the wrong size, it's just not working as well as it could, as some MTU sizes end up using two or more descriptors, and there is no way to tell that is happening without looking at ice_trace or a bus analyzer. Fixes: efc2214b6047 ("ice: Add support for XDP") Reviewed-by: Przemek Kitszel Signed-off-by: Jesse Brandeburg Reviewed-by: Leon Romanovsky Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 417e7ec0d61e2bcb4e264375badbd2e3c879467c Author: Eric Dumazet Date: Sat Aug 19 03:17:07 2023 +0000 ipv4: fix data-races around inet->inet_id [ Upstream commit f866fbc842de5976e41ba874b76ce31710b634b5 ] UDP sendmsg() is lockless, so ip_select_ident_segs() can very well be run from multiple cpus [1] Convert inet->inet_id to an atomic_t, but implement a dedicated path for TCP, avoiding cost of a locked instruction (atomic_add_return()) Note that this patch will cause a trivial merge conflict because we added inet->flags in net-next tree. v2: added missing change in drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c (David Ahern) [1] BUG: KCSAN: data-race in __ip_make_skb / __ip_make_skb read-write to 0xffff888145af952a of 2 bytes by task 7803 on cpu 1: ip_select_ident_segs include/net/ip.h:542 [inline] ip_select_ident include/net/ip.h:556 [inline] __ip_make_skb+0x844/0xc70 net/ipv4/ip_output.c:1446 ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560 udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260 inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmmsg+0x269/0x500 net/socket.c:2634 __do_sys_sendmmsg net/socket.c:2663 [inline] __se_sys_sendmmsg net/socket.c:2660 [inline] __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff888145af952a of 2 bytes by task 7804 on cpu 0: ip_select_ident_segs include/net/ip.h:541 [inline] ip_select_ident include/net/ip.h:556 [inline] __ip_make_skb+0x817/0xc70 net/ipv4/ip_output.c:1446 ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560 udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260 inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmmsg+0x269/0x500 net/socket.c:2634 __do_sys_sendmmsg net/socket.c:2663 [inline] __se_sys_sendmmsg net/socket.c:2660 [inline] __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x184d -> 0x184e Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7804 Comm: syz-executor.1 Not tainted 6.5.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 ================================================================== Fixes: 23f57406b82d ("ipv4: avoid using shared IP generator for connected sockets") Reported-by: syzbot Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4af1fe642f3724f1567e7c2017eeb857b08399d6 Author: Jakub Kicinski Date: Fri Aug 18 18:26:02 2023 -0700 net: validate veth and vxcan peer ifindexes [ Upstream commit f534f6581ec084fe94d6759f7672bd009794b07e ] veth and vxcan need to make sure the ifindexes of the peer are not negative, core does not validate this. Using iproute2 with user-space-level checking removed: Before: # ./ip link add index 10 type veth peer index -1 # ip link show 1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp1s0: mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:74:b2:03 brd ff:ff:ff:ff:ff:ff 10: veth1@veth0: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 8a:90:ff:57:6d:5d brd ff:ff:ff:ff:ff:ff -1: veth0@veth1: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:ed:18:e6:fa:7f brd ff:ff:ff:ff:ff:ff Now: $ ./ip link add index 10 type veth peer index -1 Error: ifindex can't be negative. This problem surfaced in net-next because an explicit WARN() was added, the root cause is older. Fixes: e6f8f1a739b6 ("veth: Allow to create peer link with given ifindex") Fixes: a8f820a380a2 ("can: add Virtual CAN Tunnel driver (vxcan)") Reported-by: syzbot+5ba06978f34abb058571@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit afc9d3d217939842ba4e2795356b8dd53f7a2ed6 Author: Ruan Jinjie Date: Fri Aug 18 13:12:21 2023 +0800 net: bcmgenet: Fix return value check for fixed_phy_register() [ Upstream commit 32bbe64a1386065ab2aef8ce8cae7c689d0add6e ] The fixed_phy_register() function returns error pointers and never returns NULL. Update the checks accordingly. Fixes: b0ba512e25d7 ("net: bcmgenet: enable driver to work without a device tree") Signed-off-by: Ruan Jinjie Reviewed-by: Leon Romanovsky Acked-by: Doug Berger Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 029e491b8c11859525a1d6a307622bbc3a4ae559 Author: Ruan Jinjie Date: Fri Aug 18 13:12:20 2023 +0800 net: bgmac: Fix return value check for fixed_phy_register() [ Upstream commit 23a14488ea5882dea5851b65c9fce2127ee8fcad ] The fixed_phy_register() function returns error pointers and never returns NULL. Update the checks accordingly. Fixes: c25b23b8a387 ("bgmac: register fixed PHY for ARM BCM470X / BCM5301X chipsets") Signed-off-by: Ruan Jinjie Reviewed-by: Andrew Lunn Reviewed-by: Leon Romanovsky Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ac259251487a174ece97dd745727e39712f18ebf Author: Arınç ÜNAL Date: Sun Aug 13 13:59:17 2023 +0300 net: dsa: mt7530: fix handling of 802.1X PAE frames [ Upstream commit e94b590abfff2cdbf0bdaa7d9904364c8d480af5 ] 802.1X PAE frames are link-local frames, therefore they must be trapped to the CPU port. Currently, the MT753X switches treat 802.1X PAE frames as regular multicast frames, therefore flooding them to user ports. To fix this, set 802.1X PAE frames to be trapped to the CPU port(s). Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL Reviewed-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c663607202f58e0fe17d2db1f9967d641c0195f5 Author: Ido Schimmel Date: Thu Aug 17 15:58:25 2023 +0200 selftests: mlxsw: Fix test failure on Spectrum-4 [ Upstream commit f520489e99a35b0a5257667274fbe9afd2d8c50b ] Remove assumptions about shared buffer cell size and instead query the cell size from devlink. Adjust the test to send small packets that fit inside a single cell. Tested on Spectrum-{1,2,3,4}. Fixes: 4735402173e6 ("mlxsw: spectrum: Extend to support Spectrum-4 ASIC") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/f7dfbf3c4d1cb23838d9eb99bab09afaa320c4ca.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 1288f9907514b8e0cc1e17645252429ee0917182 Author: Amit Cohen Date: Thu Aug 17 15:58:24 2023 +0200 mlxsw: Fix the size of 'VIRT_ROUTER_MSB' [ Upstream commit 348c976be0a599918b88729def198a843701c9fe ] The field 'virtual router' was extended to 12 bits in Spectrum-4. Therefore, the element 'MLXSW_AFK_ELEMENT_VIRT_ROUTER_MSB' needs 3 bits for Spectrum < 4 and 4 bits for Spectrum >= 4. The elements are stored in an internal storage scratchpad. Currently, the MSB is defined there as 3 bits. It means that for Spectrum-4, only 2K VRFs can be used for multicast routing, as the highest bit is not really used by the driver. Fix the definition of 'VIRT_ROUTER_MSB' to use 4 bits. Adjust the definitions of 'virtual router' field in the blocks accordingly - use '_avoid_size_check' for Spectrum-2 instead of for Spectrum-4. Fix the mask in parse function to use 4 bits. Fixes: 6d5d8ebb881c ("mlxsw: Rename virtual router flex key element") Signed-off-by: Amit Cohen Reviewed-by: Ido Schimmel Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/79bed2b70f6b9ed58d4df02e9798a23da648015b.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 7134565a8207fc6fafe188c95bdd5f09744ccfec Author: Ido Schimmel Date: Thu Aug 17 15:58:23 2023 +0200 mlxsw: reg: Fix SSPR register layout [ Upstream commit 0dc63b9cfd4c5666ced52c829fdd65dcaeb9f0f1 ] The two most significant bits of the "local_port" field in the SSPR register are always cleared since they are overwritten by the deprecated and overlapping "sub_port" field. On systems with more than 255 local ports (e.g., Spectrum-4), this results in the firmware maintaining invalid mappings between system port and local port. Specifically, two different systems ports (0x1 and 0x101) point to the same local port (0x1), which eventually leads to firmware errors. Fix by removing the deprecated "sub_port" field. Fixes: fd24b29a1b74 ("mlxsw: reg: Align existing registers to use extended local_port field") Signed-off-by: Ido Schimmel Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/9b909a3033c8d3d6f67f237306bef4411c5e6ae4.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 22f9b5468df567ec3b3f493e16c65ff5df5fdff0 Author: Danielle Ratson Date: Thu Aug 17 15:58:22 2023 +0200 mlxsw: pci: Set time stamp fields also when its type is MIRROR_UTC [ Upstream commit bc2de151ab6ad0762a04563527ec42e54dde572a ] Currently, in Spectrum-2 and above, time stamps are extracted from the CQE into the time stamp fields in 'struct mlxsw_skb_cb', only when the CQE time stamp type is UTC. The time stamps are read directly from the CQE and software can get the time stamp in UTC format using CQEv2. From Spectrum-4, the time stamps that are read from the CQE are allowed to be also from MIRROR_UTC type. Therefore, we get a warning [1] from the driver that the time stamp fields were not set, when LLDP control packet is sent. Allow the time stamp type to be MIRROR_UTC and set the time stamp in this case as well. [1] WARNING: CPU: 11 PID: 0 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1409 mlxsw_sp2_ptp_hwtstamp_fill+0x1f/0x70 [mlxsw_spectrum] [...] Call Trace: mlxsw_sp2_ptp_receive+0x3c/0x80 [mlxsw_spectrum] mlxsw_core_skb_receive+0x119/0x190 [mlxsw_core] mlxsw_pci_cq_tasklet+0x3c9/0x780 [mlxsw_pci] tasklet_action_common.constprop.0+0x9f/0x110 __do_softirq+0xbb/0x296 irq_exit_rcu+0x79/0xa0 common_interrupt+0x86/0xa0 Fixes: 4735402173e6 ("mlxsw: spectrum: Extend to support Spectrum-4 ASIC") Signed-off-by: Danielle Ratson Reviewed-by: Ido Schimmel Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/bcef4d044ef608a4e258d33a7ec0ecd91f480db5.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 4496f6ccf599054b9a8d2a9d41e6410221269aa7 Author: Lu Wei Date: Thu Aug 17 22:54:49 2023 +0800 ipvlan: Fix a reference count leak warning in ipvlan_ns_exit() [ Upstream commit 043d5f68d0ccdda91029b4b6dce7eeffdcfad281 ] There are two network devices(veth1 and veth3) in ns1, and ipvlan1 with L3S mode and ipvlan2 with L2 mode are created based on them as figure (1). In this case, ipvlan_register_nf_hook() will be called to register nf hook which is needed by ipvlans in L3S mode in ns1 and value of ipvl_nf_hook_refcnt is set to 1. (1) ns1 ns2 ------------ ------------ veth1--ipvlan1 (L3S) veth3--ipvlan2 (L2) (2) ns1 ns2 ------------ ------------ veth1--ipvlan1 (L3S) ipvlan2 (L2) veth3 | | |------->-------->--------->-------- migrate When veth3 migrates from ns1 to ns2 as figure (2), veth3 will register in ns2 and calls call_netdevice_notifiers with NETDEV_REGISTER event: dev_change_net_namespace call_netdevice_notifiers ipvlan_device_event ipvlan_migrate_l3s_hook ipvlan_register_nf_hook(newnet) (I) ipvlan_unregister_nf_hook(oldnet) (II) In function ipvlan_migrate_l3s_hook(), ipvl_nf_hook_refcnt in ns1 is not 0 since veth1 with ipvlan1 still in ns1, (I) and (II) will be called to register nf_hook in ns2 and unregister nf_hook in ns1. As a result, ipvl_nf_hook_refcnt in ns1 is decreased incorrectly and this in ns2 is increased incorrectly. When the second net namespace is removed, a reference count leak warning in ipvlan_ns_exit() will be triggered. This patch add a check before ipvlan_migrate_l3s_hook() is called. The warning can be triggered as follows: $ ip netns add ns1 $ ip netns add ns2 $ ip netns exec ns1 ip link add veth1 type veth peer name veth2 $ ip netns exec ns1 ip link add veth3 type veth peer name veth4 $ ip netns exec ns1 ip link add ipv1 link veth1 type ipvlan mode l3s $ ip netns exec ns1 ip link add ipv2 link veth3 type ipvlan mode l2 $ ip netns exec ns1 ip link set veth3 netns ns2 $ ip net del ns2 Fixes: 3133822f5ac1 ("ipvlan: use pernet operations and restrict l3s hooks to master netns") Signed-off-by: Lu Wei Reviewed-by: Florian Westphal Link: https://lore.kernel.org/r/20230817145449.141827-1-luwei32@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 265ed382e0f4c7b28742e542d4b14799e131a57e Author: Eric Dumazet Date: Fri Aug 18 01:58:20 2023 +0000 dccp: annotate data-races in dccp_poll() [ Upstream commit cba3f1786916063261e3e5ccbb803abc325b24ef ] We changed tcp_poll() over time, bug never updated dccp. Note that we also could remove dccp instead of maintaining it. Fixes: 7c657876b63c ("[DCCP]: Initial implementation") Signed-off-by: Eric Dumazet Link: https://lore.kernel.org/r/20230818015820.2701595-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit b516a24f4c07426028baaa55595f50730df512c6 Author: Eric Dumazet Date: Fri Aug 18 01:51:32 2023 +0000 sock: annotate data-races around prot->memory_pressure [ Upstream commit 76f33296d2e09f63118db78125c95ef56df438e9 ] *prot->memory_pressure is read/writen locklessly, we need to add proper annotations. A recent commit added a new race, it is time to audit all accesses. Fixes: 2d0c88e84e48 ("sock: Fix misuse of sk_under_memory_pressure()") Fixes: 4d93df0abd50 ("[SCTP]: Rewrite of sctp buffer management code") Signed-off-by: Eric Dumazet Cc: Abel Wu Reviewed-by: Shakeel Butt Link: https://lore.kernel.org/r/20230818015132.2699348-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit cfee17993d1065d7d6553e57d276cb36c2a768d4 Author: Vladimir Oltean Date: Thu Aug 17 15:01:11 2023 +0300 net: dsa: felix: fix oversize frame dropping for always closed tc-taprio gates [ Upstream commit d44036cad31170da0cb9c728e80743f84267da6e ] The blamed commit resolved a bug where frames would still get stuck at egress, even though they're smaller than the maxSDU[tc], because the driver did not take into account the extra 33 ns that the queue system needs for scheduling the frame. It now takes that into account, but the arithmetic that we perform in vsc9959_tas_remaining_gate_len_ps() is buggy, because we operate on 64-bit unsigned integers, so gate_len_ns - VSC9959_TAS_MIN_GATE_LEN_NS may become a very large integer if gate_len_ns < 33 ns. In practice, this means that we've introduced a regression where all traffic class gates which are permanently closed will not get detected by the driver, and we won't enable oversize frame dropping for them. Before: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 2 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 3 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 4 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 5 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 6 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS After: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS Fixes: 11afdc6526de ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet") Signed-off-by: Vladimir Oltean Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20230817120111.3522827-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit b701b8d191daae3442942a1c68bb6e0582b566b2 Author: Jiri Pirko Date: Thu Aug 17 14:52:40 2023 +0200 devlink: add missing unregister linecard notification [ Upstream commit 2ebbc9752d06bb1d01201fe632cb6da033b0248d ] Cited fixes commit introduced linecard notifications for register, however it didn't add them for unregister. Fix that by adding them. Fixes: c246f9b5fd61 ("devlink: add support to create line card and expose to user") Signed-off-by: Jiri Pirko Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20230817125240.2144794-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 1375d2061204785d592c05f5dec47eb487ec3515 Author: Jakub Kicinski Date: Wed Jan 4 20:05:17 2023 -0800 devlink: move code to a dedicated directory [ Upstream commit f05bd8ebeb69c803efd6d8a76d96b7fcd7011094 ] The devlink code is hard to navigate with 13kLoC in one file. I really like the way Michal split the ethtool into per-command files and core. It'd probably be too much to split it all up, but we can at least separate the core parts out of the per-cmd implementations and put it in a directory so that new commands can be separate files. Move the code, subsequent commit will do a partial split. Reviewed-by: Jacob Keller Reviewed-by: Jiri Pirko Signed-off-by: Jakub Kicinski Stable-dep-of: 2ebbc9752d06 ("devlink: add missing unregister linecard notification") Signed-off-by: Sasha Levin commit eaeef5c865ab9dc5c153b552b52ed0d90eea614e Author: Hariprasad Kelam Date: Thu Aug 17 12:00:06 2023 +0530 octeontx2-af: SDP: fix receive link config [ Upstream commit 05f3d5bc23524bed6f043dfe6b44da687584f9fb ] On SDP interfaces, frame oversize and undersize errors are observed as driver is not considering packet sizes of all subscribers of the link before updating the link config. This patch fixes the same. Fixes: 9b7dd87ac071 ("octeontx2-af: Support to modify min/max allowed packet lengths") Signed-off-by: Hariprasad Kelam Signed-off-by: Sunil Goutham Reviewed-by: Leon Romanovsky Link: https://lore.kernel.org/r/20230817063006.10366-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 2cb0c037c927db4ec928cc927488e52aa359786e Author: Zheng Yejian Date: Thu Aug 17 20:55:39 2023 +0800 tracing: Fix memleak due to race between current_tracer and trace [ Upstream commit eecb91b9f98d6427d4af5fdb8f108f52572a39e7 ] Kmemleak report a leak in graph_trace_open(): unreferenced object 0xffff0040b95f4a00 (size 128): comm "cat", pid 204981, jiffies 4301155872 (age 99771.964s) hex dump (first 32 bytes): e0 05 e7 b4 ab 7d 00 00 0b 00 01 00 00 00 00 00 .....}.......... f4 00 01 10 00 a0 ff ff 00 00 00 00 65 00 10 00 ............e... backtrace: [<000000005db27c8b>] kmem_cache_alloc_trace+0x348/0x5f0 [<000000007df90faa>] graph_trace_open+0xb0/0x344 [<00000000737524cd>] __tracing_open+0x450/0xb10 [<0000000098043327>] tracing_open+0x1a0/0x2a0 [<00000000291c3876>] do_dentry_open+0x3c0/0xdc0 [<000000004015bcd6>] vfs_open+0x98/0xd0 [<000000002b5f60c9>] do_open+0x520/0x8d0 [<00000000376c7820>] path_openat+0x1c0/0x3e0 [<00000000336a54b5>] do_filp_open+0x14c/0x324 [<000000002802df13>] do_sys_openat2+0x2c4/0x530 [<0000000094eea458>] __arm64_sys_openat+0x130/0x1c4 [<00000000a71d7881>] el0_svc_common.constprop.0+0xfc/0x394 [<00000000313647bf>] do_el0_svc+0xac/0xec [<000000002ef1c651>] el0_svc+0x20/0x30 [<000000002fd4692a>] el0_sync_handler+0xb0/0xb4 [<000000000c309c35>] el0_sync+0x160/0x180 The root cause is descripted as follows: __tracing_open() { // 1. File 'trace' is being opened; ... *iter->trace = *tr->current_trace; // 2. Tracer 'function_graph' is // currently set; ... iter->trace->open(iter); // 3. Call graph_trace_open() here, // and memory are allocated in it; ... } s_start() { // 4. The opened file is being read; ... *iter->trace = *tr->current_trace; // 5. If tracer is switched to // 'nop' or others, then memory // in step 3 are leaked!!! ... } To fix it, in s_start(), close tracer before switching then reopen the new tracer after switching. And some tracers like 'wakeup' may not update 'iter->private' in some cases when reopen, then it should be cleared to avoid being mistakenly closed again. Link: https://lore.kernel.org/linux-trace-kernel/20230817125539.1646321-1-zhengyejian1@huawei.com Fixes: d7350c3f4569 ("tracing/core: make the read callbacks reentrants") Signed-off-by: Zheng Yejian Signed-off-by: Steven Rostedt (Google) Signed-off-by: Sasha Levin commit 7d0c2b0de2dbf086373049ac9bf45b9ae7f421bb Author: Zheng Yejian Date: Sat Aug 5 11:38:15 2023 +0800 tracing: Fix cpu buffers unavailable due to 'record_disabled' missed [ Upstream commit b71645d6af10196c46cbe3732de2ea7d36b3ff6d ] Trace ring buffer can no longer record anything after executing following commands at the shell prompt: # cd /sys/kernel/tracing # cat tracing_cpumask fff # echo 0 > tracing_cpumask # echo 1 > snapshot # echo fff > tracing_cpumask # echo 1 > tracing_on # echo "hello world" > trace_marker -bash: echo: write error: Bad file descriptor The root cause is that: 1. After `echo 0 > tracing_cpumask`, 'record_disabled' of cpu buffers in 'tr->array_buffer.buffer' became 1 (see tracing_set_cpumask()); 2. After `echo 1 > snapshot`, 'tr->array_buffer.buffer' is swapped with 'tr->max_buffer.buffer', then the 'record_disabled' became 0 (see update_max_tr()); 3. After `echo fff > tracing_cpumask`, the 'record_disabled' become -1; Then array_buffer and max_buffer are both unavailable due to value of 'record_disabled' is not 0. To fix it, enable or disable both array_buffer and max_buffer at the same time in tracing_set_cpumask(). Link: https://lkml.kernel.org/r/20230805033816.3284594-2-zhengyejian1@huawei.com Cc: Cc: Cc: Fixes: 71babb2705e2 ("tracing: change CPU ring buffer state from tracing_cpumask") Signed-off-by: Zheng Yejian Signed-off-by: Steven Rostedt (Google) Signed-off-by: Sasha Levin commit 7e862cce34916458bf6af954d198cce103c1e13f Author: Andi Shyti Date: Tue Jul 25 02:19:50 2023 +0200 drm/i915/gt: Support aux invalidation on all engines [ Upstream commit 6a35f22d222528e1b157c6978c9424d2f8cbe0a1 ] Perform some refactoring with the purpose of keeping in one single place all the operations around the aux table invalidation. With this refactoring add more engines where the invalidation should be performed. Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines") Signed-off-by: Andi Shyti Cc: Jonathan Cavitt Cc: Matt Roper Cc: # v5.8+ Reviewed-by: Andrzej Hajda Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-8-andi.shyti@linux.intel.com (cherry picked from commit 76ff7789d6e63d1a10b3b58f5c70b2e640c7a880) Signed-off-by: Tvrtko Ursulin Signed-off-by: Sasha Levin commit 8e3f138b96f64fde58d74f886acbfd4baca907fc Author: Jonathan Cavitt Date: Tue Jul 25 02:19:49 2023 +0200 drm/i915/gt: Poll aux invalidation register bit on invalidation [ Upstream commit 0fde2f23516a00fd90dfb980b66b4665fcbfa659 ] For platforms that use Aux CCS, wait for aux invalidation to complete by checking the aux invalidation register bit is cleared. Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines") Signed-off-by: Jonathan Cavitt Signed-off-by: Andi Shyti Cc: # v5.8+ Reviewed-by: Nirmoy Das Reviewed-by: Andrzej Hajda Reviewed-by: Matt Roper Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-7-andi.shyti@linux.intel.com (cherry picked from commit d459c86f00aa98028d155a012c65dc42f7c37e76) Signed-off-by: Tvrtko Ursulin Signed-off-by: Sasha Levin commit 017d4404312ab94a61be218c0221cd0048a37896 Author: Jonathan Cavitt Date: Tue Jul 25 02:19:46 2023 +0200 drm/i915/gt: Ensure memory quiesced before invalidation [ Upstream commit 78a6ccd65fa3a7cc697810db079cc4b84dff03d5 ] All memory traffic must be quiesced before requesting an aux invalidation on platforms that use Aux CCS. Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines") Requires: a2a4aa0eef3b ("drm/i915: Add the gen12_needs_ccs_aux_inv helper") Signed-off-by: Jonathan Cavitt Signed-off-by: Andi Shyti Cc: # v5.8+ Reviewed-by: Nirmoy Das Reviewed-by: Andrzej Hajda Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-4-andi.shyti@linux.intel.com (cherry picked from commit ad8ebf12217e451cd19804b1c3e97ad56491c74a) Signed-off-by: Tvrtko Ursulin Signed-off-by: Sasha Levin commit c23126f2c76a17b97520d306542cee32bb26fad8 Author: Andi Shyti Date: Tue Jul 25 02:19:45 2023 +0200 drm/i915: Add the gen12_needs_ccs_aux_inv helper [ Upstream commit b2f59e9026038a5bbcbc0019fa58f963138211ee ] We always assumed that a device might either have AUX or FLAT CCS, but this is an approximation that is not always true, e.g. PVC represents an exception. Set the basis for future finer selection by implementing a boolean gen12_needs_ccs_aux_inv() function that tells whether aux invalidation is needed or not. Currently PVC is the only exception to the above mentioned rule. Requires: 059ae7ae2a1c ("drm/i915/gt: Cleanup aux invalidation registers") Signed-off-by: Andi Shyti Cc: Matt Roper Cc: Jonathan Cavitt Cc: # v5.8+ Reviewed-by: Matt Roper Reviewed-by: Andrzej Hajda Reviewed-by: Nirmoy Das Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-3-andi.shyti@linux.intel.com (cherry picked from commit c827655b87ad201ebe36f2e28d16b5491c8f7801) Signed-off-by: Tvrtko Ursulin Signed-off-by: Sasha Levin commit d4f5dcf68c0531b8a7191b736e121c6537e1b6cf Author: Harald Freudenberger Date: Mon Jul 17 16:55:29 2023 +0200 s390/zcrypt: fix reply buffer calculations for CCA replies [ Upstream commit 4cfca532ddc3474b3fc42592d0e4237544344b1a ] The length information for available buffer space for CCA replies is covered with two fields in the T6 header prepended on each CCA reply: fromcardlen1 and fromcardlen2. The sum of these both values must not exceed the AP bus limit for this card (24KB for CEX8, 12KB CEX7 and older) minus the always present headers. The current code adjusted the fromcardlen2 value in case of exceeding the AP bus limit when there was a non-zero value given from userspace. Some tests now showed that this was the wrong assumption. Instead the userspace value given for this field should always be trusted and if the sum of the two fields exceeds the AP bus limit for this card the first field fromcardlen1 should be adjusted instead. So now the calculation is done with this new insight in mind. Also some additional checks for overflow have been introduced and some comments to provide some documentation for future maintainers of this complicated calculation code. Furthermore the 128 bytes of fix overhead which is used in the current code is not correct. Investigations showed that for a reply always the same two header structs are prepended before a possible payload. So this is also fixed with this patch. Signed-off-by: Harald Freudenberger Reviewed-by: Holger Dengler Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit 246d763b79a597fcc84ff10d96b620974f47bb56 Author: Yu Zhe Date: Fri Mar 3 13:21:55 2023 +0800 s390/zcrypt: remove unnecessary (void *) conversions [ Upstream commit 72c2112ce9d72e6c40dd893f32187a3d34453113 ] Pointer variables of void * type do not require type cast. Signed-off-by: Yu Zhe Reviewed-by: Muhammad Usama Anjum Link: https://lore.kernel.org/r/20230303052155.21072-1-yuzhe@nfschina.com Signed-off-by: Heiko Carstens Signed-off-by: Vasily Gorbik Stable-dep-of: 4cfca532ddc3 ("s390/zcrypt: fix reply buffer calculations for CCA replies") Signed-off-by: Sasha Levin commit 40dafcab9da92179d610ea5a5f0c32f5ff2fc365 Author: Eric Dumazet Date: Thu Jul 20 11:44:38 2023 +0000 can: raw: fix lockdep issue in raw_release() [ Upstream commit 11c9027c983e9e4b408ee5613b6504d24ebd85be ] syzbot complained about a lockdep issue [1] Since raw_bind() and raw_setsockopt() first get RTNL before locking the socket, we must adopt the same order in raw_release() [1] WARNING: possible circular locking dependency detected 6.5.0-rc1-syzkaller-00192-g78adb4bcf99e #0 Not tainted ------------------------------------------------------ syz-executor.0/14110 is trying to acquire lock: ffff88804e4b6130 (sk_lock-AF_CAN){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1708 [inline] ffff88804e4b6130 (sk_lock-AF_CAN){+.+.}-{0:0}, at: raw_bind+0xb1/0xab0 net/can/raw.c:435 but task is already holding lock: ffffffff8e3df368 (rtnl_mutex){+.+.}-{3:3}, at: raw_bind+0xa7/0xab0 net/can/raw.c:434 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (rtnl_mutex){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:603 [inline] __mutex_lock+0x181/0x1340 kernel/locking/mutex.c:747 raw_release+0x1c6/0x9b0 net/can/raw.c:391 __sock_release+0xcd/0x290 net/socket.c:654 sock_close+0x1c/0x20 net/socket.c:1386 __fput+0x3fd/0xac0 fs/file_table.c:384 task_work_run+0x14d/0x240 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x210/0x240 kernel/entry/common.c:204 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297 do_syscall_64+0x44/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (sk_lock-AF_CAN){+.+.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144 lock_acquire kernel/locking/lockdep.c:5761 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726 lock_sock_nested+0x3a/0xf0 net/core/sock.c:3492 lock_sock include/net/sock.h:1708 [inline] raw_bind+0xb1/0xab0 net/can/raw.c:435 __sys_bind+0x1ec/0x220 net/socket.c:1792 __do_sys_bind net/socket.c:1803 [inline] __se_sys_bind net/socket.c:1801 [inline] __x64_sys_bind+0x72/0xb0 net/socket.c:1801 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(sk_lock-AF_CAN); lock(rtnl_mutex); lock(sk_lock-AF_CAN); *** DEADLOCK *** 1 lock held by syz-executor.0/14110: stack backtrace: CPU: 0 PID: 14110 Comm: syz-executor.0 Not tainted 6.5.0-rc1-syzkaller-00192-g78adb4bcf99e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2195 check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144 lock_acquire kernel/locking/lockdep.c:5761 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726 lock_sock_nested+0x3a/0xf0 net/core/sock.c:3492 lock_sock include/net/sock.h:1708 [inline] raw_bind+0xb1/0xab0 net/can/raw.c:435 __sys_bind+0x1ec/0x220 net/socket.c:1792 __do_sys_bind net/socket.c:1803 [inline] __se_sys_bind net/socket.c:1801 [inline] __x64_sys_bind+0x72/0xb0 net/socket.c:1801 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fd89007cb29 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fd890d2a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 RAX: ffffffffffffffda RBX: 00007fd89019bf80 RCX: 00007fd89007cb29 RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000003 RBP: 00007fd8900c847a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fd89019bf80 R15: 00007ffebf8124f8 Fixes: ee8b94c8510c ("can: raw: fix receiver memory leak") Reported-by: syzbot Signed-off-by: Eric Dumazet Cc: Ziyang Xuan Cc: Oliver Hartkopp Cc: stable@vger.kernel.org Cc: Marc Kleine-Budde Link: https://lore.kernel.org/all/20230720114438.172434-1-edumazet@google.com Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 335987e21237fff0902262ca02b29463e84e3272 Author: Ziyang Xuan Date: Tue Jul 11 09:17:37 2023 +0800 can: raw: fix receiver memory leak [ Upstream commit ee8b94c8510ce64afe0b87ef548d23e00915fb10 ] Got kmemleak errors with the following ltp can_filter testcase: for ((i=1; i<=100; i++)) do ./can_filter & sleep 0.1 done ============================================================== [<00000000db4a4943>] can_rx_register+0x147/0x360 [can] [<00000000a289549d>] raw_setsockopt+0x5ef/0x853 [can_raw] [<000000006d3d9ebd>] __sys_setsockopt+0x173/0x2c0 [<00000000407dbfec>] __x64_sys_setsockopt+0x61/0x70 [<00000000fd468496>] do_syscall_64+0x33/0x40 [<00000000b7e47d51>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 It's a bug in the concurrent scenario of unregister_netdevice_many() and raw_release() as following: cpu0 cpu1 unregister_netdevice_many(can_dev) unlist_netdevice(can_dev) // dev_get_by_index() return NULL after this net_set_todo(can_dev) raw_release(can_socket) dev = dev_get_by_index(, ro->ifindex); // dev == NULL if (dev) { // receivers in dev_rcv_lists not free because dev is NULL raw_disable_allfilters(, dev, ); dev_put(dev); } ... ro->bound = 0; ... call_netdevice_notifiers(NETDEV_UNREGISTER, ) raw_notify(, NETDEV_UNREGISTER, ) if (ro->bound) // invalid because ro->bound has been set 0 raw_disable_allfilters(, dev, ); // receivers in dev_rcv_lists will never be freed Add a net_device pointer member in struct raw_sock to record bound can_dev, and use rtnl_lock to serialize raw_socket members between raw_bind(), raw_release(), raw_setsockopt() and raw_notify(). Use ro->dev to decide whether to free receivers in dev_rcv_lists. Fixes: 8d0caedb7596 ("can: bcm/raw/isotp: use per module netdevice notifier") Reviewed-by: Oliver Hartkopp Acked-by: Oliver Hartkopp Signed-off-by: Ziyang Xuan Link: https://lore.kernel.org/all/20230711011737.1969582-1-william.xuanziyang@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit e5c768d809a85e9efd0274b2efe69d4970cc0014 Author: Zhang Yi Date: Tue Jun 6 21:59:27 2023 +0800 jbd2: fix a race when checking checkpoint buffer busy [ Upstream commit 46f881b5b1758dc4a35fba4a643c10717d0cf427 ] Before removing checkpoint buffer from the t_checkpoint_list, we have to check both BH_Dirty and BH_Lock bits together to distinguish buffers have not been or were being written back. But __cp_buffer_busy() checks them separately, it first check lock state and then check dirty, the window between these two checks could be raced by writing back procedure, which locks buffer and clears buffer dirty before I/O completes. So it cannot guarantee checkpointing buffers been written back to disk if some error happens later. Finally, it may clean checkpoint transactions and lead to inconsistent filesystem. jbd2_journal_forget() and __journal_try_to_free_buffer() also have the same problem (journal_unmap_buffer() escape from this issue since it's running under the buffer lock), so fix them through introducing a new helper to try holding the buffer lock and remove really clean buffer. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Cc: stable@vger.kernel.org Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20230606135928.434610-6-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit 5fda50e262e65bd553ff777c4b280afd1495a18b Author: Zhang Yi Date: Tue Jun 6 21:59:25 2023 +0800 jbd2: remove journal_clean_one_cp_list() [ Upstream commit b98dba273a0e47dbfade89c9af73c5b012a4eabb ] journal_clean_one_cp_list() and journal_shrink_one_cp_list() are almost the same, so merge them into journal_shrink_one_cp_list(), remove the nr_to_scan parameter, always scan and try to free the whole checkpoint list. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20230606135928.434610-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin commit 8168c96c24ecfe0265c295aa4d969b4b650f00d3 Author: Zhang Yi Date: Tue Jun 6 21:59:24 2023 +0800 jbd2: remove t_checkpoint_io_list [ Upstream commit be22255360f80d3af789daad00025171a65424a5 ] Since t_checkpoint_io_list was stop using in jbd2_log_do_checkpoint() now, it's time to remove the whole t_checkpoint_io_list logic. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20230606135928.434610-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin commit 1fa68a78109840ccb3d1b69ea0ba3cd9b30ed17f Author: Jiaxun Yang Date: Wed Jun 7 13:51:22 2023 +0800 MIPS: cpu-features: Use boot_cpu_type for CPU type based features [ Upstream commit 5487a7b60695a92cf998350e4beac17144c91fcd ] Some CPU feature macros were using current_cpu_type to mark feature availability. However current_cpu_type will use smp_processor_id, which is prohibited under preemptable context. Since those features are all uniform on all CPUs in a SMP system, use boot_cpu_type instead of current_cpu_type to fix preemptable kernel. Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang Signed-off-by: Thomas Bogendoerfer Signed-off-by: Sasha Levin commit 92c568c82ee74cb8343620d14190d9c1169e9f71 Author: Jiaxun Yang Date: Tue Apr 4 10:33:44 2023 +0100 MIPS: cpu-features: Enable octeon_cache by cpu_type [ Upstream commit f641519409a73403ee6612b8648b95a688ab85c2 ] cpu_has_octeon_cache was tied to 0 for generic cpu-features, whith this generic kernel built for octeon CPU won't boot. Just enable this flag by cpu_type. It won't hurt orther platforms because compiler will eliminate the code path on other processors. Signed-off-by: Jiaxun Yang Signed-off-by: Thomas Bogendoerfer Stable-dep-of: 5487a7b60695 ("MIPS: cpu-features: Use boot_cpu_type for CPU type based features") Signed-off-by: Sasha Levin commit 3e4d038da33e75765584fa54d6cc1a327535af65 Author: Igor Mammedov Date: Mon Apr 24 21:15:57 2023 +0200 PCI: acpiphp: Reassign resources on bridge if necessary [ Upstream commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 ] When using ACPI PCI hotplug, hotplugging a device with large BARs may fail if bridge windows programmed by firmware are not large enough. Reproducer: $ qemu-kvm -monitor stdio -M q35 -m 4G \ -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on \ -device id=rp1,pcie-root-port,bus=pcie.0,chassis=4 \ disk_image wait till linux guest boots, then hotplug device: (qemu) device_add qxl,bus=rp1 hotplug on guest side fails with: pci 0000:01:00.0: [1b36:0100] type 00 class 0x038000 pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x03ffffff] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00001fff] pci 0000:01:00.0: reg 0x1c: [io 0x0000-0x001f] pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000] pci 0000:01:00.0: BAR 1: no space for [mem size 0x04000000] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x04000000] pci 0000:01:00.0: BAR 2: assigned [mem 0xfe800000-0xfe801fff] pci 0000:01:00.0: BAR 3: assigned [io 0x1000-0x101f] qxl 0000:01:00.0: enabling device (0000 -> 0003) Unable to create vram_mapping qxl: probe of 0000:01:00.0 failed with error -12 However when using native PCIe hotplug '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off' it works fine, since kernel attempts to reassign unused resources. Use the same machinery as native PCIe hotplug to (re)assign resources. Link: https://lore.kernel.org/r/20230424191557.2464760-1-imammedo@redhat.com Signed-off-by: Igor Mammedov Signed-off-by: Bjorn Helgaas Acked-by: Michael S. Tsirkin Acked-by: Rafael J. Wysocki Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 28916927b7626ed3c44d902eafbf9d00691266e1 Author: Daniel Vetter Date: Thu Apr 6 15:21:05 2023 +0200 video/aperture: Move vga handling to pci function [ Upstream commit f1d599d315fb7b7343cddaf365e671aaa8453aca ] A few reasons for this: - It's really the only one where this matters. I tried looking around, and I didn't find any non-pci vga-compatible controllers for x86 (since that's the only platform where we had this until a few patches ago), where a driver participating in the aperture claim dance would interfere. - I also don't expect that any future bus anytime soon will not just look like pci towards the OS, that's been the case for like 25+ years by now for practically everything (even non non-x86). - Also it's a bit funny if we have one part of the vga removal in the pci function, and the other in the generic one. v2: Rebase. v4: - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Signed-off-by: Daniel Vetter Signed-off-by: Daniel Vetter Signed-off-by: Thomas Zimmermann Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Cc: Helge Deller Cc: linux-fbdev@vger.kernel.org Reviewed-by: Javier Martinez Canillas Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-6-tzimmermann@suse.de Stable-dep-of: 5ae3716cfdcd ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin commit 4aad3b82b9de7c4ffccb02f87d0c6903569333c9 Author: Daniel Vetter Date: Thu Apr 6 15:21:04 2023 +0200 video/aperture: Only kick vgacon when the pdev is decoding vga [ Upstream commit 7450cd235b45d43ee6f3c235f89e92623458175d ] Otherwise it's a bit silly, and we might throw out the driver for the screen the user is actually looking at. I haven't found a bug report for this case yet, but we did get bug reports for the analog case where we're throwing out the efifb driver. v2: Flip the check around to make it clear it's a special case for kicking out the vgacon driver only (Thomas) v4: - fixes to commit message - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Link: https://bugzilla.kernel.org/show_bug.cgi?id=216303 Signed-off-by: Daniel Vetter Signed-off-by: Daniel Vetter Signed-off-by: Thomas Zimmermann Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Cc: Helge Deller Cc: linux-fbdev@vger.kernel.org Reviewed-by: Javier Martinez Canillas Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-5-tzimmermann@suse.de Stable-dep-of: 5ae3716cfdcd ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin commit 437e99f2a1e933348c4cedb2c7ce6f0ad81b935e Author: Daniel Vetter Date: Thu Apr 6 15:21:03 2023 +0200 drm/aperture: Remove primary argument [ Upstream commit 62aeaeaa1b267c5149abee6b45967a5df3feed58 ] Only really pci devices have a business setting this - it's for figuring out whether the legacy vga stuff should be nuked too. And with the preceding two patches those are all using the pci version of this. Which means for all other callers primary == false and we can remove it now. v2: - Reorder to avoid compile fail (Thomas) - Include gma500, which retained it's called to the non-pci version. v4: - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Signed-off-by: Daniel Vetter Signed-off-by: Daniel Vetter Signed-off-by: Thomas Zimmermann Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Deepak Rawat Cc: Neil Armstrong Cc: Kevin Hilman Cc: Jerome Brunet Cc: Martin Blumenstingl Cc: Thierry Reding Cc: Jonathan Hunter Cc: Emma Anholt Cc: Helge Deller Cc: David Airlie Cc: Daniel Vetter Cc: linux-hyperv@vger.kernel.org Cc: linux-amlogic@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-tegra@vger.kernel.org Cc: linux-fbdev@vger.kernel.org Acked-by: Martin Blumenstingl Acked-by: Thierry Reding Reviewed-by: Javier Martinez Canillas Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-4-tzimmermann@suse.de Stable-dep-of: 5ae3716cfdcd ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin commit cccfcbb9e51af026529618e3bdc09b6beacac919 Author: Daniel Vetter Date: Thu Apr 6 15:21:01 2023 +0200 drm/gma500: Use drm_aperture_remove_conflicting_pci_framebuffers [ Upstream commit 80e993988b97fe794f3ec2be6db05fe30f9353c3 ] This one nukes all framebuffers, which is a bit much. In reality gma500 is igpu and never shipped with anything discrete, so there should not be any difference. v2: Unfortunately the framebuffer sits outside of the pci bars for gma500, and so only using the pci helpers won't be enough. Otoh if we only use non-pci helper, then we don't get the vga handling, and subsequent refactoring to untangle these special cases won't work. It's not pretty, but the simplest fix (since gma500 really is the only quirky pci driver like this we have) is to just have both calls. v4: - fix Daniel's S-o-b address v5: - add back an S-o-b tag with Daniel's Intel address Signed-off-by: Daniel Vetter Signed-off-by: Daniel Vetter Signed-off-by: Thomas Zimmermann Cc: Patrik Jakobsson Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Reviewed-by: Javier Martinez Canillas Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-2-tzimmermann@suse.de Stable-dep-of: 5ae3716cfdcd ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin commit 6db53af15444e7022640d7b8d5e7531d94e27a43 Author: Daniel Vetter Date: Wed Jan 11 16:41:08 2023 +0100 fbdev/radeon: use pci aperture helpers [ Upstream commit 9b539c4d1b921bc9c8c578d4d50f0a7e7874d384 ] It's not exactly the same since the open coded version doesn't set primary correctly. But that's a bugfix, so shouldn't hurt really. Signed-off-by: Daniel Vetter Cc: Benjamin Herrenschmidt Cc: linux-fbdev@vger.kernel.org Reviewed-by: Thomas Zimmermann Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20230111154112.90575-7-daniel.vetter@ffwll.ch Stable-dep-of: 5ae3716cfdcd ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin commit cd1f889c99eee5a6fae671962a63bc89e68d7837 Author: Daniel Vetter Date: Wed Jan 11 16:41:02 2023 +0100 drm/ast: Use drm_aperture_remove_conflicting_pci_framebuffers [ Upstream commit c1ebead36099deb85384f6fb262fe619a04cee73 ] It's just open coded and matches. Note that Thomas said that his version apparently failed for some reason, but hey maybe we should try again. Signed-off-by: Daniel Vetter Cc: Dave Airlie Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Cc: Helge Deller Cc: linux-fbdev@vger.kernel.org Tested-by: Thomas Zimmmermann Reviewed-by: Thomas Zimmermann Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20230111154112.90575-1-daniel.vetter@ffwll.ch Stable-dep-of: 5ae3716cfdcd ("video/aperture: Only remove sysfb on the default vga pci device") Signed-off-by: Sasha Levin commit 26ea8668b8aae5f57068c2f442a28a8e420a759b Author: Chuck Lever Date: Mon Jul 3 14:18:29 2023 -0400 xprtrdma: Remap Receive buffers after a reconnect [ Upstream commit 895cedc1791916e8a98864f12b656702fad0bb67 ] On server-initiated disconnect, rpcrdma_xprt_disconnect() was DMA- unmapping the Receive buffers, but rpcrdma_post_recvs() neglected to remap them after a new connection had been established. The result was immediate failure of the new connection with the Receives flushing with LOCAL_PROT_ERR. Fixes: 671c450b6fe0 ("xprtrdma: Fix oops in Receive handler after device removal") Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit d9aac9cdd6e2b4cf21dc91ad0803a7df2b177bac Author: Fedor Pchelkin Date: Tue Jul 25 14:59:30 2023 +0300 NFSv4: fix out path in __nfs4_get_acl_uncached [ Upstream commit f4e89f1a6dab4c063fc1e823cc9dddc408ff40cf ] Another highly rare error case when a page allocating loop (inside __nfs4_get_acl_uncached, this time) is not properly unwound on error. Since pages array is allocated being uninitialized, need to free only lower array indices. NULL checks were useful before commit 62a1573fcf84 ("NFSv4 fix acl retrieval over krb5i/krb5p mounts") when the array had been initialized to zero on stack. Found by Linux Verification Center (linuxtesting.org). Fixes: 62a1573fcf84 ("NFSv4 fix acl retrieval over krb5i/krb5p mounts") Signed-off-by: Fedor Pchelkin Reviewed-by: Benjamin Coddington Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 4a289d123f62f7fdf33e7ce02c4c4c0d3b708a2b Author: Fedor Pchelkin Date: Tue Jul 25 14:58:58 2023 +0300 NFSv4.2: fix error handling in nfs42_proc_getxattr [ Upstream commit 4e3733fd2b0f677faae21cf838a43faf317986d3 ] There is a slight issue with error handling code inside nfs42_proc_getxattr(). If page allocating loop fails then we free the failing page array element which is NULL but __free_page() can't deal with NULL args. Found by Linux Verification Center (linuxtesting.org). Fixes: a1f26739ccdc ("NFSv4.2: improve page handling for GETXATTR") Signed-off-by: Fedor Pchelkin Reviewed-by: Benjamin Coddington Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin