commit c9a51ebb4bac69ed3fee9c0ebe0c2b5149e80845 Author: Greg Kroah-Hartman Date: Fri Jan 5 15:19:45 2024 +0100 Linux 6.6.10 Link: https://lore.kernel.org/r/20240103164834.970234661@linuxfoundation.org Tested-by: Nam Cao Tested-by: Ronald Warsow Tested-by: SeongJae Park Tested-by: Salvatore Bonaccorso Tested-by: Florian Fainelli Tested-by: Kelsey Steele Tested-by: Shuah Khan Tested-by: Takeshi Ogasawara Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Harshit Mogalapalli Tested-by: Jon Hunter Tested-by: Allen Pais Tested-by: Guenter Roeck Tested-by: Namjae Jeon Signed-off-by: Greg Kroah-Hartman commit 9b603077e29c84836a44325593e959da818274c7 Author: Shin'ichiro Kawasaki Date: Thu Jan 4 20:40:50 2024 +0900 Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe" commit b20712e853305cbd04673f02b7e52ba5b12c11a9 upstream. This reverts commit b28ff7a7c3245d7f62acc20f15b4361292fe4117. The commit introduced P2SB device scan and resource cache during the boot process to avoid deadlock. But it caused detection failure of IDE controllers on old systems [1]. The IDE controllers on old systems and P2SB devices on newer systems have same PCI DEVFN. It is suspected the confusion between those two is the failure cause. Revert the change at this moment until the proper solution gets ready. Link: https://lore.kernel.org/platform-driver-x86/CABq1_vjfyp_B-f4LAL6pg394bP6nDFyvg110TOLHHb0x4aCPeg@mail.gmail.com/T/#m07b30468d9676fc5e3bb2122371121e4559bb383 [1] Signed-off-by: Shin'ichiro Kawasaki Link: https://lore.kernel.org/r/20240104114050.3142690-1-shinichiro.kawasaki@wdc.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Greg Kroah-Hartman commit b7f1c01b55ad2a5da12f08e5ec3c76dabb99882a Author: Pablo Neira Ayuso Date: Tue Dec 19 19:44:49 2023 +0100 netfilter: nf_tables: skip set commit for deleted/destroyed sets commit 7315dc1e122c85ffdfc8defffbb8f8b616c2eb1a upstream. NFT_MSG_DELSET deactivates all elements in the set, skip set->ops->commit() to avoid the unnecessary clone (for the pipapo case) as well as the sync GC cycle, which could deactivate again expired elements in such set. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by: Kevin Rich Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit e904e81fd3c2cbe322779b36d12c039a9c7b19af Author: Léo Lam Date: Sat Dec 16 05:47:17 2023 +0000 wifi: nl80211: fix deadlock in nl80211_set_cqm_rssi (6.6.x) Commit 008afb9f3d57 ("wifi: cfg80211: fix CQM for non-range use" backported to 6.6.x) causes nl80211_set_cqm_rssi not to release the wdev lock in some of the error paths. Of course, the ensuing deadlock causes userland network managers to break pretty badly, and on typical systems this also causes lockups on on suspend, poweroff and reboot. See [1], [2], [3] for example reports. The upstream commit 7e7efdda6adb ("wifi: cfg80211: fix CQM for non-range use"), committed in November 2023, is completely fine because there was another commit in August 2023 that removed the wdev lock: see commit 076fc8775daf ("wifi: cfg80211: remove wdev mutex"). The reason things broke in 6.6.5 is that commit 4338058f6009 was applied without also applying 076fc8775daf. Commit 076fc8775daf ("wifi: cfg80211: remove wdev mutex") is a rather large commit; adjusting the error handling (which is what this commit does) yields a much simpler patch and was tested to work properly. Fix the deadlock by releasing the lock before returning. [1] https://bugzilla.kernel.org/show_bug.cgi?id=218247 [2] https://bbs.archlinux.org/viewtopic.php?id=290976 [3] https://lore.kernel.org/all/87sf4belmm.fsf@turtle.gmx.de/ Link: https://lore.kernel.org/stable/e374bb16-5b13-44cc-b11a-2f4eefb1ecf5@manjaro.org/ Fixes: 008afb9f3d57 ("wifi: cfg80211: fix CQM for non-range use") Tested-by: "Léo Lam" Tested-by: Philip Müller Cc: stable@vger.kernel.org Cc: Johannes Berg Signed-off-by: "Léo Lam" Signed-off-by: Greg Kroah-Hartman commit d673099085ddf1c28a6c0c1a245480919b5fe9e5 Author: Johannes Berg Date: Sat Dec 16 05:47:15 2023 +0000 wifi: cfg80211: fix CQM for non-range use commit 7e7efdda6adb385fbdfd6f819d76bc68c923c394 upstream. [note: this is commit 4a7e92551618f3737b305f62451353ee05662f57 reapplied; that commit had been reverted in 6.6.6 because it caused regressions, see https://lore.kernel.org/stable/2023121450-habitual-transpose-68a1@gregkh/ for details] My prior race fix here broke CQM when ranges aren't used, as the reporting worker now requires the cqm_config to be set in the wdev, but isn't set when there's no range configured. Rather than continuing to special-case the range version, set the cqm_config always and configure accordingly, also tracking if range was used or not to be able to clear the configuration appropriately with the same API, which was actually not right if both were implemented by a driver for some reason, as is the case with mac80211 (though there the implementations are equivalent so it doesn't matter.) Also, the original multiple-RSSI commit lost checking for the callback, so might have potentially crashed if a driver had neither implementation, and userspace tried to use it despite not being advertised as supported. Cc: stable@vger.kernel.org Fixes: 4a4b8169501b ("cfg80211: Accept multiple RSSI thresholds for CQM") Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race") Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman Signed-off-by: "Léo Lam" Signed-off-by: Greg Kroah-Hartman commit ccd48707d51170a7518b77074f4053815ec58186 Author: Steven Rostedt (Google) Date: Thu Dec 28 09:51:49 2023 -0500 tracing: Fix blocked reader of snapshot buffer commit 39a7dc23a1ed0fe81141792a09449d124c5953bd upstream. If an application blocks on the snapshot or snapshot_raw files, expecting to be woken up when a snapshot occurs, it will not happen. Or it may happen with an unexpected result. That result is that the application will be reading the main buffer instead of the snapshot buffer. That is because when the snapshot occurs, the main and snapshot buffers are swapped. But the reader has a descriptor still pointing to the buffer that it originally connected to. This is fine for the main buffer readers, as they may be blocked waiting for a watermark to be hit, and when a snapshot occurs, the data that the main readers want is now on the snapshot buffer. But for waiters of the snapshot buffer, they are waiting for an event to occur that will trigger the snapshot and they can then consume it quickly to save the snapshot before the next snapshot occurs. But to do this, they need to read the new snapshot buffer, not the old one that is now receiving new data. Also, it does not make sense to have a watermark "buffer_percent" on the snapshot buffer, as the snapshot buffer is static and does not receive new data except all at once. Link: https://lore.kernel.org/linux-trace-kernel/20231228095149.77f5b45d@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers Cc: Mark Rutland Acked-by: Masami Hiramatsu (Google) Fixes: debdd57f5145f ("tracing: Make a snapshot feature available from userspace") Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit a12754a8f5ac23f7792029afeec06f00f85546ef Author: Steven Rostedt (Google) Date: Fri Dec 29 11:51:34 2023 -0500 ftrace: Fix modification of direct_function hash while in use commit d05cb470663a2a1879277e544f69e660208f08f2 upstream. Masami Hiramatsu reported a memory leak in register_ftrace_direct() where if the number of new entries are added is large enough to cause two allocations in the loop: for (i = 0; i < size; i++) { hlist_for_each_entry(entry, &hash->buckets[i], hlist) { new = ftrace_add_rec_direct(entry->ip, addr, &free_hash); if (!new) goto out_remove; entry->direct = addr; } } Where ftrace_add_rec_direct() has: if (ftrace_hash_empty(direct_functions) || direct_functions->count > 2 * (1 << direct_functions->size_bits)) { struct ftrace_hash *new_hash; int size = ftrace_hash_empty(direct_functions) ? 0 : direct_functions->count + 1; if (size < 32) size = 32; new_hash = dup_hash(direct_functions, size); if (!new_hash) return NULL; *free_hash = direct_functions; direct_functions = new_hash; } The "*free_hash = direct_functions;" can happen twice, losing the previous allocation of direct_functions. But this also exposed a more serious bug. The modification of direct_functions above is not safe. As direct_functions can be referenced at any time to find what direct caller it should call, the time between: new_hash = dup_hash(direct_functions, size); and direct_functions = new_hash; can have a race with another CPU (or even this one if it gets interrupted), and the entries being moved to the new hash are not referenced. That's because the "dup_hash()" is really misnamed and is really a "move_hash()". It moves the entries from the old hash to the new one. Now even if that was changed, this code is not proper as direct_functions should not be updated until the end. That is the best way to handle function reference changes, and is the way other parts of ftrace handles this. The following is done: 1. Change add_hash_entry() to return the entry it created and inserted into the hash, and not just return success or not. 2. Replace ftrace_add_rec_direct() with add_hash_entry(), and remove the former. 3. Allocate a "new_hash" at the start that is made for holding both the new hash entries as well as the existing entries in direct_functions. 4. Copy (not move) the direct_function entries over to the new_hash. 5. Copy the entries of the added hash to the new_hash. 6. If everything succeeds, then use rcu_pointer_assign() to update the direct_functions with the new_hash. This simplifies the code and fixes both the memory leak as well as the race condition mentioned above. Link: https://lore.kernel.org/all/170368070504.42064.8960569647118388081.stgit@devnote2/ Link: https://lore.kernel.org/linux-trace-kernel/20231229115134.08dd5174@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland Cc: Mathieu Desnoyers Cc: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Acked-by: Masami Hiramatsu (Google) Fixes: 763e34e74bb7d ("ftrace: Add register_ftrace_direct()") Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit baa88944038bbecc6f712e0e7c602ac5cfa6686a Author: Steven Rostedt (Google) Date: Tue Dec 26 12:59:02 2023 -0500 ring-buffer: Fix wake ups when buffer_percent is set to 100 commit 623b1f896fa8a669a277ee5a258307a16c7377a3 upstream. The tracefs file "buffer_percent" is to allow user space to set a water-mark on how much of the tracing ring buffer needs to be filled in order to wake up a blocked reader. 0 - is to wait until any data is in the buffer 1 - is to wait for 1% of the sub buffers to be filled 50 - would be half of the sub buffers are filled with data 100 - is not to wake the waiter until the ring buffer is completely full Unfortunately the test for being full was: dirty = ring_buffer_nr_dirty_pages(buffer, cpu); return (dirty * 100) > (full * nr_pages); Where "full" is the value for "buffer_percent". There is two issues with the above when full == 100. 1. dirty * 100 > 100 * nr_pages will never be true That is, the above is basically saying that if the user sets buffer_percent to 100, more pages need to be dirty than exist in the ring buffer! 2. The page that the writer is on is never considered dirty, as dirty pages are only those that are full. When the writer goes to a new sub-buffer, it clears the contents of that sub-buffer. That is, even if the check was ">=" it would still not be equal as the most pages that can be considered "dirty" is nr_pages - 1. To fix this, add one to dirty and use ">=" in the compare. Link: https://lore.kernel.org/linux-trace-kernel/20231226125902.4a057f1d@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland Cc: Mathieu Desnoyers Acked-by: Masami Hiramatsu (Google) Fixes: 03329f9939781 ("tracing: Add tracefs file buffer_percentage") Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit c62b9a2daf2866622cc9e8d0451bf2fc97b541c9 Author: Keith Busch Date: Mon Dec 18 08:19:39 2023 -0800 Revert "nvme-fc: fix race between error recovery and creating association" commit d3e8b1858734bf46cda495be4165787b9a3981a6 upstream. The commit was identified to might sleep in invalid context and is blocking regression testing. This reverts commit ee6fdc5055e916b1dd497f11260d4901c4c1e55e. Link: https://lore.kernel.org/linux-nvme/hkhl56n665uvc6t5d6h3wtx7utkcorw4xlwi7d2t2bnonavhe6@xaan6pu43ap6/ Link: https://lists.infradead.org/pipermail/linux-nvme/2023-December/043756.html Reported-by: Daniel Wagner Reported-by: Maurizio Lombardi Cc: Michael Liang Tested-by: Daniel Wagner Reviewed-by: Daniel Wagner Reviewed-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Greg Kroah-Hartman commit d16c5d215b53b395df51668fc8387dd618867814 Author: Matthew Wilcox (Oracle) Date: Mon Dec 18 13:58:36 2023 +0000 mm/memory-failure: check the mapcount of the precise page commit c79c5a0a00a9457718056b588f312baadf44e471 upstream. A process may map only some of the pages in a folio, and might be missed if it maps the poisoned page but not the head page. Or it might be unnecessarily hit if it maps the head page, but not the poisoned page. Link: https://lkml.kernel.org/r/20231218135837.3310403-3-willy@infradead.org Fixes: 7af446a841a2 ("HWPOISON, hugetlb: enable error handling path for hugepage") Signed-off-by: Matthew Wilcox (Oracle) Cc: Dan Williams Cc: Naoya Horiguchi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 8c7da70d9ae4c1abdf62d91d0aa28feee85c7f1b Author: Matthew Wilcox (Oracle) Date: Mon Dec 18 13:58:37 2023 +0000 mm/memory-failure: cast index to loff_t before shifting it commit 39ebd6dce62d8cfe3864e16148927a139f11bc9a upstream. On 32-bit systems, we'll lose the top bits of index because arithmetic will be performed in unsigned long instead of unsigned long long. This affects files over 4GB in size. Link: https://lkml.kernel.org/r/20231218135837.3310403-4-willy@infradead.org Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages") Signed-off-by: Matthew Wilcox (Oracle) Cc: Dan Williams Cc: Naoya Horiguchi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 07550b1461d4d0499165e7d6f7718cfd0e440427 Author: Charan Teja Kalla Date: Thu Dec 14 04:58:41 2023 +0000 mm: migrate high-order folios in swap cache correctly commit fc346d0a70a13d52fe1c4bc49516d83a42cd7c4c upstream. Large folios occupy N consecutive entries in the swap cache instead of using multi-index entries like the page cache. However, if a large folio is re-added to the LRU list, it can be migrated. The migration code was not aware of the difference between the swap cache and the page cache and assumed that a single xas_store() would be sufficient. This leaves potentially many stale pointers to the now-migrated folio in the swap cache, which can lead to almost arbitrary data corruption in the future. This can also manifest as infinite loops with the RCU read lock held. [willy@infradead.org: modifications to the changelog & tweaked the fix] Fixes: 3417013e0d18 ("mm/migrate: Add folio_migrate_mapping()") Link: https://lkml.kernel.org/r/20231214045841.961776-1-willy@infradead.org Signed-off-by: Charan Teja Kalla Signed-off-by: Matthew Wilcox (Oracle) Reported-by: Charan Teja Kalla Closes: https://lkml.kernel.org/r/1700569840-17327-1-git-send-email-quic_charante@quicinc.com Cc: David Hildenbrand Cc: Johannes Weiner Cc: Kirill A. Shutemov Cc: Naoya Horiguchi Cc: Shakeel Butt Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit d16eb52c176ccfd9ba29d0e5830a271ecded5290 Author: Baokun Li Date: Wed Dec 13 14:23:24 2023 +0800 mm/filemap: avoid buffered read/write race to read inconsistent data commit e2c27b803bb664748e090d99042ac128b3f88d92 upstream. The following concurrency may cause the data read to be inconsistent with the data on disk: cpu1 cpu2 ------------------------------|------------------------------ // Buffered write 2048 from 0 ext4_buffered_write_iter generic_perform_write copy_page_from_iter_atomic ext4_da_write_end ext4_da_do_write_end block_write_end __block_commit_write folio_mark_uptodate // Buffered read 4096 from 0 smp_wmb() ext4_file_read_iter set_bit(PG_uptodate, folio_flags) generic_file_read_iter i_size_write // 2048 filemap_read unlock_page(page) filemap_get_pages filemap_get_read_batch folio_test_uptodate(folio) ret = test_bit(PG_uptodate, folio_flags) if (ret) smp_rmb(); // Ensure that the data in page 0-2048 is up-to-date. // New buffered write 2048 from 2048 ext4_buffered_write_iter generic_perform_write copy_page_from_iter_atomic ext4_da_write_end ext4_da_do_write_end block_write_end __block_commit_write folio_mark_uptodate smp_wmb() set_bit(PG_uptodate, folio_flags) i_size_write // 4096 unlock_page(page) isize = i_size_read(inode) // 4096 // Read the latest isize 4096, but without smp_rmb(), there may be // Load-Load disorder resulting in the data in the 2048-4096 range // in the page is not up-to-date. copy_page_to_iter // copyout 4096 In the concurrency above, we read the updated i_size, but there is no read barrier to ensure that the data in the page is the same as the i_size at this point, so we may copy the unsynchronized page out. Hence adding the missing read memory barrier to fix this. This is a Load-Load reordering issue, which only occurs on some weak mem-ordering architectures (e.g. ARM64, ALPHA), but not on strong mem-ordering architectures (e.g. X86). And theoretically the problem doesn't only happen on ext4, filesystems that call filemap_read() but don't hold inode lock (e.g. btrfs, f2fs, ubifs ...) will have this problem, while filesystems with inode lock (e.g. xfs, nfs) won't have this problem. Link: https://lkml.kernel.org/r/20231213062324.739009-1-libaokun1@huawei.com Signed-off-by: Baokun Li Reviewed-by: Jan Kara Cc: Andreas Dilger Cc: Christoph Hellwig Cc: Dave Chinner Cc: Matthew Wilcox (Oracle) Cc: Ritesh Harjani (IBM) Cc: Theodore Ts'o Cc: yangerkun Cc: Yu Kuai Cc: Zhang Yi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 09141f08fdf69a3c4ec58fc53904e5f8f7c680b0 Author: Muhammad Usama Anjum Date: Thu Dec 14 15:19:30 2023 +0500 selftests: secretmem: floor the memory size to the multiple of page_size commit 0aac13add26d546ac74c89d2883b3a5f0fbea039 upstream. The "locked-in-memory size" limit per process can be non-multiple of page_size. The mmap() fails if we try to allocate locked-in-memory with same size as the allowed limit if it isn't multiple of the page_size because mmap() rounds off the memory size to be allocated to next multiple of page_size. Fix this by flooring the length to be allocated with mmap() to the previous multiple of the page_size. This was getting triggered on KernelCI regularly because of different ulimit settings which wasn't multiple of the page_size. Find logs here: https://linux.kernelci.org/test/plan/id/657654bd8e81e654fae13532/ The bug in was present from the time test was first added. Link: https://lkml.kernel.org/r/20231214101931.1155586-1-usama.anjum@collabora.com Fixes: 76fe17ef588a ("secretmem: test: add basic selftest for memfd_secret(2)") Signed-off-by: Muhammad Usama Anjum Reported-by: "kernelci.org bot" Closes: https://linux.kernelci.org/test/plan/id/657654bd8e81e654fae13532/ Cc: "James E.J. Bottomley" Cc: Mike Rapoport (IBM) Cc: Shuah Khan Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 2c30b8b105d690b5bb69aab31f198e2385be9b03 Author: Sidhartha Kumar Date: Wed Dec 13 12:50:57 2023 -0800 maple_tree: do not preallocate nodes for slot stores commit 4249f13c11be8b8b7bf93204185e150c3bdc968d upstream. mas_preallocate() defaults to requesting 1 node for preallocation and then ,depending on the type of store, will update the request variable. There isn't a check for a slot store type, so slot stores are preallocating the default 1 node. Slot stores do not require any additional nodes, so add a check for the slot store case that will bypass node_count_gfp(). Update the tests to reflect that slot stores do not require allocations. User visible effects of this bug include increased memory usage from the unneeded node that was allocated. Link: https://lkml.kernel.org/r/20231213205058.386589-1-sidhartha.kumar@oracle.com Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing") Signed-off-by: Sidhartha Kumar Cc: Liam R. Howlett Cc: Matthew Wilcox (Oracle) Cc: Peng Zhang Cc: [6.6+] Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 11d41d01c088ff1bd2d3de4762117385b657bb44 Author: Shin'ichiro Kawasaki Date: Fri Dec 29 15:39:11 2023 +0900 platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe commit b28ff7a7c3245d7f62acc20f15b4361292fe4117 upstream. p2sb_bar() unhides P2SB device to get resources from the device. It guards the operation by locking pci_rescan_remove_lock so that parallel rescans do not find the P2SB device. However, this lock causes deadlock when PCI bus rescan is triggered by /sys/bus/pci/rescan. The rescan locks pci_rescan_remove_lock and probes PCI devices. When PCI devices call p2sb_bar() during probe, it locks pci_rescan_remove_lock again. Hence the deadlock. To avoid the deadlock, do not lock pci_rescan_remove_lock in p2sb_bar(). Instead, do the lock at fs_initcall. Introduce p2sb_cache_resources() for fs_initcall which gets and caches the P2SB resources. At p2sb_bar(), refer the cache and return to the caller. Suggested-by: Andy Shevchenko Fixes: 9745fb07474f ("platform/x86/intel: Add Primary to Sideband (P2SB) bridge support") Cc: stable@vger.kernel.org Signed-off-by: Shin'ichiro Kawasaki Reviewed-by: Andy Shevchenko Reviewed-by: Ilpo Järvinen Link: https://lore.kernel.org/linux-pci/6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j/ Link: https://lore.kernel.org/r/20231229063912.2517922-2-shinichiro.kawasaki@wdc.com Signed-off-by: Ilpo Järvinen Signed-off-by: Greg Kroah-Hartman commit 7d5f219f1ef69f27eb8cbfb794d634fc9c4d24ac Author: Namjae Jeon Date: Wed Dec 20 15:52:11 2023 +0900 ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16() commit d10c77873ba1e9e6b91905018e29e196fd5f863d upstream. If ->NameOffset/Length is bigger than ->CreateContextsOffset/Length, ksmbd_check_message doesn't validate request buffer it correctly. So slab-out-of-bounds warning from calling smb_strndup_from_utf16() in smb2_open() could happen. If ->NameLength is non-zero, Set the larger of the two sums (Name and CreateContext size) as the offset and length of the data area. Reported-by: Yang Chaoming Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 33fd5fb1258b32976f2d83f46a59832dde784831 Author: David E. Box Date: Fri Dec 22 19:25:45 2023 -0800 platform/x86/intel/pmc: Move GBE LTR ignore to suspend callback [ Upstream commit 70681aa0746ae61d7668b9f651221fad5e30c71e ] Commit 804951203aa5 ("platform/x86:intel/pmc: Combine core_init() and core_configure()") caused a network performance regression due to the GBE LTR ignore that it added at probe. This was needed in order to allow the SoC to enter the deepest Package C state. To fix the regression and at least support PC10 during suspend, move the LTR ignore from probe to the suspend callback, and enable it again on resume. This solution will allow PC10 during suspend but restrict Package C entry at runtime to no deeper than PC8/9 while a network cable it attach to the PCH LAN. Fixes: 804951203aa5 ("platform/x86:intel/pmc: Combine core_init() and core_configure()") Signed-off-by: "David E. Box" Link: https://lore.kernel.org/r/20231223032548.1680738-6-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit 91dcd5ee1e11c70ea11b1c879c157ee9ff35f38d Author: David E. Box Date: Fri Dec 22 19:25:44 2023 -0800 platform/x86/intel/pmc: Allow reenabling LTRs [ Upstream commit 6f9cc5c1f94daa98846b2073733d03ced709704b ] Commit 804951203aa5 ("platform/x86:intel/pmc: Combine core_init() and core_configure()") caused a network performance regression due to the GBE LTR ignore that it added during probe. The fix will move the ignore to occur at suspend-time (so as to not affect suspend power). This will require the ability to enable the LTR again on resume. Modify pmc_core_send_ltr_ignore() to allow enabling an LTR. Fixes: 804951203aa5 ("platform/x86:intel/pmc: Combine core_init() and core_configure()") Signed-off-by: "David E. Box" Link: https://lore.kernel.org/r/20231223032548.1680738-5-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit 8663b99c38a6f76b7552151e648840af6d137e91 Author: David E. Box Date: Fri Dec 22 19:25:43 2023 -0800 platform/x86/intel/pmc: Add suspend callback [ Upstream commit 7c13f365aee68b01e7e68ee293a71fdc7571c111 ] Add a suspend callback to struct pmc for performing platform specific tasks before device suspend. This is needed in order to perform GBE LTR ignore on certain platforms at suspend-time instead of at probe-time and replace the GBE LTR ignore removal that was done in order to fix a bug introduced by commit 804951203aa5 ("platform/x86:intel/pmc: Combine core_init() and core_configure()"). Fixes: 804951203aa5 ("platform/x86:intel/pmc: Combine core_init() and core_configure()") Signed-off-by: "David E. Box" Link: https://lore.kernel.org/r/20231223032548.1680738-4-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit b5f63f5e8a6820a6093b6caf35bb0f7c946c7d72 Author: Christoph Hellwig Date: Tue Dec 26 08:15:24 2023 +0000 block: renumber QUEUE_FLAG_HW_WC [ Upstream commit 02d374f3418df577c850f0cd45c3da9245ead547 ] For the QUEUE_FLAG_HW_WC to actually work, it needs to have a separate number from QUEUE_FLAG_FUA, doh. Fixes: 43c9835b144c ("block: don't allow enabling a cache on devices that don't support it") Signed-off-by: Christoph Hellwig Link: https://lore.kernel.org/r/20231226081524.180289-1-hch@lst.de Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit cf742d0955850ca2e33410dda6afd360d6cd65bc Author: Paolo Abeni Date: Fri Dec 15 17:04:25 2023 +0100 mptcp: fix inconsistent state on fastopen race [ Upstream commit 4fd19a30701659af5839b7bd19d1f05f05933ebe ] The netlink PM can race with fastopen self-connect attempts, shutting down the first subflow via: MPTCP_PM_CMD_DEL_ADDR -> mptcp_nl_remove_id_zero_address -> mptcp_pm_nl_rm_subflow_received -> mptcp_close_ssk and transitioning such subflow to FIN_WAIT1 status before the syn-ack packet is processed. The MPTCP code does not react to such state change, leaving the connection in not-fallback status and the subflow handshake uncompleted, triggering the following splat: WARNING: CPU: 0 PID: 10630 at net/mptcp/subflow.c:1405 subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405 Modules linked in: CPU: 0 PID: 10630 Comm: kworker/u4:11 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 Workqueue: bat_events batadv_nc_worker RIP: 0010:subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405 Code: 18 89 ee e8 e3 d2 21 f7 40 84 ed 75 1f e8 a9 d7 21 f7 44 89 fe bf 07 00 00 00 e8 0c d3 21 f7 41 83 ff 07 74 07 e8 91 d7 21 f7 <0f> 0b e8 8a d7 21 f7 48 89 df e8 d2 b2 ff ff 31 ff 89 c5 89 c6 e8 RSP: 0018:ffffc90000007448 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888031efc700 RCX: ffffffff8a65baf4 RDX: ffff888043222140 RSI: ffffffff8a65baff RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 000000000000000b R11: 0000000000000000 R12: 1ffff92000000e89 R13: ffff88807a534d80 R14: ffff888021c11a00 R15: 000000000000000b FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa19a0ffc81 CR3: 000000007a2db000 CR4: 00000000003506f0 DR0: 000000000000d8dd DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: tcp_data_ready+0x14c/0x5b0 net/ipv4/tcp_input.c:5128 tcp_data_queue+0x19c3/0x5190 net/ipv4/tcp_input.c:5208 tcp_rcv_state_process+0x11ef/0x4e10 net/ipv4/tcp_input.c:6844 tcp_v4_do_rcv+0x369/0xa10 net/ipv4/tcp_ipv4.c:1929 tcp_v4_rcv+0x3888/0x3b30 net/ipv4/tcp_ipv4.c:2329 ip_protocol_deliver_rcu+0x9f/0x480 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2e4/0x510 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip_local_deliver+0x1b6/0x550 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:461 [inline] ip_rcv_finish+0x1c4/0x2e0 net/ipv4/ip_input.c:449 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip_rcv+0xce/0x440 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5527 __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5641 process_backlog+0x101/0x6b0 net/core/dev.c:5969 __napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6531 napi_poll net/core/dev.c:6600 [inline] net_rx_action+0x956/0xe90 net/core/dev.c:6733 __do_softirq+0x21a/0x968 kernel/softirq.c:553 do_softirq kernel/softirq.c:454 [inline] do_softirq+0xaa/0xe0 kernel/softirq.c:441 __local_bh_enable_ip+0xf8/0x120 kernel/softirq.c:381 spin_unlock_bh include/linux/spinlock.h:396 [inline] batadv_nc_purge_paths+0x1ce/0x3c0 net/batman-adv/network-coding.c:471 batadv_nc_worker+0x9b1/0x10e0 net/batman-adv/network-coding.c:722 process_one_work+0x884/0x15c0 kernel/workqueue.c:2630 process_scheduled_works kernel/workqueue.c:2703 [inline] worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784 kthread+0x33c/0x440 kernel/kthread.c:388 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 To address the issue, catch the racing subflow state change and use it to cause the MPTCP fallback. Such fallback is also used to cause the first subflow state propagation to the msk socket via mptcp_set_connected(). After this change, the first subflow can additionally propagate the TCP_FIN_WAIT1 state, so rename the helper accordingly. Finally, if the state propagation is delayed to the msk release callback, the first subflow can change to a different state in between. Cache the relevant target state in a new msk-level field and use such value to update the msk state at release time. Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") Cc: stable@vger.kernel.org Reported-by: Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/458 Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 44ee4764c60a784c6da27680968b9ffe0604196a Author: Paolo Abeni Date: Tue Nov 14 00:16:14 2023 +0100 mptcp: fix possible NULL pointer dereference on close [ Upstream commit d109a7767273d1706b541c22b83a0323823dfde4 ] After the blamed commit below, the MPTCP release callback can dereference the first subflow pointer via __mptcp_set_connected() and send buffer auto-tuning. Such pointer is always expected to be valid, except at socket destruction time, when the first subflow is deleted and the pointer zeroed. If the connect event is handled by the release callback while the msk socket is finally released, MPTCP hits the following splat: general protection fault, probably for non-canonical address 0xdffffc00000000f2: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000790-0x0000000000000797] CPU: 1 PID: 26719 Comm: syz-executor.2 Not tainted 6.6.0-syzkaller-10102-gff269e2cd5ad #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 RIP: 0010:mptcp_subflow_ctx net/mptcp/protocol.h:542 [inline] RIP: 0010:__mptcp_propagate_sndbuf net/mptcp/protocol.h:813 [inline] RIP: 0010:__mptcp_set_connected+0x57/0x3e0 net/mptcp/subflow.c:424 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8a62323c RDX: 00000000000000f2 RSI: ffffffff8a630116 RDI: 0000000000000790 RBP: ffff88803334b100 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000034 R12: ffff88803334b198 R13: ffff888054f0b018 R14: 0000000000000000 R15: ffff88803334b100 FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbcb4f75198 CR3: 000000006afb5000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mptcp_release_cb+0xa2c/0xc40 net/mptcp/protocol.c:3405 release_sock+0xba/0x1f0 net/core/sock.c:3537 mptcp_close+0x32/0xf0 net/mptcp/protocol.c:3084 inet_release+0x132/0x270 net/ipv4/af_inet.c:433 inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:485 __sock_release+0xae/0x260 net/socket.c:659 sock_close+0x1c/0x20 net/socket.c:1419 __fput+0x270/0xbb0 fs/file_table.c:394 task_work_run+0x14d/0x240 kernel/task_work.c:180 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xa92/0x2a20 kernel/exit.c:876 do_group_exit+0xd4/0x2a0 kernel/exit.c:1026 get_signal+0x23ba/0x2790 kernel/signal.c:2900 arch_do_signal_or_restart+0x90/0x7f0 arch/x86/kernel/signal.c:309 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x60 kernel/entry/common.c:296 do_syscall_64+0x4b/0x110 arch/x86/entry/common.c:88 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7fb515e7cae9 Code: Unable to access opcode bytes at 0x7fb515e7cabf. RSP: 002b:00007fb516c560c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: 000000000000003c RBX: 00007fb515f9c120 RCX: 00007fb515e7cae9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000006 RBP: 00007fb515ec847a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000006e R14: 00007fb515f9c120 R15: 00007ffc631eb968 To avoid sparkling unneeded conditionals, address the issue explicitly checking msk->first only in the critical place. Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning") Cc: stable@vger.kernel.org Reported-by: Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/454 Reported-by: Eric Dumazet Closes: https://lore.kernel.org/netdev/CANn89iLZUA6S2a=K8GObnS62KK6Jt4B7PsAs7meMFooM8xaTgw@mail.gmail.com/ Signed-off-by: Paolo Abeni Reviewed-by: Eric Dumazet Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-2-7b9cd6a7b7f4@kernel.org Signed-off-by: Jakub Kicinski Stable-dep-of: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") Signed-off-by: Sasha Levin commit 34c7757aa56109b6533ad746d2168df185dabfab Author: Paolo Abeni Date: Mon Oct 23 13:44:42 2023 -0700 mptcp: refactor sndbuf auto-tuning [ Upstream commit 8005184fd1ca6aeb3fea36f4eb9463fc1b90c114 ] The MPTCP protocol account for the data enqueued on all the subflows to the main socket send buffer, while the send buffer auto-tuning algorithm set the main socket send buffer size as the max size among the subflows. That causes bad performances when at least one subflow is sndbuf limited, e.g. due to very high latency, as the MPTCP scheduler can't even fill such buffer. Change the send-buffer auto-tuning algorithm to compute the main socket send buffer size as the sum of all the subflows buffer size. Reviewed-by: Mat Martineau Signed-off-by: Paolo Abeni Signed-off-by: Mat Martineau Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-9-9dc60939d371@kernel.org Signed-off-by: Jakub Kicinski Stable-dep-of: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") Signed-off-by: Sasha Levin commit 183c8972b6a6f85de96c8975689eed92a0af0847 Author: Helge Deller Date: Thu Dec 28 11:36:03 2023 +0100 linux/export: Ensure natural alignment of kcrctab array [ Upstream commit 753547de0daecbdbd1af3618987ddade325d9aaa ] The ___kcrctab section holds an array of 32-bit CRC values. Add a .balign 4 to tell the linker the correct memory alignment. Fixes: f3304ecd7f06 ("linux/export: use inline assembler to populate symbol CRCs") Signed-off-by: Helge Deller Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin commit 466e9af1550724e3c47600e26904d740a4cf2a92 Author: Helge Deller Date: Wed Nov 22 23:18:11 2023 +0100 linux/export: Fix alignment for 64-bit ksymtab entries [ Upstream commit f6847807c22f6944c71c981b630b9fff30801e73 ] An alignment of 4 bytes is wrong for 64-bit platforms which don't define CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers). Fix their alignment to 8 bytes. Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") Signed-off-by: Helge Deller Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin commit 7844d7d8d8af0eed290004f5f139bc050e2d98bc Author: Arnd Bergmann Date: Mon Oct 23 13:01:55 2023 +0200 kexec: select CRYPTO from KEXEC_FILE instead of depending on it [ Upstream commit e63bde3d9417f8318d6dd0d0fafa35ebf307aabd ] All other users of crypto code use 'select' instead of 'depends on', so do the same thing with KEXEC_FILE for consistency. In practice this makes very little difference as kernels with kexec support are very likely to also include some other feature that already selects both crypto and crypto_sha256, but being consistent here helps for usability as well as to avoid potential circular dependencies. This reverts the dependency back to what it was originally before commit 74ca317c26a3f ("kexec: create a new config option CONFIG_KEXEC_FILE for new syscall"), which changed changed it with the comment "This should be safer as "select" is not recursive", but that appears to have been done in error, as "select" is indeed recursive, and there are no other dependencies that prevent CRYPTO_SHA256 from being selected here. Link: https://lkml.kernel.org/r/20231023110308.1202042-2-arnd@kernel.org Fixes: 74ca317c26a3f ("kexec: create a new config option CONFIG_KEXEC_FILE for new syscall") Signed-off-by: Arnd Bergmann Reviewed-by: Eric DeVolder Tested-by: Eric DeVolder Acked-by: Baoquan He Cc: Herbert Xu Cc: "David S. Miller" Cc: Albert Ou Cc: Alexander Gordeev Cc: Ard Biesheuvel Cc: Borislav Petkov Cc: Christian Borntraeger Cc: Christophe Leroy Cc: Conor Dooley Cc: Dave Hansen Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Peter Zijlstra Cc: Sven Schnelle Cc: Thomas Gleixner Cc: Vasily Gorbik Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin commit 78422b744ad90f0f99b8081c10a1da9182964efd Author: Arnd Bergmann Date: Mon Oct 23 13:01:54 2023 +0200 kexec: fix KEXEC_FILE dependencies [ Upstream commit c1ad12ee0efc07244be37f69311e6f7c4ac98e62 ] The cleanup for the CONFIG_KEXEC Kconfig logic accidentally changed the 'depends on CRYPTO=y' dependency to a plain 'depends on CRYPTO', which causes a link failure when all the crypto support is in a loadable module and kexec_file support is built-in: x86_64-linux-ld: vmlinux.o: in function `__x64_sys_kexec_file_load': (.text+0x32e30a): undefined reference to `crypto_alloc_shash' x86_64-linux-ld: (.text+0x32e58e): undefined reference to `crypto_shash_update' x86_64-linux-ld: (.text+0x32e6ee): undefined reference to `crypto_shash_final' Both s390 and x86 have this problem, while ppc64 and riscv have the correct dependency already. On riscv, the dependency is only used for the purgatory, not for the kexec_file code itself, which may be a bit surprising as it means that with CONFIG_CRYPTO=m, it is possible to enable KEXEC_FILE but then the purgatory code is silently left out. Move this into the common Kconfig.kexec file in a way that is correct everywhere, using the dependency on CRYPTO_SHA256=y only when the purgatory code is available. This requires reversing the dependency between ARCH_SUPPORTS_KEXEC_PURGATORY and KEXEC_FILE, but the effect remains the same, other than making riscv behave like the other ones. On s390, there is an additional dependency on CRYPTO_SHA256_S390, which should technically not be required but gives better performance. Remove this dependency here, noting that it was not present in the initial Kconfig code but was brought in without an explanation in commit 71406883fd357 ("s390/kexec_file: Add kexec_file_load system call"). [arnd@arndb.de: fix riscv build] Link: https://lkml.kernel.org/r/67ddd260-d424-4229-a815-e3fcfb864a77@app.fastmail.com Link: https://lkml.kernel.org/r/20231023110308.1202042-1-arnd@kernel.org Fixes: 6af5138083005 ("x86/kexec: refactor for kernel/Kconfig.kexec") Signed-off-by: Arnd Bergmann Reviewed-by: Eric DeVolder Tested-by: Eric DeVolder Cc: Albert Ou Cc: Alexander Gordeev Cc: Ard Biesheuvel Cc: Borislav Petkov Cc: Christian Borntraeger Cc: Christophe Leroy Cc: Conor Dooley Cc: Dave Hansen Cc: David S. Miller Cc: Heiko Carstens Cc: Herbert Xu Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Peter Zijlstra Cc: Sven Schnelle Cc: Thomas Gleixner Cc: Vasily Gorbik Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin commit 28d6cde17f219133a68d530b575e8725fe17a90f Author: Xuan Zhuo Date: Fri Dec 1 11:33:03 2023 +0800 virtio_ring: fix syncs DMA memory with different direction [ Upstream commit 1f475cd572ea77ae6474a17e693a96bca927efe9 ] Now the APIs virtqueue_dma_sync_single_range_for_{cpu,device} ignore the parameter 'dir', that is a mistake. [ 6.101666] ------------[ cut here ]------------ [ 6.102079] DMA-API: virtio-pci 0000:00:04.0: device driver syncs DMA memory with different direction [device address=0x00000000ae010000] [size=32752 bytes] [mapped with DMA_FROM_DEVICE] [synced with DMA_BIDIRECTIONAL] [ 6.103630] WARNING: CPU: 6 PID: 0 at kernel/dma/debug.c:1125 check_sync+0x53e/0x6c0 [ 6.107420] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G E 6.6.0+ #290 [ 6.108030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 6.108936] RIP: 0010:check_sync+0x53e/0x6c0 [ 6.109289] Code: 24 10 e8 f5 d9 74 00 4c 8b 4c 24 10 4c 8b 44 24 18 48 8b 4c 24 20 48 89 c6 41 56 4c 89 ea 48 c7 c7 b0 f1 50 82 e8 32 fc f3 ff <0f> 0b 48 c7 c7 48 4b 4a 82 e8 74 d9 fc ff 8b 73 4c 48 8d 7b 50 31 [ 6.110750] RSP: 0018:ffffc90000180cd8 EFLAGS: 00010092 [ 6.111178] RAX: 00000000000000ce RBX: ffff888100aa5900 RCX: 0000000000000000 [ 6.111744] RDX: 0000000000000104 RSI: ffffffff824c3208 RDI: 00000000ffffffff [ 6.112316] RBP: ffffc90000180d40 R08: 0000000000000000 R09: 00000000fffeffff [ 6.112893] R10: ffffc90000180b98 R11: ffffffff82f63308 R12: ffffffff83d5af00 [ 6.113460] R13: ffff888100998200 R14: ffffffff824a4b5f R15: 0000000000000286 [ 6.114027] FS: 0000000000000000(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000 [ 6.114665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.115128] CR2: 00007f10f1e03030 CR3: 0000000108272004 CR4: 0000000000770ee0 [ 6.115701] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6.116272] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6.116842] PKRU: 55555554 [ 6.117069] Call Trace: [ 6.117275] [ 6.117452] ? __warn+0x84/0x140 [ 6.117727] ? check_sync+0x53e/0x6c0 [ 6.118034] ? __report_bug+0xea/0x100 [ 6.118353] ? check_sync+0x53e/0x6c0 [ 6.118653] ? report_bug+0x41/0xc0 [ 6.118944] ? handle_bug+0x3c/0x70 [ 6.119237] ? exc_invalid_op+0x18/0x70 [ 6.119551] ? asm_exc_invalid_op+0x1a/0x20 [ 6.119900] ? check_sync+0x53e/0x6c0 [ 6.120199] ? check_sync+0x53e/0x6c0 [ 6.120499] debug_dma_sync_single_for_cpu+0x5c/0x70 [ 6.120906] ? dma_sync_single_for_cpu+0xb7/0x100 [ 6.121291] virtnet_rq_unmap+0x158/0x170 [virtio_net] [ 6.121716] virtnet_receive+0x196/0x220 [virtio_net] [ 6.122135] virtnet_poll+0x48/0x1b0 [virtio_net] [ 6.122524] __napi_poll+0x29/0x1b0 [ 6.123083] net_rx_action+0x282/0x360 [ 6.123612] __do_softirq+0xf3/0x2fb [ 6.124138] __irq_exit_rcu+0x8e/0xf0 [ 6.124663] common_interrupt+0xbc/0xe0 [ 6.125202] We need to enable CONFIG_DMA_API_DEBUG and work with need sync mode(such as swiotlb) to reproduce this warn. Fixes: 8bd2f71054bd ("virtio_ring: introduce dma sync api for virtqueue") Reported-by: "Ning, Hongyu" Closes: https://lore.kernel.org/all/f37cb55a-6fc8-4e21-8789-46d468325eea@linux.intel.com/ Suggested-by: Jason Wang Signed-off-by: Xuan Zhuo Message-Id: <20231201033303.25141-1-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Parav Pandit Acked-by: Jason Wang Tested-by: Hongyu Ning Signed-off-by: Sasha Levin commit 9a49874443307ca97d5bc199fa4c5ae1591ff5a0 Author: Zizhi Wo Date: Wed Dec 13 10:23:53 2023 +0800 fs: cifs: Fix atime update check [ Upstream commit 01fe654f78fd1ea4df046ef76b07ba92a35f8dbe ] Commit 9b9c5bea0b96 ("cifs: do not return atime less than mtime") indicates that in cifs, if atime is less than mtime, some apps will break. Therefore, it introduce a function to compare this two variables in two places where atime is updated. If atime is less than mtime, update it to mtime. However, the patch was handled incorrectly, resulting in atime and mtime being exactly equal. A previous commit 69738cfdfa70 ("fs: cifs: Fix atime update check vs mtime") fixed one place and forgot to fix another. Fix it. Fixes: 9b9c5bea0b96 ("cifs: do not return atime less than mtime") Cc: stable@vger.kernel.org Signed-off-by: Zizhi Wo Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 23171df51f601c92177e4c810d4c683a198de1a0 Author: Jeff Layton Date: Wed Oct 4 14:52:53 2023 -0400 client: convert to new timestamp accessors [ Upstream commit 8f22ce7088835444418f0775efb455d10b825596 ] Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton Link: https://lore.kernel.org/r/20231004185347.80880-66-jlayton@kernel.org Signed-off-by: Christian Brauner Stable-dep-of: 01fe654f78fd ("fs: cifs: Fix atime update check") Signed-off-by: Sasha Levin commit 5b5599a7eee5e6101c6ae738682331bc6299a54d Author: Jeff Layton Date: Wed Oct 4 14:52:37 2023 -0400 fs: new accessor methods for atime and mtime [ Upstream commit 077c212f0344ae4198b2b51af128a94b614ccdf4 ] Recently, we converted the ctime accesses in the kernel to use new accessor functions. Linus recently pointed out though that if we add accessors for the atime and mtime, then that would allow us to seamlessly change how these timestamps are stored in the inode. Add new accessor functions for the atime and mtime that mirror the accessors for the ctime. Signed-off-by: Jeff Layton Link: https://lore.kernel.org/r/20231004185239.80830-1-jlayton@kernel.org Signed-off-by: Christian Brauner Stable-dep-of: 01fe654f78fd ("fs: cifs: Fix atime update check") Signed-off-by: Sasha Levin commit 861eaba7ca6c6f96537cd14d16503426b995dc1d Author: Namjae Jeon Date: Sun Dec 31 16:19:19 2023 +0900 ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack() [ Upstream commit 658609d9a618d8881bf549b5893c0ba8fcff4526 ] opinfo_put() could be called twice on error of smb21_lease_break_ack(). It will cause UAF issue if opinfo is referenced on other places. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit ab5a0a1c40befa1fdfe9e018c2fab52c5fc3d44b Author: Namjae Jeon Date: Sun Dec 31 16:19:18 2023 +0900 ksmbd: lazy v2 lease break on smb2_write() [ Upstream commit c2a721eead71202a0d8ddd9b56ec8dce652c71d1 ] Don't immediately send directory lease break notification on smb2_write(). Instead, It postpones it until smb2_close(). Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 3c1e602a34e1ecfdede19fbf1b96d31bc8c21f89 Author: Namjae Jeon Date: Sun Dec 31 16:19:17 2023 +0900 ksmbd: send v2 lease break notification for directory [ Upstream commit d47d9886aeef79feba7adac701a510d65f3682b5 ] If client send different parent key, different client guid, or there is no parent lease key flags in create context v2 lease, ksmbd send lease break to client. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 572388ff429a3d39216e736a3b6d052a9f5c025b Author: Namjae Jeon Date: Sun Dec 31 16:19:16 2023 +0900 ksmbd: downgrade RWH lease caching state to RH for directory [ Upstream commit eb547407f3572d2110cb1194ecd8865b3371a7a4 ] RWH(Read + Write + Handle) caching state is not supported for directory. ksmbd downgrade it to RH for directory if client send RWH caching lease state. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit d7af4e499c308a38b07e585096057045ec303591 Author: Namjae Jeon Date: Sun Dec 31 16:19:15 2023 +0900 ksmbd: set v2 lease capability [ Upstream commit 18dd1c367c31d0a060f737d48345747662369b64 ] Set SMB2_GLOBAL_CAP_DIRECTORY_LEASING to ->capabilities to inform server support directory lease to client. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit bc025d49c507b7830c63382eeaddce972dccc143 Author: Namjae Jeon Date: Sun Dec 31 16:19:14 2023 +0900 ksmbd: set epoch in create context v2 lease [ Upstream commit d045850b628aaf931fc776c90feaf824dca5a1cf ] To support v2 lease(directory lease), ksmbd set epoch in create context v2 lease response. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 3da84670973ba926ce421077e57cb769f7595ccb Author: Namjae Jeon Date: Sun Dec 31 16:19:13 2023 +0900 ksmbd: don't update ->op_state as OPLOCK_STATE_NONE on error [ Upstream commit cd80ce7e68f1624ac29cd0a6b057789d1236641e ] ksmbd set ->op_state as OPLOCK_STATE_NONE on lease break ack error. op_state of lease should not be updated because client can send lease break ack again. This patch fix smb2.lease.breaking2 test failure. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit b06c9637317952c29d8a2ffbdc604c4328aee6c9 Author: Namjae Jeon Date: Sun Dec 31 16:19:12 2023 +0900 ksmbd: move setting SMB2_FLAGS_ASYNC_COMMAND and AsyncId [ Upstream commit 9ac45ac7cf65b0623ceeab9b28b307a08efa22dc ] Directly set SMB2_FLAGS_ASYNC_COMMAND flags and AsyncId in smb2 header of interim response instead of current response header. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit fa86141f357f91abe3847a01d0e183a686d3519c Author: Namjae Jeon Date: Sun Dec 31 16:19:11 2023 +0900 ksmbd: release interim response after sending status pending response [ Upstream commit 2a3f7857ec742e212d6cee7fbbf7b0e2ae7f5161 ] Add missing release async id and delete interim response entry after sending status pending response. This only cause when smb2 lease is enable. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit e4ae1953755803378b7e0200bddf26e1f61927f4 Author: Namjae Jeon Date: Sun Dec 31 16:19:10 2023 +0900 ksmbd: move oplock handling after unlock parent dir [ Upstream commit 2e450920d58b4991a436c8cecf3484bcacd8e535 ] ksmbd should process secound parallel smb2 create request during waiting oplock break ack. parent lock range that is too large in smb2_open() causes smb2_open() to be serialized. Move the oplock handling to the bottom of smb2_open() and make it called after parent unlock. This fixes the failure of smb2.lease.breaking1 testcase. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit f263652dc6c95d973dcd97ec4bcc9b2c66471a3e Author: Namjae Jeon Date: Sun Dec 31 16:19:09 2023 +0900 ksmbd: separately allocate ci per dentry [ Upstream commit 4274a9dc6aeb9fea66bffba15697a35ae8983b6a ] xfstests generic/002 test fail when enabling smb2 leases feature. This test create hard link file, but removeal failed. ci has a file open count to count file open through the smb client, but in the case of hard link files, The allocation of ci per inode cause incorrectly open count for file deletion. This patch allocate ci per dentry to counts open counts for hard link. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 8d69547b94e079a61e43d017f92ecc86256b1eea Author: Zongmin Zhou Date: Sun Dec 31 16:19:08 2023 +0900 ksmbd: prevent memory leak on error return [ Upstream commit 90044481e7cca6cb3125b3906544954a25f1309f ] When allocated memory for 'new' failed,just return will cause memory leak of 'ar'. Fixes: 1819a9042999 ("ksmbd: reorganize ksmbd_iov_pin_rsp()") Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202311031837.H3yo7JVl-lkp@intel.com/ Signed-off-by: Zongmin Zhou Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit cdb93ef9cfccc1338f49c6780af3d8e0bc34024a Author: Namjae Jeon Date: Sun Dec 31 16:19:07 2023 +0900 ksmbd: fix kernel-doc comment of ksmbd_vfs_kern_path_locked() [ Upstream commit f6049712e520287ad695e9d4f1572ab76807fa0c ] Fix argument list that the kdoc format and script verified in ksmbd_vfs_kern_path_locked(). fs/smb/server/vfs.c:1207: warning: Function parameter or member 'parent_path' not described in 'ksmbd_vfs_kern_path_locked' Reported-by: kernel test robot Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit b48bb8c2ecdb7dd0bd00a642819a71a228570d9c Author: Namjae Jeon Date: Sun Dec 31 16:19:06 2023 +0900 ksmbd: no need to wait for binded connection termination at logoff [ Upstream commit 67797da8a4b82446d42c52b6ee1419a3100d78ff ] The connection could be binded to the existing session for Multichannel. session will be destroyed when binded connections are released. So no need to wait for that's connection at logoff. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 0bd595cb8e8bc6719262c4ac0949eeeb5f8fb385 Author: Namjae Jeon Date: Sun Dec 31 16:19:05 2023 +0900 ksmbd: add support for surrogate pair conversion [ Upstream commit 0c180317c654a494fe429adbf7bc9b0793caf9e2 ] ksmbd is missing supporting to convert filename included surrogate pair characters. It triggers a "file or folder does not exist" error in Windows client. [Steps to Reproduce for bug] 1. Create surrogate pair file touch $(echo -e '\xf0\x9d\x9f\xa3') touch $(echo -e '\xf0\x9d\x9f\xa4') 2. Try to open these files in ksmbd share through Windows client. This patch update unicode functions not to consider about surrogate pair (and IVS). Reviewed-by: Marios Makassikis Tested-by: Marios Makassikis Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit dca63bad39501aee8bc5069c69680400dcc2fc86 Author: Kangjing Huang Date: Sun Dec 31 16:19:04 2023 +0900 ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev() [ Upstream commit ecce70cf17d91c3dd87a0c4ea00b2d1387729701 ] Physical ib_device does not have an underlying net_device, thus its association with IPoIB net_device cannot be retrieved via ops.get_netdev() or ib_device_get_by_netdev(). ksmbd reads physical ib_device port GUID from the lower 16 bytes of the hardware addresses on IPoIB net_device and match its underlying ib_device using ib_find_gid() Signed-off-by: Kangjing Huang Acked-by: Namjae Jeon Reviewed-by: Tom Talpey Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 31c453b3743fbc04935f3f5d2dd4c34ef7851e4b Author: Namjae Jeon Date: Sun Dec 31 16:19:03 2023 +0900 ksmbd: fix kernel-doc comment of ksmbd_vfs_setxattr() [ Upstream commit 3354db668808d5b6d7c5e0cb19ff4c9da4bb5e58 ] Fix argument list that the kdoc format and script verified in ksmbd_vfs_setxattr(). fs/smb/server/vfs.c:929: warning: Function parameter or member 'path' not described in 'ksmbd_vfs_setxattr' Reported-by: kernel test robot Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit d73737884ea4d51c81a382259c64bf1c5804f961 Author: Namjae Jeon Date: Sun Dec 31 16:19:02 2023 +0900 ksmbd: reorganize ksmbd_iov_pin_rsp() [ Upstream commit 1819a904299942b309f687cc0f08b123500aa178 ] If ksmbd_iov_pin_rsp fail, io vertor should be rollback. This patch moves memory allocations to before setting the io vector to avoid rollbacks. Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound") Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 3ba08c420d05bbefcb9421a740be6788ee33f7df Author: Cheng-Han Wu Date: Sun Dec 31 16:19:01 2023 +0900 ksmbd: Remove unused field in ksmbd_user struct [ Upstream commit eacc655e18d1dec9b50660d16a1ddeeb4d6c48f2 ] fs/smb/server/mgmt/user_config.h:21: Remove the unused field 'failed_login_count' from the ksmbd_user struct. Signed-off-by: Cheng-Han Wu Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin