In the Linux kernel, the following vulnerability has been resolved:
tun: Fix memory leak for detached NAPI queue.
syzkaller reported [0] memory leaks of sk and skb related to the TUN
device with no repro, but we can reproduce it easily with:
struct ifreq ifr = {}
int fd_tun, fd_tmp;
char buf[4] = {};
fd_tun = openat(AT_FDCWD, "/dev/net/tun", O_WRONLY, 0);
ifr.ifr_flags = IFF_TUN | IFF_NAPI | IFF_MULTI_QUEUE;
ioctl(fd_tun, TUNSETIFF, &ifr);
ifr.ifr_flags = IFF_DETACH_QUEUE;
ioctl(fd_tun, TUNSETQUEUE, &ifr);
fd_tmp = socket(AF_PACKET, SOCK_PACKET, 0);
ifr.ifr_flags = IFF_UP;
ioctl(fd_tmp, SIOCSIFFLAGS, &ifr);
write(fd_tun, buf, sizeof(buf));
close(fd_tun);
If we enable NAPI and multi-queue on a TUN device, we can put skb into
tfile->sk.sk_write_queue after the queue is detached. We should prevent
it by checking tfile->detached before queuing skb.
Note this must be done under tfile->sk.sk_write_queue.lock because write()
and ioctl(IFF_DETACH_QUEUE) can run concurrently. Otherwise, there would
be a small race window:
write() ioctl(IFF_DETACH_QUEUE)
`- tun_get_user `- __tun_detach
|- if (tfile->detached) |- tun_disable_queue
| `-> false | `- tfile->detached = tun
| `- tun_queue_purge
|- spin_lock_bh(&queue->lock)
`- __skb_queue_tail(queue, skb)
Another solution is to call tun_queue_purge() when closing and
reattaching the detached queue, but it could paper over another
problems. Also, we do the same kind of test for IFF_NAPI_FRAGS.
[0]:
unreferenced object 0xffff88801edbc800 (size 2048):
comm "syz-executor.1", pid 33269, jiffies 4295743834 (age 18.756s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
backtrace:
[<000000008c16ea3d>] __do_kmalloc_node mm/slab_common.c:965 [inline]
[<000000008c16ea3d>] __kmalloc+0x4a/0x130 mm/slab_common.c:979
[<000000003addde56>] kmalloc include/linux/slab.h:563 [inline]
[<000000003addde56>] sk_prot_alloc+0xef/0x1b0 net/core/sock.c:2035
[<000000003e20621f>] sk_alloc+0x36/0x2f0 net/core/sock.c:2088
[<0000000028e43843>] tun_chr_open+0x3d/0x190 drivers/net/tun.c:3438
[<000000001b0f1f28>] misc_open+0x1a6/0x1f0 drivers/char/misc.c:165
[<000000004376f706>] chrdev_open+0x111/0x300 fs/char_dev.c:414
[<00000000614d379f>] do_dentry_open+0x2f9/0x750 fs/open.c:920
[<000000008eb24774>] do_open fs/namei.c:3636 [inline]
[<000000008eb24774>] path_openat+0x143f/0x1a30 fs/namei.c:3791
[<00000000955077b5>] do_filp_open+0xce/0x1c0 fs/namei.c:3818
[<00000000b78973b0>] do_sys_openat2+0xf0/0x260 fs/open.c:1356
[<00000000057be699>] do_sys_open fs/open.c:1372 [inline]
[<00000000057be699>] __do_sys_openat fs/open.c:1388 [inline]
[<00000000057be699>] __se_sys_openat fs/open.c:1383 [inline]
[<00000000057be699>] __x64_sys_openat+0x83/0xf0 fs/open.c:1383
[<00000000a7d2182d>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<00000000a7d2182d>] do_syscall_64+0x3c/0x90 arch/x86/entry/common.c:80
[<000000004cc4e8c4>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
unreferenced object 0xffff88802f671700 (size 240):
comm "syz-executor.1", pid 33269, jiffies 4295743854 (age 18.736s)
hex dump (first 32 bytes):
68 c9 db 1e 80 88 ff ff 68 c9 db 1e 80 88 ff ff h.......h.......
00 c0 7b 2f 80 88 ff ff 00 c8 db 1e 80 88 ff ff ..{/............
backtrace:
[<00000000e9d9fdb6>] __alloc_skb+0x223/0x250 net/core/skbuff.c:644
[<000000002c3e4e0b>] alloc_skb include/linux/skbuff.h:1288 [inline]
[<000000002c3e4e0b>] alloc_skb_with_frags+0x6f/0x350 net/core/skbuff.c:6378
[<00000000825f98d7>] sock_alloc_send_pskb+0x3ac/0x3e0 net/core/sock.c:2729
[<00000000e9eb3df3>] tun_alloc_skb drivers/net/tun.c:1529 [inline]
[<
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
xfrm: Zero padding when dumping algos and encap
When copying data to user-space we should ensure that only valid
data is copied over. Padding in structures may be filled with
random (possibly sensitve) data and should never be given directly
to user-space.
This patch fixes the copying of xfrm algorithms and the encap
template in xfrm_user so that padding is zeroed.
In the Linux kernel, the following vulnerability has been resolved:
fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode()
syzbot is hitting WARN_ON() in hfsplus_cat_{read,write}_inode(), for
crafted filesystem image can contain bogus length. There conditions are
not kernel bugs that can justify kernel to panic.
In the Linux kernel, the following vulnerability has been resolved:
hwmon: (xgene) Fix ioremap and memremap leak
Smatch reports:
drivers/hwmon/xgene-hwmon.c:757 xgene_hwmon_probe() warn:
'ctx->pcc_comm_addr' from ioremap() not released on line: 757.
This is because in drivers/hwmon/xgene-hwmon.c:701 xgene_hwmon_probe(),
ioremap and memremap is not released, which may cause a leak.
To fix this, ioremap and memremap is modified to devm_ioremap and
devm_memremap.
[groeck: Fixed formatting and subject]
In the Linux kernel, the following vulnerability has been resolved:
bcache: Fix __bch_btree_node_alloc to make the failure behavior consistent
In some specific situations, the return value of __bch_btree_node_alloc
may be NULL. This may lead to a potential NULL pointer dereference in
caller function like a calling chain :
btree_split->bch_btree_node_alloc->__bch_btree_node_alloc.
Fix it by initializing the return value in __bch_btree_node_alloc.
In the Linux kernel, the following vulnerability has been resolved:
NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
OPDESC() simply indexes into nfsd4_ops[] by the op's operation
number, without range checking that value. It assumes callers are
careful to avoid calling it with an out-of-bounds opnum value.
nfsd4_decode_compound() is not so careful, and can invoke OPDESC()
with opnum set to OP_ILLEGAL, which is 10044 -- well beyond the end
of nfsd4_ops[].
In the Linux kernel, the following vulnerability has been resolved:
drm/i915: Fix memory leaks in i915 selftests
This patch fixes memory leaks on error escapes in function fake_get_pages
(cherry picked from commit 8bfbdadce85c4c51689da10f39c805a7106d4567)
In the Linux kernel, the following vulnerability has been resolved:
scsi: target: iscsi: Fix buffer overflow in lio_target_nacl_info_show()
The function lio_target_nacl_info_show() uses sprintf() in a loop to print
details for every iSCSI connection in a session without checking for the
buffer length. With enough iSCSI connections it's possible to overflow the
buffer provided by configfs and corrupt the memory.
This patch replaces sprintf() with sysfs_emit_at() that checks for buffer
boundries.
In the Linux kernel, the following vulnerability has been resolved:
scsi: ses: Fix possible desc_ptr out-of-bounds accesses
Sanitize possible desc_ptr out-of-bounds accesses in
ses_enclosure_data_process().
In the Linux kernel, the following vulnerability has been resolved:
clk: Fix memory leak in devm_clk_notifier_register()
devm_clk_notifier_register() allocates a devres resource for clk
notifier but didn't register that to the device, so the notifier didn't
get unregistered on device detach and the allocated resource was leaked.
Fix the issue by registering the resource through devres_add().
This issue was found with kmemleak on a Chromebook.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: output extra debug info if we failed to find an inline backref
[BUG]
Syzbot reported several warning triggered inside
lookup_inline_extent_backref().
[CAUSE]
As usual, the reproducer doesn't reliably trigger locally here, but at
least we know the WARN_ON() is triggered when an inline backref can not
be found, and it can only be triggered when @insert is true. (I.e.
inserting a new inline backref, which means the backref should already
exist)
[ENHANCEMENT]
After the WARN_ON(), dump all the parameters and the extent tree
leaf to help debug.
In the Linux kernel, the following vulnerability has been resolved:
tcp: fix skb_copy_ubufs() vs BIG TCP
David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy
using hugepages, and skb length bigger than ~68 KB.
skb_copy_ubufs() assumed it could copy all payload using up to
MAX_SKB_FRAGS order-0 pages.
This assumption broke when BIG TCP was able to put up to 512 KB per skb.
We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45
and limit gso_max_size to 180000.
A solution is to use higher order pages if needed.
v2: add missing __GFP_COMP, or we leak memory.
In the Linux kernel, the following vulnerability has been resolved:
ring-buffer: Fix deadloop issue on reading trace_pipe
Soft lockup occurs when reading file 'trace_pipe':
watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488]
[...]
RIP: 0010:ring_buffer_empty_cpu+0xed/0x170
RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb
RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218
RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f
R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901
R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000
[...]
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__find_next_entry+0x1a8/0x4b0
? peek_next_entry+0x250/0x250
? down_write+0xa5/0x120
? down_write_killable+0x130/0x130
trace_find_next_entry_inc+0x3b/0x1d0
tracing_read_pipe+0x423/0xae0
? tracing_splice_read_pipe+0xcb0/0xcb0
vfs_read+0x16b/0x490
ksys_read+0x105/0x210
? __ia32_sys_pwrite64+0x200/0x200
? switch_fpu_return+0x108/0x220
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
Through the vmcore, I found it's because in tracing_read_pipe(),
ring_buffer_empty_cpu() found some buffer is not empty but then it
cannot read anything due to "rb_num_of_entries() == 0" always true,
Then it infinitely loop the procedure due to user buffer not been
filled, see following code path:
tracing_read_pipe() {
... ...
waitagain:
tracing_wait_pipe() // 1. find non-empty buffer here
trace_find_next_entry_inc() // 2. loop here try to find an entry
__find_next_entry()
ring_buffer_empty_cpu(); // 3. find non-empty buffer
peek_next_entry() // 4. but peek always return NULL
ring_buffer_peek()
rb_buffer_peek()
rb_get_reader_page()
// 5. because rb_num_of_entries() == 0 always true here
// then return NULL
// 6. user buffer not been filled so goto 'waitgain'
// and eventually leads to an deadloop in kernel!!!
}
By some analyzing, I found that when resetting ringbuffer, the 'entries'
of its pages are not all cleared (see rb_reset_cpu()). Then when reducing
the ringbuffer, and if some reduced pages exist dirty 'entries' data, they
will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which
cause wrong 'overrun' count and eventually cause the deadloop issue.
To fix it, we need to clear every pages in rb_reset_cpu().
In the Linux kernel, the following vulnerability has been resolved:
net: cdc_ncm: Deal with too low values of dwNtbOutMaxSize
Currently in cdc_ncm_check_tx_max(), if dwNtbOutMaxSize is lower than
the calculated "min" value, but greater than zero, the logic sets
tx_max to dwNtbOutMaxSize. This is then used to allocate a new SKB in
cdc_ncm_fill_tx_frame() where all the data is handled.
For small values of dwNtbOutMaxSize the memory allocated during
alloc_skb(dwNtbOutMaxSize, GFP_ATOMIC) will have the same size, due to
how size is aligned at alloc time:
size = SKB_DATA_ALIGN(size);
size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
Thus we hit the same bug that we tried to squash with
commit 2be6d4d16a084 ("net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero")
Low values of dwNtbOutMaxSize do not cause an issue presently because at
alloc_skb() time more memory (512b) is allocated than required for the
SKB headers alone (320b), leaving some space (512b - 320b = 192b)
for CDC data (172b).
However, if more elements (for example 3 x u64 = [24b]) were added to
one of the SKB header structs, say 'struct skb_shared_info',
increasing its original size (320b [320b aligned]) to something larger
(344b [384b aligned]), then suddenly the CDC data (172b) no longer
fits in the spare SKB data area (512b - 384b = 128b).
Consequently the SKB bounds checking semantics fails and panics:
skbuff: skb_over_panic: text:ffffffff831f755b len:184 put:172 head:ffff88811f1c6c00 data:ffff88811f1c6c00 tail:0xb8 end:0x80 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:113!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 57 Comm: kworker/0:2 Not tainted 5.15.106-syzkaller-00249-g19c0ed55a470 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
Workqueue: mld mld_ifc_work
RIP: 0010:skb_panic net/core/skbuff.c:113 [inline]
RIP: 0010:skb_over_panic+0x14c/0x150 net/core/skbuff.c:118
[snip]
Call Trace:
<TASK>
skb_put+0x151/0x210 net/core/skbuff.c:2047
skb_put_zero include/linux/skbuff.h:2422 [inline]
cdc_ncm_ndp16 drivers/net/usb/cdc_ncm.c:1131 [inline]
cdc_ncm_fill_tx_frame+0x11ab/0x3da0 drivers/net/usb/cdc_ncm.c:1308
cdc_ncm_tx_fixup+0xa3/0x100
Deal with too low values of dwNtbOutMaxSize, clamp it in the range
[USB_CDC_NCM_NTB_MIN_OUT_SIZE, CDC_NCM_NTB_MAX_SIZE_TX]. We ensure
enough data space is allocated to handle CDC data by making sure
dwNtbOutMaxSize is not smaller than USB_CDC_NCM_NTB_MIN_OUT_SIZE.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: codecs: wcd938x: fix missing mbhc init error handling
MBHC initialisation can fail so add the missing error handling to avoid
dereferencing an error pointer when later configuring the jack:
Unable to handle kernel paging request at virtual address fffffffffffffff8
pc : wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc]
lr : wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x]
Call trace:
wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc]
wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x]
snd_soc_component_set_jack+0x28/0x8c [snd_soc_core]
qcom_snd_wcd_jack_setup+0x7c/0x19c [snd_soc_qcom_common]
sc8280xp_snd_init+0x20/0x2c [snd_soc_sc8280xp]
snd_soc_link_init+0x28/0x90 [snd_soc_core]
snd_soc_bind_card+0x628/0xbfc [snd_soc_core]
snd_soc_register_card+0xec/0x104 [snd_soc_core]
devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core]
sc8280xp_platform_probe+0xf0/0x108 [snd_soc_sc8280xp]
In the Linux kernel, the following vulnerability has been resolved:
md: don't dereference mddev after export_rdev()
Except for initial reference, mddev->kobject is referenced by
rdev->kobject, and if the last rdev is freed, there is no guarantee that
mddev is still valid. Hence mddev should not be used anymore after
export_rdev().
This problem can be triggered by following test for mdadm at very
low rate:
New file: mdadm/tests/23rdev-lifetime
devname=${dev0##*/}
devt=`cat /sys/block/$devname/dev`
pid=""
runtime=2
clean_up_test() {
pill -9 $pid
echo clear > /sys/block/md0/md/array_state
}
trap 'clean_up_test' EXIT
add_by_sysfs() {
while true; do
echo $devt > /sys/block/md0/md/new_dev
done
}
remove_by_sysfs(){
while true; do
echo remove > /sys/block/md0/md/dev-${devname}/state
done
}
echo md0 > /sys/module/md_mod/parameters/new_array || die "create md0 failed"
add_by_sysfs &
pid="$pid $!"
remove_by_sysfs &
pid="$pid $!"
sleep $runtime
exit 0
Test cmd:
./test --save-logs --logdir=/tmp/ --keep-going --dev=loop --tests=23rdev-lifetime
Test result:
general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bcb: 0000 [#4] PREEMPT SMP
CPU: 0 PID: 1292 Comm: test Tainted: G D W 6.5.0-rc2-00121-g01e55c376936 #562
RIP: 0010:md_wakeup_thread+0x9e/0x320 [md_mod]
Call Trace:
<TASK>
mddev_unlock+0x1b6/0x310 [md_mod]
rdev_attr_store+0xec/0x190 [md_mod]
sysfs_kf_write+0x52/0x70
kernfs_fop_write_iter+0x19a/0x2a0
vfs_write+0x3b5/0x770
ksys_write+0x74/0x150
__x64_sys_write+0x22/0x30
do_syscall_64+0x40/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fix this problem by don't dereference mddev after export_rdev().
In the Linux kernel, the following vulnerability has been resolved:
OPP: Fix potential null ptr dereference in dev_pm_opp_get_required_pstate()
"opp" pointer is dereferenced before the IS_ERR_OR_NULL() check. Fix it by
removing the dereference to cache opp_table and dereference it directly
where opp_table is used.
This fixes the following smatch warning:
drivers/opp/core.c:232 dev_pm_opp_get_required_pstate() warn: variable
dereferenced before IS_ERR check 'opp' (see line 230)
In the Linux kernel, the following vulnerability has been resolved:
KVM: nSVM: Check instead of asserting on nested TSC scaling support
Check for nested TSC scaling support on nested SVM VMRUN instead of
asserting that TSC scaling is exposed to L1 if L1's MSR_AMD64_TSC_RATIO
has diverged from KVM's default. Userspace can trigger the WARN at will
by writing the MSR and then updating guest CPUID to hide the feature
(modifying guest CPUID is allowed anytime before KVM_RUN). E.g. hacking
KVM's state_test selftest to do
vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);
after restoring state in a new VM+vCPU yields an endless supply of:
------------[ cut here ]------------
WARNING: CPU: 164 PID: 62565 at arch/x86/kvm/svm/nested.c:699
nested_vmcb02_prepare_control+0x3d6/0x3f0 [kvm_amd]
Call Trace:
<TASK>
enter_svm_guest_mode+0x114/0x560 [kvm_amd]
nested_svm_vmrun+0x260/0x330 [kvm_amd]
vmrun_interception+0x29/0x30 [kvm_amd]
svm_invoke_exit_handler+0x35/0x100 [kvm_amd]
svm_handle_exit+0xe7/0x180 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
__se_sys_ioctl+0x7a/0xc0
__x64_sys_ioctl+0x21/0x30
do_syscall_64+0x41/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x45ca1b
Note, the nested #VMEXIT path has the same flaw, but needs a different
fix and will be handled separately.
In the Linux kernel, the following vulnerability has been resolved:
ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup}
If the filename casefolding fails, we'll be leaking memory from the
fscrypt_name struct, namely from the 'crypto_buf.name' member.
Make sure we free it in the error path on both ext4_fname_setup_filename()
and ext4_fname_prepare_lookup() functions.
In the Linux kernel, the following vulnerability has been resolved:
bnxt: avoid overflow in bnxt_get_nvram_directory()
The value of an arithmetic expression is subject
of possible overflow due to a failure to cast operands to a larger data
type before performing arithmetic. Used macro for multiplication instead
operator for avoiding overflow.
Found by Security Code and Linux Verification
Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved:
bpf, cpumap: Handle skb as well when clean up ptr_ring
The following warning was reported when running xdp_redirect_cpu with
both skb-mode and stress-mode enabled:
------------[ cut here ]------------
Incorrect XDP memory type (-2128176192) usage
WARNING: CPU: 7 PID: 1442 at net/core/xdp.c:405
Modules linked in:
CPU: 7 PID: 1442 Comm: kworker/7:0 Tainted: G 6.5.0-rc2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Workqueue: events __cpu_map_entry_free
RIP: 0010:__xdp_return+0x1e4/0x4a0
......
Call Trace:
<TASK>
? show_regs+0x65/0x70
? __warn+0xa5/0x240
? __xdp_return+0x1e4/0x4a0
......
xdp_return_frame+0x4d/0x150
__cpu_map_entry_free+0xf9/0x230
process_one_work+0x6b0/0xb80
worker_thread+0x96/0x720
kthread+0x1a5/0x1f0
ret_from_fork+0x3a/0x70
ret_from_fork_asm+0x1b/0x30
</TASK>
The reason for the warning is twofold. One is due to the kthread
cpu_map_kthread_run() is stopped prematurely. Another one is
__cpu_map_ring_cleanup() doesn't handle skb mode and treats skbs in
ptr_ring as XDP frames.
Prematurely-stopped kthread will be fixed by the preceding patch and
ptr_ring will be empty when __cpu_map_ring_cleanup() is called. But
as the comments in __cpu_map_ring_cleanup() said, handling and freeing
skbs in ptr_ring as well to "catch any broken behaviour gracefully".
In the Linux kernel, the following vulnerability has been resolved:
spi: bcm-qspi: return error if neither hif_mspi nor mspi is available
If neither a "hif_mspi" nor "mspi" resource is present, the driver will
just early exit in probe but still return success. Apart from not doing
anything meaningful, this would then also lead to a null pointer access
on removal, as platform_get_drvdata() would return NULL, which it would
then try to dereference when trying to unregister the spi master.
Fix this by unconditionally calling devm_ioremap_resource(), as it can
handle a NULL res and will then return a viable ERR_PTR() if we get one.
The "return 0;" was previously a "goto qspi_resource_err;" where then
ret was returned, but since ret was still initialized to 0 at this place
this was a valid conversion in 63c5395bb7a9 ("spi: bcm-qspi: Fix
use-after-free on unbind"). The issue was not introduced by this commit,
only made more obvious.
In the Linux kernel, the following vulnerability has been resolved:
ice: Don't tx before switchdev is fully configured
There is possibility that ice_eswitch_port_start_xmit might be
called while some resources are still not allocated which might
cause NULL pointer dereference. Fix this by checking if switchdev
configuration was finished.
In the Linux kernel, the following vulnerability has been resolved:
drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
The driver needs to migrate the perf context if the current using CPU going
to teardown. By the time calling the cpuhp::teardown() callback the
cpu_online_mask() hasn't updated yet and still includes the CPU going to
teardown. In current driver's implementation we may migrate the context
to the teardown CPU and leads to the below calltrace:
...
[ 368.104662][ T932] task:cpuhp/0 state:D stack: 0 pid: 15 ppid: 2 flags:0x00000008
[ 368.113699][ T932] Call trace:
[ 368.116834][ T932] __switch_to+0x7c/0xbc
[ 368.120924][ T932] __schedule+0x338/0x6f0
[ 368.125098][ T932] schedule+0x50/0xe0
[ 368.128926][ T932] schedule_preempt_disabled+0x18/0x24
[ 368.134229][ T932] __mutex_lock.constprop.0+0x1d4/0x5dc
[ 368.139617][ T932] __mutex_lock_slowpath+0x1c/0x30
[ 368.144573][ T932] mutex_lock+0x50/0x60
[ 368.148579][ T932] perf_pmu_migrate_context+0x84/0x2b0
[ 368.153884][ T932] hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
[ 368.160579][ T932] cpuhp_invoke_callback+0x2a0/0x650
[ 368.165707][ T932] cpuhp_thread_fun+0xe4/0x190
[ 368.170316][ T932] smpboot_thread_fn+0x15c/0x1a0
[ 368.175099][ T932] kthread+0x108/0x13c
[ 368.179012][ T932] ret_from_fork+0x10/0x18
...
Use function cpumask_any_but() to find one correct active cpu to fixes
this issue.
In the Linux kernel, the following vulnerability has been resolved:
octeontx2-af: Add validation before accessing cgx and lmac
with the addition of new MAC blocks like CN10K RPM and CN10KB
RPM_USX, LMACs are noncontiguous and CGX blocks are also
noncontiguous. But during RVU driver initialization, the driver
is assuming they are contiguous and trying to access
cgx or lmac with their id which is resulting in kernel panic.
This patch fixes the issue by adding proper checks.
[ 23.219150] pc : cgx_lmac_read+0x38/0x70
[ 23.219154] lr : rvu_program_channels+0x3f0/0x498
[ 23.223852] sp : ffff000100d6fc80
[ 23.227158] x29: ffff000100d6fc80 x28: ffff00010009f880 x27:
000000000000005a
[ 23.234288] x26: ffff000102586768 x25: 0000000000002500 x24:
fffffffffff0f000
In the Linux kernel, the following vulnerability has been resolved:
media: amphion: fix REVERSE_INULL issues reported by coverity
null-checking of a pointor is suggested before dereferencing it
In the Linux kernel, the following vulnerability has been resolved:
vdpa: Add features attr to vdpa_nl_policy for nlattr length check
The vdpa_nl_policy structure is used to validate the nlattr when parsing
the incoming nlmsg. It will ensure the attribute being described produces
a valid nlattr pointer in info->attrs before entering into each handler
in vdpa_nl_ops.
That is to say, the missing part in vdpa_nl_policy may lead to illegal
nlattr after parsing, which could lead to OOB read just like CVE-2023-3773.
This patch adds the missing nla_policy for vdpa features attr to avoid
such bugs.
In the Linux kernel, the following vulnerability has been resolved:
Input: exc3000 - properly stop timer on shutdown
We need to stop the timer on driver unbind or probe failures, otherwise
we get UAF/Oops.
In the Linux kernel, the following vulnerability has been resolved:
fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe()
If 'mipid_detect()' fails, we must free 'md' to avoid a memory leak.
In the Linux kernel, the following vulnerability has been resolved:
perf trace: Really free the evsel->priv area
In 3cb4d5e00e037c70 ("perf trace: Free syscall tp fields in
evsel->priv") it only was freeing if strcmp(evsel->tp_format->system,
"syscalls") returned zero, while the corresponding initialization of
evsel->priv was being performed if it was _not_ zero, i.e. if the tp
system wasn't 'syscalls'.
Just stop looking for that and free it if evsel->priv was set, which
should be equivalent.
Also use the pre-existing evsel_trace__delete() function.
This resolves these leaks, detected with:
$ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin
=================================================================
==481565==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
#1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
#2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
#3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
#4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
#5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
#6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212
#7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
#8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
#9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
#10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
#11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
#1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
#2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
#3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
#4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
#5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
#6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205
#7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
#8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
#9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
#10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
#11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
[root@quaco ~]#
With this we plug all leaks with "perf trace sleep 1".
In the Linux kernel, the following vulnerability has been resolved:
ALSA: ac97: Fix possible NULL dereference in snd_ac97_mixer
smatch error:
sound/pci/ac97/ac97_codec.c:2354 snd_ac97_mixer() error:
we previously assumed 'rac97' could be null (see line 2072)
remove redundant assignment, return error if rac97 is NULL.
In the Linux kernel, the following vulnerability has been resolved:
Drivers: hv: vmbus: Don't dereference ACPI root object handle
Since the commit referenced in the Fixes: tag below the VMBus client driver
is walking the ACPI namespace up from the VMBus ACPI device to the ACPI
namespace root object trying to find Hyper-V MMIO ranges.
However, if it is not able to find them it ends trying to walk resources of
the ACPI namespace root object itself.
This object has all-ones handle, which causes a NULL pointer dereference
in the ACPI code (from dereferencing this pointer with an offset).
This in turn causes an oops on boot with VMBus host implementations that do
not provide Hyper-V MMIO ranges in their VMBus ACPI device or its
ancestors.
The QEMU VMBus implementation is an example of such implementation.
I guess providing these ranges is optional, since all tested Windows
versions seem to be able to use VMBus devices without them.
Fix this by explicitly terminating the lookup at the ACPI namespace root
object.
Note that Linux guests under KVM/QEMU do not use the Hyper-V PV interface
by default - they only do so if the KVM PV interface is missing or
disabled.
Example stack trace of such oops:
[ 3.710827] ? __die+0x1f/0x60
[ 3.715030] ? page_fault_oops+0x159/0x460
[ 3.716008] ? exc_page_fault+0x73/0x170
[ 3.716959] ? asm_exc_page_fault+0x22/0x30
[ 3.717957] ? acpi_ns_lookup+0x7a/0x4b0
[ 3.718898] ? acpi_ns_internalize_name+0x79/0xc0
[ 3.720018] acpi_ns_get_node_unlocked+0xb5/0xe0
[ 3.721120] ? acpi_ns_check_object_type+0xfe/0x200
[ 3.722285] ? acpi_rs_convert_aml_to_resource+0x37/0x6e0
[ 3.723559] ? down_timeout+0x3a/0x60
[ 3.724455] ? acpi_ns_get_node+0x3a/0x60
[ 3.725412] acpi_ns_get_node+0x3a/0x60
[ 3.726335] acpi_ns_evaluate+0x1c3/0x2c0
[ 3.727295] acpi_ut_evaluate_object+0x64/0x1b0
[ 3.728400] acpi_rs_get_method_data+0x2b/0x70
[ 3.729476] ? vmbus_platform_driver_probe+0x1d0/0x1d0 [hv_vmbus]
[ 3.730940] ? vmbus_platform_driver_probe+0x1d0/0x1d0 [hv_vmbus]
[ 3.732411] acpi_walk_resources+0x78/0xd0
[ 3.733398] vmbus_platform_driver_probe+0x9f/0x1d0 [hv_vmbus]
[ 3.734802] platform_probe+0x3d/0x90
[ 3.735684] really_probe+0x19b/0x400
[ 3.736570] ? __device_attach_driver+0x100/0x100
[ 3.737697] __driver_probe_device+0x78/0x160
[ 3.738746] driver_probe_device+0x1f/0x90
[ 3.739743] __driver_attach+0xc2/0x1b0
[ 3.740671] bus_for_each_dev+0x70/0xc0
[ 3.741601] bus_add_driver+0x10e/0x210
[ 3.742527] driver_register+0x55/0xf0
[ 3.744412] ? 0xffffffffc039a000
[ 3.745207] hv_acpi_init+0x3c/0x1000 [hv_vmbus]
In the Linux kernel, the following vulnerability has been resolved:
drm/i915/perf: add sentinel to xehp_oa_b_counters
Arrays passed to reg_in_range_table should end with empty record.
The patch solves KASAN detected bug with signature:
BUG: KASAN: global-out-of-bounds in xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
Read of size 4 at addr ffffffffa1555d90 by task perf/1518
CPU: 4 PID: 1518 Comm: perf Tainted: G U 6.4.0-kasan_438-g3303d06107f3+ #1
Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3223.D80.2305311348 05/31/2023
Call Trace:
<TASK>
...
xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
(cherry picked from commit 2f42c5afb34b5696cf5fe79e744f99be9b218798)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Make bpf_refcount_acquire fallible for non-owning refs
This patch fixes an incorrect assumption made in the original
bpf_refcount series [0], specifically that the BPF program calling
bpf_refcount_acquire on some node can always guarantee that the node is
alive. In that series, the patch adding failure behavior to rbtree_add
and list_push_{front, back} breaks this assumption for non-owning
references.
Consider the following program:
n = bpf_kptr_xchg(&mapval, NULL);
/* skip error checking */
bpf_spin_lock(&l);
if(bpf_rbtree_add(&t, &n->rb, less)) {
bpf_refcount_acquire(n);
/* Failed to add, do something else with the node */
}
bpf_spin_unlock(&l);
It's incorrect to assume that bpf_refcount_acquire will always succeed in this
scenario. bpf_refcount_acquire is being called in a critical section
here, but the lock being held is associated with rbtree t, which isn't
necessarily the lock associated with the tree that the node is already
in. So after bpf_rbtree_add fails to add the node and calls bpf_obj_drop
in it, the program has no ownership of the node's lifetime. Therefore
the node's refcount can be decr'd to 0 at any time after the failing
rbtree_add. If this happens before the refcount_acquire above, the node
might be free'd, and regardless refcount_acquire will be incrementing a
0 refcount.
Later patches in the series exercise this scenario, resulting in the
expected complaint from the kernel (without this patch's changes):
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 1 PID: 207 at lib/refcount.c:25 refcount_warn_saturate+0xbc/0x110
Modules linked in: bpf_testmod(O)
CPU: 1 PID: 207 Comm: test_progs Tainted: G O 6.3.0-rc7-02231-g723de1a718a2-dirty #371
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0xbc/0x110
Code: 6f 64 f6 02 01 e8 84 a3 5c ff 0f 0b eb 9d 80 3d 5e 64 f6 02 00 75 94 48 c7 c7 e0 13 d2 82 c6 05 4e 64 f6 02 01 e8 64 a3 5c ff <0f> 0b e9 7a ff ff ff 80 3d 38 64 f6 02 00 0f 85 6d ff ff ff 48 c7
RSP: 0018:ffff88810b9179b0 EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: 0000000000000202 RSI: 0000000000000008 RDI: ffffffff857c3680
RBP: ffff88810027d3c0 R08: ffffffff8125f2a4 R09: ffff88810b9176e7
R10: ffffed1021722edc R11: 746e756f63666572 R12: ffff88810027d388
R13: ffff88810027d3c0 R14: ffffc900005fe030 R15: ffffc900005fe048
FS: 00007fee0584a700(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005634a96f6c58 CR3: 0000000108ce9002 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
bpf_refcount_acquire_impl+0xb5/0xc0
(rest of output snipped)
The patch addresses this by changing bpf_refcount_acquire_impl to use
refcount_inc_not_zero instead of refcount_inc and marking
bpf_refcount_acquire KF_RET_NULL.
For owning references, though, we know the above scenario is not possible
and thus that bpf_refcount_acquire will always succeed. Some verifier
bookkeeping is added to track "is input owning ref?" for bpf_refcount_acquire
calls and return false from is_kfunc_ret_null for bpf_refcount_acquire on
owning refs despite it being marked KF_RET_NULL.
Existing selftests using bpf_refcount_acquire are modified where
necessary to NULL-check its return value.
[0]: https://lore.kernel.org/bpf/20230415201811.343116-1-davemarchevsky@fb.com/
In the Linux kernel, the following vulnerability has been resolved:
nvme-tcp: don't access released socket during error recovery
While the error recovery work is temporarily failing reconnect attempts,
running the 'nvme list' command causes a kernel NULL pointer dereference
by calling getsockname() with a released socket.
During error recovery work, the nvme tcp socket is released and a new one
created, so it is not safe to access the socket without proper check.
In the Linux kernel, the following vulnerability has been resolved:
x86: fix clear_user_rep_good() exception handling annotation
This code no longer exists in mainline, because it was removed in
commit d2c95f9d6802 ("x86: don't use REP_GOOD or ERMS for user memory
clearing") upstream.
However, rather than backport the full range of x86 memory clearing and
copying cleanups, fix the exception table annotation placement for the
final 'rep movsb' in clear_user_rep_good(): rather than pointing at the
actual instruction that did the user space access, it pointed to the
register move just before it.
That made sense from a code flow standpoint, but not from an actual
usage standpoint: it means that if user access takes an exception, the
exception handler won't actually find the instruction in the exception
tables.
As a result, rather than fixing it up and returning -EFAULT, it would
then turn it into a kernel oops report instead, something like:
BUG: unable to handle page fault for address: 0000000020081000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
...
RIP: 0010:clear_user_rep_good+0x1c/0x30 arch/x86/lib/clear_page_64.S:147
...
Call Trace:
__clear_user arch/x86/include/asm/uaccess_64.h:103 [inline]
clear_user arch/x86/include/asm/uaccess_64.h:124 [inline]
iov_iter_zero+0x709/0x1290 lib/iov_iter.c:800
iomap_dio_hole_iter fs/iomap/direct-io.c:389 [inline]
iomap_dio_iter fs/iomap/direct-io.c:440 [inline]
__iomap_dio_rw+0xe3d/0x1cd0 fs/iomap/direct-io.c:601
iomap_dio_rw+0x40/0xa0 fs/iomap/direct-io.c:689
ext4_dio_read_iter fs/ext4/file.c:94 [inline]
ext4_file_read_iter+0x4be/0x690 fs/ext4/file.c:145
call_read_iter include/linux/fs.h:2183 [inline]
do_iter_readv_writev+0x2e0/0x3b0 fs/read_write.c:733
do_iter_read+0x2f2/0x750 fs/read_write.c:796
vfs_readv+0xe5/0x150 fs/read_write.c:916
do_preadv+0x1b6/0x270 fs/read_write.c:1008
__do_sys_preadv2 fs/read_write.c:1070 [inline]
__se_sys_preadv2 fs/read_write.c:1061 [inline]
__x64_sys_preadv2+0xef/0x150 fs/read_write.c:1061
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
which then looks like a filesystem bug rather than the incorrect
exception annotation that it is.
[ The alternative to this one-liner fix is to take the upstream series
that cleans this all up:
68674f94ffc9 ("x86: don't use REP_GOOD or ERMS for small memory copies")
20f3337d350c ("x86: don't use REP_GOOD or ERMS for small memory clearing")
adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory copies")
* d2c95f9d6802 ("x86: don't use REP_GOOD or ERMS for user memory clearing")
3639a535587d ("x86: move stac/clac from user copy routines into callers")
577e6a7fd50d ("x86: inline the 'rep movs' in user copies for the FSRM case")
8c9b6a88b7e2 ("x86: improve on the non-rep 'clear_user' function")
427fda2c8a49 ("x86: improve on the non-rep 'copy_user' function")
* e046fe5a36a9 ("x86: set FSRS automatically on AMD CPUs that have FSRM")
e1f2750edc4a ("x86: remove 'zerorest' argument from __copy_user_nocache()")
034ff37d3407 ("x86: rewrite '__copy_user_nocache' function")
with either the whole series or at a minimum the two marked commits
being needed to fix this issue ]
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath9k: hif_usb: fix memory leak of remain_skbs
hif_dev->remain_skb is allocated and used exclusively in
ath9k_hif_usb_rx_stream(). It is implied that an allocated remain_skb is
processed and subsequently freed (in error paths) only during the next
call of ath9k_hif_usb_rx_stream().
So, if the urbs are deallocated between those two calls due to the device
deinitialization or suspend, it is possible that ath9k_hif_usb_rx_stream()
is not called next time and the allocated remain_skb is leaked. Our local
Syzkaller instance was able to trigger that.
remain_skb makes sense when receiving two consecutive urbs which are
logically linked together, i.e. a specific data field from the first skb
indicates a cached skb to be allocated, memcpy'd with some data and
subsequently processed in the next call to ath9k_hif_usb_rx_stream(). Urbs
deallocation supposedly makes that link irrelevant so we need to free the
cached skb in those cases.
Fix the leak by introducing a function to explicitly free remain_skb (if
it is not NULL) when the rx urbs have been deallocated. remain_skb is NULL
when it has not been allocated at all (hif_dev struct is kzalloced) or
when it has been processed in next call to ath9k_hif_usb_rx_stream().
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: lpass: Fix for KASAN use_after_free out of bounds
When we run syzkaller we get below Out of Bounds error.
"KASAN: slab-out-of-bounds Read in regcache_flat_read"
Below is the backtrace of the issue:
BUG: KASAN: slab-out-of-bounds in regcache_flat_read+0x10c/0x110
Read of size 4 at addr ffffff8088fbf714 by task syz-executor.4/14144
CPU: 6 PID: 14144 Comm: syz-executor.4 Tainted: G W
Hardware name: Qualcomm Technologies, Inc. sc7280 CRD platform (rev5+) (DT)
Call trace:
dump_backtrace+0x0/0x4ec
show_stack+0x34/0x50
dump_stack_lvl+0xdc/0x11c
print_address_description+0x30/0x2d8
kasan_report+0x178/0x1e4
__asan_report_load4_noabort+0x44/0x50
regcache_flat_read+0x10c/0x110
regcache_read+0xf8/0x5a0
_regmap_read+0x45c/0x86c
_regmap_update_bits+0x128/0x290
regmap_update_bits_base+0xc0/0x15c
snd_soc_component_update_bits+0xa8/0x22c
snd_soc_component_write_field+0x68/0xd4
tx_macro_put_dec_enum+0x1d0/0x268
snd_ctl_elem_write+0x288/0x474
By Error checking and checking valid values issue gets rectifies.
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath6kl: reduce WARN to dev_dbg() in callback
The warn is triggered on a known race condition, documented in the code above
the test, that is correctly handled. Using WARN() hinders automated testing.
Reducing severity.
In the Linux kernel, the following vulnerability has been resolved:
octeon_ep: cancel queued works in probe error path
If it fails to get the devices's MAC address, octep_probe exits while
leaving the delayed work intr_poll_task queued. When the work later
runs, it's a use after free.
Move the cancelation of intr_poll_task from octep_remove into
octep_device_cleanup. This does not change anything in the octep_remove
flow, but octep_device_cleanup is called also in the octep_probe error
path, where the cancelation is needed.
Note that the cancelation of ctrl_mbox_task has to follow
intr_poll_task's, because the ctrl_mbox_task may be queued by
intr_poll_task.