In the Linux kernel, the following vulnerability has been resolved:
soc: qcom: aoss: Fix refcount leak in qmp_cooling_devices_register
Every iteration of for_each_available_child_of_node() decrements
the reference count of the previous node.
When breaking early from a for_each_available_child_of_node() loop,
we need to explicitly call of_node_put() on the child node.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
erofs: wake up all waiters after z_erofs_lzma_head ready
When the user mounts the erofs second times, the decompression thread
may hung. The problem happens due to a sequence of steps like the
following:
1) Task A called z_erofs_load_lzma_config which obtain all of the node
from the z_erofs_lzma_head.
2) At this time, task B called the z_erofs_lzma_decompress and wanted to
get a node. But the z_erofs_lzma_head was empty, the Task B had to
sleep.
3) Task A release nodes and push nodes into the z_erofs_lzma_head. But
task B was still sleeping.
One example report when the hung happens:
task:kworker/u3:1 state:D stack:14384 pid: 86 ppid: 2 flags:0x00004000
Workqueue: erofs_unzipd z_erofs_decompressqueue_work
Call Trace:
<TASK>
__schedule+0x281/0x760
schedule+0x49/0xb0
z_erofs_lzma_decompress+0x4bc/0x580
? cpu_core_flags+0x10/0x10
z_erofs_decompress_pcluster+0x49b/0xba0
? __update_load_avg_se+0x2b0/0x330
? __update_load_avg_se+0x2b0/0x330
? update_load_avg+0x5f/0x690
? update_load_avg+0x5f/0x690
? set_next_entity+0xbd/0x110
? _raw_spin_unlock+0xd/0x20
z_erofs_decompress_queue.isra.0+0x2e/0x50
z_erofs_decompressqueue_work+0x30/0x60
process_one_work+0x1d3/0x3a0
worker_thread+0x45/0x3a0
? process_one_work+0x3a0/0x3a0
kthread+0xe2/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
In the Linux kernel, the following vulnerability has been resolved:
regulator: of: Fix refcount leak bug in of_get_regulation_constraints()
We should call the of_node_put() for the reference returned by
of_get_child_by_name() which has increased the refcount.
In the Linux kernel, the following vulnerability has been resolved:
drm/meson: Fix refcount leak in meson_encoder_hdmi_init
of_find_device_by_node() takes reference, we should use put_device()
to release it when not need anymore.
Add missing put_device() in error path to avoid refcount
leak.
In the Linux kernel, the following vulnerability has been resolved:
ath11k: fix netdev open race
Make sure to allocate resources needed before registering the device.
This specifically avoids having a racing open() trigger a BUG_ON() in
mod_timer() when ath11k_mac_op_start() is called before the
mon_reap_timer as been set up.
I did not see this issue with next-20220310, but I hit it on every probe
with next-20220511. Perhaps some timing changed in between.
Here's the backtrace:
[ 51.346947] kernel BUG at kernel/time/timer.c:990!
[ 51.346958] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
...
[ 51.578225] Call trace:
[ 51.583293] __mod_timer+0x298/0x390
[ 51.589518] mod_timer+0x14/0x20
[ 51.595368] ath11k_mac_op_start+0x41c/0x4a0 [ath11k]
[ 51.603165] drv_start+0x38/0x60 [mac80211]
[ 51.610110] ieee80211_do_open+0x29c/0x7d0 [mac80211]
[ 51.617945] ieee80211_open+0x60/0xb0 [mac80211]
[ 51.625311] __dev_open+0x100/0x1c0
[ 51.631420] __dev_change_flags+0x194/0x210
[ 51.638214] dev_change_flags+0x24/0x70
[ 51.644646] do_setlink+0x228/0xdb0
[ 51.650723] __rtnl_newlink+0x460/0x830
[ 51.657162] rtnl_newlink+0x4c/0x80
[ 51.663229] rtnetlink_rcv_msg+0x124/0x390
[ 51.669917] netlink_rcv_skb+0x58/0x130
[ 51.676314] rtnetlink_rcv+0x18/0x30
[ 51.682460] netlink_unicast+0x250/0x310
[ 51.688960] netlink_sendmsg+0x19c/0x3e0
[ 51.695458] ____sys_sendmsg+0x220/0x290
[ 51.701938] ___sys_sendmsg+0x7c/0xc0
[ 51.708148] __sys_sendmsg+0x68/0xd0
[ 51.714254] __arm64_sys_sendmsg+0x28/0x40
[ 51.720900] invoke_syscall+0x48/0x120
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
In the Linux kernel, the following vulnerability has been resolved:
ath11k: fix missing skb drop on htc_tx_completion error
On htc_tx_completion error the skb is not dropped. This is wrong since
the completion_handler logic expect the skb to be consumed anyway even
when an error is triggered. Not freeing the skb on error is a memory
leak since the skb won't be freed anywere else. Correctly free the
packet on eid >= ATH11K_HTC_EP_COUNT before returning.
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-01208-QCAHKSWPL_SILICONZ-1
In the Linux kernel, the following vulnerability has been resolved:
drm/meson: encoder_hdmi: Fix refcount leak in meson_encoder_hdmi_init
of_graph_get_remote_node() returns remote device nodepointer with
refcount incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
drm/meson: encoder_cvbs: Fix refcount leak in meson_encoder_cvbs_init
of_graph_get_remote_node() returns remote device nodepointer with
refcount incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
virtio-gpu: fix a missing check to avoid NULL dereference
'cache_ent' could be set NULL inside virtio_gpu_cmd_get_capset()
and it will lead to a NULL dereference by a lately use of it
(i.e., ptr = cache_ent->caps_cache). Fix it with a NULL check.
[ kraxel: minor codestyle fixup ]
In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw89: 8852a: rfk: fix div 0 exception
The DPK is a kind of RF calibration whose algorithm is to fine tune
parameters and calibrate, and check the result. If the result isn't good
enough, it could adjust parameters and try again.
This issue is to read and show the result, but it could be a negative
calibration result that causes divisor 0 and core dump. So, fix it by
phy_div() that does division only if divisor isn't zero; otherwise,
zero is adopted.
divide error: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 728 Comm: wpa_supplicant Not tainted 5.10.114-16019-g462a1661811a #1 <HASH:d024 28>
RIP: 0010:rtw8852a_dpk+0x14ae/0x288f [rtw89_core]
RSP: 0018:ffffa9bb412a7520 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 00000000000180fc RDI: ffffa141d01023c0
RBP: ffffa9bb412a76a0 R08: 0000000000001319 R09: 00000000ffffff92
R10: ffffffffc0292de3 R11: ffffffffc00d2f51 R12: 0000000000000000
R13: ffffa141d01023c0 R14: ffffffffc0290250 R15: ffffa141d0102638
FS: 00007fa99f5c2740(0000) GS:ffffa142e5e80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000013e8e010 CR3: 0000000110d2c000 CR4: 0000000000750ee0
PKRU: 55555554
Call Trace:
rtw89_core_sta_add+0x95/0x9c [rtw89_core <HASH:d239 29>]
rtw89_ops_sta_state+0x5d/0x108 [rtw89_core <HASH:d239 29>]
drv_sta_state+0x115/0x66f [mac80211 <HASH:81fe 30>]
sta_info_insert_rcu+0x45c/0x713 [mac80211 <HASH:81fe 30>]
sta_info_insert+0xf/0x1b [mac80211 <HASH:81fe 30>]
ieee80211_prep_connection+0x9d6/0xb0c [mac80211 <HASH:81fe 30>]
ieee80211_mgd_auth+0x2aa/0x352 [mac80211 <HASH:81fe 30>]
cfg80211_mlme_auth+0x160/0x1f6 [cfg80211 <HASH:00cd 31>]
nl80211_authenticate+0x2e5/0x306 [cfg80211 <HASH:00cd 31>]
genl_rcv_msg+0x371/0x3a1
? nl80211_stop_sched_scan+0xe5/0xe5 [cfg80211 <HASH:00cd 31>]
? genl_rcv+0x36/0x36
netlink_rcv_skb+0x8a/0xf9
genl_rcv+0x28/0x36
netlink_unicast+0x27b/0x3a0
netlink_sendmsg+0x2aa/0x469
sock_sendmsg_nosec+0x49/0x4d
____sys_sendmsg+0xe5/0x213
__sys_sendmsg+0xec/0x157
? syscall_enter_from_user_mode+0xd7/0x116
do_syscall_64+0x43/0x55
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa99f6e689b
In the Linux kernel, the following vulnerability has been resolved:
rcutorture: Fix ksoftirqd boosting timing and iteration
The RCU priority boosting can fail in two situations:
1) If (nr_cpus= > maxcpus=), which means if the total number of CPUs
is higher than those brought online at boot, then torture_onoff() may
later bring up CPUs that weren't online on boot. Now since rcutorture
initialization only boosts the ksoftirqds of the CPUs that have been
set online on boot, the CPUs later set online by torture_onoff won't
benefit from the boost, making RCU priority boosting fail.
2) The ksoftirqd kthreads are boosted after the creation of
rcu_torture_boost() kthreads, which opens a window large enough for these
rcu_torture_boost() kthreads to wait (despite running at FIFO priority)
for ksoftirqds that are still running at SCHED_NORMAL priority.
The issues can trigger for example with:
./kvm.sh --configs TREE01 --kconfig "CONFIG_RCU_BOOST=y"
[ 34.968561] rcu-torture: !!!
[ 34.968627] ------------[ cut here ]------------
[ 35.014054] WARNING: CPU: 4 PID: 114 at kernel/rcu/rcutorture.c:1979 rcu_torture_stats_print+0x5ad/0x610
[ 35.052043] Modules linked in:
[ 35.069138] CPU: 4 PID: 114 Comm: rcu_torture_sta Not tainted 5.18.0-rc1 #1
[ 35.096424] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[ 35.154570] RIP: 0010:rcu_torture_stats_print+0x5ad/0x610
[ 35.198527] Code: 63 1b 02 00 74 02 0f 0b 48 83 3d 35 63 1b 02 00 74 02 0f 0b 48 83 3d 21 63 1b 02 00 74 02 0f 0b 48 83 3d 0d 63 1b 02 00 74 02 <0f> 0b 83 eb 01 0f 8e ba fc ff ff 0f 0b e9 b3 fc ff f82
[ 37.251049] RSP: 0000:ffffa92a0050bdf8 EFLAGS: 00010202
[ 37.277320] rcu: De-offloading 8
[ 37.290367] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
[ 37.290387] RDX: 0000000000000000 RSI: 00000000ffffbfff RDI: 00000000ffffffff
[ 37.290398] RBP: 000000000000007b R08: 0000000000000000 R09: c0000000ffffbfff
[ 37.290407] R10: 000000000000002a R11: ffffa92a0050bc18 R12: ffffa92a0050be20
[ 37.290417] R13: ffffa92a0050be78 R14: 0000000000000000 R15: 000000000001bea0
[ 37.290427] FS: 0000000000000000(0000) GS:ffff96045eb00000(0000) knlGS:0000000000000000
[ 37.290448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.290460] CR2: 0000000000000000 CR3: 000000001dc0c000 CR4: 00000000000006e0
[ 37.290470] Call Trace:
[ 37.295049] <TASK>
[ 37.295065] ? preempt_count_add+0x63/0x90
[ 37.295095] ? _raw_spin_lock_irqsave+0x12/0x40
[ 37.295125] ? rcu_torture_stats_print+0x610/0x610
[ 37.295143] rcu_torture_stats+0x29/0x70
[ 37.295160] kthread+0xe3/0x110
[ 37.295176] ? kthread_complete_and_exit+0x20/0x20
[ 37.295193] ret_from_fork+0x22/0x30
[ 37.295218] </TASK>
Fix this with boosting the ksoftirqds kthreads from the boosting
hotplug callback itself and before the boosting kthreads are created.
In the Linux kernel, the following vulnerability has been resolved:
drm/mcde: Fix refcount leak in mcde_dsi_bind
Every iteration of for_each_available_child_of_node() decrements
the reference counter of the previous node. There is no decrement
when break out from the loop and results in refcount leak.
Add missing of_node_put() to fix this.
In the Linux kernel, the following vulnerability has been resolved:
media: tw686x: Fix memory leak in tw686x_video_init
video_device_alloc() allocates memory for vdev,
when video_register_device() fails, it doesn't release the memory and
leads to memory leak, call video_device_release() to fix this.
In the Linux kernel, the following vulnerability has been resolved:
net: hinic: avoid kernel hung in hinic_get_stats64()
When using hinic device as a bond slave device, and reading device stats
of master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4dc
seq_read+0xe0/0x130
proc_reg_read+0xa8/0xe0
vfs_read+0xb0/0x1d4
ksys_read+0x70/0xfc
__arm64_sys_read+0x20/0x30
el0_svc_common+0x88/0x234
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
And the calltrace of task that actually caused kernel hungs as follows:
__switch_to+124
__schedule+548
schedule+72
schedule_timeout+348
__down_common+188
__down+24
down+104
hinic_get_stats64+44 [hinic]
dev_get_stats+92
bond_get_stats+172 [bonding]
dev_get_stats+92
dev_seq_printf_stats+60
dev_seq_show+24
seq_read_iter+964
seq_read+220
proc_reg_read+164
vfs_read+172
ksys_read+108
__arm64_sys_read+28
el0_svc_common+132
do_el0_svc+40
el0_svc+24
el0_sync_handler+164
el0_sync+324
When getting device stats from bond, kernel will call bond_get_stats().
It first holds the spinlock bond->stats_lock, and then call
hinic_get_stats64() to collect hinic device's stats.
However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
protect its critical section, which may schedule current task out.
And if system is under high pressure, the task cannot be woken up
immediately, which eventually triggers kernel hung panic.
Since previous patch has replaced hinic_dev.tx_stats/rx_stats with local
variable in hinic_get_stats64(), there is nothing need to be protected
by lock, so just removing down()/up() is ok.
In the Linux kernel, the following vulnerability has been resolved:
mt76: mt76x02u: fix possible memory leak in __mt76x02u_mcu_send_msg
Free the skb if mt76u_bulk_msg fails in __mt76x02u_mcu_send_msg routine.
In the Linux kernel, the following vulnerability has been resolved:
crypto: hisilicon/sec - don't sleep when in softirq
When kunpeng920 encryption driver is used to deencrypt and decrypt
packets during the softirq, it is not allowed to use mutex lock. The
kernel will report the following error:
BUG: scheduling while atomic: swapper/57/0/0x00000300
Call trace:
dump_backtrace+0x0/0x1e4
show_stack+0x20/0x2c
dump_stack+0xd8/0x140
__schedule_bug+0x68/0x80
__schedule+0x728/0x840
schedule+0x50/0xe0
schedule_preempt_disabled+0x18/0x24
__mutex_lock.constprop.0+0x594/0x5dc
__mutex_lock_slowpath+0x1c/0x30
mutex_lock+0x50/0x60
sec_request_init+0x8c/0x1a0 [hisi_sec2]
sec_process+0x28/0x1ac [hisi_sec2]
sec_skcipher_crypto+0xf4/0x1d4 [hisi_sec2]
sec_skcipher_encrypt+0x1c/0x30 [hisi_sec2]
crypto_skcipher_encrypt+0x2c/0x40
crypto_authenc_encrypt+0xc8/0xfc [authenc]
crypto_aead_encrypt+0x2c/0x40
echainiv_encrypt+0x144/0x1a0 [echainiv]
crypto_aead_encrypt+0x2c/0x40
esp_output_tail+0x348/0x5c0 [esp4]
esp_output+0x120/0x19c [esp4]
xfrm_output_one+0x25c/0x4d4
xfrm_output_resume+0x6c/0x1fc
xfrm_output+0xac/0x3c0
xfrm4_output+0x64/0x130
ip_build_and_send_pkt+0x158/0x20c
tcp_v4_send_synack+0xdc/0x1f0
tcp_conn_request+0x7d0/0x994
tcp_v4_conn_request+0x58/0x6c
tcp_v6_conn_request+0xf0/0x100
tcp_rcv_state_process+0x1cc/0xd60
tcp_v4_do_rcv+0x10c/0x250
tcp_v4_rcv+0xfc4/0x10a4
ip_protocol_deliver_rcu+0xf4/0x200
ip_local_deliver_finish+0x58/0x70
ip_local_deliver+0x68/0x120
ip_sublist_rcv_finish+0x70/0x94
ip_list_rcv_finish.constprop.0+0x17c/0x1d0
ip_sublist_rcv+0x40/0xb0
ip_list_rcv+0x140/0x1dc
__netif_receive_skb_list_core+0x154/0x28c
__netif_receive_skb_list+0x120/0x1a0
netif_receive_skb_list_internal+0xe4/0x1f0
napi_complete_done+0x70/0x1f0
gro_cell_poll+0x9c/0xb0
napi_poll+0xcc/0x264
net_rx_action+0xd4/0x21c
__do_softirq+0x130/0x358
irq_exit+0x11c/0x13c
__handle_domain_irq+0x88/0xf0
gic_handle_irq+0x78/0x2c0
el1_irq+0xb8/0x140
arch_cpu_idle+0x18/0x40
default_idle_call+0x5c/0x1c0
cpuidle_idle_call+0x174/0x1b0
do_idle+0xc8/0x160
cpu_startup_entry+0x30/0x11c
secondary_start_kernel+0x158/0x1e4
softirq: huh, entered softirq 3 NET_RX 0000000093774ee4 with
preempt_count 00000100, exited with fffffe00?
In the Linux kernel, the following vulnerability has been resolved:
kunit: executor: Fix a memory leak on failure in kunit_filter_tests
It's possible that memory allocation for 'filtered' will fail, but for the
copy of the suite to succeed. In this case, the copy could be leaked.
Properly free 'copy' in the error case for the allocation of 'filtered'
failing.
Note that there may also have been a similar issue in
kunit_filter_subsuites, before it was removed in "kunit: flatten
kunit_suite*** to kunit_suite** in .kunit_test_suites".
This was reported by clang-analyzer via the kernel test robot, here:
https://lore.kernel.org/all/c8073b8e-7b9e-0830-4177-87c12f16349c@intel.com/
And by smatch via Dan Carpenter and the kernel test robot:
https://lore.kernel.org/all/202207101328.ASjx88yj-lkp@intel.com/
In the Linux kernel, the following vulnerability has been resolved:
bpf: fix potential 32-bit overflow when accessing ARRAY map element
If BPF array map is bigger than 4GB, element pointer calculation can
overflow because both index and elem_size are u32. Fix this everywhere
by forcing 64-bit multiplication. Extract this formula into separate
small helper and use it consistently in various places.
Speculative-preventing formula utilizing index_mask trick is left as is,
but explicit u64 casts are added in both places.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: When HCI work queue is drained, only queue chained work
The HCI command, event, and data packet processing workqueue is drained
to avoid deadlock in commit
76727c02c1e1 ("Bluetooth: Call drain_workqueue() before resetting state").
There is another delayed work, which will queue command to this drained
workqueue. Which results in the following error report:
Bluetooth: hci2: command 0x040f tx timeout
WARNING: CPU: 1 PID: 18374 at kernel/workqueue.c:1438 __queue_work+0xdad/0x1140
Workqueue: events hci_cmd_timeout
RIP: 0010:__queue_work+0xdad/0x1140
RSP: 0000:ffffc90002cffc60 EFLAGS: 00010093
RAX: 0000000000000000 RBX: ffff8880b9d3ec00 RCX: 0000000000000000
RDX: ffff888024ba0000 RSI: ffffffff814e048d RDI: ffff8880b9d3ec08
RBP: 0000000000000008 R08: 0000000000000000 R09: 00000000b9d39700
R10: ffffffff814f73c6 R11: 0000000000000000 R12: ffff88807cce4c60
R13: 0000000000000000 R14: ffff8880796d8800 R15: ffff8880796d8800
FS: 0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c0174b4000 CR3: 000000007cae9000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? queue_work_on+0xcb/0x110
? lockdep_hardirqs_off+0x90/0xd0
queue_work_on+0xee/0x110
process_one_work+0x996/0x1610
? pwq_dec_nr_in_flight+0x2a0/0x2a0
? rwlock_bug.part.0+0x90/0x90
? _raw_spin_lock_irq+0x41/0x50
worker_thread+0x665/0x1080
? process_one_work+0x1610/0x1610
kthread+0x2e9/0x3a0
? kthread_complete_and_exit+0x40/0x40
ret_from_fork+0x1f/0x30
</TASK>
To fix this, we can add a new HCI_DRAIN_WQ flag, and don't queue the
timeout workqueue while command workqueue is draining.
In the Linux kernel, the following vulnerability has been resolved:
wifi: wil6210: debugfs: fix uninitialized variable use in `wil_write_file_wmi()`
Commit 7a4836560a61 changes simple_write_to_buffer() with memdup_user()
but it forgets to change the value to be returned that came from
simple_write_to_buffer() call. It results in the following warning:
warning: variable 'rc' is uninitialized when used here [-Wuninitialized]
return rc;
^~
Remove rc variable and just return the passed in length if the
memdup_user() succeeds.
In the Linux kernel, the following vulnerability has been resolved:
wifi: libertas: Fix possible refcount leak in if_usb_probe()
usb_get_dev will be called before lbs_get_firmware_async which means that
usb_put_dev need to be called when lbs_get_firmware_async fails.
In the Linux kernel, the following vulnerability has been resolved:
mtd: maps: Fix refcount leak in of_flash_probe_versatile
of_find_matching_node_and_match() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
mtd: maps: Fix refcount leak in ap_flash_init
of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
of: check previous kernel's ima-kexec-buffer against memory bounds
Presently ima_get_kexec_buffer() doesn't check if the previous kernel's
ima-kexec-buffer lies outside the addressable memory range. This can result
in a kernel panic if the new kernel is booted with 'mem=X' arg and the
ima-kexec-buffer was allocated beyond that range by the previous kernel.
The panic is usually of the form below:
$ sudo kexec --initrd initrd vmlinux --append='mem=16G'
<snip>
BUG: Unable to handle kernel data access on read at 0xc000c01fff7f0000
Faulting instruction address: 0xc000000000837974
Oops: Kernel access of bad area, sig: 11 [#1]
<snip>
NIP [c000000000837974] ima_restore_measurement_list+0x94/0x6c0
LR [c00000000083b55c] ima_load_kexec_buffer+0xac/0x160
Call Trace:
[c00000000371fa80] [c00000000083b55c] ima_load_kexec_buffer+0xac/0x160
[c00000000371fb00] [c0000000020512c4] ima_init+0x80/0x108
[c00000000371fb70] [c0000000020514dc] init_ima+0x4c/0x120
[c00000000371fbf0] [c000000000012240] do_one_initcall+0x60/0x2c0
[c00000000371fcc0] [c000000002004ad0] kernel_init_freeable+0x344/0x3ec
[c00000000371fda0] [c0000000000128a4] kernel_init+0x34/0x1b0
[c00000000371fe10] [c00000000000ce64] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
f92100b8 f92100c0 90e10090 910100a0 4182050c 282a0017 3bc00000 40810330
7c0802a6 fb610198 7c9b2378 f80101d0 <a1240000> 2c090001 40820614 e9240010
---[ end trace 0000000000000000 ]---
Fix this issue by checking returned PFN range of previous kernel's
ima-kexec-buffer with page_is_ram() to ensure correct memory bounds.
In the Linux kernel, the following vulnerability has been resolved:
mtd: partitions: Fix refcount leak in parse_redboot_of
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
PCI: microchip: Fix refcount leak in mc_pcie_init_irq_domains()
of_get_next_child() returns a node pointer with refcount incremented, so we
should use of_node_put() on it when we don't need it anymore.
mc_pcie_init_irq_domains() only calls of_node_put() in the normal path,
missing it in some error paths. Add missing of_node_put() to avoid
refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
mtd: parsers: ofpart: Fix refcount leak in bcm4908_partitions_fw_offset
of_find_node_by_path() returns a node pointer with refcount incremented,
we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
PCI: mediatek-gen3: Fix refcount leak in mtk_pcie_init_irq_domains()
of_get_child_by_name() returns a node pointer with refcount incremented, so
we should use of_node_put() on it when we don't need it anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
usb: host: Fix refcount leak in ehci_hcd_ppc_of_probe
of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
usb: ohci-nxp: Fix refcount leak in ohci_hcd_nxp_probe
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
driver core: fix potential deadlock in __driver_attach
In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").
stack like commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").
list below:
In __driver_attach function, The lock holding logic is as follows:
...
__driver_attach
if (driver_allows_async_probing(drv))
device_lock(dev) // get lock dev
async_schedule_dev(__driver_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__driver_attach_async_helper will get lock dev as
will, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:
[ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid:
0 flags:0x00004000
[ 370.788865] Call Trace:
[ 370.789374] <TASK>
[ 370.789841] __schedule+0x482/0x1050
[ 370.790613] schedule+0x92/0x1a0
[ 370.791290] schedule_preempt_disabled+0x2c/0x50
[ 370.792256] __mutex_lock.isra.0+0x757/0xec0
[ 370.793158] __mutex_lock_slowpath+0x1f/0x30
[ 370.794079] mutex_lock+0x50/0x60
[ 370.794795] __device_driver_lock+0x2f/0x70
[ 370.795677] ? driver_probe_device+0xd0/0xd0
[ 370.796576] __driver_attach_async_helper+0x1d/0xd0
[ 370.797318] ? driver_probe_device+0xd0/0xd0
[ 370.797957] async_schedule_node_domain+0xa5/0xc0
[ 370.798652] async_schedule_node+0x19/0x30
[ 370.799243] __driver_attach+0x246/0x290
[ 370.799828] ? driver_allows_async_probing+0xa0/0xa0
[ 370.800548] bus_for_each_dev+0x9d/0x130
[ 370.801132] driver_attach+0x22/0x30
[ 370.801666] bus_add_driver+0x290/0x340
[ 370.802246] driver_register+0x88/0x140
[ 370.802817] ? virtio_scsi_init+0x116/0x116
[ 370.803425] scsi_register_driver+0x1a/0x30
[ 370.804057] init_sd+0x184/0x226
[ 370.804533] do_one_initcall+0x71/0x3a0
[ 370.805107] kernel_init_freeable+0x39a/0x43a
[ 370.805759] ? rest_init+0x150/0x150
[ 370.806283] kernel_init+0x26/0x230
[ 370.806799] ret_from_fork+0x1f/0x30
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
In the Linux kernel, the following vulnerability has been resolved:
kernfs: fix potential NULL dereference in __kernfs_remove
When lockdep is enabled, lockdep_assert_held_write would
cause potential NULL pointer dereference.
Fix the following smatch warnings:
fs/kernfs/dir.c:1353 __kernfs_remove() warn: variable dereferenced before check 'kn' (see line 1346)
In the Linux kernel, the following vulnerability has been resolved:
PCI: dwc: Deallocate EPC memory on dw_pcie_ep_init() errors
If dw_pcie_ep_init() fails to perform any action after the EPC memory is
initialized and the MSI memory region is allocated, the latter parts won't
be undone thus causing a memory leak. Add a cleanup-on-error path to fix
these leaks.
[bhelgaas: commit log]
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: sf-pdma: Add multithread support for a DMA channel
When we get a DMA channel and try to use it in multiple threads it
will cause oops and hanging the system.
% echo 64 > /sys/module/dmatest/parameters/threads_per_chan
% echo 10000 > /sys/module/dmatest/parameters/iterations
% echo 1 > /sys/module/dmatest/parameters/run
[ 89.480664] Unable to handle kernel NULL pointer dereference at virtual
address 00000000000000a0
[ 89.488725] Oops [#1]
[ 89.494708] CPU: 2 PID: 1008 Comm: dma0chan0-copy0 Not tainted
5.17.0-rc5
[ 89.509385] epc : vchan_find_desc+0x32/0x46
[ 89.513553] ra : sf_pdma_tx_status+0xca/0xd6
This happens because of data race. Each thread rewrite channels's
descriptor as soon as device_prep_dma_memcpy() is called. It leads to the
situation when the driver thinks that it uses right descriptor that
actually is freed or substituted for other one.
With current fixes a descriptor changes its value only when it has
been used. A new descriptor is acquired from vc->desc_issued queue that
is already filled with descriptors that are ready to be sent. Threads
have no direct access to DMA channel descriptor. Now it is just possible
to queue a descriptor for further processing.
In the Linux kernel, the following vulnerability has been resolved:
soundwire: revisit driver bind/unbind and callbacks
In the SoundWire probe, we store a pointer from the driver ops into
the 'slave' structure. This can lead to kernel oopses when unbinding
codec drivers, e.g. with the following sequence to remove machine
driver and codec driver.
/sbin/modprobe -r snd_soc_sof_sdw
/sbin/modprobe -r snd_soc_rt711
The full details can be found in the BugLink below, for reference the
two following examples show different cases of driver ops/callbacks
being invoked after the driver .remove().
kernel: BUG: kernel NULL pointer dereference, address: 0000000000000150
kernel: Workqueue: events cdns_update_slave_status_work [soundwire_cadence]
kernel: RIP: 0010:mutex_lock+0x19/0x30
kernel: Call Trace:
kernel: ? sdw_handle_slave_status+0x426/0xe00 [soundwire_bus 94ff184bf398570c3f8ff7efe9e32529f532e4ae]
kernel: ? newidle_balance+0x26a/0x400
kernel: ? cdns_update_slave_status_work+0x1e9/0x200 [soundwire_cadence 1bcf98eebe5ba9833cd433323769ac923c9c6f82]
kernel: BUG: unable to handle page fault for address: ffffffffc07654c8
kernel: Workqueue: pm pm_runtime_work
kernel: RIP: 0010:sdw_bus_prep_clk_stop+0x6f/0x160 [soundwire_bus]
kernel: Call Trace:
kernel: <TASK>
kernel: sdw_cdns_clock_stop+0xb5/0x1b0 [soundwire_cadence 1bcf98eebe5ba9833cd433323769ac923c9c6f82]
kernel: intel_suspend_runtime+0x5f/0x120 [soundwire_intel aca858f7c87048d3152a4a41bb68abb9b663a1dd]
kernel: ? dpm_sysfs_remove+0x60/0x60
This was not detected earlier in Intel tests since the tests first
remove the parent PCI device and shut down the bus. The sequence
above is a corner case which keeps the bus operational but without a
driver bound.
While trying to solve this kernel oopses, it became clear that the
existing SoundWire bus does not deal well with the unbind case.
Commit 528be501b7d4a ("soundwire: sdw_slave: add probe_complete structure and new fields")
added a 'probed' status variable and a 'probe_complete'
struct completion. This status is however not reset on remove and
likewise the 'probe complete' is not re-initialized, so the
bind/unbind/bind test cases would fail. The timeout used before the
'update_status' callback was also a bad idea in hindsight, there
should really be no timing assumption as to if and when a driver is
bound to a device.
An initial draft was based on device_lock() and device_unlock() was
tested. This proved too complicated, with deadlocks created during the
suspend-resume sequences, which also use the same device_lock/unlock()
as the bind/unbind sequences. On a CometLake device, a bad DSDT/BIOS
caused spurious resumes and the use of device_lock() caused hangs
during suspend. After multiple weeks or testing and painful
reverse-engineering of deadlocks on different devices, we looked for
alternatives that did not interfere with the device core.
A bus notifier was used successfully to keep track of DRIVER_BOUND and
DRIVER_UNBIND events. This solved the bind-unbind-bind case in tests,
but it can still be defeated with a theoretical corner case where the
memory is freed by a .remove while the callback is in use. The
notifier only helps make sure the driver callbacks are valid, but not
that the memory allocated in probe remains valid while the callbacks
are invoked.
This patch suggests the introduction of a new 'sdw_dev_lock' mutex
protecting probe/remove and all driver callbacks. Since this mutex is
'local' to SoundWire only, it does not interfere with existing locks
and does not create deadlocks. In addition, this patch removes the
'probe_complete' completion, instead we directly invoke the
'update_status' from the probe routine. That removes any sort of
timing dependency and a much better support for the device/driver
model, the driver could be bound before the bus started, or eons after
the bus started and the hardware would be properly initialized in all
cases.
BugLink: https://github.com/thesofproject/linux/is
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
intel_th: Fix a resource leak in an error handling path
If an error occurs after calling 'pci_alloc_irq_vectors()',
'pci_free_irq_vectors()' must be called as already done in the remove
function.
In the Linux kernel, the following vulnerability has been resolved:
mmc: sdhci-of-esdhc: Fix refcount leak in esdhc_signal_voltage_switch
of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
of_node_put() checks null pointer.
In the Linux kernel, the following vulnerability has been resolved:
memstick/ms_block: Fix a memory leak
'erased_blocks_bitmap' is never freed. As it is allocated at the same time
as 'used_blocks_bitmap', it is likely that it should be freed also at the
same time.
Add the corresponding bitmap_free() in msb_data_clear().
In the Linux kernel, the following vulnerability has been resolved:
usb: aspeed-vhub: Fix refcount leak bug in ast_vhub_init_desc()
We should call of_node_put() for the reference returned by
of_get_child_by_name() which has increased the refcount.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/qedr: Fix potential memory leak in __qedr_alloc_mr()
__qedr_alloc_mr() allocates a memory chunk for "mr->info.pbl_table" with
init_mr_info(). When rdma_alloc_tid() and rdma_register_tid() fail, "mr"
is released while "mr->info.pbl_table" is not released, which will lead
to a memory leak.
We should release the "mr->info.pbl_table" with qedr_free_pbl() when error
occurs to fix the memory leak.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/siw: Fix duplicated reported IW_CM_EVENT_CONNECT_REPLY event
If siw_recv_mpa_rr returns -EAGAIN, it means that the MPA reply hasn't
been received completely, and should not report IW_CM_EVENT_CONNECT_REPLY
in this case. This may trigger a call trace in iw_cm. A simple way to
trigger this:
server: ib_send_lat
client: ib_send_lat -R <server_ip>
The call trace looks like this:
kernel BUG at drivers/infiniband/core/iwcm.c:894!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<...>
Workqueue: iw_cm_wq cm_work_handler [iw_cm]
Call Trace:
<TASK>
cm_work_handler+0x1dd/0x370 [iw_cm]
process_one_work+0x1e2/0x3b0
worker_thread+0x49/0x2e0
? rescuer_thread+0x370/0x370
kthread+0xe5/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
</TASK>
In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix BUG: KASAN: null-ptr-deref in rxe_qp_do_cleanup
The function rxe_create_qp calls rxe_qp_from_init. If some error
occurs, the error handler of function rxe_qp_from_init will set
both scq and rcq to NULL.
Then rxe_create_qp calls rxe_put to handle qp. In the end,
rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
accesses scq and rcq before checking them. This will cause
null-ptr-deref error.
The call graph is as below:
rxe_create_qp {
...
rxe_qp_from_init {
...
err1:
...
qp->rcq = NULL; <---rcq is set to NULL
qp->scq = NULL; <---scq is set to NULL
...
}
qp_init:
rxe_put{
...
rxe_qp_do_cleanup {
...
atomic_dec(&qp->scq->num_wq); <--- scq is accessed
...
atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
}
}
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hfi1: fix potential memory leak in setup_base_ctxt()
setup_base_ctxt() allocates a memory chunk for uctxt->groups with
hfi1_alloc_ctxt_rcv_groups(). When init_user_ctxt() fails, uctxt->groups
is not released, which will lead to a memory leak.
We should release the uctxt->groups with hfi1_free_ctxt_rcv_groups()
when init_user_ctxt() fails.
In the Linux kernel, the following vulnerability has been resolved:
usb: cdns3: change place of 'priv_ep' assignment in cdns3_gadget_ep_dequeue(), cdns3_gadget_ep_enable()
If 'ep' is NULL, result of ep_to_cdns3_ep(ep) is invalid pointer
and its dereference with priv_ep->cdns3_dev may cause panic.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved:
staging: fbtft: core: set smem_len before fb_deferred_io_init call
The fbtft_framebuffer_alloc() calls fb_deferred_io_init() before
initializing info->fix.smem_len. It is set to zero by the
framebuffer_alloc() function. It will trigger a WARN_ON() at the
start of fb_deferred_io_init() and the function will not do anything.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix error unwind in rxe_create_qp()
In the function rxe_create_qp(), rxe_qp_from_init() is called to
initialize qp, internally things like the spin locks are not setup until
rxe_qp_init_req().
If an error occures before this point then the unwind will call
rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
which will oops when trying to access the uninitialized spinlock.
Move the spinlock initializations earlier before any failures.
In the Linux kernel, the following vulnerability has been resolved:
jbd2: fix assertion 'jh->b_frozen_data == NULL' failure when journal aborted
Following process will fail assertion 'jh->b_frozen_data == NULL' in
jbd2_journal_dirty_metadata():
jbd2_journal_commit_transaction
unlink(dir/a)
jh->b_transaction = trans1
jh->b_jlist = BJ_Metadata
journal->j_running_transaction = NULL
trans1->t_state = T_COMMIT
unlink(dir/b)
handle->h_trans = trans2
do_get_write_access
jh->b_modified = 0
jh->b_frozen_data = frozen_buffer
jh->b_next_transaction = trans2
jbd2_journal_dirty_metadata
is_handle_aborted
is_journal_aborted // return false
--> jbd2 abort <--
while (commit_transaction->t_buffers)
if (is_journal_aborted)
jbd2_journal_refile_buffer
__jbd2_journal_refile_buffer
WRITE_ONCE(jh->b_transaction,
jh->b_next_transaction)
WRITE_ONCE(jh->b_next_transaction, NULL)
__jbd2_journal_file_buffer(jh, BJ_Reserved)
J_ASSERT_JH(jh, jh->b_frozen_data == NULL) // assertion failure !
The reproducer (See detail in [Link]) reports:
------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:1629!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 584 Comm: unlink Tainted: G W
5.19.0-rc6-00115-g4a57a8400075-dirty #697
RIP: 0010:jbd2_journal_dirty_metadata+0x3c5/0x470
RSP: 0018:ffffc90000be7ce0 EFLAGS: 00010202
Call Trace:
<TASK>
__ext4_handle_dirty_metadata+0xa0/0x290
ext4_handle_dirty_dirblock+0x10c/0x1d0
ext4_delete_entry+0x104/0x200
__ext4_unlink+0x22b/0x360
ext4_unlink+0x275/0x390
vfs_unlink+0x20b/0x4c0
do_unlinkat+0x42f/0x4c0
__x64_sys_unlink+0x37/0x50
do_syscall_64+0x35/0x80
After journal aborting, __jbd2_journal_refile_buffer() is executed with
holding @jh->b_state_lock, we can fix it by moving 'is_handle_aborted()'
into the area protected by @jh->b_state_lock.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: cros_ec_codec: Fix refcount leak in cros_ec_codec_platform_probe
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.