In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix combination of jit blinding and pointers to bpf subprogs.
The combination of jit blinding and pointers to bpf subprogs causes:
[ 36.989548] BUG: unable to handle page fault for address: 0000000100000001
[ 36.990342] #PF: supervisor instruction fetch in kernel mode
[ 36.990968] #PF: error_code(0x0010) - not-present page
[ 36.994859] RIP: 0010:0x100000001
[ 36.995209] Code: Unable to access opcode bytes at RIP 0xffffffd7.
[ 37.004091] Call Trace:
[ 37.004351] <TASK>
[ 37.004576] ? bpf_loop+0x4d/0x70
[ 37.004932] ? bpf_prog_3899083f75e4c5de_F+0xe3/0x13b
The jit blinding logic didn't recognize that ld_imm64 with an address
of bpf subprogram is a special instruction and proceeded to randomize it.
By itself it wouldn't have been an issue, but jit_subprogs() logic
relies on two step process to JIT all subprogs and then JIT them
again when addresses of all subprogs are known.
Blinding process in the first JIT phase caused second JIT to miss
adjustment of special ld_imm64.
Fix this issue by ignoring special ld_imm64 instructions that don't have
user controlled constants and shouldn't be blinded.
In the Linux kernel, the following vulnerability has been resolved:
usb: isp1760: Fix out-of-bounds array access
Running the driver through kasan gives an interesting splat:
BUG: KASAN: global-out-of-bounds in isp1760_register+0x180/0x70c
Read of size 20 at addr f1db2e64 by task swapper/0/1
(...)
isp1760_register from isp1760_plat_probe+0x1d8/0x220
(...)
This happens because the loop reading the regmap fields for the
different ISP1760 variants look like this:
for (i = 0; i < HC_FIELD_MAX; i++) { ... }
Meaning it expects the arrays to be at least HC_FIELD_MAX - 1 long.
However the arrays isp1760_hc_reg_fields[], isp1763_hc_reg_fields[],
isp1763_hc_volatile_ranges[] and isp1763_dc_volatile_ranges[] are
dynamically sized during compilation.
Fix this by putting an empty assignment to the [HC_FIELD_MAX]
and [DC_FIELD_MAX] array member at the end of each array.
This will make the array one member longer than it needs to be,
but avoids the risk of overwriting whatever is inside
[HC_FIELD_MAX - 1] and is simple and intuitive to read. Also
add comments explaining what is going on.
In the Linux kernel, the following vulnerability has been resolved:
fs/ntfs3: provide block_invalidate_folio to fix memory leak
The ntfs3 filesystem lacks the 'invalidate_folio' method and it causes
memory leak. If you write to the filesystem and then unmount it, the
cached written data are not freed and they are permanently leaked.
In the Linux kernel, the following vulnerability has been resolved:
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
In mce_threshold_create_device(), if threshold_create_bank() fails, the
previously allocated threshold banks array @bp will be leaked because
the call to mce_threshold_remove_device() will not free it.
This happens because mce_threshold_remove_device() fetches the pointer
through the threshold_banks per-CPU variable but bp is written there
only after the bank creation is successful, and not before, when
threshold_create_bank() fails.
Add a helper which unwinds all the bank creation work previously done
and pass into it the previously allocated threshold banks array for
freeing.
[ bp: Massage. ]
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix potential array overflow in bpf_trampoline_get_progs()
The cnt value in the 'cnt >= BPF_MAX_TRAMP_PROGS' check does not
include BPF_TRAMP_MODIFY_RETURN bpf programs, so the number of
the attached BPF_TRAMP_MODIFY_RETURN bpf programs in a trampoline
can exceed BPF_MAX_TRAMP_PROGS.
When this happens, the assignment '*progs++ = aux->prog' in
bpf_trampoline_get_progs() will cause progs array overflow as the
progs field in the bpf_tramp_progs struct can only hold at most
BPF_MAX_TRAMP_PROGS bpf programs.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock between concurrent dio writes when low on free data space
When reserving data space for a direct IO write we can end up deadlocking
if we have multiple tasks attempting a write to the same file range, there
are multiple extents covered by that file range, we are low on available
space for data and the writes don't expand the inode's i_size.
The deadlock can happen like this:
1) We have a file with an i_size of 1M, at offset 0 it has an extent with
a size of 128K and at offset 128K it has another extent also with a
size of 128K;
2) Task A does a direct IO write against file range [0, 256K), and because
the write is within the i_size boundary, it takes the inode's lock (VFS
level) in shared mode;
3) Task A locks the file range [0, 256K) at btrfs_dio_iomap_begin(), and
then gets the extent map for the extent covering the range [0, 128K).
At btrfs_get_blocks_direct_write(), it creates an ordered extent for
that file range ([0, 128K));
4) Before returning from btrfs_dio_iomap_begin(), it unlocks the file
range [0, 256K);
5) Task A executes btrfs_dio_iomap_begin() again, this time for the file
range [128K, 256K), and locks the file range [128K, 256K);
6) Task B starts a direct IO write against file range [0, 256K) as well.
It also locks the inode in shared mode, as it's within the i_size limit,
and then tries to lock file range [0, 256K). It is able to lock the
subrange [0, 128K) but then blocks waiting for the range [128K, 256K),
as it is currently locked by task A;
7) Task A enters btrfs_get_blocks_direct_write() and tries to reserve data
space. Because we are low on available free space, it triggers the
async data reclaim task, and waits for it to reserve data space;
8) The async reclaim task decides to wait for all existing ordered extents
to complete (through btrfs_wait_ordered_roots()).
It finds the ordered extent previously created by task A for the file
range [0, 128K) and waits for it to complete;
9) The ordered extent for the file range [0, 128K) can not complete
because it blocks at btrfs_finish_ordered_io() when trying to lock the
file range [0, 128K).
This results in a deadlock, because:
- task B is holding the file range [0, 128K) locked, waiting for the
range [128K, 256K) to be unlocked by task A;
- task A is holding the file range [128K, 256K) locked and it's waiting
for the async data reclaim task to satisfy its space reservation
request;
- the async data reclaim task is waiting for ordered extent [0, 128K)
to complete, but the ordered extent can not complete because the
file range [0, 128K) is currently locked by task B, which is waiting
on task A to unlock file range [128K, 256K) and task A waiting
on the async data reclaim task.
This results in a deadlock between 4 task: task A, task B, the async
data reclaim task and the task doing ordered extent completion (a work
queue task).
This type of deadlock can sporadically be triggered by the test case
generic/300 from fstests, and results in a stack trace like the following:
[12084.033689] INFO: task kworker/u16:7:123749 blocked for more than 241 seconds.
[12084.034877] Not tainted 5.18.0-rc2-btrfs-next-115 #1
[12084.035562] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12084.036548] task:kworker/u16:7 state:D stack: 0 pid:123749 ppid: 2 flags:0x00004000
[12084.036554] Workqueue: btrfs-flush_delalloc btrfs_work_helper [btrfs]
[12084.036599] Call Trace:
[12084.036601] <TASK>
[12084.036606] __schedule+0x3cb/0xed0
[12084.036616] schedule+0x4e/0xb0
[12084.036620] btrfs_start_ordered_extent+0x109/0x1c0 [btrfs]
[12084.036651] ? prepare_to_wait_exclusive+0xc0/0xc0
[12084.036659] btrfs_run_ordered_extent_work+0x1a/0x30 [btrfs]
[12084.036688] btrfs_work_helper+0xf8/0x400 [btrfs]
[12084.0367
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
x86/kexec: fix memory leak of elf header buffer
This is reported by kmemleak detector:
unreferenced object 0xffffc900002a9000 (size 4096):
comm "kexec", pid 14950, jiffies 4295110793 (age 373.951s)
hex dump (first 32 bytes):
7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 .ELF............
04 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 ..>.............
backtrace:
[<0000000016a8ef9f>] __vmalloc_node_range+0x101/0x170
[<000000002b66b6c0>] __vmalloc_node+0xb4/0x160
[<00000000ad40107d>] crash_prepare_elf64_headers+0x8e/0xcd0
[<0000000019afff23>] crash_load_segments+0x260/0x470
[<0000000019ebe95c>] bzImage64_load+0x814/0xad0
[<0000000093e16b05>] arch_kexec_kernel_image_load+0x1be/0x2a0
[<000000009ef2fc88>] kimage_file_alloc_init+0x2ec/0x5a0
[<0000000038f5a97a>] __do_sys_kexec_file_load+0x28d/0x530
[<0000000087c19992>] do_syscall_64+0x3b/0x90
[<0000000066e063a4>] entry_SYSCALL_64_after_hwframe+0x44/0xae
In crash_prepare_elf64_headers(), a buffer is allocated via vmalloc() to
store elf headers. While it's not freed back to system correctly when
kdump kernel is reloaded or unloaded. Then memory leak is caused. Fix it
by introducing x86 specific function arch_kimage_file_post_load_cleanup(),
and freeing the buffer there.
And also remove the incorrect elf header buffer freeing code. Before
calling arch specific kexec_file loading function, the image instance has
been initialized. So 'image->elf_headers' must be NULL. It doesn't make
sense to free the elf header buffer in the place.
Three different people have reported three bugs about the memory leak on
x86_64 inside Redhat.
In the Linux kernel, the following vulnerability has been resolved:
ALSA: usb-audio: Cancel pending work at closing a MIDI substream
At closing a USB MIDI output substream, there might be still a pending
work, which would eventually access the rawmidi runtime object that is
being released. For fixing the race, make sure to cancel the pending
work at closing.
In the Linux kernel, the following vulnerability has been resolved:
ipw2x00: Fix potential NULL dereference in libipw_xmit()
crypt and crypt->ops could be null, so we need to checking null
before dereference
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.
Call Trace:
_raw_spin_lock_irqsave+0x32/0x40
lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
lpfc_do_work+0x1485/0x1d70 [lpfc]
kthread+0x112/0x130
ret_from_fork+0x1f/0x40
Kernel panic - not syncing: Hard LOCKUP
The same CPU tries to claim the phba->port_list_lock twice.
Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().
In the Linux kernel, the following vulnerability has been resolved:
cifs: fix potential double free during failed mount
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2088799
In the Linux kernel, the following vulnerability has been resolved:
rcu-tasks: Fix race in schedule and flush work
While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
online cpumask stable. The transient online mask results in below
calltrace.
[ 0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
[ 0.346652] Detected PIPT I-cache on CPU2
[ 0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
[ 0.377255] Detected PIPT I-cache on CPU3
[ 0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
[ 0.379040] ------------[ cut here ]------------
[ 0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
[ 0.384850] Modules linked in:
[ 0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
[ 0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[ 0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.388308] pc : __flush_work+0x12c/0x138
[ 0.388970] lr : __flush_work+0x80/0x138
[ 0.389620] sp : ffffffc00aaf3c60
[ 0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
[ 0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
[ 0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
[ 0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
[ 0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
[ 0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
[ 0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
[ 0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
[ 0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
[ 0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
[ 0.401886] Call trace:
[ 0.402309] __flush_work+0x12c/0x138
[ 0.402941] schedule_on_each_cpu+0x228/0x278
[ 0.403693] rcu_tasks_rude_wait_gp+0x130/0x144
[ 0.404502] rcu_tasks_kthread+0x220/0x254
[ 0.405264] kthread+0x174/0x1ac
[ 0.405837] ret_from_fork+0x10/0x20
[ 0.406456] irq event stamp: 102
[ 0.406966] hardirqs last enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
[ 0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
[ 0.409410] softirqs last enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
[ 0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
[ 0.411890] ---[ end trace 0000000000000000 ]---
[ 0.413000] smp: Brought up 1 node, 4 CPUs
[ 0.413762] SMP: Total of 4 processors activated.
[ 0.414566] CPU features: detected: 32-bit EL0 Support
[ 0.415414] CPU features: detected: 32-bit EL1 Support
[ 0.416278] CPU features: detected: CRC32 instructions
[ 0.447021] Callback from call_rcu_tasks_rude() invoked.
[ 0.506693] Callback from call_rcu_tasks() invoked.
This commit therefore fixes this issue by applying a single-CPU
optimization to the RCU Tasks Rude grace-period process. The key point
here is that the purpose of this RCU flavor is to force a schedule on
each online CPU since some past event. But the rcu_tasks_rude_wait_gp()
function runs in the context of the RCU Tasks Rude's grace-period kthread,
so there must already have been a context switch on the current CPU since
the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
So if there is only a single CPU online, RCU Tasks Rude's grace-period
kthread does not need to anything at all.
It turns out that the rcu_tasks_rude_wait_gp() function's call to
schedule_on_each_cpu() causes problems during early boot. During that
time, there is only one online CPU, namely the boot CPU. Therefore,
applying this single-CPU optimization fixes early-boot instances of
this problem.
In the Linux kernel, the following vulnerability has been resolved:
rtw89: ser: fix CAM leaks occurring in L2 reset
The CAM, meaning address CAM and bssid CAM here, will get leaks during
SER (system error recover) L2 reset process and ieee80211_restart_hw()
which is called by L2 reset process eventually.
The normal flow would be like
-> add interface (acquire 1)
-> enter ips (release 1)
-> leave ips (acquire 1)
-> connection (occupy 1) <(A) 1 leak after L2 reset if non-sec connection>
The ieee80211_restart_hw() flow (under connection)
-> ieee80211 reconfig
-> add interface (acquire 1)
-> leave ips (acquire 1)
-> connection (occupy (A) + 2) <(B) 1 more leak>
Originally, CAM is released before HW restart only if connection is under
security. Now, release CAM whatever connection it is to fix leak in (A).
OTOH, check if CAM is already valid to avoid acquiring multiple times to
fix (B).
Besides, if AP mode, release address CAM of all stations before HW restart.
In the Linux kernel, the following vulnerability has been resolved:
ALSA: jack: Access input_dev under mutex
It is possible when using ASoC that input_dev is unregistered while
calling snd_jack_report, which causes NULL pointer dereference.
In order to prevent this serialize access to input_dev using mutex lock.
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix call trace observed during I/O with CMF enabled
The following was seen with CMF enabled:
BUG: using smp_processor_id() in preemptible
code: systemd-udevd/31711
kernel: caller is lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: CPU: 12 PID: 31711 Comm: systemd-udevd
kernel: Call Trace:
kernel: <TASK>
kernel: dump_stack_lvl+0x44/0x57
kernel: check_preemption_disabled+0xbf/0xe0
kernel: lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: lpfc_nvme_fcp_io_submit+0x23b4/0x4df0 [lpfc]
this_cpu_ptr() calls smp_processor_id() in a preemptible context.
Fix by using per_cpu_ptr() with raw_smp_processor_id() instead.
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock
During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.
CPU A:
native_queued_spin_lock_slowpath+0x192
_raw_spin_lock_irqsave+0x32
lpfc_handle_fcp_err+0x4c6
lpfc_fcp_io_cmd_wqe_cmpl+0x964
lpfc_sli4_fp_handle_cqe+0x266
__lpfc_sli4_process_cq+0x105
__lpfc_sli4_hba_process_cq+0x3c
lpfc_cq_poll_hdler+0x16
irq_poll_softirq+0x76
__softirqentry_text_start+0xe4
irq_exit+0xf7
do_IRQ+0x7f
CPU B:
native_queued_spin_lock_slowpath+0x5b
_raw_spin_lock+0x1c
lpfc_abort_handler+0x13e
scmd_eh_abort_handler+0x85
process_one_work+0x1a7
worker_thread+0x30
kthread+0x112
ret_from_fork+0x1f
Diagram of lockup:
CPUA CPUB
---- ----
lpfc_cmd->buf_lock
phba->hbalock
lpfc_cmd->buf_lock
phba->hbalock
Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix null pointer dereference after failing to issue FLOGI and PLOGI
If lpfc_issue_els_flogi() fails and returns non-zero status, the node
reference count is decremented to trigger the release of the nodelist
structure. However, if there is a prior registration or dev-loss-evt work
pending, the node may be released prematurely. When dev-loss-evt
completes, the released node is referenced causing a use-after-free null
pointer dereference.
Similarly, when processing non-zero ELS PLOGI completion status in
lpfc_cmpl_els_plogi(), the ndlp flags are checked for a transport
registration before triggering node removal. If dev-loss-evt work is
pending, the node may be released prematurely and a subsequent call to
lpfc_dev_loss_tmo_handler() results in a use after free ndlp dereference.
Add test for pending dev-loss before decrementing the node reference count
for FLOGI, PLOGI, PRLI, and ADISC handling.
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Protect memory leak for NPIV ports sending PLOGI_RJT
There is a potential memory leak in lpfc_ignore_els_cmpl() and
lpfc_els_rsp_reject() that was allocated from NPIV PLOGI_RJT
(lpfc_rcv_plogi()'s login_mbox).
Check if cmdiocb->context_un.mbox was allocated in lpfc_ignore_els_cmpl(),
and then free it back to phba->mbox_mem_pool along with mbox->ctx_buf for
service parameters.
For lpfc_els_rsp_reject() failure, free both the ctx_buf for service
parameters and the login_mbox.
In the Linux kernel, the following vulnerability has been resolved:
ath11k: Change max no of active probe SSID and BSSID to fw capability
The maximum number of SSIDs in a for active probe requests is currently
reported as 16 (WLAN_SCAN_PARAMS_MAX_SSID) when registering the driver.
The scan_req_params structure only has the capacity to hold 10 SSIDs.
This leads to a buffer overflow which can be triggered from
wpa_supplicant in userspace. When copying the SSIDs into the
scan_req_params structure in the ath11k_mac_op_hw_scan route, it can
overwrite the extraie pointer.
Firmware supports 16 ssid * 4 bssid, for each ssid 4 bssid combo probe
request will be sent, so totally 64 probe requests supported. So
set both max ssid and bssid to 16 and 4 respectively. Remove the
redundant macros of ssid and bssid.
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.7.0.1-01300-QCAHKSWPL_SILICONZ-1
In the Linux kernel, the following vulnerability has been resolved:
loop: implement ->free_disk
Ensure that the lo_device which is stored in the gendisk private
data is valid until the gendisk is freed. Currently the loop driver
uses a lot of effort to make sure a device is not freed when it is
still in use, but to to fix a potential deadlock this will be relaxed
a bit soon.
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/pm: fix double free in si_parse_power_table()
In function si_parse_power_table(), array adev->pm.dpm.ps and its member
is allocated. If the allocation of each member fails, the array itself
is freed and returned with an error code. However, the array is later
freed again in si_dpm_fini() function which is called when the function
returns an error.
This leads to potential double free of the array adev->pm.dpm.ps, as
well as leak of its array members, since the members are not freed in
the allocation function and the array is not nulled when freed.
In addition adev->pm.dpm.num_ps, which keeps track of the allocated
array member, is not updated until the member allocation is
successfully finished, this could also lead to either use after free,
or uninitialized variable access in si_dpm_fini().
Fix this by postponing the free of the array until si_dpm_fini() and
increment adev->pm.dpm.num_ps everytime the array member is allocated.
In the Linux kernel, the following vulnerability has been resolved:
media: i2c: dw9714: Disable the regulator when the driver fails to probe
When the driver fails to probe, we will get the following splat:
[ 59.305988] ------------[ cut here ]------------
[ 59.306417] WARNING: CPU: 2 PID: 395 at drivers/regulator/core.c:2257 _regulator_put+0x3ec/0x4e0
[ 59.310345] RIP: 0010:_regulator_put+0x3ec/0x4e0
[ 59.318362] Call Trace:
[ 59.318582] <TASK>
[ 59.318765] regulator_put+0x1f/0x30
[ 59.319058] devres_release_group+0x319/0x3d0
[ 59.319420] i2c_device_probe+0x766/0x940
Fix this by disabling the regulator in error handling.
In the Linux kernel, the following vulnerability has been resolved:
media: venus: hfi: avoid null dereference in deinit
If venus_probe fails at pm_runtime_put_sync the error handling first
calls hfi_destroy and afterwards hfi_core_deinit. As hfi_destroy sets
core->ops to NULL, hfi_core_deinit cannot call the core_deinit function
anymore.
Avoid this null pointer derefence by skipping the call when necessary.
In the Linux kernel, the following vulnerability has been resolved:
media: cx25821: Fix the warning when removing the module
When removing the module, we will get the following warning:
[ 14.746697] remove_proc_entry: removing non-empty directory 'irq/21', leaking at least 'cx25821[1]'
[ 14.747449] WARNING: CPU: 4 PID: 368 at fs/proc/generic.c:717 remove_proc_entry+0x389/0x3f0
[ 14.751611] RIP: 0010:remove_proc_entry+0x389/0x3f0
[ 14.759589] Call Trace:
[ 14.759792] <TASK>
[ 14.759975] unregister_irq_proc+0x14c/0x170
[ 14.760340] irq_free_descs+0x94/0xe0
[ 14.760640] mp_unmap_irq+0xb6/0x100
[ 14.760937] acpi_unregister_gsi_ioapic+0x27/0x40
[ 14.761334] acpi_pci_irq_disable+0x1d3/0x320
[ 14.761688] pci_disable_device+0x1ad/0x380
[ 14.762027] ? _raw_spin_unlock_irqrestore+0x2d/0x60
[ 14.762442] ? cx25821_shutdown+0x20/0x9f0 [cx25821]
[ 14.762848] cx25821_finidev+0x48/0xc0 [cx25821]
[ 14.763242] pci_device_remove+0x92/0x240
Fix this by freeing the irq before call pci_disable_device().
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp()
If no handler is found in lpfc_complete_unsol_iocb() to match the rctl of a
received frame, the frame is dropped and resources are leaked.
Fix by returning resources when discarding an unhandled frame type. Update
lpfc_fc_frame_check() handling of NOP basic link service.
In the Linux kernel, the following vulnerability has been resolved:
arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
If a compat process tries to execute an unknown system call above the
__ARM_NR_COMPAT_END number, the kernel sends a SIGILL signal to the
offending process. Information about the error is printed to dmesg in
compat_arm_syscall() -> arm64_notify_die() -> arm64_force_sig_fault() ->
arm64_show_signal().
arm64_show_signal() interprets a non-zero value for
current->thread.fault_code as an exception syndrome and displays the
message associated with the ESR_ELx.EC field (bits 31:26).
current->thread.fault_code is set in compat_arm_syscall() ->
arm64_notify_die() with the bad syscall number instead of a valid ESR_ELx
value. This means that the ESR_ELx.EC field has the value that the user set
for the syscall number and the kernel can end up printing bogus exception
messages*. For example, for the syscall number 0x68000000, which evaluates
to ESR_ELx.EC value of 0x1A (ESR_ELx_EC_FPAC) the kernel prints this error:
[ 18.349161] syscall[300]: unhandled exception: ERET/ERETAA/ERETAB, ESR 0x68000000, Oops - bad compat syscall(2) in syscall[10000+50000]
[ 18.350639] CPU: 2 PID: 300 Comm: syscall Not tainted 5.18.0-rc1 #79
[ 18.351249] Hardware name: Pine64 RockPro64 v2.0 (DT)
[..]
which is misleading, as the bad compat syscall has nothing to do with
pointer authentication.
Stop arm64_show_signal() from printing exception syndrome information by
having compat_arm_syscall() set the ESR_ELx value to 0, as it has no
meaning for an invalid system call number. The example above now becomes:
[ 19.935275] syscall[301]: unhandled exception: Oops - bad compat syscall(2) in syscall[10000+50000]
[ 19.936124] CPU: 1 PID: 301 Comm: syscall Not tainted 5.18.0-rc1-00005-g7e08006d4102 #80
[ 19.936894] Hardware name: Pine64 RockPro64 v2.0 (DT)
[..]
which although shows less information because the syscall number,
wrongfully advertised as the ESR value, is missing, it is better than
showing plainly wrong information. The syscall number can be easily
obtained with strace.
*A 32-bit value above or equal to 0x8000_0000 is interpreted as a negative
integer in compat_arm_syscal() and the condition scno < __ARM_NR_COMPAT_END
evaluates to true; the syscall will exit to userspace in this case with the
ENOSYS error code instead of arm64_notify_die() being called.
In the Linux kernel, the following vulnerability has been resolved:
ath10k: skip ath10k_halt during suspend for driver state RESTARTING
Double free crash is observed when FW recovery(caused by wmi
timeout/crash) is followed by immediate suspend event. The FW recovery
is triggered by ath10k_core_restart() which calls driver clean up via
ath10k_halt(). When the suspend event occurs between the FW recovery,
the restart worker thread is put into frozen state until suspend completes.
The suspend event triggers ath10k_stop() which again triggers ath10k_halt()
The double invocation of ath10k_halt() causes ath10k_htt_rx_free() to be
called twice(Note: ath10k_htt_rx_alloc was not called by restart worker
thread because of its frozen state), causing the crash.
To fix this, during the suspend flow, skip call to ath10k_halt() in
ath10k_stop() when the current driver state is ATH10K_STATE_RESTARTING.
Also, for driver state ATH10K_STATE_RESTARTING, call
ath10k_wait_for_suspend() in ath10k_stop(). This is because call to
ath10k_wait_for_suspend() is skipped later in
[ath10k_halt() > ath10k_core_stop()] for the driver state
ATH10K_STATE_RESTARTING.
The frozen restart worker thread will be cancelled during resume when the
device comes out of suspend.
Below is the crash stack for reference:
[ 428.469167] ------------[ cut here ]------------
[ 428.469180] kernel BUG at mm/slub.c:4150!
[ 428.469193] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 428.469219] Workqueue: events_unbound async_run_entry_fn
[ 428.469230] RIP: 0010:kfree+0x319/0x31b
[ 428.469241] RSP: 0018:ffffa1fac015fc30 EFLAGS: 00010246
[ 428.469247] RAX: ffffedb10419d108 RBX: ffff8c05262b0000
[ 428.469252] RDX: ffff8c04a8c07000 RSI: 0000000000000000
[ 428.469256] RBP: ffffa1fac015fc78 R08: 0000000000000000
[ 428.469276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 428.469285] Call Trace:
[ 428.469295] ? dma_free_attrs+0x5f/0x7d
[ 428.469320] ath10k_core_stop+0x5b/0x6f
[ 428.469336] ath10k_halt+0x126/0x177
[ 428.469352] ath10k_stop+0x41/0x7e
[ 428.469387] drv_stop+0x88/0x10e
[ 428.469410] __ieee80211_suspend+0x297/0x411
[ 428.469441] rdev_suspend+0x6e/0xd0
[ 428.469462] wiphy_suspend+0xb1/0x105
[ 428.469483] ? name_show+0x2d/0x2d
[ 428.469490] dpm_run_callback+0x8c/0x126
[ 428.469511] ? name_show+0x2d/0x2d
[ 428.469517] __device_suspend+0x2e7/0x41b
[ 428.469523] async_suspend+0x1f/0x93
[ 428.469529] async_run_entry_fn+0x3d/0xd1
[ 428.469535] process_one_work+0x1b1/0x329
[ 428.469541] worker_thread+0x213/0x372
[ 428.469547] kthread+0x150/0x15f
[ 428.469552] ? pr_cont_work+0x58/0x58
[ 428.469558] ? kthread_blkcg+0x31/0x31
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1
In the Linux kernel, the following vulnerability has been resolved:
ASoC: SOF: ipc3-topology: Correct get_control_data for non bytes payload
It is possible to craft a topology where sof_get_control_data() would do
out of bounds access because it expects that it is only called when the
payload is bytes type.
Confusingly it also handles other types of controls, but the payload
parsing implementation is only valid for bytes.
Fix the code to count the non bytes controls and instead of storing a
pointer to sof_abi_hdr in sof_widget_data (which is only valid for bytes),
store the pointer to the data itself and add a new member to save the size
of the data.
In case of non bytes controls we store the pointer to the chanv itself,
which is just an array of values at the end.
In case of bytes control, drop the wrong cdata->data (wdata[i].pdata) check
against NULL since it is incorrect and invalid in this context.
The data is pointing to the end of cdata struct, so it should never be
null.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: mediatek: Fix missing of_node_put in mt2701_wm8960_machine_probe
This node pointer is returned by of_parse_phandle() with
refcount incremented in this function.
Calling of_node_put() to avoid the refcount leak.
In the Linux kernel, the following vulnerability has been resolved:
ice: always check VF VSI pointer values
The ice_get_vf_vsi function can return NULL in some cases, such as if
handling messages during a reset where the VSI is being removed and
recreated.
Several places throughout the driver do not bother to check whether this
VSI pointer is valid. Static analysis tools maybe report issues because
they detect paths where a potentially NULL pointer could be dereferenced.
Fix this by checking the return value of ice_get_vf_vsi everywhere.
In the Linux kernel, the following vulnerability has been resolved:
ASoC: cs35l41: Fix an out-of-bounds access in otp_packed_element_t
The CS35L41_NUM_OTP_ELEM is 100, but only 99 entries are defined in
the array otp_map_1/2[CS35L41_NUM_OTP_ELEM], this will trigger UBSAN
to report a shift-out-of-bounds warning in the cs35l41_otp_unpack()
since the last entry in the array will result in GENMASK(-1, 0).
UBSAN reports this problem:
UBSAN: shift-out-of-bounds in /home/hwang4/build/jammy/jammy/sound/soc/codecs/cs35l41-lib.c:836:8
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic #23
Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
Call Trace:
<TASK>
show_stack+0x52/0x58
dump_stack_lvl+0x4a/0x5f
dump_stack+0x10/0x12
ubsan_epilogue+0x9/0x45
__ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
? regmap_unlock_mutex+0xe/0x10
cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]
? cs35l41_hda_i2c_remove+0x20/0x20 [snd_hda_scodec_cs35l41_i2c]
i2c_device_probe+0x252/0x2b0
In the Linux kernel, the following vulnerability has been resolved:
ASoC: mediatek: Fix error handling in mt8173_max98090_dev_probe
Call of_node_put(platform_node) to avoid refcount leak in
the error path.
In the Linux kernel, the following vulnerability has been resolved:
cpufreq: governor: Use kobject release() method to free dbs_data
The struct dbs_data embeds a struct gov_attr_set and
the struct gov_attr_set embeds a kobject. Since every kobject must have
a release() method and we can't use kfree() to free it directly,
so introduce cpufreq_dbs_data_release() to release the dbs_data via
the kobject::release() method. This fixes the calltrace like below:
ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x34
WARNING: CPU: 12 PID: 810 at lib/debugobjects.c:505 debug_print_object+0xb8/0x100
Modules linked in:
CPU: 12 PID: 810 Comm: sh Not tainted 5.16.0-next-20220120-yocto-standard+ #536
Hardware name: Marvell OcteonTX CN96XX board (DT)
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : debug_print_object+0xb8/0x100
lr : debug_print_object+0xb8/0x100
sp : ffff80001dfcf9a0
x29: ffff80001dfcf9a0 x28: 0000000000000001 x27: ffff0001464f0000
x26: 0000000000000000 x25: ffff8000090e3f00 x24: ffff80000af60210
x23: ffff8000094dfb78 x22: ffff8000090e3f00 x21: ffff0001080b7118
x20: ffff80000aeb2430 x19: ffff800009e8f5e0 x18: 0000000000000000
x17: 0000000000000002 x16: 00004d62e58be040 x15: 013590470523aff8
x14: ffff8000090e1828 x13: 0000000001359047 x12: 00000000f5257d14
x11: 0000000000040591 x10: 0000000066c1ffea x9 : ffff8000080d15e0
x8 : ffff80000a1765a8 x7 : 0000000000000000 x6 : 0000000000000001
x5 : ffff800009e8c000 x4 : ffff800009e8c760 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0001474ed040
Call trace:
debug_print_object+0xb8/0x100
__debug_check_no_obj_freed+0x1d0/0x25c
debug_check_no_obj_freed+0x24/0xa0
kfree+0x11c/0x440
cpufreq_dbs_governor_exit+0xa8/0xac
cpufreq_exit_governor+0x44/0x90
cpufreq_set_policy+0x29c/0x570
store_scaling_governor+0x110/0x154
store+0xb0/0xe0
sysfs_kf_write+0x58/0x84
kernfs_fop_write_iter+0x12c/0x1c0
new_sync_write+0xf0/0x18c
vfs_write+0x1cc/0x220
ksys_write+0x74/0x100
__arm64_sys_write+0x28/0x3c
invoke_syscall.constprop.0+0x58/0xf0
do_el0_svc+0x70/0x170
el0_svc+0x54/0x190
el0t_64_sync_handler+0xa4/0x130
el0t_64_sync+0x1a0/0x1a4
irq event stamp: 189006
hardirqs last enabled at (189005): [<ffff8000080849d0>] finish_task_switch.isra.0+0xe0/0x2c0
hardirqs last disabled at (189006): [<ffff8000090667a4>] el1_dbg+0x24/0xa0
softirqs last enabled at (188966): [<ffff8000080106d0>] __do_softirq+0x4b0/0x6a0
softirqs last disabled at (188957): [<ffff80000804a618>] __irq_exit_rcu+0x108/0x1a4
[ rjw: Because can be freed by the gov_attr_set_put() in
cpufreq_dbs_governor_exit() now, it is also necessary to put the
invocation of the governor ->exit() callback into the new
cpufreq_dbs_data_release() function. ]
In the Linux kernel, the following vulnerability has been resolved:
mtd: rawnand: denali: Use managed device resources
All of the resources used by this driver has managed interfaces, so use
them. Otherwise we will get the following splat:
[ 4.472703] denali-nand-pci 0000:00:05.0: timeout while waiting for irq 0x1000
[ 4.474071] denali-nand-pci: probe of 0000:00:05.0 failed with error -5
[ 4.473538] nand: No NAND device found
[ 4.474068] BUG: unable to handle page fault for address: ffffc90005000410
[ 4.475169] #PF: supervisor write access in kernel mode
[ 4.475579] #PF: error_code(0x0002) - not-present page
[ 4.478362] RIP: 0010:iowrite32+0x9/0x50
[ 4.486068] Call Trace:
[ 4.486269] <IRQ>
[ 4.486443] denali_isr+0x15b/0x300 [denali]
[ 4.486788] ? denali_direct_write+0x50/0x50 [denali]
[ 4.487189] __handle_irq_event_percpu+0x161/0x3b0
[ 4.487571] handle_irq_event+0x7d/0x1b0
[ 4.487884] handle_fasteoi_irq+0x2b0/0x770
[ 4.488219] __common_interrupt+0xc8/0x1b0
[ 4.488549] common_interrupt+0x9a/0xc0
In the Linux kernel, the following vulnerability has been resolved:
fbdev: defio: fix the pagelist corruption
Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (ffffffffc0ceb090), but
was ffffec604507edc8. (prev=ffffec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:
<TASK>
fb_deferred_io_mkwrite+0xea/0x150
do_page_mkwrite+0x57/0xc0
do_wp_page+0x278/0x2f0
__handle_mm_fault+0xdc2/0x1590
handle_mm_fault+0xdd/0x2c0
do_user_addr_fault+0x1d3/0x650
exc_page_fault+0x77/0x180
? asm_exc_page_fault+0x8/0x30
asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==
Figure out the race happens when one process is adding &page->lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same &page->lru in fb_deferred_io_fault(), which is
not protected by the lock.
This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.
V2: change "int i" to "unsigned int i" (Geert Uytterhoeven)
In the Linux kernel, the following vulnerability has been resolved:
drm/omap: fix NULL but dereferenced coccicheck error
Fix the following coccicheck warning:
./drivers/gpu/drm/omapdrm/omap_overlay.c:89:22-25: ERROR: r_ovl is NULL
but dereferenced.
Here should be ovl->idx rather than r_ovl->idx.
In the Linux kernel, the following vulnerability has been resolved:
HID: elan: Fix potential double free in elan_input_configured
'input' is a managed resource allocated with devm_input_allocate_device(),
so there is no need to call input_free_device() explicitly or
there will be a double free.
According to the doc of devm_input_allocate_device():
* Managed input devices do not need to be explicitly unregistered or
* freed as it will be done automatically when owner device unbinds from
* its driver (or binding fails).
In the Linux kernel, the following vulnerability has been resolved:
regulator: da9121: Fix uninit-value in da9121_assign_chip_model()
KASAN report slab-out-of-bounds in __regmap_init as follows:
BUG: KASAN: slab-out-of-bounds in __regmap_init drivers/base/regmap/regmap.c:841
Read of size 1 at addr ffff88803678cdf1 by task xrun/9137
CPU: 0 PID: 9137 Comm: xrun Tainted: G W 5.18.0-rc2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x15a lib/dump_stack.c:88
print_report.cold+0xcd/0x69b mm/kasan/report.c:313
kasan_report+0x8e/0xc0 mm/kasan/report.c:491
__regmap_init+0x4540/0x4ba0 drivers/base/regmap/regmap.c:841
__devm_regmap_init+0x7a/0x100 drivers/base/regmap/regmap.c:1266
__devm_regmap_init_i2c+0x65/0x80 drivers/base/regmap/regmap-i2c.c:394
da9121_i2c_probe+0x386/0x6d1 drivers/regulator/da9121-regulator.c:1039
i2c_device_probe+0x959/0xac0 drivers/i2c/i2c-core-base.c:563
This happend when da9121 device is probe by da9121_i2c_id, but with
invalid dts. Thus, chip->subvariant_id is set to -EINVAL, and later
da9121_assign_chip_model() will access 'regmap' without init it.
Fix it by return -EINVAL from da9121_assign_chip_model() if
'chip->subvariant_id' is invalid.
In the Linux kernel, the following vulnerability has been resolved:
drm/mediatek: Add vblank register/unregister callback functions
We encountered a kernel panic issue that callback data will be NULL when
it's using in ovl irq handler. There is a timing issue between
mtk_disp_ovl_irq_handler() and mtk_ovl_disable_vblank().
To resolve this issue, we use the flow to register/unregister vblank cb:
- Register callback function and callback data when crtc creates.
- Unregister callback function and callback data when crtc destroies.
With this solution, we can assure callback data will not be NULL when
vblank is disable.
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Inhibit aborts if external loopback plug is inserted
After running a short external loopback test, when the external loopback is
removed and a normal cable inserted that is directly connected to a target
device, the system oops in the llpfc_set_rrq_active() routine.
When the loopback was inserted an FLOGI was transmit. As we're looped back,
we receive the FLOGI request. The FLOGI is ABTS'd as we recognize the same
wppn thus understand it's a loopback. However, as the ABTS sends address
information the port is not set to (fffffe), the ABTS is dropped on the
wire. A short 1 frame loopback test is run and completes before the ABTS
times out. The looback is unplugged and the new cable plugged in, and the
an FLOGI to the new device occurs and completes. Due to a mixup in ref
counting the completion of the new FLOGI releases the fabric ndlp. Then the
original ABTS completes and references the released ndlp generating the
oops.
Correct by no-op'ing the ABTS when in loopback mode (it will be dropped
anyway). Added a flag to track the mode to recognize when it should be
no-op'd.
In the Linux kernel, the following vulnerability has been resolved:
ath9k_htc: fix potential out of bounds access with invalid rxstatus->rs_keyix
The "rxstatus->rs_keyix" eventually gets passed to test_bit() so we need to
ensure that it is within the bitmap.
drivers/net/wireless/ath/ath9k/common.c:46 ath9k_cmn_rx_accept()
error: passing untrusted data 'rx_stats->rs_keyix' to 'test_bit()'
In the Linux kernel, the following vulnerability has been resolved:
media: rga: fix possible memory leak in rga_probe
rga->m2m_dev needs to be freed when rga_probe fails.
In the Linux kernel, the following vulnerability has been resolved:
wl1251: dynamically allocate memory used for DMA
With introduction of vmap'ed stacks, stack parameters can no
longer be used for DMA and now leads to kernel panic.
It happens at several places for the wl1251 (e.g. when
accessed through SDIO) making it unuseable on e.g. the
OpenPandora.
We solve this by allocating temporary buffers or use wl1251_read32().
Tested on v5.18-rc5 with OpenPandora.
In the Linux kernel, the following vulnerability has been resolved:
drm/msm: Fix null pointer dereferences without iommu
Check if 'aspace' is set before using it as it will stay null without
IOMMU, such as on msm8974.