The vulnerable system is not bound to the network stack and the attacker’s path is via read/write/execute capabilities. Either: the attacker exploits the vulnerability by accessing the target system locally (e.g., keyboard, console), or through terminal emulation (e.g., SSH); or the attacker relies on User Interaction by another person to perform actions required to exploit the vulnerability (e.g., using social engineering techniques to trick a legitimate user into opening a malicious document).
Attack Complexity
Low
AC
The attacker must take no measurable action to exploit the vulnerability. The attack requires no target-specific circumvention to exploit the vulnerability. An attacker can expect repeatable success against the vulnerable system.
Privileges Required
Low
PR
The attacker requires privileges that provide basic capabilities that are typically limited to settings and resources owned by a single low-privileged user. Alternatively, an attacker with Low privileges has the ability to access only non-sensitive resources.
User Interaction
None
UI
The vulnerable system can be exploited without interaction from any human user, other than the attacker. Examples include: a remote attacker is able to send packets to a target system a locally authenticated attacker executes code to elevate privileges
Scope
Unchanged
S
An exploited vulnerability can only affect resources managed by the same security authority. In the case of a vulnerability in a virtualized environment, an exploited vulnerability in one guest instance would not affect neighboring guest instances.
Confidentiality
High
C
There is total information disclosure, resulting in all data on the system being revealed to the attacker, or there is a possibility of the attacker gaining control over confidential data.
Integrity
High
I
There is a total compromise of system integrity. There is a complete loss of system protection, resulting in the attacker being able to modify any file on the target system.
Availability
High
A
There is a total shutdown of the affected resource. The attacker can deny access to the system or data, potentially causing significant loss to the organization.
Below is a copy: Linux unmap_mapping_range() Race Condition
Linux: unmap_mapping_range() race with munmap() on VM_PFNMAP mappings leads to stale TLB entry
For VM_PFNMAP VMAs, there is a race between unmap_mapping_range() and
munmap() that can lead to a page being freed by a device driver while
the page still has stale TLB entries.
There are drivers (in particular GPU drivers) that create
VM_PFNMAP VMAs containing PTEs that point to normal pages
from the page allocator. VM_PFNMAP means that the core kernel
won't track this using the page mapcounts; instead, the driver
is responsible for holding references to the page as long as
it is mapped into userspace.
Some of these drivers have codepaths that can remove userspace
mappings of such pages using unmap_mapping_range(), then give these
pages back to the page allocator.
For example, i915 has a shrinker callback i915_gem_shrink() that does
this.
To make this driver behavior correct, it is necessary that by the time
unmap_mapping_range() returns, all the PTEs in the specified range have
been removed and the corresponding TLB flushes have been executed.
However, munmap() ends up in unmap_region(), which does this:
struct mmu_gather tlb;
lru_add_drain();
tlb_gather_mmu(&tlb, mm);
update_hiwater_rss(mm);
unmap_vmas(&tlb, vma, start, end);
free_pgtables(&tlb, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS,
next ? next->vm_start : USER_PGTABLES_CEILING);
tlb_finish_mmu(&tlb);
unmap_vmas() removes all PTEs in the range, but does not necessarily
perform a TLB flush yet.
free_pgtables() then removes the VMA from the mapping's rbtree
(unlink_file_vma()) before tearing down page tables in the range:
void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma,
unsigned long floor, unsigned long ceiling)
{
while (vma) {
struct vm_area_struct *next = vma->vm_next;
unsigned long addr = vma->vm_start;
/*
* Hide vma from rmap and truncate_pagecache before freeing
* pgtables
*/
unlink_anon_vmas(vma);
unlink_file_vma(vma);
if (is_vm_hugetlb_page(vma)) {
[...]
} else {
[... irrelevant optimization ...]
free_pgd_range(tlb, addr, vma->vm_end,
floor, next ? next->vm_start : ceiling);
}
vma = next;
}
}
The TLB flush corresponding to the PTEs that were removed in
unmap_vmas() might only happen afterwards, in tlb_finish_mmu().
This is bad because starting at unlink_file_vma(), the VMA won't
be visible to unmap_mapping_range() anymore. If the driver calls
unmap_mapping_range() directly after munmap() called
unlink_file_vma(), unmap_mapping_range() won't notice the
existence of this VMA, it might return while there are still
stale TLB entries pointing to this page, and the driver could
then free the page while userspace can still read/write it
through the stale TLB entry.
It would be a pain to actually hit this bug through the i915
driver though, since the only time it ever uses
unmap_mapping_range() like this is in the i915_gem_shrink()
shrinker callback. Instead, I wrote a reproducer against some
out-of-tree GPU driver where the unmap_mapping_range() path
can be triggered directly from userspace, and on a system
with CONFIG_PAGE_POISONING, I managed to read PAGE_POISON
(0xaa) out of the stale PTE from userspace after a few
iterations. So sadly I don't have a nice reproducer for this
issue that works upstream.
I guess if we want to avoid having extra TLB flushes for
non-PFNMAP/MIXEDMAP VMAs, a possible fix would be to add
a new bit in struct mmu_gather to track the existence of
PTEs without struct page, and then conditionally flush
before free_pgtables() if either that bit is set or
mm_tlb_flush_nested() is true?
This bug is subject to a 90-day disclosure deadline. If a fix for this
issue is made available to users before the end of the 90-day deadline,
this bug report will become public 30 days after the fix was made
available. Otherwise, this bug report will become public at the deadline.
The scheduled deadline is 2022-10-04.
Found by: [email protected]
This information is provided for TESTING and LEGAL RESEARCH purposes only. All trademarks used are properties of their respective owners. By visiting this website you agree to Terms of Use and Privacy Policy and Impressum