mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes

Since the 2.6 kernel, the oom killer has slightly biased away from
CAP_SYS_ADMIN processes by discounting some of its memory usage in
comparison to other processes.

This has always been implicit and nothing exactly relies on the behavior.

Gaurav notices that __task_cred() can dereference a potentially freed
pointer if the task under consideration is exiting because a reference to
the task_struct is not held.

Remove the CAP_SYS_ADMIN bias so that all processes are treated equally.

If any CAP_SYS_ADMIN process would like to be biased against, it is always
allowed to adjust /proc/pid/oom_score_adj.

Change-Id: Ib5aabf6e1669301e9367b2495d26f21924ae7209
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803071548510.6996@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Git-commit: a62ca4dbf28fc5caad697f2603bcd5acbadce330
Git-repo: http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/
Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org>
This commit is contained in:
David Rientjes
2018-03-13 23:05:16 +00:00
committed by Gerrit - the friendly Code Review server
parent 11a18c0dc1
commit 46270f4b1d

View File

@@ -202,13 +202,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);
task_unlock(p);
/*
* Root processes get 3% bonus, just like the __vm_enough_memory()
* implementation used by LSMs.
*/
if (has_capability_noaudit(p, CAP_SYS_ADMIN))
points -= (points * 3) / 100;
/* Normalize to oom_score_adj units */
adj *= totalpages / 1000;
points += adj;