523 Commits

Author SHA1 Message Date
Blagovest Kolenichev
f4b8243182 Merge android-4.9.93 (05baf14) into msm-4.9
* refs/heads/tmp-05baf14:
  Linux 4.9.93
  spi: davinci: fix up dma_mapping_error() incorrect patch
  Revert "ip6_vti: adjust vti mtu according to mtu of lower device"
  Revert "mtip32xx: use runtime tag to initialize command header"
  Revert "spi: bcm-qspi: shut up warning about cfi header inclusion"
  Revert "ARM: dts: omap3-n900: Fix the audio CODEC's reset pin"
  Revert "ARM: dts: am335x-pepper: Fix the audio CODEC's reset pin"
  Fix slab name "biovec-(1<<(21-12))"
  net: hns: Fix ethtool private flags
  md/raid10: reset the 'first' at the end of loop
  ARM: dts: am57xx-idk-common: Add overide powerhold property
  ARM: dts: am57xx-beagle-x15-common: Add overide powerhold property
  ARM: dts: dra7: Add power hold and power controller properties to palmas
  Documentation: pinctrl: palmas: Add ti,palmas-powerhold-override property definition
  vt: change SGR 21 to follow the standards
  Input: i8042 - enable MUX on Sony VAIO VGN-CS series to fix touchpad
  Input: i8042 - add Lenovo ThinkPad L460 to i8042 reset list
  Input: ALPS - fix TrackStick detection on Thinkpad L570 and Latitude 7370
  staging: comedi: ni_mio_common: ack ai fifo error interrupts.
  crypto: x86/cast5-avx - fix ECB encryption when long sg follows short one
  crypto: ahash - Fix early termination in hash walk
  parport_pc: Add support for WCH CH382L PCI-E single parallel port card.
  media: usbtv: prevent double free in error case
  mei: remove dev_err message on an unsupported ioctl
  USB: serial: cp210x: add ELDAT Easywave RX09 id
  USB: serial: ftdi_sio: add support for Harman FirmwareHubEmulator
  USB: serial: ftdi_sio: add RT Systems VX-8 cable
  arm64: idmap: Use "awx" flags for .idmap.text .pushsection directives
  arm64: entry: Reword comment about post_ttbr_update_workaround
  arm64: Force KPTI to be disabled on Cavium ThunderX
  arm64: kpti: Add ->enable callback to remap swapper using nG mappings
  arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()
  arm64: Turn on KPTI only on CPUs that need it
  arm64: cputype: Add MIDR values for Cavium ThunderX2 CPUs
  arm64: capabilities: Handle duplicate entries for a capability
  arm64: Allow checking of a CPU-local erratum
  arm64: Take into account ID_AA64PFR0_EL1.CSV3
  arm64: Kconfig: Reword UNMAP_KERNEL_AT_EL0 kconfig entry
  arm64: Kconfig: Add CONFIG_UNMAP_KERNEL_AT_EL0
  arm64: use RET instruction for exiting the trampoline
  arm64: kaslr: Put kernel vectors address in separate data page
  arm64: entry: Add fake CPU feature for unmapping the kernel at EL0
  arm64: tls: Avoid unconditional zeroing of tpidrro_el0 for native tasks
  arm64: entry: Hook up entry trampoline to exception vectors
  arm64: entry: Explicitly pass exception level to kernel_ventry macro
  arm64: mm: Map entry trampoline into trampoline and kernel page tables
  arm64: entry: Add exception trampoline page for exceptions from EL0
  module: extend 'rodata=off' boot cmdline parameter to module mappings
  arm64: factor out entry stack manipulation
  arm64: mm: Invalidate both kernel and user ASIDs when performing TLBI
  arm64: mm: Add arm64_kernel_unmapped_at_el0 helper
  arm64: mm: Allocate ASIDs in pairs
  arm64: mm: Move ASID from TTBR0 to TTBR1
  arm64: mm: Use non-global mappings for kernel space
  usb: dwc2: Improve gadget state disconnection handling
  scsi: virtio_scsi: always read VPD pages for multiqueue too
  llist: clang: introduce member_address_is_nonnull()
  Bluetooth: Fix missing encryption refresh on Security Request
  netfilter: x_tables: add and use xt_check_proc_name
  netfilter: bridge: ebt_among: add more missing match size checks
  xfrm: Refuse to insert 32 bit userspace socket policies on 64 bit systems
  net: xfrm: use preempt-safe this_cpu_read() in ipcomp_alloc_tfms()
  RDMA/ucma: Introduce safer rdma_addr_size() variants
  RDMA/ucma: Check that device exists prior to accessing it
  RDMA/ucma: Check that device is connected prior to access it
  RDMA/ucma: Ensure that CM_ID exists prior to access it
  RDMA/ucma: Fix use-after-free access in ucma_close
  RDMA/ucma: Check AF family prior resolving address
  xfrm_user: uncoditionally validate esn replay attribute struct
  mm/vmscan.c: fix unsequenced modification and access warning
  selinux: Remove redundant check for unknown labeling behavior
  arm64: avoid overflow in VA_START and PAGE_OFFSET
  btrfs: Remove extra parentheses from condition in copy_items()
  mac80211: ibss: Fix channel type enum in ieee80211_sta_join_ibss()
  mac80211: Fix clang warning about constant operand in logical operation
  netfilter: ctnetlink: Make some parameters integer to avoid enum mismatch
  HID: sony: Use LED_CORE_SUSPENDRESUME
  cfg80211: Fix array-bounds warning in fragment copy
  nl80211: Fix enum type of variable in nl80211_put_sta_rate()
  xgene_enet: remove bogus forward declarations
  usb: gadget: remove redundant self assignment
  frv: declare jiffies to be located in the .data section
  jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp
  fs: compat: Remove warning from COMPATIBLE_IOCTL
  selinux: Remove unnecessary check of array base in selinux_set_mapping()
  cpumask: Add helper cpumask_available()
  genirq: Use cpumask_available() for check of cpumask variable
  netfilter: nf_nat_h323: fix logical-not-parentheses warning
  Input: mousedev - fix implicit conversion warning
  dm ioctl: remove double parentheses
  PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant
  kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
  partitions/msdos: Unable to mount UFS 44bsd partitions
  powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs
  powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
  ipc/shm.c: add split function to shm_vm_ops
  ceph: only dirty ITER_IOVEC pages for direct read
  perf/hwbp: Simplify the perf-hwbp code, fix documentation
  ALSA: pcm: potential uninitialized return values
  ALSA: pcm: Use dma_bytes as size parameter in dma_mmap_coherent()
  ALSA: usb-audio: Add native DSD support for TEAC UD-301
  mtd: jedec_probe: Fix crash in jedec_read_mfr()
  ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
  ANDROID: fuse: Add null terminator to path in canonical path to avoid issue
  ANDROID: sdcardfs: Fix sdcardfs to stop creating cases-sensitive duplicate entries.
  ANDROID: cpufreq: times: skip printing invalid frequencies
  ANDROID: cpufreq: Add time_in_state to /proc/uid directories
  ANDROID: proc: Add /proc/uid directory
  ANDROID: cpufreq: times: track per-uid time in state
  ANDROID: cpufreq: track per-task time in state
  arm64: fix show_data fallout from KERN_CONT changes
  arm: fix show_data fallout from KERN_CONT changes

Conflicts:
	arch/arm64/include/asm/assembler.h
	arch/arm64/include/asm/cputype.h
	arch/arm64/include/asm/sysreg.h
	arch/arm64/kernel/cpufeature.c
	kernel/sched/core.c

Change-Id: If39e1c5577a1c9345b1b2739f4a5368422cef135
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2018-05-11 03:06:00 -07:00
Blagovest Kolenichev
aa71c72742 Merge android-4.9.88 (bb52bba) into msm-4.9
* refs/heads/tmp-bb52bba:
  Linux 4.9.88
  PCI: dwc: Fix enumeration end when reaching root subordinate
  earlycon: add reg-offset to physical address before mapping
  serial: core: mark port as initialized in autoconfig
  serial: 8250_pci: Add Brainboxes UC-260 4 port serial device
  usb: gadget: f_fs: Fix use-after-free in ffs_fs_kill_sb()
  usb: usbmon: Read text within supplied buffer size
  usb: quirks: add control message delay for 1b1c:1b20
  usbip: vudc: fix null pointer dereference on udc->lock
  USB: storage: Add JMicron bridge 152d:2567 to unusual_devs.h
  staging: android: ashmem: Fix lockdep issue during llseek
  staging: comedi: fix comedi_nsamples_left.
  uas: fix comparison for error code
  tty/serial: atmel: add new version check for usart
  serial: sh-sci: prevent lockup on full TTY buffers
  ASoC: rt5651: Fix regcache sync errors on resume
  ASoC: sgtl5000: Fix suspend/resume
  x86: Treat R_X86_64_PLT32 as R_X86_64_PC32
  x86/module: Detect and skip invalid relocations
  NFS: Fix unstable write completion
  NFS: Fix an incorrect type in struct nfs_direct_req
  scsi: qla2xxx: Replace fcport alloc with qla2x00_alloc_fcport
  ubi: Fix race condition between ubi volume creation and udev
  ext4: inplace xattr block update fails to deduplicate blocks
  netfilter: x_tables: pack percpu counter allocations
  netfilter: x_tables: pass xt_counters struct to counter allocator
  netfilter: x_tables: pass xt_counters struct instead of packet counter
  netfilter: ipv6: fix use-after-free Write in nf_nat_ipv6_manip_pkt
  netfilter: bridge: ebt_among: add missing match size checks
  netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets
  netfilter: IDLETIMER: be syzkaller friendly
  netfilter: nat: cope with negative port range
  netfilter: x_tables: fix missing timer initialization in xt_LED
  netfilter: add back stackpointer size checks
  tc358743: fix register i2c_rd/wr function fix
  Input: tca8418_keypad - remove double read of key event register
  ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds
  watchdog: hpwdt: Remove legacy NMI sourcing.
  watchdog: hpwdt: fix unused variable warning
  watchdog: hpwdt: Check source of NMI
  watchdog: hpwdt: SMBIOS check
  x86/paravirt, objtool: Annotate indirect calls
  x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
  x86/boot, objtool: Annotate indirect jump in secondary_startup_64()
  x86/speculation, objtool: Annotate indirect calls/jumps for objtool
  x86/retpoline: Support retpoline builds with Clang
  x86/speculation: Use IBRS if available before calling into firmware
  Revert "x86/retpoline: Simplify vmexit_fill_RSB()"
  nospec: Include <asm/barrier.h> dependency
  nospec: Kill array_index_nospec_mask_check()
  ALSA: hda: add dock and led support for HP ProBook 640 G2
  ALSA: hda: add dock and led support for HP EliteBook 820 G3
  ALSA: seq: More protection for concurrent write and ioctl races
  ALSA: seq: Don't allow resizing pool in use
  ALSA: hda/realtek - Make dock sound work on ThinkPad L570
  ALSA: hda/realtek - Fix dock line-out volume on Dell Precision 7520
  ALSA: hda/realtek: Limit mic boost on T480
  x86/spectre_v2: Don't check microcode versions when running under hypervisors
  perf tools: Fix trigger class trigger_on()
  x86/MCE: Serialize sysfs changes
  bcache: don't attach backing with duplicate UUID
  bcache: fix crashes in duplicate cache device register
  IB/mlx5: Fix incorrect size of klms in the memory region
  kbuild: Handle builtin dtb file names containing hyphens
  KVM: s390: fix memory overwrites when not using SCA entries
  virtio_ring: fix num_free handling in error case
  loop: Fix lost writes caused by missing flag
  Input: matrix_keypad - fix race when disabling interrupts
  MIPS: OCTEON: irq: Check for null return on kzalloc allocation
  MIPS: ath25: Check for kzalloc allocation failure
  MIPS: BMIPS: Do not mask IPIs during suspend
  drm/amdgpu:Always save uvd vcpu_bo in VM Mode
  drm/amdgpu:Correct max uvd handles
  drm/amdgpu: fix KV harvesting
  drm/radeon: fix KV harvesting
  drm/amdgpu: Notify sbios device ready before send request
  drm/amdgpu: Fix deadlock on runtime suspend
  drm/radeon: Fix deadlock on runtime suspend
  drm/nouveau: Fix deadlock on runtime suspend
  drm: Allow determining if current task is output poll worker
  workqueue: Allow retrieval of current task's work struct
  drm/i915: Always call to intel_display_set_init_power() in resume_early.
  scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS
  drm/i915: Try EDID bitbanging on HDMI after failed read
  RDMA/mlx5: Fix integer overflow while resizing CQ
  RDMA/ucma: Check that user doesn't overflow QP state
  RDMA/ucma: Limit possible option size
  ANDROID: sdcardfs: fix lock issue on 32 bit/SMP architectures
  UPSTREAM: kasan: add functions for unpoisoning stack variables
  UPSTREAM: kasan: add tests for alloca poisoning
  UPSTREAM: kasan: support alloca() poisoning
  UPSTREAM: kasan/Makefile: support LLVM style asan parameters
  BACKPORT: kasan: add compiler support for clang
  kbuild: fix --gc-sections
  BACKPORT: fix "netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'"
  UPSTREAM: netfilter: xt_bpf: add overflow checks
  UPSTREAM: netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
  UPSTREAM: netfilter: xt_bpf: support ebpf
  FROMLIST: f2fs: don't put dentry page in pagecache into highmem

Change-Id: I7f13fedc725fe5333e18e4e5b6639eee27ea1120
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2018-04-17 10:43:29 -07:00
Greg Hackmann
05baf14727 Merge tag 'v4.9.93' into android-4.9
This is the 4.9.93 stable release

Change-Id: I4293d83f45982c6fd479bddbf9b0f811248ddc30
Signed-off-by: Greg Hackmann <ghackmann@google.com>
2018-04-09 11:39:17 -07:00
Florian Westphal
c6ab7c6ceb netfilter: x_tables: add and use xt_check_proc_name
commit b1d0a5d0cba4597c0394997b2d5fced3e3841b4e upstream.

recent and hashlimit both create /proc files, but only check that
name is 0 terminated.

This can trigger WARN() from procfs when name is "" or "/".
Add helper for this and then use it for both.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: <syzbot+0502b00edac2a0680b61@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08 12:12:50 +02:00
Greg Kroah-Hartman
bb52bba67e Merge 4.9.88 into android-4.9
Changes in 4.9.88
	RDMA/ucma: Limit possible option size
	RDMA/ucma: Check that user doesn't overflow QP state
	RDMA/mlx5: Fix integer overflow while resizing CQ
	drm/i915: Try EDID bitbanging on HDMI after failed read
	scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS
	drm/i915: Always call to intel_display_set_init_power() in resume_early.
	workqueue: Allow retrieval of current task's work struct
	drm: Allow determining if current task is output poll worker
	drm/nouveau: Fix deadlock on runtime suspend
	drm/radeon: Fix deadlock on runtime suspend
	drm/amdgpu: Fix deadlock on runtime suspend
	drm/amdgpu: Notify sbios device ready before send request
	drm/radeon: fix KV harvesting
	drm/amdgpu: fix KV harvesting
	drm/amdgpu:Correct max uvd handles
	drm/amdgpu:Always save uvd vcpu_bo in VM Mode
	MIPS: BMIPS: Do not mask IPIs during suspend
	MIPS: ath25: Check for kzalloc allocation failure
	MIPS: OCTEON: irq: Check for null return on kzalloc allocation
	Input: matrix_keypad - fix race when disabling interrupts
	loop: Fix lost writes caused by missing flag
	virtio_ring: fix num_free handling in error case
	KVM: s390: fix memory overwrites when not using SCA entries
	kbuild: Handle builtin dtb file names containing hyphens
	IB/mlx5: Fix incorrect size of klms in the memory region
	bcache: fix crashes in duplicate cache device register
	bcache: don't attach backing with duplicate UUID
	x86/MCE: Serialize sysfs changes
	perf tools: Fix trigger class trigger_on()
	x86/spectre_v2: Don't check microcode versions when running under hypervisors
	ALSA: hda/realtek: Limit mic boost on T480
	ALSA: hda/realtek - Fix dock line-out volume on Dell Precision 7520
	ALSA: hda/realtek - Make dock sound work on ThinkPad L570
	ALSA: seq: Don't allow resizing pool in use
	ALSA: seq: More protection for concurrent write and ioctl races
	ALSA: hda: add dock and led support for HP EliteBook 820 G3
	ALSA: hda: add dock and led support for HP ProBook 640 G2
	nospec: Kill array_index_nospec_mask_check()
	nospec: Include <asm/barrier.h> dependency
	Revert "x86/retpoline: Simplify vmexit_fill_RSB()"
	x86/speculation: Use IBRS if available before calling into firmware
	x86/retpoline: Support retpoline builds with Clang
	x86/speculation, objtool: Annotate indirect calls/jumps for objtool
	x86/boot, objtool: Annotate indirect jump in secondary_startup_64()
	x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
	x86/paravirt, objtool: Annotate indirect calls
	watchdog: hpwdt: SMBIOS check
	watchdog: hpwdt: Check source of NMI
	watchdog: hpwdt: fix unused variable warning
	watchdog: hpwdt: Remove legacy NMI sourcing.
	ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds
	Input: tca8418_keypad - remove double read of key event register
	tc358743: fix register i2c_rd/wr function fix
	netfilter: add back stackpointer size checks
	netfilter: x_tables: fix missing timer initialization in xt_LED
	netfilter: nat: cope with negative port range
	netfilter: IDLETIMER: be syzkaller friendly
	netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets
	netfilter: bridge: ebt_among: add missing match size checks
	netfilter: ipv6: fix use-after-free Write in nf_nat_ipv6_manip_pkt
	netfilter: x_tables: pass xt_counters struct instead of packet counter
	netfilter: x_tables: pass xt_counters struct to counter allocator
	netfilter: x_tables: pack percpu counter allocations
	ext4: inplace xattr block update fails to deduplicate blocks
	ubi: Fix race condition between ubi volume creation and udev
	scsi: qla2xxx: Replace fcport alloc with qla2x00_alloc_fcport
	NFS: Fix an incorrect type in struct nfs_direct_req
	NFS: Fix unstable write completion
	x86/module: Detect and skip invalid relocations
	x86: Treat R_X86_64_PLT32 as R_X86_64_PC32
	ASoC: sgtl5000: Fix suspend/resume
	ASoC: rt5651: Fix regcache sync errors on resume
	serial: sh-sci: prevent lockup on full TTY buffers
	tty/serial: atmel: add new version check for usart
	uas: fix comparison for error code
	staging: comedi: fix comedi_nsamples_left.
	staging: android: ashmem: Fix lockdep issue during llseek
	USB: storage: Add JMicron bridge 152d:2567 to unusual_devs.h
	usbip: vudc: fix null pointer dereference on udc->lock
	usb: quirks: add control message delay for 1b1c:1b20
	usb: usbmon: Read text within supplied buffer size
	usb: gadget: f_fs: Fix use-after-free in ffs_fs_kill_sb()
	serial: 8250_pci: Add Brainboxes UC-260 4 port serial device
	serial: core: mark port as initialized in autoconfig
	earlycon: add reg-offset to physical address before mapping
	PCI: dwc: Fix enumeration end when reaching root subordinate
	Linux 4.9.88

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-18 12:07:50 +01:00
Florian Westphal
db6a0cbeb9 netfilter: x_tables: pack percpu counter allocations
commit ae0ac0ed6fcf5af3be0f63eb935f483f44a402d2 upstream.

instead of allocating each xt_counter individually, allocate 4k chunks
and then use these for counter allocation requests.

This should speed up rule evaluation by increasing data locality,
also speeds up ruleset loading because we reduce calls to the percpu
allocator.

As Eric points out we can't use PAGE_SIZE, page_allocator would fail on
arches with 64k page size.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-18 11:18:54 +01:00
Florian Westphal
dac4448faf netfilter: x_tables: pass xt_counters struct to counter allocator
commit f28e15bacedd444608e25421c72eb2cf4527c9ca upstream.

Keeps some noise away from a followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-18 11:18:54 +01:00
Florian Westphal
61346e20c0 netfilter: x_tables: pass xt_counters struct instead of packet counter
commit 4d31eef5176df06f218201bc9c0ce40babb41660 upstream.

On SMP we overload the packet counter (unsigned long) to contain
percpu offset.  Hide this from callers and pass xt_counters address
instead.

Preparation patch to allocate the percpu counters in page-sized batch
chunks.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-18 11:18:54 +01:00
Mohammed Javid
f4530efea1 netfilter: Changes to handle segmentation in SIP ALG
Linux Kernel SIP ALG did not handle Segmented TCP Packets
because of which SIP communication could not be established
for some clients. This change fixes that issue.

Also,This change handle porting of below changes
1)Additional fixes for SIP Segmentation Support
I49fa88eebadb24c7ce03c4d189c71da0fe034e1d
2)Changes to handle MT call issue in SIP ALG
Ia4c83fb18c0e92e8402742f50c0ef05a344e51a4
3)Enable/Disable SIP Segmentation Support
I0e2fdc7dfa43f5d1cc00daa5c3103231c979ac43

Change-Id: I8c77322f69cf4d9ad4c7b4971da924ffd585dea0
Acked-by: Suraj Jaiswal <c_surajj@qti.qualcomm.com>
Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>
Signed-off-by: Tyler Wear <twear@codeaurora.org>
Signed-off-by: Mohammed Javid <mjavid@codeaurora.org>
2017-11-24 00:21:59 -08:00
Chenbo Feng
875e52672d ANDROID: Add untag hacks to inet_release function
To prevent protential risk of memory leak caused by closing socket with
out untag it from qtaguid module, the qtaguid module now do not hold any
socket file reference count. Instead, it will increase the sk_refcnt of
the sk struct to prevent a reuse of the socket pointer.  And when a socket
is released. It will delete the tag if the socket is previously tagged so
no more resources is held by xt_qtaguid moudle. A flag is added to the untag
process to prevent possible kernel crash caused by fail to delete
corresponding socket_tag_entry list.
Bug: 36374484
Test: compile and run test under system/extra/test/iptables,
      run cts -m CtsNetTestCases -t android.net.cts.SocketRefCntTest

Signed-off-by: Chenbo Feng <fengc@google.com>
Change-Id: Iea7c3bf0c59b9774a5114af905b2405f6bc9ee52
2017-05-09 03:03:11 +00:00
JP Abgrall
b702927cdd ANDROID: netfilter: adding the original quota2 from xtables-addons
The original xt_quota in the kernel is plain broken:
  - counts quota at a per CPU level
    (was written back when ubiquitous SMP was just a dream)
  - provides no way to count across IPV4/IPV6.

This patch is the original unaltered code from:
  http://sourceforge.net/projects/xtables-addons

  at commit e84391ce665cef046967f796dd91026851d6bbf3

Change-Id: I19d49858840effee9ecf6cff03c23b45a97efdeb
Signed-off-by: JP Abgrall <jpa@google.com>
2017-01-19 13:32:10 -08:00
JP Abgrall
baf0db430a ANDROID: netfilter: add xt_qtaguid matching module
This module allows tracking stats at the socket level for given UIDs.
It replaces xt_owner.
If the --uid-owner is not specified, it will just count stats based on
who the skb belongs to. This will even happen on incoming skbs as it
looks into the skb via xt_socket magic to see who owns it.
If an skb is lost, it will be assigned to uid=0.

To control what sockets of what UIDs are tagged by what, one uses:
  echo t $sock_fd $accounting_tag $the_billed_uid \
     > /proc/net/xt_qtaguid/ctrl
 So whenever an skb belongs to a sock_fd, it will be accounted against
   $the_billed_uid
  and matching stats will show up under the uid with the given
   $accounting_tag.

Because the number of allocations for the stats structs is not that big:
  ~500 apps * 32 per app
we'll just do it atomic. This avoids walking lists many times, and
the fancy worker thread handling. Slabs will grow when needed later.

It use netdevice and inetaddr notifications instead of hooks in the core dev
code to track when a device comes and goes. This removes the need for
exposed iface_stat.h.

Put procfs dirs in /proc/net/xt_qtaguid/
  ctrl
  stats
  iface_stat/<iface>/...
The uid stats are obtainable in ./stats.

Change-Id: I01af4fd91c8de651668d3decb76d9bdc1e343919
Signed-off-by: JP Abgrall <jpa@google.com>
2017-01-19 13:32:09 -08:00
Florian Westphal
8e8118f893 netfilter: conntrack: remove packet hotpath stats
These counters sit in hot path and do show up in perf, this is especially
true for 'found' and 'searched' which get incremented for every packet
processed.

Information like

searched=212030105
new=623431
found=333613
delete=623327

does not seem too helpful nowadays:

- on busy systems found and searched will overflow every few hours
(these are 32bit integers), other more busy ones every few days.

- for debugging there are better methods, such as iptables' trace target,
the conntrack log sysctls.  Nowadays we also have perf tool.

This removes packet path stat counters except those that
are expected to be 0 (or close to 0) on a normal system, e.g.
'insert_failed' (race happened) or 'invalid' (proto tracker rejects).

The insert stat is retained for the ctnetlink case.
The found stat is retained for the tuple-is-taken check when NAT has to
determine if it needs to pick a different source address.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-12 19:59:39 +02:00
Gao Feng
c579a9e7d5 netfilter: gre: Use consistent GRE and PTTP header structure instead of the ones defined by netfilter
There are two existing strutures which defines the GRE and PPTP header.
So use these two structures instead of the ones defined by netfilter to
keep consitent with other codes.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07 10:36:52 +02:00
Gao Feng
ecc6569f35 netfilter: gre: Use consistent GRE_* macros instead of ones defined by netfilter.
There are already some GRE_* macros in kernel, so it is unnecessary
to define these macros. And remove some useless macros

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07 10:36:48 +02:00
Liping Zhang
aca300183e netfilter: nfnetlink_acct: report overquota to the right netns
We should report the over quota message to the right net namespace
instead of the init netns.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-18 00:38:23 +02:00
Florian Westphal
f4dc77713f netfilter: x_tables: speed up jump target validation
The dummy ruleset I used to test the original validation change was broken,
most rules were unreachable and were not tested by mark_source_chains().

In some cases rulesets that used to load in a few seconds now require
several minutes.

sample ruleset that shows the behaviour:

echo "*filter"
for i in $(seq 0 100000);do
        printf ":chain_%06x - [0:0]\n" $i
done
for i in $(seq 0 100000);do
   printf -- "-A INPUT -j chain_%06x\n" $i
   printf -- "-A INPUT -j chain_%06x\n" $i
   printf -- "-A INPUT -j chain_%06x\n" $i
done
echo COMMIT

[ pipe result into iptables-restore ]

This ruleset will be about 74mbyte in size, with ~500k searches
though all 500k[1] rule entries. iptables-restore will take forever
(gave up after 10 minutes)

Instead of always searching the entire blob for a match, fill an
array with the start offsets of every single ipt_entry struct,
then do a binary search to check if the jump target is present or not.

After this change ruleset restore times get again close to what one
gets when reverting 3647234101 (~3 seconds on my workstation).

[1] every user-defined rule gets an implicit RETURN, so we get
300k jumps + 100k userchains + 100k returns -> 500k rule entries

Fixes: 3647234101 ("netfilter: x_tables: validate targets of jumps")
Reported-by: Jeff Wu <wujiafu@gmail.com>
Tested-by: Jeff Wu <wujiafu@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-18 21:35:23 +02:00
Joe Perches
c37a2dfa67 netfilter: Convert FWINV<[foo]> macros and uses to NF_INVF
netfilter uses multiple FWINV #defines with identical form that hide a
specific structure variable and dereference it with a invflags member.

$ git grep "#define FWINV"
include/linux/netfilter_bridge/ebtables.h:#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg))
net/bridge/netfilter/ebtables.c:#define FWINV2(bool, invflg) ((bool) ^ !!(e->invflags & invflg))
net/ipv4/netfilter/arp_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg)))
net/ipv4/netfilter/ip_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg)))
net/ipv6/netfilter/ip6_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ip6info->invflags & (invflg)))
net/netfilter/xt_tcpudp.c:#define FWINVTCP(bool, invflg) ((bool) ^ !!(tcpinfo->invflags & (invflg)))

Consolidate these macros into a single NF_INVF macro.

Miscellanea:

o Neaten the alignment around these uses
o A few lines are > 80 columns for intelligibility

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-03 10:55:07 +02:00
Pablo Neira Ayuso
92b4423e3a netfilter: fix IS_ERR_VALUE usage
This is a forward-port of the original patch from Andrzej Hajda,
he said:

"IS_ERR_VALUE should be used only with unsigned long type.
Otherwise it can work incorrectly. To achieve this function
xt_percpu_counter_alloc is modified to return unsigned long,
and its result is assigned to temporary variable to perform
error checking, before assigning to .pcnt field.

The patch follows conclusion from discussion on LKML [1][2].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2120927
[2]: http://permalink.gmane.org/gmane.linux.kernel/2150581"

Original patch from Andrzej is here:

http://patchwork.ozlabs.org/patch/582970/

This patch has clashed with input validation fixes for x_tables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-29 11:02:33 +02:00
David S. Miller
11afbff861 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree, mostly from Florian Westphal to sort out the lack of sufficient
validation in x_tables and connlabel preparation patches to add
nf_tables support. They are:

1) Ensure we don't go over the ruleset blob boundaries in
   mark_source_chains().

2) Validate that target jumps land on an existing xt_entry. This extra
   sanitization comes with a performance penalty when loading the ruleset.

3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables.

4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables.

5) Make sure the minimal possible target size in x_tables.

6) Similar to #3, add xt_compat_check_entry_offsets() for compat code.

7) Check that standard target size is valid.

8) More sanitization to ensure that the target_offset field is correct.

9) Add xt_check_entry_match() to validate that matches are well-formed.

10-12) Three patch to reduce the number of parameters in
    translate_compat_table() for {arp,ip,ip6}tables by using a container
    structure.

13) No need to return value from xt_compat_match_from_user(), so make
    it void.

14) Consolidate translate_table() so it can be used by compat code too.

15) Remove obsolete check for compat code, so we keep consistent with
    what was already removed in the native layout code (back in 2007).

16) Get rid of target jump validation from mark_source_chains(),
    obsoleted by #2.

17) Introduce xt_copy_counters_from_user() to consolidate counter
    copying, and use it from {arp,ip,ip6}tables.

18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump
    functions.

19) Move nf_connlabel_match() to xt_connlabel.

20) Skip event notification if connlabel did not change.

21) Update of nf_connlabels_get() to make the upcoming nft connlabel
    support easier.

23) Remove spinlock to read protocol state field in conntrack.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 00:12:08 -04:00
Nicolas Dichtel
e9bbe898cb libnl: nla_put_net64(): align on a 64-bit area
nla_data() is now aligned on a 64-bit area.

The temporary function nla_put_be64_32bit() is removed in this patch.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23 20:13:24 -04:00
Florian Westphal
d7591f0c41 netfilter: x_tables: introduce and use xt_copy_counters_from_user
The three variants use same copy&pasted code, condense this into a
helper and use that.

Make sure info.name is 0-terminated.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14 00:30:41 +02:00
Florian Westphal
0188346f21 netfilter: x_tables: xt_compat_match_from_user doesn't need a retval
Always returned 0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14 00:30:40 +02:00
Florian Westphal
ce683e5f9d netfilter: x_tables: check for bogus target offset
We're currently asserting that targetoff + targetsize <= nextoff.

Extend it to also check that targetoff is >= sizeof(xt_entry).
Since this is generic code, add an argument pointing to the start of the
match/target, we can then derive the base structure size from the delta.

We also need the e->elems pointer in a followup change to validate matches.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14 00:30:37 +02:00
Florian Westphal
fc1221b3a1 netfilter: x_tables: add compat version of xt_check_entry_offsets
32bit rulesets have different layout and alignment requirements, so once
more integrity checks get added to xt_check_entry_offsets it will reject
well-formed 32bit rulesets.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14 00:30:36 +02:00
Florian Westphal
7d35812c32 netfilter: x_tables: add and use xt_check_entry_offsets
Currently arp/ip and ip6tables each implement a short helper to check that
the target offset is large enough to hold one xt_entry_target struct and
that t->u.target_size fits within the current rule.

Unfortunately these checks are not sufficient.

To avoid adding new tests to all of ip/ip6/arptables move the current
checks into a helper, then extend this helper in followup patches.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-14 00:30:35 +02:00
Vishwanath Pai
596cf3fe58 netfilter: ipset: fix race condition in ipset save, swap and delete
This fix adds a new reference counter (ref_netlink) for the struct ip_set.
The other reference counter (ref) can be swapped out by ip_set_swap and we
need a separate counter to keep track of references for netlink events
like dump. Using the same ref counter for dump causes a race condition
which can be demonstrated by the following script:

ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \
counters
ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \
counters
ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \
counters

ipset save &

ipset swap hash_ip3 hash_ip2
ipset destroy hash_ip3 /* will crash the machine */

Swap will exchange the values of ref so destroy will see ref = 0 instead of
ref = 1. With this fix in place swap will not succeed because ipset save
still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink).

Both delete and swap will error out if ref_netlink != 0 on the set.

Note: The changes to *_head functions is because previously we would
increment ref whenever we called these functions, we don't do that
anymore.

Reviewed-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:57:45 +02:00
Florian Westphal
b9e69e1273 netfilter: xtables: don't hook tables by default
delay hook registration until the table is being requested inside a
namespace.

Historically, a particular table (iptables mangle, ip6tables filter, etc)
was registered on module load.

When netns support was added to iptables only the ip/ip6tables ruleset was
made namespace aware, not the actual hook points.

This means f.e. that when ipt_filter table/module is loaded on a system,
then each namespace on that system has an (empty) iptables filter ruleset.

In other words, if a namespace sends a packet, such skb is 'caught' by
netfilter machinery and fed to hooking points for that table (i.e. INPUT,
FORWARD, etc).

Thanks to Eric Biederman, hooks are no longer global, but per namespace.

This means that we can avoid allocation of empty ruleset in a namespace and
defer hook registration until we need the functionality.

We register a tables hook entry points ONLY in the initial namespace.
When an iptables get/setockopt is issued inside a given namespace, we check
if the table is found in the per-namespace list.

If not, we attempt to find it in the initial namespace, and, if found,
create an empty default table in the requesting namespace and register the
needed hooks.

Hook points are destroyed only once namespace is deleted, there is no
'usage count' (it makes no sense since there is no 'remove table' operation
in xtables api).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02 20:05:24 +01:00
Florian Westphal
905f0a739a nfnetlink: remove nfnetlink_alloc_skb
Following mmapped netlink removal this code can be simplified by
removing the alloc wrapper.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18 11:42:19 -05:00
Pablo Neira Ayuso
5913beaf0d netfilter: nfnetlink: pass down netns pointer to commit() and abort() callbacks
Adapt callsites to avoid recurrent lookup of the netns pointer.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-28 18:43:15 +01:00
Pablo Neira Ayuso
7b8002a151 netfilter: nfnetlink: pass down netns pointer to call() and call_rcu()
Adapt callsites to avoid recurrent lookup of the netns pointer.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-28 18:41:41 +01:00
David S. Miller
59ce9670ce Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains the first batch of Netfilter updates for
the upcoming 4.5 kernel. This batch contains userspace netfilter header
compilation fixes, support for packet mangling in nf_tables, the new
tracing infrastructure for nf_tables and cgroup2 support for iptables.
More specifically, they are:

1) Two patches to include dependencies in our netfilter userspace
   headers to resolve compilation problems, from Mikko Rapeli.

2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris.

3) Remove duplicate include in the netfilter reject infrastructure,
   from Stephen Hemminger.

4) Two patches to simplify the netfilter defragmentation code for IPv6,
   patch from Florian Westphal.

5) Fix root ownership of /proc/net netfilter for unpriviledged net
   namespaces, from Philip Whineray.

6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal.

7) Add mangling support to our nf_tables payload expression, from
   Patrick McHardy.

8) Introduce a new netlink-based tracing infrastructure for nf_tables,
   from Florian Westphal.

9) Change setter functions in nfnetlink_log to be void, from
    Rami Rosen.

10) Add netns support to the cttimeout infrastructure.

11) Add cgroup2 support to iptables, from Tejun Heo.

12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian.

13) Add support for mangling pkttype in the nf_tables meta expression,
    also from Florian.

BTW, I need that you pull net into net-next, I have another batch that
requires changes that I don't yet see in net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-18 15:37:42 -05:00
Pablo Neira Ayuso
633c9a840d netfilter: nfnetlink: avoid recurrent netns lookups in call_batch
Pass the net pointer to the call_batch callback functions so we can skip
recurrent lookups.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-12-10 13:49:24 +01:00
Marcelo Ricardo Leitner
f7ccdb96fa netfilter: nf_ct_sctp: move ip_ct_sctp away from UAPI
ip_ct_sctp is an internal structure, embedded by the union
nf_conntrack_proto to store sctp-specific information at conntrack
entries. It has no business with UAPI.

This patch moves it from UAPI to a saner place, together with similar
structs for other protocols.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-11-23 17:54:42 +01:00
Jozsef Kadlecsik
95ad1f4a93 netfilter: ipset: Fix extension alignment
The data extensions in ipset lacked the proper memory alignment and
thus could lead to kernel crash on several architectures. Therefore
the structures have been reorganized and alignment attributes added
where needed. The patch was tested on armv7h by Gerhard Wiesinger and
on x86_64, sparc64 by Jozsef Kadlecsik.

Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-11-07 11:21:47 +01:00
Yaowei Bai
875e082949 net/nfnetlink: lockdep_nfnl_is_held can be boolean
This patch makes lockdep_nfnl_is_held return bool to improve
readability due to this particular function only using either
one or zero as its return value.

No functional change.

Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:49:00 -07:00
Eric W. Biederman
156c196f60 netfilter: x_tables: Pass struct net in xt_action_param
As xt_action_param lives on the stack this does not bloat any
persistent data structures.

This is a first step in making netfilter code that needs to know
which network namespace it is executing in simpler.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18 21:58:14 +02:00
Daniel Borkmann
62da98656b netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
Fengguang reported, that some randconfig generated the following linker
issue with nf_ct_zone_dflt object involved:

  [...]
  CC      init/version.o
  LD      init/built-in.o
  net/built-in.o: In function `ipv4_conntrack_defrag':
  nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
  net/built-in.o: In function `ipv6_defrag':
  nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
  make: *** [vmlinux] Error 1

Given that configurations exist where we have a built-in part, which is
accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
area when netfilter is configured in general.

Therefore, split the more generic parts into a common header under
include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
section that already holds parts related to CONFIG_NF_CONNTRACK in the
netfilter core. This fixes the issue on my side.

Fixes: 308ac9143e ("netfilter: nf_conntrack: push zone object into functions")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-02 16:32:56 -07:00
Andreas Schultz
3499abb249 netfilter: nfacct: per network namespace support
- Move the nfnl_acct_list into the network namespace, initialize
  and destroy it per namespace
- Keep track of refcnt on nfacct objects, the old logic does not
  longer work with a per namespace list
- Adjust xt_nfacct to pass the namespace when registring objects

Signed-off-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-08-07 11:50:56 +02:00
Florian Westphal
dcebd3153e netfilter: add and use jump label for xt_tee
Don't bother testing if we need to switch to alternate stack
unless TEE target is used.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-15 18:18:06 +02:00
Florian Westphal
7814b6ec6d netfilter: xtables: don't save/restore jumpstack offset
In most cases there is no reentrancy into ip/ip6tables.

For skbs sent by REJECT or SYNPROXY targets, there is one level
of reentrancy, but its not relevant as those targets issue an absolute
verdict, i.e. the jumpstack can be clobbered since its not used
after the target issues absolute verdict (ACCEPT, DROP, STOLEN, etc).

So the only special case where it is relevant is the TEE target, which
returns XT_CONTINUE.

This patch changes ip(6)_do_table to always use the jump stack starting
from 0.

When we detect we're operating on an skb sent via TEE (percpu
nf_skb_duplicated is 1) we switch to an alternate stack to leave
the original one alone.

Since there is no TEE support for arptables, it doesn't need to
test if tee is active.

The jump stack overflow tests are no longer needed as well --
since ->stacksize is the largest call depth we cannot exceed it.

A much better alternative to the external jumpstack would be to just
declare a jumps[32] stack on the local stack frame, but that would mean
we'd have to reject iptables rulesets that used to work before.

Another alternative would be to start rejecting rulesets with a larger
call depth, e.g. 1000 -- in this case it would be feasible to allocate the
entire stack in the percpu area which would avoid one dereference.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-15 18:18:06 +02:00
Florian Westphal
dcb8f5c813 netfilter: xtables: fix warnings on 32bit platforms
On 32bit archs gcc complains due to cast from void* to u64.
Add intermediate casts to long to silence these warnings.

include/linux/netfilter/x_tables.h:376:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
include/linux/netfilter/x_tables.h:384:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
include/linux/netfilter/x_tables.h:391:23: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
include/linux/netfilter/x_tables.h:400:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

Fixes: 71ae0dff02 ("netfilter: xtables: use percpu rule counters")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18 21:14:33 +02:00
Eric Dumazet
a1a56aaa07 netfilter: x_tables: align per cpu xt_counter
Let's force a 16 bytes alignment on xt_counter percpu allocations,
so that bytes and packets sit in same cache line.

xt_counter being exported to user space, we cannot add __align(16) on
the structure itself.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18 13:05:45 +02:00
Eric Dumazet
711bdde6a8 netfilter: x_tables: remove XT_TABLE_INFO_SZ and a dereference.
After Florian patches, there is no need for XT_TABLE_INFO_SZ anymore :
Only one copy of table is kept, instead of one copy per cpu.

We also can avoid a dereference if we put table data right after
xt_table_info. It reduces register pressure and helps compiler.

Then, we attempt a kmalloc() if total size is under order-3 allocation,
to reduce TLB pressure, as in many cases, rules fit in 32 KB.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 20:19:20 +02:00
Jozsef Kadlecsik
ca0f6a5cd9 netfilter: ipset: Fix coding styles reported by checkpatch.pl
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:18 +02:00
Jozsef Kadlecsik
b57b2d1fa5 netfilter: ipset: Prepare the ipset core to use RCU at set level
Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
accordingly. Convert the comment extension into an rcu-avare object. Also,
simplify the timeout routines.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:16 +02:00
Jozsef Kadlecsik
c4c997839c netfilter: ipset: Fix parallel resizing and listing of the same set
When elements added to a hash:* type of set and resizing triggered,
parallel listing could start to list the original set (before resizing)
and "continue" with listing the new set. Fix it by references and
using the original hash table for listing. Therefore the destroying of
the original hash table may happen from the resizing or listing functions.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:15 +02:00
Jozsef Kadlecsik
f690cbaed9 netfilter: ipset: Fix cidr handling for hash:*net* types
Commit "Simplify cidr handling for hash:*net* types" broke the cidr
handling for the hash:*net* types when the sets were used by the SET
target: entries with invalid cidr values were added to the sets.
Reported by Jonathan Johnson.

Testsuite entry is added to verify the fix.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:14 +02:00
Jozsef Kadlecsik
aaeb6e24f5 netfilter: ipset: Use MSEC_PER_SEC consistently
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:12 +02:00
Florian Westphal
482cfc3185 netfilter: xtables: avoid percpu ruleset duplication
We store the rule blob per (possible) cpu.  Unfortunately this means we can
waste lot of memory on big smp machines. ipt_entry structure ('rule head')
is 112 byte, so e.g. with maxcpu=64 one single rule eats
close to 8k RAM.

Since previous patch made counters percpu it appears there is nothing
left in the rule blob that needs to be percpu.

On my test system (144 possible cpus, 400k dummy rules) this
change saves close to 9 Gigabyte of RAM.

Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:27:10 +02:00