Commit Graph

343 Commits

Author SHA1 Message Date
Kyle Yan
ff97938fbf Merge remote-tracking branch '4.9/tmp-8dd0f52' into msm-4.9
* 4.9/tmp-8dd0f52:
  Linux 4.9.72
  sparc32: Export vac_cache_size to fix build error
  bpf: fix incorrect sign extension in check_alu_op()
  bpf: reject out-of-bounds stack pointer calculation
  bpf: fix branch pruning logic
  bpf: adjust insn_aux_data when patching insns
  Revert "Bluetooth: btusb: driver to enable the usb-wakeup feature"
  platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes
  MIPS: math-emu: Fix final emulation phase for certain instructions
  thermal/drivers/hisi: Fix multiple alarm interrupts firing
  thermal/drivers/hisi: Simplify the temperature/step computation
  thermal/drivers/hisi: Fix kernel panic on alarm interrupt
  thermal/drivers/hisi: Fix missing interrupt enablement
  thermal: hisilicon: Handle return value of clk_prepare_enable
  cpuidle: fix broadcast control when broadcast can not be entered
  rtc: set the alarm to the next expiring timer
  tcp: fix under-evaluated ssthresh in TCP Vegas
  clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision
  staging: greybus: light: Release memory obtained by kasprintf
  net: ipv6: send NS for DAD when link operationally up
  fm10k: ensure we process SM mbx when processing VF mbx
  vfio/pci: Virtualize Maximum Payload Size
  scsi: lpfc: PLOGI failures during NPIV testing
  scsi: lpfc: Fix secure firmware updates
  fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw
  ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback
  tracing: Exclude 'generic fields' from histograms
  PCI/AER: Report non-fatal errors only to the affected endpoint
  IB/rxe: check for allocation failure on elem
  ixgbe: fix use of uninitialized padding
  igb: check memory allocation failure
  PM / OPP: Move error message to debug level
  PCI: Create SR-IOV virtfn/physfn links before attaching driver
  scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive
  scsi: cxgb4i: fix Tx skb leak
  PCI: Avoid bus reset if bridge itself is broken
  net: phy: at803x: Change error to EINVAL for invalid MAC
  kvm, mm: account kvm related kmem slabs to kmemcg
  rtc: pl031: make interrupt optional
  crypto: crypto4xx - increase context and scatter ring buffer elements
  backlight: pwm_bl: Fix overflow condition
  bnxt_en: Fix NULL pointer dereference in reopen failure path
  cpuidle: powernv: Pass correct drv->cpumask for registration
  ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory
  Btrfs: fix an integer overflow check
  netfilter: nfnetlink_queue: fix secctx memory leak
  xhci: plat: Register shutdown for xhci_plat
  net: moxa: fix TX overrun memory leak
  isdn: kcapi: avoid uninitialized data
  virtio_balloon: prevent uninitialized variable use
  virtio-balloon: use actual number of stats for stats queue buffers
  KVM: pci-assign: do not map smm memory slot pages in vt-d page tables
  net: ipconfig: fix ic_close_devs() use-after-free
  cpufreq: Fix creation of symbolic links to policy directories
  ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
  netfilter: nf_nat_snmp: Fix panic when snmp_trap_helper fails to register
  netfilter: nfnl_cthelper: fix a race when walk the nf_ct_helper_hash table
  irda: vlsi_ir: fix check for DMA mapping errors
  RDMA/iser: Fix possible mr leak on device removal event
  i40e: Do not enable NAPI on q_vectors that have no rings
  IB/rxe: increment msn only when completing a request
  IB/rxe: double free on error
  net: Do not allow negative values for busy_read and busy_poll sysctl interfaces
  nbd: set queue timeout properly
  infiniband: Fix alignment of mmap cookies to support VIPT caching
  IB/core: Protect against self-requeue of a cq work item
  i40iw: Receive netdev events post INET_NOTIFIER state
  bna: avoid writing uninitialized data into hw registers
  s390/qeth: no ETH header for outbound AF_IUCV
  s390/qeth: size calculation outbound buffers
  r8152: prevent the driver from transmitting packets with carrier off
  ASoC: STI: Fix reader substream pointer set
  HID: xinmo: fix for out of range for THT 2P arcade controller.
  hwmon: (asus_atk0110) fix uninitialized data access
  ARM: dts: ti: fix PCI bus dtc warnings
  KVM: VMX: Fix enable VPID conditions
  KVM: x86: correct async page present tracepoint
  kvm: vmx: Flush TLB when the APIC-access address changes
  scsi: lpfc: Fix PT2PT PRLI reject
  pinctrl: st: add irq_request/release_resources callbacks
  inet: frag: release spinlock before calling icmp_send()
  tipc: fix nametbl deadlock at tipc_nametbl_unsubscribe
  r8152: fix the rx early size of RTL8153
  iommu/exynos: Workaround FLPD cache flush issues for SYSMMU v5
  netfilter: nfnl_cthelper: Fix memory leak
  netfilter: nfnl_cthelper: fix runtime expectation policy updates
  usb: gadget: udc: remove pointer dereference after free
  usb: gadget: f_uvc: Sanity check wMaxPacketSize for SuperSpeed
  hwmon: (max31790) Set correct PWM value
  net: qmi_wwan: Add USB IDs for MDM6600 modem on Motorola Droid 4
  sctp: out_qlen should be updated when pruning unsent queue
  bna: integer overflow bug in debugfs
  sch_dsmark: fix invalid skb_cow() usage
  vsock: cancel packets when failing to connect
  vhost-vsock: add pkt cancel capability
  vsock: track pkt owner vsock
  crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex
  r8152: fix the list rx_done may be used without initialization
  cpuidle: Validate cpu_dev in cpuidle_add_sysfs()
  nvme-loop: handle cpu unplug when re-establishing the controller
  arm: kprobes: Align stack to 8-bytes in test code
  arm: kprobes: Fix the return address of multiple kretprobes
  HID: corsair: Add driver Scimitar Pro RGB gaming mouse 1b1c:1b3e support to hid-corsair
  HID: corsair: support for K65-K70 Rapidfire and Scimitar Pro RGB
  kvm: fix usage of uninit spinlock in avic_vm_destroy()
  ALSA: hda - add support for docking station for HP 840 G3
  ALSA: hda - add support for docking station for HP 820 G2
  arm64: Initialise high_memory global variable earlier
  cxl: Check if vphb exists before iterating over AFU devices
  Linux 4.9.71
  ath9k: fix tx99 potential info leak
  icmp: don't fail on fragment reassembly time exceeded
  IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
  RDMA/cma: Avoid triggering undefined behavior
  macvlan: Only deliver one copy of the frame to the macvlan interface
  udf: Avoid overflow when session starts at large offset
  scsi: bfa: integer overflow in debugfs
  scsi: sd: change allow_restart to bool in sysfs interface
  scsi: sd: change manage_start_stop to bool in sysfs interface
  rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd
  rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd
  vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend
  IB/core: Fix calculation of maximum RoCE MTU
  scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry
  raid5: Set R5_Expanded on parity devices as well as data.
  pinctrl: adi2: Fix Kconfig build problem
  usb: musb: da8xx: fix babble condition handling
  tty fix oops when rmmod 8250
  soc: mediatek: pwrap: fix compiler errors
  powerpc/perf/hv-24x7: Fix incorrect comparison in memord
  scsi: hpsa: destroy sas transport properties before scsi_host
  scsi: hpsa: cleanup sas_phy structures in sysfs when unloading
  PCI: Detach driver before procfs & sysfs teardown on device remove
  RDMA/cxgb4: Declare stag as __be32
  xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real
  xfs: fix log block underflow during recovery cycle verification
  l2tp: cleanup l2tp_tunnel_delete calls
  nvme: use kref_get_unless_zero in nvme_find_get_ns
  platform/x86: hp_accel: Add quirk for HP ProBook 440 G4
  btrfs: tests: Fix a memory leak in error handling path in 'run_test()'
  arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27
  Ib/hfi1: Return actual operational VLs in port info query
  bcache: fix wrong cache_misses statistics
  bcache: explicitly destroy mutex while exiting
  GFS2: Take inode off order_write list when setting jdata flag
  scsi: scsi_debug: write_same: fix error report
  thermal/drivers/step_wise: Fix temperature regulation misbehavior
  ASoC: rsnd: rsnd_ssi_run_mods() needs to care ssi_parent_mod
  ppp: Destroy the mutex when cleanup
  clk: tegra: Fix cclk_lp divisor register
  clk: hi6220: mark clock cs_atb_syspll as critical
  clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU
  clk: mediatek: add the option for determining PLL source clock
  mm: Handle 0 flags in _calc_vm_trans() macro
  crypto: tcrypt - fix buffer lengths in test_aead_speed()
  arm-ccn: perf: Prevent module unload while PMU is in use
  xfs: truncate pagecache before writeback in xfs_setattr_size()
  iommu/amd: Limit the IOVA page range to the specified addresses
  badblocks: fix wrong return value in badblocks_set if badblocks are disabled
  target/file: Do not return error for UNMAP if length is zero
  target:fix condition return in core_pr_dump_initiator_port()
  iscsi-target: fix memory leak in lio_target_tiqn_addtpg()
  target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
  platform/x86: intel_punit_ipc: Fix resource ioremap warning
  powerpc/ipic: Fix status get and status clear
  powerpc/opal: Fix EBUSY bug in acquiring tokens
  netfilter: ipvs: Fix inappropriate output of procfs
  iommu/mediatek: Fix driver name
  PCI: Do not allocate more buses than available in parent
  powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo
  PCI/PME: Handle invalid data when reading Root Status
  dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type
  ASoC: Intel: Skylake: Fix uuid_module memory leak in failure case
  rtc: pcf8563: fix output clock rate
  video: fbdev: au1200fb: Return an error code if a memory allocation fails
  video: fbdev: au1200fb: Release some resources if a memory allocation fails
  video: udlfb: Fix read EDID timeout
  fbdev: controlfb: Add missing modes to fix out of bounds access
  sfc: don't warn on successful change of MAC
  HID: cp2112: fix broken gpio_direction_input callback
  Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"
  target: fix race during implicit transition work flushes
  target: fix ALUA transition timeout handling
  target: Use system workqueue for ALUA transitions
  btrfs: add missing memset while reading compressed inline extents
  NFSv4.1 respect server's max size in CREATE_SESSION
  efi/esrt: Cleanup bad memory map log messages
  perf symbols: Fix symbols__fixup_end heuristic for corner cases
  tty: fix data race in tty_ldisc_ref_wait()
  tty: don't panic on OOM in tty_set_ldisc()
  rxrpc: Ignore BUSY packets on old calls
  net: mpls: Fix nexthop alive tracking on down events
  net/mlx4_core: Avoid delays during VF driver device shutdown
  nvmet-rdma: Fix a possible uninitialized variable dereference
  nvmet: confirm sq percpu has scheduled and switched to atomic
  nvme-loop: fix a possible use-after-free when destroying the admin queue
  afs: Fix abort on signal while waiting for call completion
  afs: Fix afs_kill_pages()
  afs: Fix page leak in afs_write_begin()
  afs: Populate and use client modification time
  afs: Better abort and net error handling
  afs: Invalid op ID should abort with RXGEN_OPCODE
  afs: Fix the maths in afs_fs_store_data()
  afs: Prevent callback expiry timer overflow
  afs: Migrate vlocation fields to 64-bit
  afs: Flush outstanding writes when an fd is closed
  afs: Deal with an empty callback array
  afs: Adjust mode bits processing
  afs: Populate group ID from vnode status
  afs: Fix missing put_page()
  drm/radeon: reinstate oland workaround for sclk
  mmc: mediatek: Fixed bug where clock frequency could be set wrong
  sched/deadline: Use deadline instead of period when calculating overflow
  sched/deadline: Throttle a constrained deadline task activated after the deadline
  sched/deadline: Make sure the replenishment timer fires in the next period
  sched/deadline: Add missing update_rq_clock() in dl_task_timer()
  iwlwifi: mvm: cleanup pending frames in DQA mode
  Drivers: hv: util: move waiting for release to hv_utils_transport itself
  drm/radeon/si: add dpm quirk for Oland
  fjes: Fix wrong netdevice feature flags
  scsi: hpsa: do not timeout reset operations
  scsi: hpsa: limit outstanding rescans
  scsi: hpsa: update check for logical volume status
  ASoC: rcar: clear DE bit only in PDMACHCR when it stops
  openrisc: fix issue handling 8 byte get_user calls
  intel_th: pci: Add Gemini Lake support
  drm: amd: remove broken include path
  qed: Fix interrupt flags on Rx LL2
  qed: Fix mapping leak on LL2 rx flow
  qed: Align CIDs according to DORQ requirement
  mlxsw: reg: Fix SPVMLR max record count
  mlxsw: reg: Fix SPVM max record count
  net: Resend IGMP memberships upon peer notification.
  irqchip/mvebu-odmi: Select GENERIC_MSI_IRQ_DOMAIN
  dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
  net: wimax/i2400m: fix NULL-deref at probe
  writeback: fix memory leak in wb_queue_work()
  blk-mq: Fix tagset reinit in the presence of cpu hot-unplug
  ASoC: rsnd: fix sound route path when using SRC6/SRC9
  netfilter: bridge: honor frag_max_size when refragmenting
  drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
  Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
  NFSD: fix nfsd_reset_versions for NFSv4.
  NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
  drm/amdgpu: fix parser init error path to avoid crash in parser fini
  iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
  net/mlx5: Don't save PCI state when PCI error is detected
  net/mlx5: Fix create autogroup prev initializer
  rxrpc: Wake up the transmitter if Rx window size increases on the peer
  net: bcmgenet: Power up the internal PHY before probing the MII
  net: bcmgenet: synchronize irq0 status between the isr and task
  net: bcmgenet: power down internal phy if open or resume fails
  net: bcmgenet: reserved phy revisions must be checked first
  net: bcmgenet: correct MIB access of UniMAC RUNT counters
  net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values
  bnxt_en: Ignore 0 value in autoneg supported speed from firmware.
  net: initialize msg.msg_flags in recvfrom
  userfaultfd: selftest: vm: allow to build in vm/ directory
  userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE
  md-cluster: free md_cluster_info if node leave cluster
  usb: xhci-mtk: check hcc_params after adding primary hcd
  KVM: nVMX: do not warn when MSR bitmap address is not backed
  usb: phy: isp1301: Add OF device ID table
  mac80211: Fix addition of mesh configuration element
  ext4: fix crash when a directory's i_size is too small
  ext4: fix fdatasync(2) after fallocate(2) operation
  dmaengine: dmatest: move callback wait queue to thread context
  eeprom: at24: change nvmem stride to 1
  sched/rt: Do not pull from current CPU if only one CPU to pull
  nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
  xhci: Don't add a virt_dev to the devs array before it's fully allocated
  Bluetooth: btusb: driver to enable the usb-wakeup feature
  usb: xhci: fix TDS for MTK xHCI1.1
  ceph: drop negative child dentries before try pruning inode's alias
  usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
  usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
  usb: add helper to extract bits 12:11 of wMaxPacketSize
  usbip: fix stub_rx: get_pipe() to validate endpoint number
  USB: core: prevent malicious bNumInterfaces overflow
  USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
  tracing: Allocate mask_str buffer dynamically
  autofs: fix careless error in recent commit
  crypto: salsa20 - fix blkcipher_walk API usage
  crypto: hmac - require that the underlying hash algorithm is unkeyed
  crypto: rsa - fix buffer overread when stripping leading zeroes
  mfd: fsl-imx25: Clean up irq settings during removal
  Linux 4.9.70
  RDMA/cxgb4: Annotate r2 and stag as __be32
  md: free unused memory after bitmap resize
  audit: ensure that 'audit=1' actually enables audit for PID 1
  ipvlan: fix ipv6 outbound device
  kbuild: do not call cc-option before KBUILD_CFLAGS initialization
  powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold
  KVM: arm/arm64: vgic-its: Preserve the revious read from the pending table
  fix kcm_clone()
  usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping
  s390: always save and restore all registers on context switch
  ipmi: Stop timers before cleaning up the module
  Fix handling of verdicts after NF_QUEUE
  tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv()
  s390/qeth: fix thinko in IPv4 multicast address tracking
  s390/qeth: fix GSO throughput regression
  s390/qeth: build max size GSO skbs on L2 devices
  tcp/dccp: block bh before arming time_wait timer
  stmmac: reset last TSO segment size after device open
  net: remove hlist_nulls_add_tail_rcu()
  usbnet: fix alignment for frames with no ethernet header
  net/packet: fix a race in packet_bind() and packet_notifier()
  packet: fix crash in fanout_demux_rollover()
  sit: update frag_off info
  rds: Fix NULL pointer dereference in __rds_rdma_map
  tipc: fix memory leak in tipc_accept_from_sock()
  s390/qeth: fix early exit from error path
  net: qmi_wwan: add Quectel BG96 2c7c:0296
  ANDROID: dma-buf/sw_sync: Rename active_list to link
  FROMLIST: android: binder: Fix null ptr dereference in debug msg
  FROMLIST: android: binder: Move buffer out of area shared with user space
  FROMLIST: android: binder: Add allocator selftest
  FROMLIST: android: binder: Refactor prev and next buffer into a helper function
  Linux 4.9.69
  afs: Connect up the CB.ProbeUuid
  IB/mlx5: Assign send CQ and recv CQ of UMR QP
  IB/mlx4: Increase maximal message size under UD QP
  xfrm: Copy policy family in clone_policy
  jump_label: Invoke jump_label_test() via early_initcall()
  atm: horizon: Fix irq release error
  clk: uniphier: fix DAPLL2 clock rate of Pro5
  bpf: fix lockdep splat
  sctp: use the right sk after waking up from wait_buf sleep
  sctp: do not free asoc when it is already dead in sctp_sendmsg
  zsmalloc: calling zs_map_object() from irq is a bug
  sparc64/mm: set fields in deferred pages
  block: wake up all tasks blocked in get_request()
  dt-bindings: usb: fix reg-property port-number range
  xfs: fix forgotten rcu read unlock when skipping inode reclaim
  sunrpc: Fix rpc_task_begin trace point
  NFS: Fix a typo in nfs_rename()
  dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0
  lib/genalloc.c: make the avail variable an atomic_long_t
  drivers/rapidio/devices/rio_mport_cdev.c: fix resource leak in error handling path in 'rio_dma_transfer()'
  route: update fnhe_expires for redirect when the fnhe exists
  route: also update fnhe_genid when updating a route cache
  gre6: use log_ecn_error module parameter in ip6_tnl_rcv()
  mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl()
  x86/mpx/selftests: Fix up weird arrays
  coccinelle: fix parallel build with CHECK=scripts/coccicheck
  kbuild: pkg: use --transform option to prefix paths in tar
  EDAC, i5000, i5400: Fix definition of NRECMEMB register
  EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro
  powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
  drm/amd/amdgpu: fix console deadlock if late init failed
  axonram: Fix gendisk handling
  netfilter: don't track fragmented packets
  zram: set physical queue limits to avoid array out of bounds accesses
  blk-mq: initialize mq kobjects in blk_mq_init_allocated_queue()
  i2c: riic: fix restart condition
  crypto: s5p-sss - Fix completing crypto request in IRQ handler
  ipv6: reorder icmpv6_init() and ip6_mr_init()
  ibmvnic: Allocate number of rx/tx buffers agreed on by firmware
  ibmvnic: Fix overflowing firmware/hardware TX queue
  rds: tcp: Sequence teardown of listen and acceptor sockets to avoid races
  bnx2x: do not rollback VF MAC/VLAN filters we did not configure
  bnx2x: fix detection of VLAN filtering feature for VF
  bnx2x: fix possible overrun of VFPF multicast addresses array
  bnx2x: prevent crash when accessing PTP with interface down
  spi_ks8995: regs_size incorrect for some devices
  spi_ks8995: fix "BUG: key accdaa28 not in .data!"
  KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled
  arm64: KVM: Survive unknown traps from guests
  arm: KVM: Survive unknown traps from guests
  KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset
  irqchip/crossbar: Fix incorrect type of register size
  scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters
  scsi: qla2xxx: Fix ql_dump_buffer
  workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq
  libata: drop WARN from protocol error in ata_sff_qc_issue()
  kvm: nVMX: VMCLEAR should not cause the vCPU to shut down
  usb: gadget: udc: net2280: Fix tmp reusage in net2280 driver
  usb: gadget: pxa27x: Test for a valid argument pointer
  usb: dwc3: gadget: Fix system suspend/resume on TI platforms
  USB: gadgetfs: Fix a potential memory leak in 'dev_config()'
  usb: gadget: configs: plug memory leak
  HID: chicony: Add support for another ASUS Zen AiO keyboard
  gpio: altera: Use handle_level_irq when configured as a level_high
  ASoC: rcar: avoid SSI_MODEx settings for SSI8
  ARM: OMAP2+: Release device node after it is no longer needed.
  ARM: OMAP2+: Fix device node reference counts
  powerpc/64: Fix checksum folding in csum_add()
  module: set __jump_table alignment to 8
  lirc: fix dead lock between open and wakeup_filter
  powerpc: Fix compiling a BE kernel with a powerpc64le toolchain
  selftest/powerpc: Fix false failures for skipped tests
  powerpc/64: Invalidate process table caching after setting process table
  x86/hpet: Prevent might sleep splat on resume
  sched/fair: Make select_idle_cpu() more aggressive
  x86/platform/uv/BAU: Fix HUB errors by remove initial write to sw-ack register
  x86/selftests: Add clobbers for int80 on x86_64
  ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure
  vti6: Don't report path MTU below IPV6_MIN_MTU.
  ARM: 8657/1: uaccess: consistently check object sizes
  Revert "spi: SPI_FSL_DSPI should depend on HAS_DMA"
  Revert "drm/armada: Fix compile fail"
  mm: drop unused pmdp_huge_get_and_clear_notify()
  thp: fix MADV_DONTNEED vs. numa balancing race
  thp: reduce indentation level in change_huge_pmd()
  ARM: avoid faulting on qemu
  ARM: BUG if jumping to usermode address in kernel mode
  usb: f_fs: Force Reserved1=1 in OS_DESC_EXT_COMPAT
  crypto: talitos - fix ctr-aes-talitos
  crypto: talitos - fix use of sg_link_tbl_len
  crypto: talitos - fix AEAD for sha224 on non sha224 capable chips
  crypto: talitos - fix setkey to check key weakness
  crypto: talitos - fix memory corruption on SEC2
  crypto: talitos - fix AEAD test failures
  bus: arm-ccn: fix module unloading Error: Removing state 147 which has instances left.
  bus: arm-ccn: Fix use of smp_processor_id() in preemptible context
  bus: arm-ccn: Check memory allocation failure
  bus: arm-cci: Fix use of smp_processor_id() in preemptible context
  arm64: fpsimd: Prevent registers leaking from dead tasks
  KVM: arm/arm64: vgic-its: Check result of allocation before use
  KVM: arm/arm64: vgic-irqfd: Fix MSI entry allocation
  KVM: arm/arm64: Fix broken GICH_ELRSR big endian conversion
  KVM: VMX: remove I/O port 0x80 bypass on Intel hosts
  arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one
  arm64: KVM: fix VTTBR_BADDR_MASK BUG_ON off-by-one
  media: dvb: i2c transfers over usb cannot be done from stack
  drm/exynos: gem: Drop NONCONTIG flag for buffers allocated without IOMMU
  kdb: Fix handling of kallsyms_symbol_next() return value
  brcmfmac: change driver unbind order of the sdio function devices
  powerpc/64s: Initialize ISAv3 MMU registers before setting partition table
  KVM: s390: Fix skey emulation permission check
  s390: fix compat system call table
  smp/hotplug: Move step CPUHP_AP_SMPCFD_DYING to the correct place
  iommu/vt-d: Fix scatterlist offset handling
  ALSA: usb-audio: Add check return value for usb_string()
  ALSA: usb-audio: Fix out-of-bound error
  ALSA: seq: Remove spurious WARN_ON() at timer check
  ALSA: pcm: prevent UAF in snd_pcm_info
  btrfs: fix missing error return in btrfs_drop_snapshot
  KVM: x86: fix APIC page invalidation
  x86/PCI: Make broadcom_postcore_init() check acpi_disabled
  X.509: fix comparisons of ->pkey_algo
  X.509: reject invalid BIT STRING for subjectPublicKey
  KEYS: add missing permission check for request_key() destination
  ASN.1: check for error from ASN1_OP_END__ACT actions
  ASN.1: fix out-of-bounds read when parsing indefinite length item
  efi/esrt: Use memunmap() instead of kfree() to free the remapping
  efi: Move some sysfs files to be read-only by root
  scsi: libsas: align sata_device's rps_resp on a cacheline
  scsi: use dma_get_cache_alignment() as minimum DMA alignment
  scsi: dma-mapping: always provide dma_get_cache_alignment
  isa: Prevent NULL dereference in isa_bus driver callbacks
  hv: kvp: Avoid reading past allocated blocks from KVP file
  virtio: release virtio index when fail to device_register
  can: usb_8dev: cancel urb on -EPIPE and -EPROTO
  can: esd_usb2: cancel urb on -EPIPE and -EPROTO
  can: ems_usb: cancel urb on -EPIPE and -EPROTO
  can: kvaser_usb: cancel urb on -EPIPE and -EPROTO
  can: kvaser_usb: ratelimit errors if incomplete messages are received
  can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback()
  can: kvaser_usb: free buf in error paths
  can: ti_hecc: Fix napi poll return value for repoll
  usb: gadget: udc: renesas_usb3: fix number of the pipes
  ANDROID: Revert "arm64: move ELF_ET_DYN_BASE to 4GB / 4MB"
  ANDROID: Revert "arm: move ELF_ET_DYN_BASE to 4MB"
  Linux 4.9.68
  xen-netfront: avoid crashing on resume after a failure in talk_to_netback()
  usb: host: fix incorrect updating of offset
  USB: usbfs: Filter flags passed in from user space
  USB: devio: Prevent integer overflow in proc_do_submiturb()
  USB: Increase usbfs transfer limit
  USB: core: Add type-specific length check of BOS descriptors
  usb: xhci: fix panic in xhci_free_virt_devices_depth_first
  usb: hub: Cycle HUB power when initialization fails
  dma-buf: Update kerneldoc for sync_file_create
  dma-buf/sync_file: hold reference to fence when creating sync_file
  dma-buf/sw_sync: force signal all unsignaled fences on dying timeline
  dma-fence: Introduce drm_fence_set_error() helper
  dma-fence: Wrap querying the fence->status
  dma-fence: Clear fence->status during dma_fence_init()
  dma-buf/sw_sync: clean up list before signaling the fence
  dma-buf/sw_sync: move timeline_fence_ops around
  dma-buf/sw-sync: Use an rbtree to sort fences in the timeline
  dma-buf/sw-sync: Fix locking around sync_timeline lists
  dma-buf/sw-sync: sync_pt is private and of fixed size
  dma-buf/sw-sync: Reduce irqsave/irqrestore from known context
  dma-buf/sw-sync: Prevent user overflow on timeline advance
  dma-buf/sw-sync: Fix the is-signaled test to handle u32 wraparound
  dma-buf/dma-fence: Extract __dma_fence_is_later()
  net: fec: fix multicast filtering hardware setup
  xen-netback: vif counters from int/long to u64
  cec: initiator should be the same as the destination for, poll
  xen-netfront: Improve error handling during initialization
  mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  vfio/spapr: Fix missing mutex unlock when creating a window
  be2net: fix initial MAC setting
  net: thunderx: avoid dereferencing xcv when NULL
  net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
  gtp: fix cross netns recv on gtp socket
  gtp: clear DF bit on GTP packet tx
  nvmet: cancel fatal error and flush async work before free controller
  i2c: i2c-cadence: Initialize configuration before probing devices
  tcp: correct memory barrier usage in tcp_check_space()
  dmaengine: pl330: fix double lock
  tipc: fix cleanup at module unload
  tipc: fix nametbl_lock soft lockup at module exit
  RDMA/qedr: Fix RDMA CM loopback
  RDMA/qedr: Return success when not changing QP state
  mac80211: don't try to sleep in rate_control_rate_init()
  drm/amdgpu: fix unload driver issue for virtual display
  x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
  net: sctp: fix array overrun read on sctp_timer_tbl
  drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
  drm/amdgpu: fix bug set incorrect value to vce register
  qla2xxx: Fix wrong IOCB type assumption
  powerpc/mm: Fix memory hotplug BUG() on radix
  perf/x86/intel: Account interrupts for PEBS errors
  NFSv4: Fix client recovery when server reboots multiple times
  mac80211: prevent skb/txq mismatch
  KVM: arm/arm64: Fix occasional warning from the timer work function
  drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled
  drm/exynos/decon5433: update shadow registers iff there are active windows
  nfs: Don't take a reference on fl->fl_file for LOCK operation
  ravb: Remove Rx overflow log messages
  mac80211: calculate min channel width correctly
  mm: fix remote numa hits statistics
  net: qrtr: Mark 'buf' as little endian
  libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount
  net/appletalk: Fix kernel memory disclosure
  be2net: fix unicast list filling
  be2net: fix accesses to unicast list
  vti6: fix device register to report IFLA_INFO_KIND
  ARM: OMAP1: DMA: Correct the number of logical channels
  ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate
  net: systemport: Pad packet before inserting TSB
  net: systemport: Utilize skb_put_padto()
  libcxgb: fix error check for ip6_route_output()
  usb: gadget: f_fs: Fix ExtCompat descriptor validation
  dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status
  dmaengine: stm32-dma: Set correct args number for DMA request from DT
  l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups
  net/mlx4_en: Fix type mismatch for 32-bit systems
  dax: Avoid page invalidation races and unnecessary radix tree traversals
  iio: adc: ti-ads1015: add 10% to conversion wait time
  tools include: Do not use poison with C++
  kprobes/x86: Disable preemption in ftrace-based jprobes
  perf test attr: Fix ignored test case result
  usbip: tools: Install all headers needed for libusbip development
  sysrq : fix Show Regs call trace on ARM
  EDAC, sb_edac: Fix missing break in switch
  x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt()
  serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X
  usb: phy: tahvo: fix error handling in tahvo_usb_probe()
  mmc: sdhci-msm: fix issue with power irq
  spi: spi-axi: fix potential use-after-free after deregistration
  spi: sh-msiof: Fix DMA transfer size check
  staging: rtl8188eu: avoid a null dereference on pmlmepriv
  serial: 8250_fintek: Fix rs485 disablement on invalid ioctl()
  m68k: fix ColdFire node shift size calculation
  staging: greybus: loopback: Fix iteration count on async path
  selftests/x86/ldt_get: Add a few additional tests for limits
  s390/pci: do not require AIS facility
  ima: fix hash algorithm initialization
  USB: serial: option: add Quectel BG96 id
  s390/runtime instrumentation: simplify task exit handling
  serial: 8250_pci: Add Amazon PCI serial device ID
  usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub
  uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices
  mm, oom_reaper: gather each vma to prevent leaking TLB entry
  Revert "crypto: caam - get rid of tasklet"
  drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume()
  drm/fsl-dcu: avoid disabling pixel clock twice on suspend
  bcache: recover data from backing when data is clean
  bcache: only permit to recovery read error when cache device is clean
  Linux 4.9.67
  drm/i915: Prevent zero length "index" write
  drm/i915: Don't try indexed reads to alternate slave addresses
  NFS: revalidate "." etc correctly on "open".
  Revert "x86/entry/64: Add missing irqflags tracing to native_load_gs_index()"
  drm/amd/pp: fix typecast error in powerplay.
  drm/ttm: once more fix ttm_buffer_object_transfer
  drm/hisilicon: Ensure LDI regs are properly configured.
  drm/panel: simple: Add missing panel_simple_unprepare() calls
  drm/radeon: fix atombios on big endian
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  Revert "drm/radeon: dont switch vt on suspend"
  nvme-pci: add quirk for delay before CHK RDY for WDC SN200
  hwmon: (jc42) optionally try to disable the SMBUS timeout
  bcache: Fix building error on MIPS
  i2c: i801: Fix Failed to allocate irq -2147483648 error
  eeprom: at24: check at24_read/write arguments
  eeprom: at24: correctly set the size for at24mac402
  eeprom: at24: fix reading from 24MAC402/24MAC602
  mmc: core: prepend 0x to OCR entry in sysfs
  mmc: core: Do not leave the block driver in a suspended state
  KVM: lapic: Fixup LDR on load in x2apic
  KVM: lapic: Split out x2apic ldr calculation
  KVM: x86: inject exceptions produced by x86_decode_insn
  KVM: x86: Exit to user-mode on #UD intercept when emulator requires
  KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk
  ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate
  mfd: twl4030-power: Fix pmic for boards that need vmmc1 on reboot
  nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat
  nfsd: Fix another OPEN stateid race
  nfsd: Fix stateid races between OPEN and CLOSE
  btrfs: clear space cache inode generation always
  mm/madvise.c: fix madvise() infinite loop under special circumstances
  mm, hugetlbfs: introduce ->split() to vm_operations_struct
  mm/cma: fix alloc_contig_range ret code/potential leak
  mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()
  ARM: dts: omap3: logicpd-torpedo-37xx-devkit: Fix MMC1 cd-gpio
  ARM: dts: LogicPD Torpedo: Fix camera pin mux
  Linux 4.9.66
  xen: xenbus driver must not accept invalid transaction ids
  nvmet: fix KATO offset in Set Features
  cec: update log_addr[] before finishing configuration
  cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2
  cec: when canceling a message, don't overwrite old status info
  s390/kbuild: enable modversions for symbols exported from asm
  ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data
  btrfs: return the actual error value from from btrfs_uuid_tree_iterate
  crypto: marvell - Copy IVDIG before launching partial DMA ahash requests
  ASoC: rsnd: don't double free kctrl
  netfilter: nf_tables: fix oob access
  netfilter: nft_queue: use raw_smp_processor_id()
  spi: SPI_FSL_DSPI should depend on HAS_DMA
  staging: iio: cdc: fix improper return value
  iio: light: fix improper return value
  adm80211: add checks for dma mapping errors
  mac80211: Suppress NEW_PEER_CANDIDATE event if no room
  mac80211: Remove invalid flag operations in mesh TSF synchronization
  drm/mediatek: don't use drm_put_dev
  clk: qcom: ipq4019: Add all the frequencies for apss cpu
  drm: Apply range restriction after color adjustment when allocation
  gpio: mockup: dynamically allocate memory for chip name
  ALSA: hda - Apply ALC269_FIXUP_NO_SHUTUP on HDA_FIXUP_ACT_PROBE
  ath10k: set CTS protection VDEV param only if VDEV is up
  bnxt_en: Set default completion ring for async events.
  pinctrl: sirf: atlas7: Add missing 'of_node_put()'
  ath10k: fix potential memory leak in ath10k_wmi_tlv_op_pull_fw_stats()
  ath10k: ignore configuring the incorrect board_id
  ath10k: fix incorrect txpower set by P2P_DEVICE interface
  mwifiex: sdio: fix use after free issue for save_adapter
  adm80211: return an error if adm8211_alloc_rings() fails
  rt2800: set minimum MPDU and PSDU lengths to sane values
  drm/armada: Fix compile fail
  net: 3com: typhoon: typhoon_init_one: fix incorrect return values
  net: 3com: typhoon: typhoon_init_one: make return values more specific
  net: Allow IP_MULTICAST_IF to set index to L3 slave
  fscrypt: use ENOTDIR when setting encryption policy on nondirectory
  fscrypt: use ENOKEY when file cannot be created w/o key
  dmaengine: zx: set DMA_CYCLIC cap_mask bit
  clk: sunxi-ng: fix PLL_CPUX adjusting on A33
  clk: sunxi-ng: A31: Fix spdif clock register
  drm/sun4i: Fix a return value in case of error
  PCI: Apply _HPX settings only to relevant devices
  RDS: RDMA: fix the ib_map_mr_sg_zbva() argument
  RDS: RDMA: return appropriate error on rdma map failures
  RDS: make message size limit compliant with spec
  e1000e: Avoid receiver overrun interrupt bursts
  e1000e: Separate signaling for link check/link up
  e1000e: Fix return value test
  e1000e: Fix error path in link detection
  Revert "drm/i915: Do not rely on wm preservation for ILK watermarks"
  PM / OPP: Add missing of_node_put(np)
  net/9p: Switch to wait_event_killable()
  fscrypt: lock mutex before checking for bounce page pool
  sched/rt: Simplify the IPI based RT balancing logic
  media: v4l2-ctrl: Fix flags field on Control events
  cx231xx-cards: fix NULL-deref on missing association descriptor
  media: rc: check for integer overflow
  media: Don't do DMA on stack for firmware upload in the AS102 driver
  powerpc/signal: Properly handle return value from uprobe_deny_signal()
  parisc: Fix validity check of pointer size argument in new CAS implementation
  ixgbe: Fix skb list corruption on Power systems
  fm10k: Use smp_rmb rather than read_barrier_depends
  i40evf: Use smp_rmb rather than read_barrier_depends
  ixgbevf: Use smp_rmb rather than read_barrier_depends
  igbvf: Use smp_rmb rather than read_barrier_depends
  igb: Use smp_rmb rather than read_barrier_depends
  i40e: Use smp_rmb rather than read_barrier_depends
  NFC: fix device-allocation error return
  IB/srp: Avoid that a cable pull can trigger a kernel crash
  IB/srpt: Do not accept invalid initiator port names
  libnvdimm, namespace: make 'resource' attribute only readable by root
  libnvdimm, namespace: fix label initialization to use valid seq numbers
  libnvdimm, pfn: make 'resource' attribute only readable by root
  clk: ti: dra7-atl-clock: fix child-node lookups
  SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status
  KVM: SVM: obey guest PAT
  KVM: nVMX: set IDTR and GDTR limits when loading L1 host state
  lockd: double unregister of inetaddr notifiers
  irqchip/gic-v3: Fix ppi-partitions lookup
  block: Fix a race between blk_cleanup_queue() and timeout handling
  p54: don't unregister leds when they are not initialized
  mtd: nand: mtk: fix infinite ECC decode IRQ issue
  mtd: nand: Fix writing mtdoops to nand flash.
  mtd: nand: omap2: Fix subpage write
  target: Fix QUEUE_FULL + SCSI task attribute handling
  iscsi-target: Fix non-immediate TMR reference leak
  fs/9p: Compare qid.path in v9fs_test_inode
  fix a page leak in vhost_scsi_iov_to_sgl() error recovery
  ALSA: hda/realtek - Fix ALC700 family no sound issue
  ALSA: hda: Fix too short HDMI/DP chmap reporting
  ALSA: timer: Remove kernel warning at compat ioctl error paths
  ALSA: usb-audio: Add sanity checks in v2 clock parsers
  ALSA: usb-audio: Fix potential out-of-bound access at parsing SU
  ALSA: usb-audio: Add sanity checks to FE parser
  ALSA: pcm: update tstamp only if audio_tstamp changed
  ext4: fix interaction between i_size, fallocate, and delalloc after a crash
  ata: fixes kernel crash while tracing ata_eh_link_autopsy event
  rtlwifi: fix uninitialized rtlhal->last_suspend_sec time
  rtlwifi: rtl8192ee: Fix memory leak when loading firmware
  nfsd: deal with revoked delegations appropriately
  NFS: Avoid RCU usage in tracepoints
  nfs: Fix ugly referral attributes
  NFS: Fix typo in nomigration mount option
  isofs: fix timestamps beyond 2027
  bcache: check ca->alloc_thread initialized before wake up it
  libceph: don't WARN() if user tries to add invalid key
  eCryptfs: use after free in ecryptfs_release_messaging()
  nilfs2: fix race condition that causes file system corruption
  autofs: don't fail mount for transient error
  rt2x00usb: mark device removed when get ENOENT usb error
  MIPS: BCM47XX: Fix LED inversion for WRT54GSv1
  MIPS: Fix an n32 core file generation regset support regression
  MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry
  MIPS: Fix odd fp register warnings with MIPS64r2
  dm: fix race between dm_get_from_kobject() and __dm_destroy()
  MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver
  dm: allocate struct mapped_device with kvzalloc
  dm bufio: fix integer overflow when limiting maximum cache size
  ALSA: hda: Add Raven PCI ID
  PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF
  MIPS: ralink: Fix typo in mt7628 pinmux function
  MIPS: ralink: Fix MT7628 pinmux
  ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
  ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
  arm64: Implement arch-specific pte_access_permitted()
  x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
  x86/decoder: Add new TEST instruction pattern
  lib/mpi: call cond_resched() from mpi_powm() loop
  sched: Make resched_cpu() unconditional
  vsock: use new wait API for vsock_stream_sendmsg()
  ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
  x86/mm: fix use-after-free of vma during userfaultfd fault
  ACPI / EC: Fix regression related to triggering source of EC event handling
  s390/disassembler: increase show_code buffer size
  s390/disassembler: add missing end marker for e7 table
  s390/runtime instrumention: fix possible memory corruption
  s390: fix transactional execution control register handling

Conflicts:
	drivers/android/binder_alloc.c
	drivers/android/binder_alloc.h
	drivers/android/binder_alloc_selftest.c
	drivers/mmc/core/bus.c
	drivers/mmc/host/sdhci-msm.c
	drivers/thermal/step_wise.c
	kernel/cpu.c
	mm/oom_kill.c
	sound/usb/mixer.c

Change-Id: Id01eb66cafc5970b460321e44ec8ffcfa76971a6
Signed-off-by: Kyle Yan <kyan@codeaurora.org>
2018-01-02 10:37:28 -08:00
Greg Kroah-Hartman
eb4ef2dea9 Merge 4.9.68 into android-4.9-o
Changes in 4.9.68
	bcache: only permit to recovery read error when cache device is clean
	bcache: recover data from backing when data is clean
	drm/fsl-dcu: avoid disabling pixel clock twice on suspend
	drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume()
	Revert "crypto: caam - get rid of tasklet"
	mm, oom_reaper: gather each vma to prevent leaking TLB entry
	uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices
	usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub
	serial: 8250_pci: Add Amazon PCI serial device ID
	s390/runtime instrumentation: simplify task exit handling
	USB: serial: option: add Quectel BG96 id
	ima: fix hash algorithm initialization
	s390/pci: do not require AIS facility
	selftests/x86/ldt_get: Add a few additional tests for limits
	staging: greybus: loopback: Fix iteration count on async path
	m68k: fix ColdFire node shift size calculation
	serial: 8250_fintek: Fix rs485 disablement on invalid ioctl()
	staging: rtl8188eu: avoid a null dereference on pmlmepriv
	spi: sh-msiof: Fix DMA transfer size check
	spi: spi-axi: fix potential use-after-free after deregistration
	mmc: sdhci-msm: fix issue with power irq
	usb: phy: tahvo: fix error handling in tahvo_usb_probe()
	serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X
	x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt()
	EDAC, sb_edac: Fix missing break in switch
	sysrq : fix Show Regs call trace on ARM
	usbip: tools: Install all headers needed for libusbip development
	perf test attr: Fix ignored test case result
	kprobes/x86: Disable preemption in ftrace-based jprobes
	tools include: Do not use poison with C++
	iio: adc: ti-ads1015: add 10% to conversion wait time
	dax: Avoid page invalidation races and unnecessary radix tree traversals
	net/mlx4_en: Fix type mismatch for 32-bit systems
	l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups
	dmaengine: stm32-dma: Set correct args number for DMA request from DT
	dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status
	usb: gadget: f_fs: Fix ExtCompat descriptor validation
	libcxgb: fix error check for ip6_route_output()
	net: systemport: Utilize skb_put_padto()
	net: systemport: Pad packet before inserting TSB
	ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate
	ARM: OMAP1: DMA: Correct the number of logical channels
	vti6: fix device register to report IFLA_INFO_KIND
	be2net: fix accesses to unicast list
	be2net: fix unicast list filling
	net/appletalk: Fix kernel memory disclosure
	libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount
	net: qrtr: Mark 'buf' as little endian
	mm: fix remote numa hits statistics
	mac80211: calculate min channel width correctly
	ravb: Remove Rx overflow log messages
	nfs: Don't take a reference on fl->fl_file for LOCK operation
	drm/exynos/decon5433: update shadow registers iff there are active windows
	drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled
	KVM: arm/arm64: Fix occasional warning from the timer work function
	mac80211: prevent skb/txq mismatch
	NFSv4: Fix client recovery when server reboots multiple times
	perf/x86/intel: Account interrupts for PEBS errors
	powerpc/mm: Fix memory hotplug BUG() on radix
	qla2xxx: Fix wrong IOCB type assumption
	drm/amdgpu: fix bug set incorrect value to vce register
	drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
	net: sctp: fix array overrun read on sctp_timer_tbl
	x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
	drm/amdgpu: fix unload driver issue for virtual display
	mac80211: don't try to sleep in rate_control_rate_init()
	RDMA/qedr: Return success when not changing QP state
	RDMA/qedr: Fix RDMA CM loopback
	tipc: fix nametbl_lock soft lockup at module exit
	tipc: fix cleanup at module unload
	dmaengine: pl330: fix double lock
	tcp: correct memory barrier usage in tcp_check_space()
	i2c: i2c-cadence: Initialize configuration before probing devices
	nvmet: cancel fatal error and flush async work before free controller
	gtp: clear DF bit on GTP packet tx
	gtp: fix cross netns recv on gtp socket
	net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
	net: thunderx: avoid dereferencing xcv when NULL
	be2net: fix initial MAC setting
	vfio/spapr: Fix missing mutex unlock when creating a window
	mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
	xen-netfront: Improve error handling during initialization
	cec: initiator should be the same as the destination for, poll
	xen-netback: vif counters from int/long to u64
	net: fec: fix multicast filtering hardware setup
	dma-buf/dma-fence: Extract __dma_fence_is_later()
	dma-buf/sw-sync: Fix the is-signaled test to handle u32 wraparound
	dma-buf/sw-sync: Prevent user overflow on timeline advance
	dma-buf/sw-sync: Reduce irqsave/irqrestore from known context
	dma-buf/sw-sync: sync_pt is private and of fixed size
	dma-buf/sw-sync: Fix locking around sync_timeline lists
	dma-buf/sw-sync: Use an rbtree to sort fences in the timeline
	dma-buf/sw_sync: move timeline_fence_ops around
	dma-buf/sw_sync: clean up list before signaling the fence
	dma-fence: Clear fence->status during dma_fence_init()
	dma-fence: Wrap querying the fence->status
	dma-fence: Introduce drm_fence_set_error() helper
	dma-buf/sw_sync: force signal all unsignaled fences on dying timeline
	dma-buf/sync_file: hold reference to fence when creating sync_file
	dma-buf: Update kerneldoc for sync_file_create
	usb: hub: Cycle HUB power when initialization fails
	usb: xhci: fix panic in xhci_free_virt_devices_depth_first
	USB: core: Add type-specific length check of BOS descriptors
	USB: Increase usbfs transfer limit
	USB: devio: Prevent integer overflow in proc_do_submiturb()
	USB: usbfs: Filter flags passed in from user space
	usb: host: fix incorrect updating of offset
	xen-netfront: avoid crashing on resume after a failure in talk_to_netback()
	Linux 4.9.68

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-12-10 17:14:07 +01:00
Jiri Olsa
a88ff235e8 perf/x86/intel: Account interrupts for PEBS errors
[ Upstream commit 475113d937adfd150eb82b5e2c5507125a68e7af ]

It's possible to set up PEBS events to get only errors and not
any data, like on SNB-X (model 45) and IVB-EP (model 62)
via 2 perf commands running simultaneously:

    taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10

This leads to a soft lock up, because the error path of the
intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt
for error PEBS interrupts, so in case you're getting ONLY
errors you don't have a way to stop the event when it's over
the max_samples_per_tick limit:

  NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816]
  ...
  RIP: 0010:[<ffffffff81159232>]  [<ffffffff81159232>] smp_call_function_single+0xe2/0x140
  ...
  Call Trace:
   ? trace_hardirqs_on_caller+0xf5/0x1b0
   ? perf_cgroup_attach+0x70/0x70
   perf_install_in_context+0x199/0x1b0
   ? ctx_resched+0x90/0x90
   SYSC_perf_event_open+0x641/0xf90
   SyS_perf_event_open+0x9/0x10
   do_syscall_64+0x6c/0x1f0
   entry_SYSCALL64_slow_path+0x25/0x25

Add perf_event_account_interrupt() which does the interrupt
and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s
error path.

We keep the pending_kill and pending_wakeup logic only in the
__perf_event_overflow() path, because they make sense only if
there's any data to deliver.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09 22:01:52 +01:00
Raghavendra Rao Ananta
b5fd2edf6f perf: Fix event cleanup across CPU hotplugs
Perf hardware events are generally tied to a particular CPU.
During the cleanup process perf_remove_from_context()
tries to execute the PMU related cleanup on the CPU that the
event is tied to. But if the CPU is offline, the cleanup is
not completed successfully at the PMU level, but the event
is freed (at the core level).

This fix is to collect the events that are approaching the release
when it's CPU is down. These zombie events are cleaned up when the
CPU is hotplugged on again.

Change-Id: I9ddf8f32cfcec4e0cb1c0910fb68b6db984d7553
Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
2017-12-01 21:14:53 -08:00
Raghavendra Rao Ananta
459740ae0d perf: Enable user and kernel event sharing
The kernel and the user-space are now able to share the similar events.
The kernel and user's attr->type (hardware/raw) and attr->config should
be same for them to share the same counter.

Change-Id: I4a4b35bde6beaf8f2aef74e683a9804e31807013
Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
2017-11-17 11:23:04 -08:00
John Dias
daf4560d30 perf: protect group_leader from races that cause ctx double-free
When moving a group_leader perf event from a software-context
to a hardware-context, there's a race in checking and
updating that context. The existing locking solution
doesn't work; note that it tries to grab a lock inside
the group_leader's context object, which you can only
get at by going through a pointer that should be protected
from these races. To avoid that problem, and to produce
a simple solution, we can just use a lock per group_leader
to protect all checks on the group_leader's context.
The new lock is grabbed and released when no context locks
are held.

Bug: 30955111
Bug: 31095224
Change-Id: If37124c100ca6f4aa962559fba3bd5dbbec8e052
Git-repo: https://android.googlesource.com/kernel/msm
Git-commit: 5b87e00be9ca28ea32cab49b92c0386e4a91f730
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
2017-08-29 18:19:19 -07:00
Patrick Fay
b02d7648fb perf: enable perf to continue across hotplug
Currently perf hardware, software and tracepoint events are
deleted when a cpu is hotplugged out. This change restarts
the events after hotplug. In arm_pmu.c most of the code
for handline power collapse is reused for hotplug.
This change supercedes commit 1f0f95c5fe9e ("perf: add hotplug
support so that perf continues after hotplug") and uses the
new hotplug notification method.

Change-Id: Ia9de1c150558125e2da5f2c4354c268ce4b1154a
Signed-off-by: Patrick Fay <pfay@codeaurora.org>
2017-04-03 19:20:57 -07:00
Jeff Vander Stoep
d28f856d2c FROMLIST: security,perf: Allow further restriction of perf_event_open
When kernel.perf_event_open is set to 3 (or greater), disallow all
access to performance events by users without CAP_SYS_ADMIN.
Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
makes this value the default.

This is based on a similar feature in grsecurity
(CONFIG_GRKERNSEC_PERF_HARDEN).  This version doesn't include making
the variable read-only.  It also allows enabling further restriction
at run-time regardless of whether the default is changed.

https://lkml.org/lkml/2016/1/11/587

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Bug: 29054680
Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8
2017-01-27 13:55:09 -08:00
Channagoud Kadabi
e6d4a923be Merge remote-tracking branch 'origin/tmp-a909d3e' into 1105
* origin/tmp-a909d3e:
  Linux 4.9-rc3
  x86/smpboot: Init apic mapping before usage
  ACPICA: Dispatcher: Fix interpreter locking around acpi_ev_initialize_region()
  ACPICA: Dispatcher: Fix an unbalanced lock exit path in acpi_ds_auto_serialize_method()
  ACPICA: Dispatcher: Fix order issue of method termination
  ARC: module: print pretty section names
  ARC: module: elide loop to save reference to .eh_frame
  ARC: mm: retire ARC_DBG_TLB_MISS_COUNT...
  ARC: build: retire old toggles
  ARC: boot log: refactor cpu name/release printing
  ARC: boot log: remove awkward space comma from MMU line
  ARC: boot log: don't assume SWAPE instruction support
  ARC: boot log: refactor printing abt features not captured in BCRs
  ARCv2: boot log: print IOC exists as well as enabled status
  ubifs: Fix regression in ubifs_readdir()
  ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap()
  MAINTAINERS: Add entry for genwqe driver
  VMCI: Doorbell create and destroy fixes
  GenWQE: Fix bad page access during abort of resource allocation
  vme: vme_get_size potentially returning incorrect value on failure
  tty: serial_core: fix NULL struct tty pointer access in uart_write_wakeup
  tty: serial_core: Fix serial console crash on port shutdown
  tty/serial: at91: fix hardware handshake on Atmel platforms
  perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors
  perf/powerpc: Don't call perf_event_disable() from atomic context
  perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic
  x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y
  drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk
  cris/arch-v32: cryptocop: print a hex number after a 0x prefix
  ipack: print a hex number after a 0x prefix
  block: DAC960: print a hex number after a 0x prefix
  fs: exofs: print a hex number after a 0x prefix
  lib/genalloc.c: start search from start of chunk
  mm: memcontrol: do not recurse in direct reclaim
  CREDITS: update credit information for Martin Kepplinger
  proc: fix NULL dereference when reading /proc/<pid>/auxv
  mm: kmemleak: ensure that the task stack is not freed during scanning
  lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB
  latent_entropy: raise CONFIG_FRAME_WARN by default
  kconfig.h: remove config_enabled() macro
  ipc: account for kmem usage on mqueue and msg
  mm/slab: improve performance of gathering slabinfo stats
  mm: page_alloc: use KERN_CONT where appropriate
  mm/list_lru.c: avoid error-path NULL pointer deref
  h8300: fix syscall restarting
  kcov: properly check if we are in an interrupt
  mm/slab: fix kmemcg cache creation delayed issue
  device-dax: fix percpu_ref_exit ordering
  Allow KASAN and HOTPLUG_MEMORY to co-exist when doing build testing
  nvdimm: make CONFIG_NVDIMM_DAX 'bool'
  mm: remove unused variable in memory hotplug
  btrfs: fix races on root_log_ctx lists
  mm: remove per-zone hashtable of bitlock waitqueues
  blk-mq: update hardware and software queues for sleeping alloc
  driver core: Make Kconfig text for DEBUG_TEST_DRIVER_REMOVE stronger
  kernfs: Add noop_fsync to supported kernfs_file_fops
  vt: clear selection before resizing
  sc16is7xx: always write state when configuring GPIO as an output
  sh-sci: document R8A7743/5 support
  tty: serial: 8250: 8250_core: NXP SC16C2552 workaround
  tty: limit terminal size to 4M chars
  tty: serial: fsl_lpuart: Fix Tx DMA edge case
  serial: 8250_lpss: enable MSI for sure
  serial: core: fix console problems on uart_close
  serial: 8250_uniphier: fix clearing divisor latch access bit
  serial: 8250_uniphier: fix more unterminated string
  serial: pch_uart: add terminate entry for dmi_system_id tables
  devicetree: bindings: uart: Add new compatible string for ZynqMP
  serial: xuartps: Add new compatible string for ZynqMP
  serial: SERIAL_STM32 should depend on HAS_DMA
  serial: stm32: Fix comparisons with undefined register
  tty: vt, fix bogus division in csi_J
  powerpc/64s: relocation, register save fixes for system reset interrupt
  powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu
  powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state()
  ALSA: usb-audio: Add quirk for Syntek STK1160
  sched/fair: Remove unused but set variable 'rq'
  objtool: Fix rare switch jump table pattern detection
  security/keys: make BIG_KEYS dependent on stdrng.
  KEYS: Sort out big_key initialisation
  KEYS: Fix short sprintf buffer in /proc/keys show function
  arm64: mm: fix __page_to_voff definition
  arm64/numa: fix incorrect log for memory-less node
  arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity
  block: flush: fix IO hang in case of flood fua req
  x86: Fix export for mcount and __fentry__
  doc: Add missing parameter for msi_setup
  extcon: qcom-spmi-misc: Sync the extcon state on interrupt
  drm/drivers: add support for using the arch wc mapping API.
  x86/io: add interface to reserve io memtype for a resource range. (v1.1)
  MAINTAINERS: Begin module maintainer transition
  ahci: fix the single MSI-X case in ahci_init_one
  timers: Prevent base clock corruption when forwarding
  timers: Prevent base clock rewind when forwarding clock
  timers: Lock base for same bucket optimization
  timers: Plug locking race vs. timer migration
  ALSA: seq: Fix time account regression
  i2c: imx: defer probe if bus recovery GPIOs are not ready
  i2c: designware: Avoid aborted transfers with fast reacting I2C slaves
  i2c: i801: Fix I2C Block Read on 8-Series/C220 and later
  i2c: xgene: Avoid dma_buffer overrun
  i2c: digicolor: Fix module autoload
  i2c: xlr: Fix module autoload for OF registration
  i2c: xlp9xx: Fix module autoload
  i2c: jz4780: Fix module autoload
  i2c: allow configuration of imx driver for ColdFire architecture
  i2c: mark device nodes only in case of successful instantiation
  x86/quirks: Hide maybe-uninitialized warning
  x86/build: Fix build with older GCC versions
  x86/unwind: Fix empty stack dereference in guess unwinder
  i2c: rk3x: Give the tuning value 0 during rk3x_i2c_v0_calc_timings
  i2c: hix5hd2: allow build with ARCH_HISI
  usb: chipidea: host: fix NULL ptr dereference during shutdown
  ALSA: hda - Fix surround output pins for ASRock B150M mobo
  hv: do not lose pending heartbeat vmbus packets
  mm: unexport __get_user_pages()
  proc: don't use FOLL_FORCE for reading cmdline and environment
  cpufreq: intel_pstate: Always set max P-state in performance mode
  nbd: fix incorrect unlock of nbd->sock_lock in sock_shutdown
  orangefs: don't use d_time
  orangefs: user file_inode() where it is due
  mei: txe: don't clean an unprocessed interrupt cause.
  ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
  ANDROID: binder: Add strong ref checks
  ARCv2: IOC: use @ioc_enable not @ioc_exist where intended
  ARC: syscall for userspace cmpxchg assist
  dm table: fix missing dm_put_target_type() in dm_table_add_target()
  xenbus: check return value of xenbus_scanf()
  xenbus: prefer list_for_each()
  x86: xen: move cpu_up functions out of ifdef
  xenbus: advertise control feature flags
  greybus: fix a leak on error in gb_module_create()
  greybus: es2: fix error return code in ap_probe()
  greybus: arche-platform: Add missing of_node_put() in arche_platform_change_state()
  staging: android: ion: Fix error handling in ion_query_heaps()
  ARM: imx: mach-imx6q: Fix the PHY ID mask for AR8031
  PM / suspend: Fix missing KERN_CONT for suspend message
  usb: renesas_usbhs: add wait after initialization for R-Car Gen3
  ACPI / APEI: Fix incorrect return value of ghes_proc()
  usb: increase ohci watchdog delay to 275 msec
  usb: musb: Call pm_runtime from musb_gadget_queue
  usb: musb: Fix hardirq-safe hardirq-unsafe lock order error
  usb: ehci-platform: increase EHCI_MAX_RSTS to 4
  usb: ohci-at91: Set RemoteWakeupConnected bit explicitly.
  ACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs
  ACPI/PCI: pci_link: penalize SCI correctly
  ACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages
  ARM: dts: vf610: fix IRQ flag of global timer
  powerpc/64: Fix race condition in setting lock bit in idle/wakeup code
  powerpc/64: Re-fix race condition between going idle and entering guest
  s390/mm: fix zone calculation in arch_add_memory()
  s390/dumpstack: use pr_cont within show_stack and die
  ARM: imx: gpc: Fix the imx_gpc_genpd_init() error path
  ARM: imx: gpc: Initialize all power domains
  xfs: clear cowblocks tag when cow fork is emptied
  xfs: fix up inode cowblocks tracking tracepoints
  fs: Do to trim high file position bits in iomap_page_mkwrite_actor
  cxl: Fix leaking pid refs in some error paths
  gpio: mpc8xxx: Correct irq handler function
  gpio: ath79: Fix module autoload
  arm64: dts: Updated NAND DT properties for NS2 SVK
  iio: accel: sca3000_core: avoid potentially uninitialized variable
  iio:chemical:atlas-ph-sensor: Fix use of 32 bit int to hold 16 bit big endian value
  arm64: dts: uniphier: change MIO node to SD control node
  ARM: dts: uniphier: change MIO node to SD control node
  reset: uniphier: rename MIO reset to SD reset for Pro5, PXs2, LD20 SoCs
  arm64: uniphier: select ARCH_HAS_RESET_CONTROLLER
  ARM: uniphier: select ARCH_HAS_RESET_CONTROLLER
  badblocks: badblocks_set/clear update unacked_exist
  softirq: Display IRQ_POLL for irq-poll statistics
  powerpc: Convert cmp to cmpd in idle enter sequence
  KVM: PPC: Book3S HV: Fix build error when SMP=n
  cpufreq: intel_pstate: Set P-state upfront in performance mode
  USB: serial: fix potential NULL-dereference at probe
  arm64: dts: Add timer erratum property for LS2080A and LS1043A
  gpio: ts4800: Fix module autoload
  gpio: GPIO_GET_LINEEVENT_IOCTL: Reject invalid line and event flags
  gpio: GPIO_GET_LINEHANDLE_IOCTL: Reject invalid line flags
  gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak
  gpio: GPIO_GET_LINEEVENT_IOCTL: Validate line offset
  gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak
  gpio: GPIO_GET_LINEHANDLE_IOCTL: Validate line offset
  gpio: GPIO_GET_CHIPINFO_IOCTL: Fix information leak
  gpio: GPIO_GET_CHIPINFO_IOCTL: Fix line offset validation
  clk: at91: Fix a return value in case of error
  ahci: fix nvec check
  xhci: use default USB_RESUME_TIMEOUT when resuming ports.
  xhci: workaround for hosts missing CAS bit
  xhci: add restart quirk for Intel Wildcatpoint PCH
  gpio / ACPI: fix returned error from acpi_dev_gpio_irq_get()
  gpio: mockup: add sysfs dependency
  gpio: stmpe: || vs && typo
  gpio: mxs: Unmap region obtained by of_iomap
  gpio/board.txt: point to gpiod_set_value
  ALSA: hda - Fix headset mic detection problem for two Dell laptops
  USB: serial: cp210x: fix tiocmget error handling
  thermal/powerclamp: correct cpu support check
  thermal: intel_pch_thermal: Enable Haswell PCH
  thermal: intel_pch_thermal: Add an ACPI passive trip
  xfs: remove xfs_bunmapi_cow
  xfs: optimize xfs_reflink_end_cow
  xfs: optimize xfs_reflink_cancel_cow_blocks
  xfs: refactor xfs_bunmapi_cow
  xfs: optimize writes to reflink files
  xfs: don't bother looking at the refcount tree for reads
  xfs: handle "raw" delayed extents xfs_reflink_trim_around_shared
  xfs: add xfs_trim_extent
  iomap: add IOMAP_REPORT
  xfs: merge xfs_reflink_remap_range and xfs_file_share_range
  xfs: remove xfs_file_wait_for_io
  xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range
  xfs: fix the same_inode check in xfs_file_share_range
  xfs: remove the same fs check from xfs_file_share_range
  libxfs: v3 inodes are only valid on crc-enabled filesystems
  libxfs: clean up _calc_dquots_per_chunk
  xfs: unset MS_ACTIVE if mount fails
  xfs: remove pointless error goto in xfs_bmap_remap_alloc
  xfs: don't take the IOLOCK exclusive for direct I/O page invalidation
  xfs: add some 'static' annotations
  xfs: Fix uninitialized variable in xfs_reflink_reserve_cow_range()
  xfs: remove redundant assignment of ifp
  ARC: fix build warning in elf.h
  clk: uniphier: rename MIO clock to SD clock for Pro5, PXs2, LD20 SoCs
  clk: uniphier: fix memory overrun bug
  pmem: report error on clear poison failure
  libnvdimm, namespace: potential NULL deref on allocation error
  ahci: only try to use multi-MSI mode if there is more than 1 port
  ARC: Adjust cpuinfo for non-continuous cpu ids
  perf/x86/intel/cstate: Add C-state residency events for Knights Landing
  wusb: fix error return code in wusb_prf()
  hwrng: core - Don't use a stack buffer in add_early_randomness()
  arm64: dts: rockchip: remove the abuse of keep-power-in-suspend
  dm rq: clear kworker_task if kthread_run() returned an error
  dm: free io_barrier after blk_cleanup_queue call
  ALSA: asihpi: fix kernel memory disclosure
  Revert "Documentation: devicetree: dwc2: Deprecate g-tx-fifo-size"
  Revert "usb: dwc2: gadget: fix TX FIFO size and address initialization"
  Revert "usb: dwc2: gadget: change variable name to more meaningful"
  ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table
  ALSA: hda - allow 40 bit DMA mask for NVidia devices
  clk: hi6220: use CLK_OF_DECLARE_DRIVER for sysctrl and mediactrl clock init
  clk: mvebu: armada-37xx-periph: Fix the clock gate flag
  clk: bcm2835: Clamp the PLL's requested rate to the hardware limits.
  clk: max77686: fix number of clocks setup for clk_hw based registration
  clk: mvebu: armada-37xx-periph: Fix the clock provider registration
  clk: core: add __init decoration for CLK_OF_DECLARE_DRIVER function
  clk: mediatek: Add hardware dependency
  clk: samsung: clk-exynos-audss: Fix module autoload
  clk: uniphier: fix type of variable passed to regmap_read()
  clk: uniphier: add system clock support for sLD3 SoC
  ARM: multi_v7_defconfig: Enable Intel e1000e driver
  MAINTAINERS: add myself as Marvell berlin SoC maintainer
  bus: qcom-ebi2: depend on ARCH_QCOM or COMPILE_TEST
  ARM: dts: fix the SD card on the Snowball
  dm raid: fix activation of existing raid4/10 devices
  scsi: NCR5380: no longer mark irq probing as __init
  scsi: be2iscsi: Replace _bh with _irqsave/irqrestore
  scsi: libiscsi: Fix locking in __iscsi_conn_send_pdu
  USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7
  s390/dumpstack: get rid of return_address again
  s390/disassambler: use pr_cont where appropriate
  s390/dumpstack: use pr_cont where appropriate
  s390/dumpstack: restore reliable indicator for call traces
  wusb: Stop using the stack for sg crypto scratch space
  usb: dwc3: Fix size used in dma_free_coherent()
  usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable
  usb: gadget: f_fs: edit epfile->ep under lock
  usb: dwc2: Add msleep for host-only
  s390/mm: use hugetlb_bad_size()
  s390/cio: don't register chpids in reserved state
  s390: ignore pkey system calls
  s390/dasd: avoid undefined behaviour
  usb: dwc3: gadget: never pre-start Isochronous endpoints
  usb: dwc3: gadget: properly account queued requests
  usb: gadget: function: u_ether: don't starve tx request queue
  usb: gadget: udc: atmel: fix endpoint name
  staging/lustre/llite: Move unstable_stats from sysfs to debugfs
  Staging: wilc1000: Fix kernel Oops on opening the device
  staging: android/ion: testing the wrong variable
  Staging: greybus: uart: Use gbphy_dev->dev instead of bundle->dev
  Staging: greybus: gpio: Use gbphy_dev->dev instead of bundle->dev
  ARC: [build] Support gz, lzma compressed uImage
  ARCv2: intc: untangle SMP, MCIP and IDU
  arm64: dts: rockchip: remove always-on and boot-on from vcc_sd
  dm mirror: use all available legs on multiple failures
  dm mirror: fix read error on recovery after default leg failure
  Btrfs: fix incremental send failure caused by balance
  dm raid: fix compat_features validation
  iio: adc: ti-adc081c: Select IIO_TRIGGERED_BUFFER to prevent build errors
  iio: maxim_thermocouple: Align 16 bit big endian value of raw reads
  arm64: dts: marvell: fix clocksource for CP110 master SPI0
  ARM: mvebu: Select corediv clk for all mvebu v7 SoC

Conflicts:
	drivers/android/binder.c
	drivers/staging/android/ion/ion.c
	drivers/usb/gadget/function/u_ether.c

Change-Id: Id87fbdb0731f6f7681787faa4cebf067a402aa7e
Signed-off-by: Channagoud Kadabi <ckadabi@codeaurora.org>
2016-11-05 16:31:24 -07:00
Jiri Olsa
5aab90ce1e perf/powerpc: Don't call perf_event_disable() from atomic context
The trinity syscall fuzzer triggered following WARN() on powerpc:

  WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278
  ...
  NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0
  LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0
  Call Trace:
  [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable)
  [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
  [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
  [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
  [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
  [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

Followed by a lockdep warning:

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.8.0-rc5+ #7 Tainted: G        W
  -------------------------------
  ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section!

  other info that might help us debug this:

  rcu_scheduler_active = 1, debug_locks = 0
  2 locks held by ls/2998:
   #0:  (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0
   #1:  (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0

  stack backtrace:
  CPU: 9 PID: 2998 Comm: ls Tainted: G        W       4.8.0-rc5+ #7
  Call Trace:
  [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable)
  [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180
  [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0
  [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0
  [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380
  [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60
  [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0
  [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
  [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
  [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
  [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
  [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

While it looks like the first WARN() is probably valid, the other one is
triggered by disabling event via perf_event_disable() from atomic context.

The event is disabled here in case we were not able to emulate
the instruction that hit the breakpoint. By disabling the event
we unschedule the event and make sure it's not scheduled back.

But we can't call perf_event_disable() from atomic context, instead
we need to use the event's pending_disable irq_work method to disable it.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161026094824.GA21397@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-28 11:06:25 +02:00
Jeff Vander Stoep
88b2873135 FROMLIST: security,perf: Allow further restriction of perf_event_open
When kernel.perf_event_open is set to 3 (or greater), disallow all
access to performance events by users without CAP_SYS_ADMIN.
Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
makes this value the default.

This is based on a similar feature in grsecurity
(CONFIG_GRKERNSEC_PERF_HARDEN).  This version doesn't include making
the variable read-only.  It also allows enabling further restriction
at run-time regardless of whether the default is changed.

https://lkml.org/lkml/2016/1/11/587

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Bug: 29054680
Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8
2016-10-24 23:41:21 +08:00
Linus Torvalds
687ee0ad4e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) BBR TCP congestion control, from Neal Cardwell, Yuchung Cheng and
    co. at Google. https://lwn.net/Articles/701165/

 2) Do TCP Small Queues for retransmits, from Eric Dumazet.

 3) Support collect_md mode for all IPV4 and IPV6 tunnels, from Alexei
    Starovoitov.

 4) Allow cls_flower to classify packets in ip tunnels, from Amir Vadai.

 5) Support DSA tagging in older mv88e6xxx switches, from Andrew Lunn.

 6) Support GMAC protocol in iwlwifi mwm, from Ayala Beker.

 7) Support ndo_poll_controller in mlx5, from Calvin Owens.

 8) Move VRF processing to an output hook and allow l3mdev to be
    loopback, from David Ahern.

 9) Support SOCK_DESTROY for UDP sockets. Also from David Ahern.

10) Congestion control in RXRPC, from David Howells.

11) Support geneve RX offload in ixgbe, from Emil Tantilov.

12) When hitting pressure for new incoming TCP data SKBs, perform a
    partial rathern than a full purge of the OFO queue (which could be
    huge). From Eric Dumazet.

13) Convert XFRM state and policy lookups to RCU, from Florian Westphal.

14) Support RX network flow classification to igb, from Gangfeng Huang.

15) Hardware offloading of eBPF in nfp driver, from Jakub Kicinski.

16) New skbmod packet action, from Jamal Hadi Salim.

17) Remove some inefficiencies in snmp proc output, from Jia He.

18) Add FIB notifications to properly propagate route changes to
    hardware which is doing forwarding offloading. From Jiri Pirko.

19) New dsa driver for qca8xxx chips, from John Crispin.

20) Implement RFC7559 ipv6 router solicitation backoff, from Maciej
    Żenczykowski.

21) Add L3 mode to ipvlan, from Mahesh Bandewar.

22) Support 802.1ad in mlx4, from Moshe Shemesh.

23) Support hardware LRO in mediatek driver, from Nelson Chang.

24) Add TC offloading to mlx5, from Or Gerlitz.

25) Convert various drivers to ethtool ksettings interfaces, from
    Philippe Reynes.

26) TX max rate limiting for cxgb4, from Rahul Lakkireddy.

27) NAPI support for ath10k, from Rajkumar Manoharan.

28) Support XDP in mlx5, from Rana Shahout and Saeed Mahameed.

29) UDP replicast support in TIPC, from Richard Alpe.

30) Per-queue statistics for qed driver, from Sudarsana Reddy Kalluru.

31) Support BQL in thunderx driver, from Sunil Goutham.

32) TSO support in alx driver, from Tobias Regnery.

33) Add stream parser engine and use it in kcm.

34) Support async DHCP replies in ipconfig module, from Uwe
    Kleine-König.

35) DSA port fast aging for mv88e6xxx driver, from Vivien Didelot.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1715 commits)
  mlxsw: switchx2: Fix misuse of hard_header_len
  mlxsw: spectrum: Fix misuse of hard_header_len
  net/faraday: Stop NCSI device on shutdown
  net/ncsi: Introduce ncsi_stop_dev()
  net/ncsi: Rework the channel monitoring
  net/ncsi: Allow to extend NCSI request properties
  net/ncsi: Rework request index allocation
  net/ncsi: Don't probe on the reserved channel ID (0x1f)
  net/ncsi: Introduce NCSI_RESERVED_CHANNEL
  net/ncsi: Avoid unused-value build warning from ia64-linux-gcc
  net: Add netdev all_adj_list refcnt propagation to fix panic
  net: phy: Add Edge-rate driver for Microsemi PHYs.
  vmxnet3: Wake queue from reset work
  i40e: avoid NULL pointer dereference and recursive errors on early PCI error
  qed: Add RoCE ll2 & GSI support
  qed: Add support for memory registeration verbs
  qed: Add support for QP verbs
  qed: PD,PKEY and CQ verb support
  qed: Add support for RoCE hw init
  qede: Add qedr framework
  ...
2016-10-05 10:11:24 -07:00
Alexei Starovoitov
aa6a5f3cb2 perf, bpf: add perf events core support for BPF_PROG_TYPE_PERF_EVENT programs
Allow attaching BPF_PROG_TYPE_PERF_EVENT programs to sw and hw perf events
via overflow_handler mechanism.
When program is attached the overflow_handlers become stacked.
The program acts as a filter.
Returning zero from the program means that the normal perf_event_output handler
will not be called and sampling event won't be stored in the ring buffer.

The overflow_handler_context==NULL is an additional safety check
to make sure programs are not attached to hw breakpoints and watchdog
in case other checks (that prevent that now anyway) get accidentally
relaxed in the future.

The program refcnt is incremented in case perf_events are inhereted
when target task is forked.
Similar to kprobe and tracepoint programs there is no ioctl to
detach the program or swap already attached program. The user space
expected to close(perf_event_fd) like it does right now for kprobe+bpf.
That restriction simplifies the code quite a bit.

The invocation of overflow_handler in __perf_event_overflow() is now
done via READ_ONCE, since that pointer can be replaced when the program
is attached while perf_event itself could have been active already.
There is no need to do similar treatment for event->prog, since it's
assigned only once before it's accessed.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02 10:46:44 -07:00
Alexei Starovoitov
0515e5999a bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type
Introduce BPF_PROG_TYPE_PERF_EVENT programs that can be attached to
HW and SW perf events (PERF_TYPE_HARDWARE and PERF_TYPE_SOFTWARE
correspondingly in uapi/linux/perf_event.h)

The program visible context meta structure is
struct bpf_perf_event_data {
    struct pt_regs regs;
     __u64 sample_period;
};
which is accessible directly from the program:
int bpf_prog(struct bpf_perf_event_data *ctx)
{
  ... ctx->sample_period ...
  ... ctx->regs.ip ...
}

The bpf verifier rewrites the accesses into kernel internal
struct bpf_perf_event_data_kern which allows changing
struct perf_sample_data without affecting bpf programs.
New fields can be added to the end of struct bpf_perf_event_data
in the future.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02 10:46:44 -07:00
David Carrillo-Cisneros
d6a2f9035b perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG
Introduce the flag PMU_EV_CAP_READ_ACTIVE_PKG, useful for uncore events,
that allows a PMU to signal the generic perf code that an event is readable
in the current CPU if the event is active in a CPU in the same package as
the current CPU.

This is an optimization that avoids a unnecessary IPI for the common case
where uncore events are run and read in the same package but in
different CPUs.

As an example, the IPI removal speeds up perf_read() in my Haswell system
as follows:

  - For event UNC_C_LLC_LOOKUP: From 260 us to 31 us.
  - For event RAPL's power/energy-cores/: From to 255 us to 27 us.

For the optimization to work, all events in the group must have it
(similarly to PERF_EV_CAP_SOFTWARE).

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471467307-61171-4-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:53:59 +02:00
David Carrillo-Cisneros
4ff6a8debf perf/core: Generalize event->group_flags
Currently, PERF_GROUP_SOFTWARE is used in the group_flags field of a
group's leader to indicate that is_software_event(event) is true for all
events in a group. This is the only usage of event->group_flags.

This pattern of setting a group level flags when all events in the group
share a property is useful for the flag introduced in the next patch and
for future CQM/CMT flags. So this patches generalizes group_flags to work
as an aggregate of event level flags.

PERF_GROUP_SOFTWARE denotes an inmutable event's property. All other flags
that I intend to add are also determinable at event initialization.
To better convey the above, this patch renames event's group_flags to
group_caps and PERF_GROUP_SOFTWARE to PERF_EV_CAP_SOFTWARE.

Individual event flags are stored in the new event->event_caps. Since the
cap flags do not change after event initialization, there is no need to
serialize event_caps. This new field is used when events are added to a
context, similarly to how PERF_GROUP_SOFTWARE and is_software_event()
worked.

Lastly, for consistency, updates is_software_event() to rely in event_cap
instead of the context index.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471467307-61171-3-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:44:21 +02:00
Peter Zijlstra
e48c178814 perf/core: Optimize perf_pmu_sched_task()
For perf record -b, which requires the pmu::sched_task callback the
current code is rather expensive:

     7.68%  sched-pipe  [kernel.vmlinux]    [k] perf_pmu_sched_task
     5.95%  sched-pipe  [kernel.vmlinux]    [k] __switch_to
     5.20%  sched-pipe  [kernel.vmlinux]    [k] __intel_pmu_disable_all
     3.95%  sched-pipe  perf                [.] worker_thread

The problem is that it will iterate all registered PMUs, most of which
will not have anything to do. Avoid this by keeping an explicit list
of PMUs that have requested the callback.

The perf_sched_cb_{inc,dec}() functions already takes the required pmu
argument, and now that these functions are no longer called from NMI
context we can use them to manage a list.

With this patch applied the function doesn't show up in the top 4
anymore (it dropped to 18th place).

     6.67%  sched-pipe  [kernel.vmlinux]    [k] __switch_to
     6.18%  sched-pipe  [kernel.vmlinux]    [k] __intel_pmu_disable_all
     3.92%  sched-pipe  [kernel.vmlinux]    [k] switch_mm_irqs_off
     3.71%  sched-pipe  perf                [.] worker_thread

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 13:13:28 +02:00
David Carrillo-Cisneros
db4a835601 perf/core: Set cgroup in CPU contexts for new cgroup events
There's a perf stat bug easy to observer on a machine with only one cgroup:

  $ perf stat -e cycles -I 1000 -C 0 -G /
  #          time             counts unit events
      1.000161699      <not counted>      cycles                    /
      2.000355591      <not counted>      cycles                    /
      3.000565154      <not counted>      cycles                    /
      4.000951350      <not counted>      cycles                    /

We'd expect some output there.

The underlying problem is that there is an optimization in
perf_cgroup_sched_{in,out}() that skips the switch of cgroup events
if the old and new cgroups in a task switch are the same.

This optimization interacts with the current code in two ways
that cause a CPU context's cgroup (cpuctx->cgrp) to be NULL even if a
cgroup event matches the current task. These are:

  1. On creation of the first cgroup event in a CPU: In current code,
  cpuctx->cpu is only set in perf_cgroup_sched_in, but due to the
  aforesaid optimization, perf_cgroup_sched_in will run until the next
  cgroup switches in that CPU. This may happen late or never happen,
  depending on system's number of cgroups, CPU load, etc.

  2. On deletion of the last cgroup event in a cpuctx: In list_del_event,
  cpuctx->cgrp is set NULL. Any new cgroup event will not be sched in
  because cpuctx->cgrp == NULL until a cgroup switch occurs and
  perf_cgroup_sched_in is executed (updating cpuctx->cgrp).

This patch fixes both problems by setting cpuctx->cgrp in list_add_event,
mirroring what list_del_event does when removing a cgroup event from CPU
context, as introduced in:

  commit 68cacd2916 ("perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()")

With this patch, cpuctx->cgrp is always set/clear when installing/removing
the first/last cgroup event in/from the CPU context. With cpuctx->cgrp
correctly set, event_filter_match works as intended when events are
sched in/out.

After the fix, the output is as expected:

  $ perf stat -e cycles -I 1000 -a -G /
  #         time             counts unit events
     1.004699159          627342882      cycles                    /
     2.007397156          615272690      cycles                    /
     3.010019057          616726074      cycles                    /

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1470124092-113192-1-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 13:05:52 +02:00
Linus Torvalds
a6408f6cb6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
2016-07-29 13:55:30 -07:00
Linus Torvalds
468fc7ed55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Unified UDP encapsulation offload methods for drivers, from
    Alexander Duyck.

 2) Make DSA binding more sane, from Andrew Lunn.

 3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.

 4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.

 5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
    packets as soon as the device sees them, with the option to mirror
    the packet on TX via the same interface.  From Brenden Blanco and
    others.

 6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.

 7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.

 8) Simplify netlink conntrack entry layout, from Florian Westphal.

 9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
    Schimmel, Yotam Gigi, and Jiri Pirko.

10) Add SKB array infrastructure and convert tun and macvtap over to it.
    From Michael S Tsirkin and Jason Wang.

11) Support qdisc packet injection in pktgen, from John Fastabend.

12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.

13) Add NV congestion control support to TCP, from Lawrence Brakmo.

14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.

15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.

16) Support MPLS over IPV4, from Simon Horman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
  xgene: Fix build warning with ACPI disabled.
  be2net: perform temperature query in adapter regardless of its interface state
  l2tp: Correctly return -EBADF from pppol2tp_getname.
  net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
  net: ipmr/ip6mr: update lastuse on entry change
  macsec: ensure rx_sa is set when validation is disabled
  tipc: dump monitor attributes
  tipc: add a function to get the bearer name
  tipc: get monitor threshold for the cluster
  tipc: make cluster size threshold for monitoring configurable
  tipc: introduce constants for tipc address validation
  net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
  MAINTAINERS: xgene: Add driver and documentation path
  Documentation: dtb: xgene: Add MDIO node
  dtb: xgene: Add MDIO node
  drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
  drivers: net: xgene: Use exported functions
  drivers: net: xgene: Enable MDIO driver
  drivers: net: xgene: Add backward compatibility
  drivers: net: phy: xgene: Add MDIO driver
  ...
2016-07-27 12:03:20 -07:00
Daniel Borkmann
aa7145c16d bpf, events: fix offset in skb copy handler
This patch fixes the __output_custom() routine we currently use with
bpf_skb_copy(). I missed that when len is larger than the size of the
current handle, we can issue multiple invocations of copy_func, and
__output_custom() advances destination but also source buffer by the
written amount of bytes. When we have __output_custom(), this is actually
wrong since in that case the source buffer points to a non-linear object,
in our case an skb, which the copy_func helper is supposed to walk.
Therefore, since this is non-linear we thus need to pass the offset into
the helper, so that copy_func can use it for extracting the data from
the source object.

Therefore, adjust the callback signatures properly and pass offset
into the skb_header_pointer() invoked from bpf_skb_copy() callback. The
__DEFINE_OUTPUT_COPY_BODY() is adjusted to accommodate for two things:
i) to pass in whether we should advance source buffer or not; this is
a compile-time constant condition, ii) to pass in the offset for
__output_custom(), which we do with help of __VA_ARGS__, so everything
can stay inlined as is currently. Both changes allow for adapting the
__output_* fast-path helpers w/o extra overhead.

Fixes: 555c8a8623 ("bpf: avoid stack copy and use skb ctx for event output")
Fixes: 7e3f977edd ("perf, events: add non-linear data support for raw records")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25 10:34:11 -07:00
Daniel Borkmann
7e3f977edd perf, events: add non-linear data support for raw records
This patch adds support for non-linear data on raw records. It
extends raw records to have one or multiple fragments that will
be written linearly into the ring slot, where each fragment can
optionally have a custom callback handler to walk and extract
complex, possibly non-linear data.

If a callback handler is provided for a fragment, then the new
__output_custom() will be used instead of __output_copy() for
the perf_output_sample() part. perf_prepare_sample() does all
the size calculation only once, so perf_output_sample() doesn't
need to redo the same work anymore, meaning real_size and padding
will be cached in the raw record. The raw record becomes 32 bytes
in size without holes; to not increase it further and to avoid
doing unnecessary recalculations in fast-path, we can reuse
next pointer of the last fragment, idea here is borrowed from
ZERO_OR_NULL_PTR(), which should keep the perf_output_sample()
path for PERF_SAMPLE_RAW minimal.

This facility is needed for BPF's event output helper as a first
user that will, in a follow-up, add an additional perf_raw_frag
to its perf_raw_record in order to be able to more efficiently
dump skb context after a linear head meta data related to it.
skbs can be non-linear and thus need a custom output function to
dump buffers. Currently, the skb data needs to be copied twice;
with the help of __output_custom() this work only needs to be
done once. Future users could be things like XDP/BPF programs
that work on different context though and would thus also have
a different callback function.

The few users of raw records are adapted to initialize their frag
data from the raw record itself, no change in behavior for them.
The code is based upon a PoC diff provided by Peter Zijlstra [1].

  [1] http://thread.gmane.org/gmane.linux.network/421294

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-15 14:23:56 -07:00
Thomas Gleixner
89ab9cb169 perf/core: Remove perf CPU notifier code
All users converted to state machine callbacks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153335.115333381@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14 09:34:42 +02:00
Thomas Gleixner
00e16c3d68 perf/core: Convert to hotplug state machine
Actually a nice symmetric startup/teardown pair which fits properly into
the state machine concept. In the long run we should be able to invoke
the startup callback for the boot CPU via the state machine and get
rid of the init function which invokes it on the boot CPU.

Note: This comes actually before the perf hardware callbacks. In the notifier
model the hardware callbacks have a higher priority than the core
callback. But that's solely for CPU offline so that hardware migration of
events happens before the core is notified about the outgoing CPU.

With the symetric state array model we have the following ordering:

 UP:     core -> hardware
 DOWN:   hardware -> core

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153333.587514098@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14 09:34:31 +02:00
Ingo Molnar
616d1c1b98 Merge branch 'linus' into perf/core, to refresh the branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 09:26:46 +02:00
Andi Kleen
fc07e9f983 perf/x86: Support sysfs files depending on SMT status
Add a way to show different sysfs events attributes depending on
HyperThreading is on or off. This is difficult to determine
early at boot, so we just do it dynamically when the sysfs
attribute is read.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03 09:41:22 +02:00
Kan Liang
f2fb6bef92 perf/core: Optimize side-band event delivery
The perf_event_aux() function iterates all PMUs and all events in
their respective per-CPU contexts to find the events to deliver
side-band records to.

For example, the brk test case in lkp triggers many mmap() operations,
which, if we're also running perf, results in many perf_event_aux()
invocations.

If we enable uncore PMU support (even when uncore events are not used),
dozens of uncore PMUs will be iterated, which can significantly
decrease brk_test's throughput.

For example, the brk throughput:

  without uncore PMUs: 2647573 ops_per_sec
  with    uncore PMUs: 1768444 ops_per_sec

... a 33% reduction.

To get at the per-CPU events that need side-band records, this patch
puts these events on a per-CPU list, this avoids iterating the PMUs
and any events that do not need side-band records.

Per task events are unchanged to avoid extra overhead on the context
switch paths.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Huang, Ying <ying.huang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1458757477-3781-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03 09:40:15 +02:00
Arnaldo Carvalho de Melo
97c79a38cd perf core: Per event callchain limit
Additionally to being able to control the system wide maximum depth via
/proc/sys/kernel/perf_event_max_stack, now we are able to ask for
different depths per event, using perf_event_attr.sample_max_stack for
that.

This uses an u16 hole at the end of perf_event_attr, that, when
perf_event_attr.sample_type has the PERF_SAMPLE_CALLCHAIN, if
sample_max_stack is zero, means use perf_event_max_stack, otherwise
it'll be bounds checked under callchain_mutex.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:44 -03:00
Linus Torvalds
bdc6b758e4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Mostly tooling and PMU driver fixes, but also a number of late updates
  such as the reworking of the call-chain size limiting logic to make
  call-graph recording more robust, plus tooling side changes for the
  new 'backwards ring-buffer' extension to the perf ring-buffer"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  perf record: Read from backward ring buffer
  perf record: Rename variable to make code clear
  perf record: Prevent reading invalid data in record__mmap_read
  perf evlist: Add API to pause/resume
  perf trace: Use the ptr->name beautifier as default for "filename" args
  perf trace: Use the fd->name beautifier as default for "fd" args
  perf report: Add srcline_from/to branch sort keys
  perf evsel: Record fd into perf_mmap
  perf evsel: Add overwrite attribute and check write_backward
  perf tools: Set buildid dir under symfs when --symfs is provided
  perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
  perf annotate: Sort list of recognised instructions
  perf annotate: Fix identification of ARM blt and bls instructions
  perf tools: Fix usage of max_stack sysctl
  perf callchain: Stop validating callchains by the max_stack sysctl
  perf trace: Fix exit_group() formatting
  perf top: Use machine->kptr_restrict_warned
  perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
  perf machine: Do not bail out if not managing to read ref reloc symbol
  perf/x86/intel/p4: Trival indentation fix, remove space
  ...
2016-05-25 17:05:40 -07:00
Linus Torvalds
a7fd20d1c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support SPI based w5100 devices, from Akinobu Mita.

   2) Partial Segmentation Offload, from Alexander Duyck.

   3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

   4) Allow cls_flower stats offload, from Amir Vadai.

   5) Implement bpf blinding, from Daniel Borkmann.

   6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
      actually using FASYNC these atomics are superfluous.  From Eric
      Dumazet.

   7) Run TCP more preemptibly, also from Eric Dumazet.

   8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
      driver, from Gal Pressman.

   9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

  10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

  11) Support tunneling offloads in qed, from Manish Chopra.

  12) aRFS offloading in mlx5e, from Maor Gottlieb.

  13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
      Leitner.

  14) Add MSG_EOR support to TCP, this allows controlling packet
      coalescing on application record boundaries for more accurate
      socket timestamp sampling.  From Martin KaFai Lau.

  15) Fix alignment of 64-bit netlink attributes across the board, from
      Nicolas Dichtel.

  16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

  17) Several conversions of drivers to ethtool ksettings, from Philippe
      Reynes.

  18) Checksum neutral ILA in ipv6, from Tom Herbert.

  19) Factorize all of the various marvell dsa drivers into one, from
      Vivien Didelot

  20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
  Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
  Revert "phy dp83867: Make rgmii parameters optional"
  r8169: default to 64-bit DMA on recent PCIe chips
  phy dp83867: Make rgmii parameters optional
  phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
  bpf: arm64: remove callee-save registers use for tmp registers
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  switchdev: pass pointer to fib_info instead of copy
  net_sched: close another race condition in tcf_mirred_release()
  tipc: fix nametable publication field in nl compat
  drivers: net: Don't print unpopulated net_device name
  qed: add support for dcbx.
  ravb: Add missing free_irq() calls to ravb_close()
  qed: Remove a stray tab
  net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fec-mpc52xx: use phydev from struct net_device
  bpf, doc: fix typo on bpf_asm descriptions
  stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
  net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fs-enet: use phydev from struct net_device
  ...
2016-05-17 16:26:30 -07:00
Arnaldo Carvalho de Melo
c85b033496 perf core: Separate accounting of contexts and real addresses in a stack trace
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.

A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:53 -03:00
Arnaldo Carvalho de Melo
3e4de4ec4c perf core: Add perf_callchain_store_context() helper
We need have different helpers to account how many contexts we have in
the sample and for real addresses, so do it now as a prep patch, to
ease review.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:52 -03:00
Arnaldo Carvalho de Melo
3b1fff0803 perf core: Add a 'nr' field to perf_event_callchain_context
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.

This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:51 -03:00
Arnaldo Carvalho de Melo
cfbcf46845 perf core: Pass max stack as a perf_callchain_entry context
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:50 -03:00
Mark Rutland
5101ef20f0 perf/arm: Special-case hetereogeneous CPUs
Commit:

  2665784850 ("perf/core: Verify we have a single perf_hw_context PMU")

forcefully prevents multiple PMUs from sharing perf_hw_context, as this
generally doesn't make sense. It is a common bug for uncore PMUs to
use perf_hw_context rather than perf_invalid_context, which this detects.

However, systems exist with heterogeneous CPUs (and hence heterogeneous
HW PMUs), for which sharing perf_hw_context is necessary, and possible
in some limited cases.

To make this work we have to perform some gymnastics, as we did in these
commits:

  66eb579e66 ("perf: allow for PMU-specific event filtering")
  c904e32a69 ("arm: perf: filter unschedulable events")

To allow those systems to work, we must allow PMUs for heterogeneous
CPUs to share perf_hw_context, though we must still disallow sharing
otherwise to detect the common misuse of perf_hw_context.

This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates
the core logic to account for this, and makes use of it in the arm_pmu
code that is used for systems with heterogeneous CPUs. Comments are
added to make the rationale clear and hopefully avoid accidental abuse.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostej
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:59 +02:00
Alexander Shishkin
375637bc52 perf/core: Introduce address range filtering
Many instruction tracing PMUs out there support address range-based
filtering, which would, for example, generate trace data only for a
given range of instruction addresses, which is useful for tracing
individual functions, modules or libraries. Other PMUs may also
utilize this functionality to allow filtering to or filtering out
code at certain address ranges.

This patch introduces the interface for userspace to specify these
filters and for the PMU drivers to apply these filters to hardware
configuration.

The user interface is an ASCII string that is passed via an ioctl()
and specifies (in the form of an ASCII string) address ranges within
certain object files or within kernel. There is no special treatment
for kernel modules yet, but it might be a worthy pursuit.

The PMU driver interface basically adds two extra callbacks to the
PMU driver structure, one of which validates the filter configuration
proposed by the user against what the hardware is actually capable of
doing and the other one translates hardware-independent filter
configuration into something that can be programmed into the
hardware.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-6-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:57 +02:00
Arnaldo Carvalho de Melo
c5dfd78eb7 perf core: Allow setting up max frame stack depth via sysctl
The default remains 127, which is good for most cases, and not even hit
most of the time, but then for some cases, as reported by Brendan, 1024+
deep frames are appearing on the radar for things like groovy, ruby.

And in some workloads putting a _lower_ cap on this may make sense. One
that is per event still needs to be put in place tho.

The new file is:

  # cat /proc/sys/kernel/perf_event_max_stack
  127

Chaging it:

  # echo 256 > /proc/sys/kernel/perf_event_max_stack
  # cat /proc/sys/kernel/perf_event_max_stack
  256

But as soon as there is some event using callchains we get:

  # echo 512 > /proc/sys/kernel/perf_event_max_stack
  -bash: echo: write error: Device or resource busy
  #

Because we only allocate the callchain percpu data structures when there
is a user, which allows for changing the max easily, its just a matter
of having no callchain users at that point.

Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-27 10:20:39 -03:00
Wang Nan
9ecda41acb perf/core: Add ::write_backward attribute to perf event
This patch introduces 'write_backward' bit to perf_event_attr, which
controls the direction of a ring buffer. After set, the corresponding
ring buffer is written from end to beginning. This feature is design to
support reading from overwritable ring buffer.

Ring buffer can be created by mapping a perf event fd. Kernel puts event
records into ring buffer, user tooling like perf fetch them from
address returned by mmap(). To prevent racing between kernel and tooling,
they communicate to each other through 'head' and 'tail' pointers.
Kernel maintains 'head' pointer, points it to the next free area (tail
of the last record). Tooling maintains 'tail' pointer, points it to the
tail of last consumed record (record has already been fetched). Kernel
determines the available space in a ring buffer using these two
pointers to avoid overwrite unfetched records.

By mapping without 'PROT_WRITE', an overwritable ring buffer is created.
Different from normal ring buffer, tooling is unable to maintain 'tail'
pointer because writing is forbidden. Therefore, for this type of ring
buffers, kernel overwrite old records unconditionally, works like flight
recorder. This feature would be useful if reading from overwritable ring
buffer were as easy as reading from normal ring buffer. However,
there's an obscure problem.

The following figure demonstrates a full overwritable ring buffer. In
this figure, the 'head' pointer points to the end of last record, and a
long record 'E' is pending. For a normal ring buffer, a 'tail' pointer
would have pointed to position (X), so kernel knows there's no more
space in the ring buffer. However, for an overwritable ring buffer,
kernel ignore the 'tail' pointer.

   (X)                              head
    .                                |
    .                                V
    +------+-------+----------+------+---+
    |A....A|B.....B|C........C|D....D|   |
    +------+-------+----------+------+---+

Record 'A' is overwritten by event 'E':

      head
       |
       V
    +--+---+-------+----------+------+---+
    |.E|..A|B.....B|C........C|D....D|E..|
    +--+---+-------+----------+------+---+

Now tooling decides to read from this ring buffer. However, none of these
two natural positions, 'head' and the start of this ring buffer, are
pointing to the head of a record. Even the full ring buffer can be
accessed by tooling, it is unable to find a position to start decoding.

The first attempt tries to solve this problem AFAIK can be found from
[1]. It makes kernel to maintain 'tail' pointer: updates it when ring
buffer is half full. However, this approach introduces overhead to
fast path. Test result shows a 1% overhead [2]. In addition, this method
utilizes no more tham 50% records.

Another attempt can be found from [3], which allows putting the size of
an event at the end of each record. This approach allows tooling to find
records in a backward manner from 'head' pointer by reading size of a
record from its tail. However, because of alignment requirement, it
needs 8 bytes to record the size of a record, which is a huge waste. Its
performance is also not good, because more data need to be written.
This approach also introduces some extra branch instructions to fast
path.

'write_backward' is a better solution to this problem.

Following figure demonstrates the state of the overwritable ring buffer
when 'write_backward' is set before overwriting:

       head
        |
        V
    +---+------+----------+-------+------+
    |   |D....D|C........C|B.....B|A....A|
    +---+------+----------+-------+------+

and after overwriting:
                                     head
                                      |
                                      V
    +---+------+----------+-------+---+--+
    |..E|D....D|C........C|B.....B|A..|E.|
    +---+------+----------+-------+---+--+

In each situation, 'head' points to the beginning of the newest record.
From this record, tooling can iterate over the full ring buffer and fetch
records one by one.

The only limitation that needs to be considered is back-to-back reading.
Due to the non-deterministic of user programs, it is impossible to ensure
the ring buffer keeps stable during reading. Consider an extreme situation:
tooling is scheduled out after reading record 'D', then a burst of events
come, eat up the whole ring buffer (one or multiple rounds). When the
tooling process comes back, reading after 'D' is incorrect now.

To prevent this problem, we need to find a way to ensure the ring buffer
is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is
suggested because its overhead is lower than
ioctl(PERF_EVENT_IOC_ENABLE).

By carefully verifying 'header' pointer, reader can avoid pausing the
ring-buffer. For example:

    /* A union of all possible events */
    union perf_event event;

    p = head = perf_mmap__read_head();
    while (true) {
        /* copy header of next event */
        fetch(&event.header, p, sizeof(event.header));

        /* read 'head' pointer */
        head = perf_mmap__read_head();

        /* check overwritten: is the header good? */
        if (!verify(sizeof(event.header), p, head))
            break;

        /* copy the whole event */
        fetch(&event, p, event.header.size);

        /* read 'head' pointer again */
        head = perf_mmap__read_head();

        /* is the whole event good? */
        if (!verify(event.header.size, p, head))
            break;
        p += event.header.size;
    }

However, the overhead is high because:

 a) In-place decoding is not safe.
    Copying-verifying-decoding is required.
 b) Fetching 'head' pointer requires additional synchronization.

(From Alexei Starovoitov:

Even when this trick works, pause is needed for more than stability of
reading. When we collect the events into overwrite buffer we're waiting
for some other trigger (like all cpu utilization spike or just one cpu
running and all others are idle) and when it happens the buffer has
valuable info from the past. At this point new events are no longer
interesting and buffer should be paused, events read and unpaused until
next trigger comes.)

This patch utilizes event's default overflow_handler introduced
previously. perf_event_output_backward() is created as the default
overflow handler for backward ring buffers. To avoid extra overhead to
fast path, original perf_event_output() becomes __perf_event_output()
and marked '__always_inline'. In theory, there's no extra overhead
introduced to fast path.

Performance testing:

Calling 3000000 times of 'close(-1)', use gettimeofday() to check
duration.  Use 'perf record -o /dev/null -e raw_syscalls:*' to capture
system calls. In ns.

Testing environment:

  CPU    : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
  Kernel : v4.5.0
                    MEAN         STDVAR
 BASE            800214.950    2853.083
 PRE1           2253846.700    9997.014
 PRE2           2257495.540    8516.293
 POST           2250896.100    8933.921

Where 'BASE' is pure performance without capturing. 'PRE1' is test
result of pure 'v4.5.0' kernel. 'PRE2' is test result before this
patch. 'POST' is test result after this patch. See [4] for the detailed
experimental setup.

Considering the stdvar, this patch doesn't introduce performance
overhead to the fast path.

 [1] http://lkml.iu.edu/hypermail/linux/kernel/1304.1/04584.html
 [2] http://lkml.iu.edu/hypermail/linux/kernel/1307.1/00535.html
 [3] http://lkml.iu.edu/hypermail/linux/kernel/1512.0/01265.html
 [4] http://lkml.kernel.org/g/56F89DCD.1040202@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: <acme@kernel.org>
Cc: <pi3orama@163.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1459865478-53413-1-git-send-email-wangnan0@huawei.com
[ Fixed the changelog some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23 14:12:39 +02:00
Ingo Molnar
889fac6d67 Merge tag 'v4.6-rc3' into perf/core, to refresh the tree
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 08:57:03 +02:00
Alexei Starovoitov
1e1dcd93b4 perf: split perf_trace_buf_prepare into alloc and update parts
split allows to move expensive update of 'struct trace_entry' to later phase.
Repurpose unused 1st argument of perf_tp_event() to indicate event type.

While splitting use temp variable 'rctx' instead of '*rctx' to avoid
unnecessary loads done by the compiler due to -fno-strict-aliasing

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 21:04:26 -04:00
Alexei Starovoitov
ec5e099d6e perf: optimize perf_fetch_caller_regs
avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints.
It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call
with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and
subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs,
so we can safely drop memset from all of the above cases and move it into
perf_ftrace_function_call that calls it with stack allocated pt_regs.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 21:04:26 -04:00
Wang Nan
1879445dfa perf/core: Set event's default ::overflow_handler()
Set a default event->overflow_handler in perf_event_alloc() so don't
need to check event->overflow_handler in __perf_event_overflow().
Following commits can give a different default overflow_handler.

Initial idea comes from Peter:

  http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net

Since the default value of event->overflow_handler is not NULL, existing
'if (!overflow_handler)' checks need to be changed.

is_default_overflow_handler() is introduced for this.

No extra performance overhead is introduced into the hot path because in the
original code we still need to read this handler from memory. A conditional
branch is avoided so actually we remove some instructions.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <pi3orama@163.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1459147292-239310-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 10:30:47 +02:00
Linus Torvalds
3fa2fe2ce0 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree contains various perf fixes on the kernel side, plus three
  hw/event-enablement late additions:

   - Intel Memory Bandwidth Monitoring events and handling
   - the AMD Accumulated Power Mechanism reporting facility
   - more IOMMU events

  ... and a final round of perf tooling updates/fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  perf llvm: Use strerror_r instead of the thread unsafe strerror one
  perf llvm: Use realpath to canonicalize paths
  perf tools: Unexport some methods unused outside strbuf.c
  perf probe: No need to use formatting strbuf method
  perf help: Use asprintf instead of adhoc equivalents
  perf tools: Remove unused perf_pathdup, xstrdup functions
  perf tools: Do not include stringify.h from the kernel sources
  tools include: Copy linux/stringify.h from the kernel
  tools lib traceevent: Remove redundant CPU output
  perf tools: Remove needless 'extern' from function prototypes
  perf tools: Simplify die() mechanism
  perf tools: Remove unused DIE_IF macro
  perf script: Remove lots of unused arguments
  perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
  perf machine: Rename perf_event__preprocess_sample to machine__resolve
  perf tools: Add cpumode to struct perf_sample
  perf tests: Forward the perf_sample in the dwarf unwind test
  perf tools: Remove misplaced __maybe_unused
  perf list: Fix documentation of :ppp
  perf bench numa: Fix assertion for nodes bitfield
  ...
2016-03-24 10:02:14 -07:00
Huang Rui
c7ab62bfbe perf/x86/amd/power: Add AMD accumulated power reporting mechanism
Introduce an AMD accumlated power reporting mechanism for the Family
15h, Model 60h processor that can be used to calculate the average
power consumed by a processor during a measurement interval. The
feature support is indicated by CPUID Fn8000_0007_EDX[12].

This feature will be implemented both in hwmon and perf. The current
design provides one event to report per package/processor power
consumption by counting each compute unit power value.

Here the gory details of how the computation is done:

* Tsample: compute unit power accumulator sample period
* Tref: the PTSC counter period (PTSC: performance timestamp counter)
* N: the ratio of compute unit power accumulator sample period to the
  PTSC period

* Jmax: max compute unit accumulated power which is indicated by
  MSR_C001007b[MaxCpuSwPwrAcc]

* Jx/Jy: compute unit accumulated power which is indicated by
  MSR_C001007a[CpuSwPwrAcc]

* Tx/Ty: the value of performance timestamp counter which is indicated
  by CU_PTSC MSR_C0010280[PTSC]
* PwrCPUave: CPU average power

i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
	N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].

ii. Read the full range of the cumulative energy value from the new
    MSR MaxCpuSwPwrAcc.
	Jmax = value returned.

iii. At time x, software reads CpuSwPwrAcc and samples the PTSC.
	Jx = value read from CpuSwPwrAcc and Tx = value read from PTSC.

iv. At time y, software reads CpuSwPwrAcc and samples the PTSC.
	Jy = value read from CpuSwPwrAcc and Ty = value read from PTSC.

v. Calculate the average power consumption for a compute unit over
time period (y-x). Unit of result is uWatt:

	if (Jy < Jx) // Rollover has occurred
		Jdelta = (Jy + Jmax) - Jx
	else
		Jdelta = Jy - Jx
	PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)

Simple example:

  root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' make -j4
    CHK     include/config/kernel.release
    CHK     include/generated/uapi/linux/version.h
    CHK     include/generated/utsrelease.h
    CHK     include/generated/timeconst.h
    CHK     include/generated/bounds.h
    CHK     include/generated/asm-offsets.h
    CALL    scripts/checksyscalls.sh
    CHK     include/generated/compile.h
    SKIPPED include/generated/compile.h
    Building modules, stage 2.
  Kernel: arch/x86/boot/bzImage is ready  (#40)
    MODPOST 4225 modules

   Performance counter stats for 'system wide':

              183.44 mWatts power/power-pkg/

       341.837270111 seconds time elapsed

  root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' sleep 10

   Performance counter stats for 'system wide':

                0.18 mWatts power/power-pkg/

        10.012551815 seconds time elapsed

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: jacob.w.shin@gmail.com
Link: http://lkml.kernel.org/r/1457502306-2559-1-git-send-email-ray.huang@amd.com
[ Fixed the modular build. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21 09:37:15 +01:00
Vikas Shivappa
a223c1c7ab perf/x86/cqm: Fix CQM handling of grouping events into a cache_group
Currently CQM (cache quality of service monitoring) is grouping all
events belonging to same PID to use one RMID. However its not counting
all of these different events. Hence we end up with a count of zero
for all events other than the group leader.

The patch tries to address the issue by keeping a flag in the
perf_event.hw which has other CQM related fields. The field is updated
at event creation and during grouping.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
[peterz: Changed hw_perf_event::is_group_event to an int]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: fenghua.yu@intel.com
Cc: h.peter.anvin@intel.com
Cc: ravi.v.shankar@intel.com
Cc: vikas.shivappa@intel.com
Link: http://lkml.kernel.org/r/1457652732-4499-2-git-send-email-vikas.shivappa@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21 09:08:18 +01:00
Linus Torvalds
1200b6809d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support more Realtek wireless chips, from Jes Sorenson.

   2) New BPF types for per-cpu hash and arrap maps, from Alexei
      Starovoitov.

   3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

   4) Allow the use of SO_REUSEPORT in order to do per-thread processing
   of incoming TCP/UDP connections.  The muxing can be done using a
   BPF program which hashes the incoming packet.  From Craig Gallek.

   5) Add a multiplexer for TCP streams, to provide a messaged based
      interface.  BPF programs can be used to determine the message
      boundaries.  From Tom Herbert.

   6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

   7) Avoid factorial complexity when taking down an inetdev interface
      with lots of configured addresses.  We were doing things like
      traversing the entire address less for each address removed, and
      flushing the entire netfilter conntrack table for every address as
      well.

   8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

   9) Allow offloading u32 classifiers to hardware, and implement for
      ixgbe, from John Fastabend.

  10) Allow configuring IRQ coalescing parameters on a per-queue basis,
      from Kan Liang.

  11) Extend ethtool so that larger link mode masks can be supported.
      From David Decotigny.

  12) Introduce devlink, which can be used to configure port link types
      (ethernet vs Infiniband, etc.), port splitting, and switch device
      level attributes as a whole.  From Jiri Pirko.

  13) Hardware offload support for flower classifiers, from Amir Vadai.

  14) Add "Local Checksum Offload".  Basically, for a tunneled packet
      the checksum of the outer header is 'constant' (because with the
      checksum field filled into the inner protocol header, the payload
      of the outer frame checksums to 'zero'), and we can take advantage
      of that in various ways.  From Edward Cree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  net/mlx4_core: Fix backward compatibility on VFs
  phy: mdio-thunder: Fix some Kconfig typos
  lan78xx: add ndo_get_stats64
  lan78xx: handle statistics counter rollover
  RDS: TCP: Remove unused constant
  RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
  net: smc911x: convert pxa dma to dmaengine
  team: remove duplicate set of flag IFF_MULTICAST
  bonding: remove duplicate set of flag IFF_MULTICAST
  net: fix a comment typo
  ethernet: micrel: fix some error codes
  ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
  bpf, dst: add and use dst_tclassid helper
  bpf: make skb->tc_classid also readable
  net: mvneta: bm: clarify dependencies
  cls_bpf: reset class and reuse major in da
  ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
  ldmvsw: Add ldmvsw.c driver code
  ...
2016-03-19 10:05:34 -07:00
Linus Torvalds
e23604edac Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull NOHZ updates from Ingo Molnar:
 "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
  the NOHZ 'can the tick be stopped?' infrastructure and related code to
  be data driven, and harmonizes the naming and handling of all the
  various properties"

[ This makes the ugly "fetch_or()" macro that the scheduler used
  internally a new generic helper, and does a bad job at it.

  I'm pulling it, but I've asked Ingo and Frederic to get this
  fixed up ]

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched-clock: Migrate to use new tick dependency mask model
  posix-cpu-timers: Migrate to use new tick dependency mask model
  sched: Migrate sched to use new tick dependency mask model
  sched: Account rr tasks
  perf: Migrate perf to use new tick dependency mask model
  nohz: Use enum code for tick stop failure tracing message
  nohz: New tick dependency mask
  nohz: Implement wide kick on top of irq work
  atomic: Export fetch_or()
2016-03-14 19:44:38 -07:00
David S. Miller
810813c47a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 12:34:12 -05:00
Ingo Molnar
1f25184656 Merge branch 'timers/core-v9' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz
Pull nohz enhancements from Frederic Weisbecker:

"Currently in nohz full configs, the tick dependency is checked
 asynchronously by nohz code from interrupt and context switch for each
 concerned subsystem with a set of function provided by these. Such
 functions are made of many conditions and details that can be heavyweight
 as they are called on fastpath: sched_can_stop_tick(),
 posix_cpu_timer_can_stop_tick(), perf_event_can_stop_tick()...

 Thomas suggested a few months ago to make that tick dependency check
 synchronous. Instead of checking subsystems details from each interrupt
 to guess if the tick can be stopped, every subsystem that may have a tick
 dependency should set itself a flag specifying the state of that
 dependency. This way we can verify if we can stop the tick with a single
 lightweight mask check on fast path.

 This conversion from a pull to a push model to implement tick dependency
 is the core feature of this patchset that is split into:

  * Nohz wide kick simplification
  * Improve nohz tracing
  * Introduce tick dependency mask
  * Migrate scheduler, posix timers, perf events and sched clock tick
    dependencies to the tick dependency mask."

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 13:17:54 +01:00
Frederic Weisbecker
555e0c1ef7 perf: Migrate perf to use new tick dependency mask model
Instead of providing asynchronous checks for the nohz subsystem to verify
perf event tick dependency, migrate perf to the new mask.

Perf needs the tick for two situations:

1) Freq events. We could set the tick dependency when those are
installed on a CPU context. But setting a global dependency on top of
the global freq events accounting is much easier. If people want that
to be optimized, we can still refine that on the per-CPU tick dependency
level. This patch dooesn't change the current behaviour anyway.

2) Throttled events: this is a per-cpu dependency.

Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2016-03-02 16:43:00 +01:00