mirror of
https://github.com/commaai/agnos-kernel-sdm845.git
synced 2026-06-13 13:54:53 +08:00
49eea524bebea0d2b7dfa1c709a6694de808eb8a
343 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ff97938fbf |
Merge remote-tracking branch '4.9/tmp-8dd0f52' into msm-4.9
* 4.9/tmp-8dd0f52: Linux 4.9.72 sparc32: Export vac_cache_size to fix build error bpf: fix incorrect sign extension in check_alu_op() bpf: reject out-of-bounds stack pointer calculation bpf: fix branch pruning logic bpf: adjust insn_aux_data when patching insns Revert "Bluetooth: btusb: driver to enable the usb-wakeup feature" platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes MIPS: math-emu: Fix final emulation phase for certain instructions thermal/drivers/hisi: Fix multiple alarm interrupts firing thermal/drivers/hisi: Simplify the temperature/step computation thermal/drivers/hisi: Fix kernel panic on alarm interrupt thermal/drivers/hisi: Fix missing interrupt enablement thermal: hisilicon: Handle return value of clk_prepare_enable cpuidle: fix broadcast control when broadcast can not be entered rtc: set the alarm to the next expiring timer tcp: fix under-evaluated ssthresh in TCP Vegas clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision staging: greybus: light: Release memory obtained by kasprintf net: ipv6: send NS for DAD when link operationally up fm10k: ensure we process SM mbx when processing VF mbx vfio/pci: Virtualize Maximum Payload Size scsi: lpfc: PLOGI failures during NPIV testing scsi: lpfc: Fix secure firmware updates fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback tracing: Exclude 'generic fields' from histograms PCI/AER: Report non-fatal errors only to the affected endpoint IB/rxe: check for allocation failure on elem ixgbe: fix use of uninitialized padding igb: check memory allocation failure PM / OPP: Move error message to debug level PCI: Create SR-IOV virtfn/physfn links before attaching driver scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive scsi: cxgb4i: fix Tx skb leak PCI: Avoid bus reset if bridge itself is broken net: phy: at803x: Change error to EINVAL for invalid MAC kvm, mm: account kvm related kmem slabs to kmemcg rtc: pl031: make interrupt optional crypto: crypto4xx - increase context and scatter ring buffer elements backlight: pwm_bl: Fix overflow condition bnxt_en: Fix NULL pointer dereference in reopen failure path cpuidle: powernv: Pass correct drv->cpumask for registration ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory Btrfs: fix an integer overflow check netfilter: nfnetlink_queue: fix secctx memory leak xhci: plat: Register shutdown for xhci_plat net: moxa: fix TX overrun memory leak isdn: kcapi: avoid uninitialized data virtio_balloon: prevent uninitialized variable use virtio-balloon: use actual number of stats for stats queue buffers KVM: pci-assign: do not map smm memory slot pages in vt-d page tables net: ipconfig: fix ic_close_devs() use-after-free cpufreq: Fix creation of symbolic links to policy directories ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend netfilter: nf_nat_snmp: Fix panic when snmp_trap_helper fails to register netfilter: nfnl_cthelper: fix a race when walk the nf_ct_helper_hash table irda: vlsi_ir: fix check for DMA mapping errors RDMA/iser: Fix possible mr leak on device removal event i40e: Do not enable NAPI on q_vectors that have no rings IB/rxe: increment msn only when completing a request IB/rxe: double free on error net: Do not allow negative values for busy_read and busy_poll sysctl interfaces nbd: set queue timeout properly infiniband: Fix alignment of mmap cookies to support VIPT caching IB/core: Protect against self-requeue of a cq work item i40iw: Receive netdev events post INET_NOTIFIER state bna: avoid writing uninitialized data into hw registers s390/qeth: no ETH header for outbound AF_IUCV s390/qeth: size calculation outbound buffers r8152: prevent the driver from transmitting packets with carrier off ASoC: STI: Fix reader substream pointer set HID: xinmo: fix for out of range for THT 2P arcade controller. hwmon: (asus_atk0110) fix uninitialized data access ARM: dts: ti: fix PCI bus dtc warnings KVM: VMX: Fix enable VPID conditions KVM: x86: correct async page present tracepoint kvm: vmx: Flush TLB when the APIC-access address changes scsi: lpfc: Fix PT2PT PRLI reject pinctrl: st: add irq_request/release_resources callbacks inet: frag: release spinlock before calling icmp_send() tipc: fix nametbl deadlock at tipc_nametbl_unsubscribe r8152: fix the rx early size of RTL8153 iommu/exynos: Workaround FLPD cache flush issues for SYSMMU v5 netfilter: nfnl_cthelper: Fix memory leak netfilter: nfnl_cthelper: fix runtime expectation policy updates usb: gadget: udc: remove pointer dereference after free usb: gadget: f_uvc: Sanity check wMaxPacketSize for SuperSpeed hwmon: (max31790) Set correct PWM value net: qmi_wwan: Add USB IDs for MDM6600 modem on Motorola Droid 4 sctp: out_qlen should be updated when pruning unsent queue bna: integer overflow bug in debugfs sch_dsmark: fix invalid skb_cow() usage vsock: cancel packets when failing to connect vhost-vsock: add pkt cancel capability vsock: track pkt owner vsock crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex r8152: fix the list rx_done may be used without initialization cpuidle: Validate cpu_dev in cpuidle_add_sysfs() nvme-loop: handle cpu unplug when re-establishing the controller arm: kprobes: Align stack to 8-bytes in test code arm: kprobes: Fix the return address of multiple kretprobes HID: corsair: Add driver Scimitar Pro RGB gaming mouse 1b1c:1b3e support to hid-corsair HID: corsair: support for K65-K70 Rapidfire and Scimitar Pro RGB kvm: fix usage of uninit spinlock in avic_vm_destroy() ALSA: hda - add support for docking station for HP 840 G3 ALSA: hda - add support for docking station for HP 820 G2 arm64: Initialise high_memory global variable earlier cxl: Check if vphb exists before iterating over AFU devices Linux 4.9.71 ath9k: fix tx99 potential info leak icmp: don't fail on fragment reassembly time exceeded IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop RDMA/cma: Avoid triggering undefined behavior macvlan: Only deliver one copy of the frame to the macvlan interface udf: Avoid overflow when session starts at large offset scsi: bfa: integer overflow in debugfs scsi: sd: change allow_restart to bool in sysfs interface scsi: sd: change manage_start_stop to bool in sysfs interface rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend IB/core: Fix calculation of maximum RoCE MTU scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry raid5: Set R5_Expanded on parity devices as well as data. pinctrl: adi2: Fix Kconfig build problem usb: musb: da8xx: fix babble condition handling tty fix oops when rmmod 8250 soc: mediatek: pwrap: fix compiler errors powerpc/perf/hv-24x7: Fix incorrect comparison in memord scsi: hpsa: destroy sas transport properties before scsi_host scsi: hpsa: cleanup sas_phy structures in sysfs when unloading PCI: Detach driver before procfs & sysfs teardown on device remove RDMA/cxgb4: Declare stag as __be32 xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real xfs: fix log block underflow during recovery cycle verification l2tp: cleanup l2tp_tunnel_delete calls nvme: use kref_get_unless_zero in nvme_find_get_ns platform/x86: hp_accel: Add quirk for HP ProBook 440 G4 btrfs: tests: Fix a memory leak in error handling path in 'run_test()' arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27 Ib/hfi1: Return actual operational VLs in port info query bcache: fix wrong cache_misses statistics bcache: explicitly destroy mutex while exiting GFS2: Take inode off order_write list when setting jdata flag scsi: scsi_debug: write_same: fix error report thermal/drivers/step_wise: Fix temperature regulation misbehavior ASoC: rsnd: rsnd_ssi_run_mods() needs to care ssi_parent_mod ppp: Destroy the mutex when cleanup clk: tegra: Fix cclk_lp divisor register clk: hi6220: mark clock cs_atb_syspll as critical clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU clk: mediatek: add the option for determining PLL source clock mm: Handle 0 flags in _calc_vm_trans() macro crypto: tcrypt - fix buffer lengths in test_aead_speed() arm-ccn: perf: Prevent module unload while PMU is in use xfs: truncate pagecache before writeback in xfs_setattr_size() iommu/amd: Limit the IOVA page range to the specified addresses badblocks: fix wrong return value in badblocks_set if badblocks are disabled target/file: Do not return error for UNMAP if length is zero target:fix condition return in core_pr_dump_initiator_port() iscsi-target: fix memory leak in lio_target_tiqn_addtpg() target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd() platform/x86: intel_punit_ipc: Fix resource ioremap warning powerpc/ipic: Fix status get and status clear powerpc/opal: Fix EBUSY bug in acquiring tokens netfilter: ipvs: Fix inappropriate output of procfs iommu/mediatek: Fix driver name PCI: Do not allocate more buses than available in parent powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo PCI/PME: Handle invalid data when reading Root Status dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type ASoC: Intel: Skylake: Fix uuid_module memory leak in failure case rtc: pcf8563: fix output clock rate video: fbdev: au1200fb: Return an error code if a memory allocation fails video: fbdev: au1200fb: Release some resources if a memory allocation fails video: udlfb: Fix read EDID timeout fbdev: controlfb: Add missing modes to fix out of bounds access sfc: don't warn on successful change of MAC HID: cp2112: fix broken gpio_direction_input callback Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting" target: fix race during implicit transition work flushes target: fix ALUA transition timeout handling target: Use system workqueue for ALUA transitions btrfs: add missing memset while reading compressed inline extents NFSv4.1 respect server's max size in CREATE_SESSION efi/esrt: Cleanup bad memory map log messages perf symbols: Fix symbols__fixup_end heuristic for corner cases tty: fix data race in tty_ldisc_ref_wait() tty: don't panic on OOM in tty_set_ldisc() rxrpc: Ignore BUSY packets on old calls net: mpls: Fix nexthop alive tracking on down events net/mlx4_core: Avoid delays during VF driver device shutdown nvmet-rdma: Fix a possible uninitialized variable dereference nvmet: confirm sq percpu has scheduled and switched to atomic nvme-loop: fix a possible use-after-free when destroying the admin queue afs: Fix abort on signal while waiting for call completion afs: Fix afs_kill_pages() afs: Fix page leak in afs_write_begin() afs: Populate and use client modification time afs: Better abort and net error handling afs: Invalid op ID should abort with RXGEN_OPCODE afs: Fix the maths in afs_fs_store_data() afs: Prevent callback expiry timer overflow afs: Migrate vlocation fields to 64-bit afs: Flush outstanding writes when an fd is closed afs: Deal with an empty callback array afs: Adjust mode bits processing afs: Populate group ID from vnode status afs: Fix missing put_page() drm/radeon: reinstate oland workaround for sclk mmc: mediatek: Fixed bug where clock frequency could be set wrong sched/deadline: Use deadline instead of period when calculating overflow sched/deadline: Throttle a constrained deadline task activated after the deadline sched/deadline: Make sure the replenishment timer fires in the next period sched/deadline: Add missing update_rq_clock() in dl_task_timer() iwlwifi: mvm: cleanup pending frames in DQA mode Drivers: hv: util: move waiting for release to hv_utils_transport itself drm/radeon/si: add dpm quirk for Oland fjes: Fix wrong netdevice feature flags scsi: hpsa: do not timeout reset operations scsi: hpsa: limit outstanding rescans scsi: hpsa: update check for logical volume status ASoC: rcar: clear DE bit only in PDMACHCR when it stops openrisc: fix issue handling 8 byte get_user calls intel_th: pci: Add Gemini Lake support drm: amd: remove broken include path qed: Fix interrupt flags on Rx LL2 qed: Fix mapping leak on LL2 rx flow qed: Align CIDs according to DORQ requirement mlxsw: reg: Fix SPVMLR max record count mlxsw: reg: Fix SPVM max record count net: Resend IGMP memberships upon peer notification. irqchip/mvebu-odmi: Select GENERIC_MSI_IRQ_DOMAIN dmaengine: Fix array index out of bounds warning in __get_unmap_pool() net: wimax/i2400m: fix NULL-deref at probe writeback: fix memory leak in wb_queue_work() blk-mq: Fix tagset reinit in the presence of cpu hot-unplug ASoC: rsnd: fix sound route path when using SRC6/SRC9 netfilter: bridge: honor frag_max_size when refragmenting drm/omap: fix dmabuf mmap for dma_alloc'ed buffers Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list NFSD: fix nfsd_reset_versions for NFSv4. NFSD: fix nfsd_minorversion(.., NFSD_AVAIL) drm/amdgpu: fix parser init error path to avoid crash in parser fini iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it net/mlx5: Don't save PCI state when PCI error is detected net/mlx5: Fix create autogroup prev initializer rxrpc: Wake up the transmitter if Rx window size increases on the peer net: bcmgenet: Power up the internal PHY before probing the MII net: bcmgenet: synchronize irq0 status between the isr and task net: bcmgenet: power down internal phy if open or resume fails net: bcmgenet: reserved phy revisions must be checked first net: bcmgenet: correct MIB access of UniMAC RUNT counters net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values bnxt_en: Ignore 0 value in autoneg supported speed from firmware. net: initialize msg.msg_flags in recvfrom userfaultfd: selftest: vm: allow to build in vm/ directory userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE md-cluster: free md_cluster_info if node leave cluster usb: xhci-mtk: check hcc_params after adding primary hcd KVM: nVMX: do not warn when MSR bitmap address is not backed usb: phy: isp1301: Add OF device ID table mac80211: Fix addition of mesh configuration element ext4: fix crash when a directory's i_size is too small ext4: fix fdatasync(2) after fallocate(2) operation dmaengine: dmatest: move callback wait queue to thread context eeprom: at24: change nvmem stride to 1 sched/rt: Do not pull from current CPU if only one CPU to pull nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests xhci: Don't add a virt_dev to the devs array before it's fully allocated Bluetooth: btusb: driver to enable the usb-wakeup feature usb: xhci: fix TDS for MTK xHCI1.1 ceph: drop negative child dentries before try pruning inode's alias usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input usb: add helper to extract bits 12:11 of wMaxPacketSize usbip: fix stub_rx: get_pipe() to validate endpoint number USB: core: prevent malicious bNumInterfaces overflow USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID tracing: Allocate mask_str buffer dynamically autofs: fix careless error in recent commit crypto: salsa20 - fix blkcipher_walk API usage crypto: hmac - require that the underlying hash algorithm is unkeyed crypto: rsa - fix buffer overread when stripping leading zeroes mfd: fsl-imx25: Clean up irq settings during removal Linux 4.9.70 RDMA/cxgb4: Annotate r2 and stag as __be32 md: free unused memory after bitmap resize audit: ensure that 'audit=1' actually enables audit for PID 1 ipvlan: fix ipv6 outbound device kbuild: do not call cc-option before KBUILD_CFLAGS initialization powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold KVM: arm/arm64: vgic-its: Preserve the revious read from the pending table fix kcm_clone() usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping s390: always save and restore all registers on context switch ipmi: Stop timers before cleaning up the module Fix handling of verdicts after NF_QUEUE tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv() s390/qeth: fix thinko in IPv4 multicast address tracking s390/qeth: fix GSO throughput regression s390/qeth: build max size GSO skbs on L2 devices tcp/dccp: block bh before arming time_wait timer stmmac: reset last TSO segment size after device open net: remove hlist_nulls_add_tail_rcu() usbnet: fix alignment for frames with no ethernet header net/packet: fix a race in packet_bind() and packet_notifier() packet: fix crash in fanout_demux_rollover() sit: update frag_off info rds: Fix NULL pointer dereference in __rds_rdma_map tipc: fix memory leak in tipc_accept_from_sock() s390/qeth: fix early exit from error path net: qmi_wwan: add Quectel BG96 2c7c:0296 ANDROID: dma-buf/sw_sync: Rename active_list to link FROMLIST: android: binder: Fix null ptr dereference in debug msg FROMLIST: android: binder: Move buffer out of area shared with user space FROMLIST: android: binder: Add allocator selftest FROMLIST: android: binder: Refactor prev and next buffer into a helper function Linux 4.9.69 afs: Connect up the CB.ProbeUuid IB/mlx5: Assign send CQ and recv CQ of UMR QP IB/mlx4: Increase maximal message size under UD QP xfrm: Copy policy family in clone_policy jump_label: Invoke jump_label_test() via early_initcall() atm: horizon: Fix irq release error clk: uniphier: fix DAPLL2 clock rate of Pro5 bpf: fix lockdep splat sctp: use the right sk after waking up from wait_buf sleep sctp: do not free asoc when it is already dead in sctp_sendmsg zsmalloc: calling zs_map_object() from irq is a bug sparc64/mm: set fields in deferred pages block: wake up all tasks blocked in get_request() dt-bindings: usb: fix reg-property port-number range xfs: fix forgotten rcu read unlock when skipping inode reclaim sunrpc: Fix rpc_task_begin trace point NFS: Fix a typo in nfs_rename() dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0 lib/genalloc.c: make the avail variable an atomic_long_t drivers/rapidio/devices/rio_mport_cdev.c: fix resource leak in error handling path in 'rio_dma_transfer()' route: update fnhe_expires for redirect when the fnhe exists route: also update fnhe_genid when updating a route cache gre6: use log_ecn_error module parameter in ip6_tnl_rcv() mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl() x86/mpx/selftests: Fix up weird arrays coccinelle: fix parallel build with CHECK=scripts/coccicheck kbuild: pkg: use --transform option to prefix paths in tar EDAC, i5000, i5400: Fix definition of NRECMEMB register EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested drm/amd/amdgpu: fix console deadlock if late init failed axonram: Fix gendisk handling netfilter: don't track fragmented packets zram: set physical queue limits to avoid array out of bounds accesses blk-mq: initialize mq kobjects in blk_mq_init_allocated_queue() i2c: riic: fix restart condition crypto: s5p-sss - Fix completing crypto request in IRQ handler ipv6: reorder icmpv6_init() and ip6_mr_init() ibmvnic: Allocate number of rx/tx buffers agreed on by firmware ibmvnic: Fix overflowing firmware/hardware TX queue rds: tcp: Sequence teardown of listen and acceptor sockets to avoid races bnx2x: do not rollback VF MAC/VLAN filters we did not configure bnx2x: fix detection of VLAN filtering feature for VF bnx2x: fix possible overrun of VFPF multicast addresses array bnx2x: prevent crash when accessing PTP with interface down spi_ks8995: regs_size incorrect for some devices spi_ks8995: fix "BUG: key accdaa28 not in .data!" KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled arm64: KVM: Survive unknown traps from guests arm: KVM: Survive unknown traps from guests KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset irqchip/crossbar: Fix incorrect type of register size scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters scsi: qla2xxx: Fix ql_dump_buffer workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq libata: drop WARN from protocol error in ata_sff_qc_issue() kvm: nVMX: VMCLEAR should not cause the vCPU to shut down usb: gadget: udc: net2280: Fix tmp reusage in net2280 driver usb: gadget: pxa27x: Test for a valid argument pointer usb: dwc3: gadget: Fix system suspend/resume on TI platforms USB: gadgetfs: Fix a potential memory leak in 'dev_config()' usb: gadget: configs: plug memory leak HID: chicony: Add support for another ASUS Zen AiO keyboard gpio: altera: Use handle_level_irq when configured as a level_high ASoC: rcar: avoid SSI_MODEx settings for SSI8 ARM: OMAP2+: Release device node after it is no longer needed. ARM: OMAP2+: Fix device node reference counts powerpc/64: Fix checksum folding in csum_add() module: set __jump_table alignment to 8 lirc: fix dead lock between open and wakeup_filter powerpc: Fix compiling a BE kernel with a powerpc64le toolchain selftest/powerpc: Fix false failures for skipped tests powerpc/64: Invalidate process table caching after setting process table x86/hpet: Prevent might sleep splat on resume sched/fair: Make select_idle_cpu() more aggressive x86/platform/uv/BAU: Fix HUB errors by remove initial write to sw-ack register x86/selftests: Add clobbers for int80 on x86_64 ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure vti6: Don't report path MTU below IPV6_MIN_MTU. ARM: 8657/1: uaccess: consistently check object sizes Revert "spi: SPI_FSL_DSPI should depend on HAS_DMA" Revert "drm/armada: Fix compile fail" mm: drop unused pmdp_huge_get_and_clear_notify() thp: fix MADV_DONTNEED vs. numa balancing race thp: reduce indentation level in change_huge_pmd() ARM: avoid faulting on qemu ARM: BUG if jumping to usermode address in kernel mode usb: f_fs: Force Reserved1=1 in OS_DESC_EXT_COMPAT crypto: talitos - fix ctr-aes-talitos crypto: talitos - fix use of sg_link_tbl_len crypto: talitos - fix AEAD for sha224 on non sha224 capable chips crypto: talitos - fix setkey to check key weakness crypto: talitos - fix memory corruption on SEC2 crypto: talitos - fix AEAD test failures bus: arm-ccn: fix module unloading Error: Removing state 147 which has instances left. bus: arm-ccn: Fix use of smp_processor_id() in preemptible context bus: arm-ccn: Check memory allocation failure bus: arm-cci: Fix use of smp_processor_id() in preemptible context arm64: fpsimd: Prevent registers leaking from dead tasks KVM: arm/arm64: vgic-its: Check result of allocation before use KVM: arm/arm64: vgic-irqfd: Fix MSI entry allocation KVM: arm/arm64: Fix broken GICH_ELRSR big endian conversion KVM: VMX: remove I/O port 0x80 bypass on Intel hosts arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one arm64: KVM: fix VTTBR_BADDR_MASK BUG_ON off-by-one media: dvb: i2c transfers over usb cannot be done from stack drm/exynos: gem: Drop NONCONTIG flag for buffers allocated without IOMMU kdb: Fix handling of kallsyms_symbol_next() return value brcmfmac: change driver unbind order of the sdio function devices powerpc/64s: Initialize ISAv3 MMU registers before setting partition table KVM: s390: Fix skey emulation permission check s390: fix compat system call table smp/hotplug: Move step CPUHP_AP_SMPCFD_DYING to the correct place iommu/vt-d: Fix scatterlist offset handling ALSA: usb-audio: Add check return value for usb_string() ALSA: usb-audio: Fix out-of-bound error ALSA: seq: Remove spurious WARN_ON() at timer check ALSA: pcm: prevent UAF in snd_pcm_info btrfs: fix missing error return in btrfs_drop_snapshot KVM: x86: fix APIC page invalidation x86/PCI: Make broadcom_postcore_init() check acpi_disabled X.509: fix comparisons of ->pkey_algo X.509: reject invalid BIT STRING for subjectPublicKey KEYS: add missing permission check for request_key() destination ASN.1: check for error from ASN1_OP_END__ACT actions ASN.1: fix out-of-bounds read when parsing indefinite length item efi/esrt: Use memunmap() instead of kfree() to free the remapping efi: Move some sysfs files to be read-only by root scsi: libsas: align sata_device's rps_resp on a cacheline scsi: use dma_get_cache_alignment() as minimum DMA alignment scsi: dma-mapping: always provide dma_get_cache_alignment isa: Prevent NULL dereference in isa_bus driver callbacks hv: kvp: Avoid reading past allocated blocks from KVP file virtio: release virtio index when fail to device_register can: usb_8dev: cancel urb on -EPIPE and -EPROTO can: esd_usb2: cancel urb on -EPIPE and -EPROTO can: ems_usb: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: ratelimit errors if incomplete messages are received can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback() can: kvaser_usb: free buf in error paths can: ti_hecc: Fix napi poll return value for repoll usb: gadget: udc: renesas_usb3: fix number of the pipes ANDROID: Revert "arm64: move ELF_ET_DYN_BASE to 4GB / 4MB" ANDROID: Revert "arm: move ELF_ET_DYN_BASE to 4MB" Linux 4.9.68 xen-netfront: avoid crashing on resume after a failure in talk_to_netback() usb: host: fix incorrect updating of offset USB: usbfs: Filter flags passed in from user space USB: devio: Prevent integer overflow in proc_do_submiturb() USB: Increase usbfs transfer limit USB: core: Add type-specific length check of BOS descriptors usb: xhci: fix panic in xhci_free_virt_devices_depth_first usb: hub: Cycle HUB power when initialization fails dma-buf: Update kerneldoc for sync_file_create dma-buf/sync_file: hold reference to fence when creating sync_file dma-buf/sw_sync: force signal all unsignaled fences on dying timeline dma-fence: Introduce drm_fence_set_error() helper dma-fence: Wrap querying the fence->status dma-fence: Clear fence->status during dma_fence_init() dma-buf/sw_sync: clean up list before signaling the fence dma-buf/sw_sync: move timeline_fence_ops around dma-buf/sw-sync: Use an rbtree to sort fences in the timeline dma-buf/sw-sync: Fix locking around sync_timeline lists dma-buf/sw-sync: sync_pt is private and of fixed size dma-buf/sw-sync: Reduce irqsave/irqrestore from known context dma-buf/sw-sync: Prevent user overflow on timeline advance dma-buf/sw-sync: Fix the is-signaled test to handle u32 wraparound dma-buf/dma-fence: Extract __dma_fence_is_later() net: fec: fix multicast filtering hardware setup xen-netback: vif counters from int/long to u64 cec: initiator should be the same as the destination for, poll xen-netfront: Improve error handling during initialization mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers vfio/spapr: Fix missing mutex unlock when creating a window be2net: fix initial MAC setting net: thunderx: avoid dereferencing xcv when NULL net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause gtp: fix cross netns recv on gtp socket gtp: clear DF bit on GTP packet tx nvmet: cancel fatal error and flush async work before free controller i2c: i2c-cadence: Initialize configuration before probing devices tcp: correct memory barrier usage in tcp_check_space() dmaengine: pl330: fix double lock tipc: fix cleanup at module unload tipc: fix nametbl_lock soft lockup at module exit RDMA/qedr: Fix RDMA CM loopback RDMA/qedr: Return success when not changing QP state mac80211: don't try to sleep in rate_control_rate_init() drm/amdgpu: fix unload driver issue for virtual display x86/fpu: Set the xcomp_bv when we fake up a XSAVES area net: sctp: fix array overrun read on sctp_timer_tbl drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement drm/amdgpu: fix bug set incorrect value to vce register qla2xxx: Fix wrong IOCB type assumption powerpc/mm: Fix memory hotplug BUG() on radix perf/x86/intel: Account interrupts for PEBS errors NFSv4: Fix client recovery when server reboots multiple times mac80211: prevent skb/txq mismatch KVM: arm/arm64: Fix occasional warning from the timer work function drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled drm/exynos/decon5433: update shadow registers iff there are active windows nfs: Don't take a reference on fl->fl_file for LOCK operation ravb: Remove Rx overflow log messages mac80211: calculate min channel width correctly mm: fix remote numa hits statistics net: qrtr: Mark 'buf' as little endian libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount net/appletalk: Fix kernel memory disclosure be2net: fix unicast list filling be2net: fix accesses to unicast list vti6: fix device register to report IFLA_INFO_KIND ARM: OMAP1: DMA: Correct the number of logical channels ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate net: systemport: Pad packet before inserting TSB net: systemport: Utilize skb_put_padto() libcxgb: fix error check for ip6_route_output() usb: gadget: f_fs: Fix ExtCompat descriptor validation dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status dmaengine: stm32-dma: Set correct args number for DMA request from DT l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups net/mlx4_en: Fix type mismatch for 32-bit systems dax: Avoid page invalidation races and unnecessary radix tree traversals iio: adc: ti-ads1015: add 10% to conversion wait time tools include: Do not use poison with C++ kprobes/x86: Disable preemption in ftrace-based jprobes perf test attr: Fix ignored test case result usbip: tools: Install all headers needed for libusbip development sysrq : fix Show Regs call trace on ARM EDAC, sb_edac: Fix missing break in switch x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt() serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X usb: phy: tahvo: fix error handling in tahvo_usb_probe() mmc: sdhci-msm: fix issue with power irq spi: spi-axi: fix potential use-after-free after deregistration spi: sh-msiof: Fix DMA transfer size check staging: rtl8188eu: avoid a null dereference on pmlmepriv serial: 8250_fintek: Fix rs485 disablement on invalid ioctl() m68k: fix ColdFire node shift size calculation staging: greybus: loopback: Fix iteration count on async path selftests/x86/ldt_get: Add a few additional tests for limits s390/pci: do not require AIS facility ima: fix hash algorithm initialization USB: serial: option: add Quectel BG96 id s390/runtime instrumentation: simplify task exit handling serial: 8250_pci: Add Amazon PCI serial device ID usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices mm, oom_reaper: gather each vma to prevent leaking TLB entry Revert "crypto: caam - get rid of tasklet" drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume() drm/fsl-dcu: avoid disabling pixel clock twice on suspend bcache: recover data from backing when data is clean bcache: only permit to recovery read error when cache device is clean Linux 4.9.67 drm/i915: Prevent zero length "index" write drm/i915: Don't try indexed reads to alternate slave addresses NFS: revalidate "." etc correctly on "open". Revert "x86/entry/64: Add missing irqflags tracing to native_load_gs_index()" drm/amd/pp: fix typecast error in powerplay. drm/ttm: once more fix ttm_buffer_object_transfer drm/hisilicon: Ensure LDI regs are properly configured. drm/panel: simple: Add missing panel_simple_unprepare() calls drm/radeon: fix atombios on big endian drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories() drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs() Revert "drm/radeon: dont switch vt on suspend" nvme-pci: add quirk for delay before CHK RDY for WDC SN200 hwmon: (jc42) optionally try to disable the SMBUS timeout bcache: Fix building error on MIPS i2c: i801: Fix Failed to allocate irq -2147483648 error eeprom: at24: check at24_read/write arguments eeprom: at24: correctly set the size for at24mac402 eeprom: at24: fix reading from 24MAC402/24MAC602 mmc: core: prepend 0x to OCR entry in sysfs mmc: core: Do not leave the block driver in a suspended state KVM: lapic: Fixup LDR on load in x2apic KVM: lapic: Split out x2apic ldr calculation KVM: x86: inject exceptions produced by x86_decode_insn KVM: x86: Exit to user-mode on #UD intercept when emulator requires KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate mfd: twl4030-power: Fix pmic for boards that need vmmc1 on reboot nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat nfsd: Fix another OPEN stateid race nfsd: Fix stateid races between OPEN and CLOSE btrfs: clear space cache inode generation always mm/madvise.c: fix madvise() infinite loop under special circumstances mm, hugetlbfs: introduce ->split() to vm_operations_struct mm/cma: fix alloc_contig_range ret code/potential leak mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d() ARM: dts: omap3: logicpd-torpedo-37xx-devkit: Fix MMC1 cd-gpio ARM: dts: LogicPD Torpedo: Fix camera pin mux Linux 4.9.66 xen: xenbus driver must not accept invalid transaction ids nvmet: fix KATO offset in Set Features cec: update log_addr[] before finishing configuration cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2 cec: when canceling a message, don't overwrite old status info s390/kbuild: enable modversions for symbols exported from asm ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data btrfs: return the actual error value from from btrfs_uuid_tree_iterate crypto: marvell - Copy IVDIG before launching partial DMA ahash requests ASoC: rsnd: don't double free kctrl netfilter: nf_tables: fix oob access netfilter: nft_queue: use raw_smp_processor_id() spi: SPI_FSL_DSPI should depend on HAS_DMA staging: iio: cdc: fix improper return value iio: light: fix improper return value adm80211: add checks for dma mapping errors mac80211: Suppress NEW_PEER_CANDIDATE event if no room mac80211: Remove invalid flag operations in mesh TSF synchronization drm/mediatek: don't use drm_put_dev clk: qcom: ipq4019: Add all the frequencies for apss cpu drm: Apply range restriction after color adjustment when allocation gpio: mockup: dynamically allocate memory for chip name ALSA: hda - Apply ALC269_FIXUP_NO_SHUTUP on HDA_FIXUP_ACT_PROBE ath10k: set CTS protection VDEV param only if VDEV is up bnxt_en: Set default completion ring for async events. pinctrl: sirf: atlas7: Add missing 'of_node_put()' ath10k: fix potential memory leak in ath10k_wmi_tlv_op_pull_fw_stats() ath10k: ignore configuring the incorrect board_id ath10k: fix incorrect txpower set by P2P_DEVICE interface mwifiex: sdio: fix use after free issue for save_adapter adm80211: return an error if adm8211_alloc_rings() fails rt2800: set minimum MPDU and PSDU lengths to sane values drm/armada: Fix compile fail net: 3com: typhoon: typhoon_init_one: fix incorrect return values net: 3com: typhoon: typhoon_init_one: make return values more specific net: Allow IP_MULTICAST_IF to set index to L3 slave fscrypt: use ENOTDIR when setting encryption policy on nondirectory fscrypt: use ENOKEY when file cannot be created w/o key dmaengine: zx: set DMA_CYCLIC cap_mask bit clk: sunxi-ng: fix PLL_CPUX adjusting on A33 clk: sunxi-ng: A31: Fix spdif clock register drm/sun4i: Fix a return value in case of error PCI: Apply _HPX settings only to relevant devices RDS: RDMA: fix the ib_map_mr_sg_zbva() argument RDS: RDMA: return appropriate error on rdma map failures RDS: make message size limit compliant with spec e1000e: Avoid receiver overrun interrupt bursts e1000e: Separate signaling for link check/link up e1000e: Fix return value test e1000e: Fix error path in link detection Revert "drm/i915: Do not rely on wm preservation for ILK watermarks" PM / OPP: Add missing of_node_put(np) net/9p: Switch to wait_event_killable() fscrypt: lock mutex before checking for bounce page pool sched/rt: Simplify the IPI based RT balancing logic media: v4l2-ctrl: Fix flags field on Control events cx231xx-cards: fix NULL-deref on missing association descriptor media: rc: check for integer overflow media: Don't do DMA on stack for firmware upload in the AS102 driver powerpc/signal: Properly handle return value from uprobe_deny_signal() parisc: Fix validity check of pointer size argument in new CAS implementation ixgbe: Fix skb list corruption on Power systems fm10k: Use smp_rmb rather than read_barrier_depends i40evf: Use smp_rmb rather than read_barrier_depends ixgbevf: Use smp_rmb rather than read_barrier_depends igbvf: Use smp_rmb rather than read_barrier_depends igb: Use smp_rmb rather than read_barrier_depends i40e: Use smp_rmb rather than read_barrier_depends NFC: fix device-allocation error return IB/srp: Avoid that a cable pull can trigger a kernel crash IB/srpt: Do not accept invalid initiator port names libnvdimm, namespace: make 'resource' attribute only readable by root libnvdimm, namespace: fix label initialization to use valid seq numbers libnvdimm, pfn: make 'resource' attribute only readable by root clk: ti: dra7-atl-clock: fix child-node lookups SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status KVM: SVM: obey guest PAT KVM: nVMX: set IDTR and GDTR limits when loading L1 host state lockd: double unregister of inetaddr notifiers irqchip/gic-v3: Fix ppi-partitions lookup block: Fix a race between blk_cleanup_queue() and timeout handling p54: don't unregister leds when they are not initialized mtd: nand: mtk: fix infinite ECC decode IRQ issue mtd: nand: Fix writing mtdoops to nand flash. mtd: nand: omap2: Fix subpage write target: Fix QUEUE_FULL + SCSI task attribute handling iscsi-target: Fix non-immediate TMR reference leak fs/9p: Compare qid.path in v9fs_test_inode fix a page leak in vhost_scsi_iov_to_sgl() error recovery ALSA: hda/realtek - Fix ALC700 family no sound issue ALSA: hda: Fix too short HDMI/DP chmap reporting ALSA: timer: Remove kernel warning at compat ioctl error paths ALSA: usb-audio: Add sanity checks in v2 clock parsers ALSA: usb-audio: Fix potential out-of-bound access at parsing SU ALSA: usb-audio: Add sanity checks to FE parser ALSA: pcm: update tstamp only if audio_tstamp changed ext4: fix interaction between i_size, fallocate, and delalloc after a crash ata: fixes kernel crash while tracing ata_eh_link_autopsy event rtlwifi: fix uninitialized rtlhal->last_suspend_sec time rtlwifi: rtl8192ee: Fix memory leak when loading firmware nfsd: deal with revoked delegations appropriately NFS: Avoid RCU usage in tracepoints nfs: Fix ugly referral attributes NFS: Fix typo in nomigration mount option isofs: fix timestamps beyond 2027 bcache: check ca->alloc_thread initialized before wake up it libceph: don't WARN() if user tries to add invalid key eCryptfs: use after free in ecryptfs_release_messaging() nilfs2: fix race condition that causes file system corruption autofs: don't fail mount for transient error rt2x00usb: mark device removed when get ENOENT usb error MIPS: BCM47XX: Fix LED inversion for WRT54GSv1 MIPS: Fix an n32 core file generation regset support regression MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry MIPS: Fix odd fp register warnings with MIPS64r2 dm: fix race between dm_get_from_kobject() and __dm_destroy() MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver dm: allocate struct mapped_device with kvzalloc dm bufio: fix integer overflow when limiting maximum cache size ALSA: hda: Add Raven PCI ID PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF MIPS: ralink: Fix typo in mt7628 pinmux function MIPS: ralink: Fix MT7628 pinmux ARM: 8721/1: mm: dump: check hardware RO bit for LPAE ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE arm64: Implement arch-specific pte_access_permitted() x86/entry/64: Add missing irqflags tracing to native_load_gs_index() x86/decoder: Add new TEST instruction pattern lib/mpi: call cond_resched() from mpi_powm() loop sched: Make resched_cpu() unconditional vsock: use new wait API for vsock_stream_sendmsg() ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER x86/mm: fix use-after-free of vma during userfaultfd fault ACPI / EC: Fix regression related to triggering source of EC event handling s390/disassembler: increase show_code buffer size s390/disassembler: add missing end marker for e7 table s390/runtime instrumention: fix possible memory corruption s390: fix transactional execution control register handling Conflicts: drivers/android/binder_alloc.c drivers/android/binder_alloc.h drivers/android/binder_alloc_selftest.c drivers/mmc/core/bus.c drivers/mmc/host/sdhci-msm.c drivers/thermal/step_wise.c kernel/cpu.c mm/oom_kill.c sound/usb/mixer.c Change-Id: Id01eb66cafc5970b460321e44ec8ffcfa76971a6 Signed-off-by: Kyle Yan <kyan@codeaurora.org> |
||
|
|
eb4ef2dea9 |
Merge 4.9.68 into android-4.9-o
Changes in 4.9.68 bcache: only permit to recovery read error when cache device is clean bcache: recover data from backing when data is clean drm/fsl-dcu: avoid disabling pixel clock twice on suspend drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume() Revert "crypto: caam - get rid of tasklet" mm, oom_reaper: gather each vma to prevent leaking TLB entry uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub serial: 8250_pci: Add Amazon PCI serial device ID s390/runtime instrumentation: simplify task exit handling USB: serial: option: add Quectel BG96 id ima: fix hash algorithm initialization s390/pci: do not require AIS facility selftests/x86/ldt_get: Add a few additional tests for limits staging: greybus: loopback: Fix iteration count on async path m68k: fix ColdFire node shift size calculation serial: 8250_fintek: Fix rs485 disablement on invalid ioctl() staging: rtl8188eu: avoid a null dereference on pmlmepriv spi: sh-msiof: Fix DMA transfer size check spi: spi-axi: fix potential use-after-free after deregistration mmc: sdhci-msm: fix issue with power irq usb: phy: tahvo: fix error handling in tahvo_usb_probe() serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt() EDAC, sb_edac: Fix missing break in switch sysrq : fix Show Regs call trace on ARM usbip: tools: Install all headers needed for libusbip development perf test attr: Fix ignored test case result kprobes/x86: Disable preemption in ftrace-based jprobes tools include: Do not use poison with C++ iio: adc: ti-ads1015: add 10% to conversion wait time dax: Avoid page invalidation races and unnecessary radix tree traversals net/mlx4_en: Fix type mismatch for 32-bit systems l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups dmaengine: stm32-dma: Set correct args number for DMA request from DT dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status usb: gadget: f_fs: Fix ExtCompat descriptor validation libcxgb: fix error check for ip6_route_output() net: systemport: Utilize skb_put_padto() net: systemport: Pad packet before inserting TSB ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate ARM: OMAP1: DMA: Correct the number of logical channels vti6: fix device register to report IFLA_INFO_KIND be2net: fix accesses to unicast list be2net: fix unicast list filling net/appletalk: Fix kernel memory disclosure libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount net: qrtr: Mark 'buf' as little endian mm: fix remote numa hits statistics mac80211: calculate min channel width correctly ravb: Remove Rx overflow log messages nfs: Don't take a reference on fl->fl_file for LOCK operation drm/exynos/decon5433: update shadow registers iff there are active windows drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled KVM: arm/arm64: Fix occasional warning from the timer work function mac80211: prevent skb/txq mismatch NFSv4: Fix client recovery when server reboots multiple times perf/x86/intel: Account interrupts for PEBS errors powerpc/mm: Fix memory hotplug BUG() on radix qla2xxx: Fix wrong IOCB type assumption drm/amdgpu: fix bug set incorrect value to vce register drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement net: sctp: fix array overrun read on sctp_timer_tbl x86/fpu: Set the xcomp_bv when we fake up a XSAVES area drm/amdgpu: fix unload driver issue for virtual display mac80211: don't try to sleep in rate_control_rate_init() RDMA/qedr: Return success when not changing QP state RDMA/qedr: Fix RDMA CM loopback tipc: fix nametbl_lock soft lockup at module exit tipc: fix cleanup at module unload dmaengine: pl330: fix double lock tcp: correct memory barrier usage in tcp_check_space() i2c: i2c-cadence: Initialize configuration before probing devices nvmet: cancel fatal error and flush async work before free controller gtp: clear DF bit on GTP packet tx gtp: fix cross netns recv on gtp socket net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause net: thunderx: avoid dereferencing xcv when NULL be2net: fix initial MAC setting vfio/spapr: Fix missing mutex unlock when creating a window mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers xen-netfront: Improve error handling during initialization cec: initiator should be the same as the destination for, poll xen-netback: vif counters from int/long to u64 net: fec: fix multicast filtering hardware setup dma-buf/dma-fence: Extract __dma_fence_is_later() dma-buf/sw-sync: Fix the is-signaled test to handle u32 wraparound dma-buf/sw-sync: Prevent user overflow on timeline advance dma-buf/sw-sync: Reduce irqsave/irqrestore from known context dma-buf/sw-sync: sync_pt is private and of fixed size dma-buf/sw-sync: Fix locking around sync_timeline lists dma-buf/sw-sync: Use an rbtree to sort fences in the timeline dma-buf/sw_sync: move timeline_fence_ops around dma-buf/sw_sync: clean up list before signaling the fence dma-fence: Clear fence->status during dma_fence_init() dma-fence: Wrap querying the fence->status dma-fence: Introduce drm_fence_set_error() helper dma-buf/sw_sync: force signal all unsignaled fences on dying timeline dma-buf/sync_file: hold reference to fence when creating sync_file dma-buf: Update kerneldoc for sync_file_create usb: hub: Cycle HUB power when initialization fails usb: xhci: fix panic in xhci_free_virt_devices_depth_first USB: core: Add type-specific length check of BOS descriptors USB: Increase usbfs transfer limit USB: devio: Prevent integer overflow in proc_do_submiturb() USB: usbfs: Filter flags passed in from user space usb: host: fix incorrect updating of offset xen-netfront: avoid crashing on resume after a failure in talk_to_netback() Linux 4.9.68 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
a88ff235e8 |
perf/x86/intel: Account interrupts for PEBS errors
[ Upstream commit 475113d937adfd150eb82b5e2c5507125a68e7af ]
It's possible to set up PEBS events to get only errors and not
any data, like on SNB-X (model 45) and IVB-EP (model 62)
via 2 perf commands running simultaneously:
taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10
This leads to a soft lock up, because the error path of the
intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt
for error PEBS interrupts, so in case you're getting ONLY
errors you don't have a way to stop the event when it's over
the max_samples_per_tick limit:
NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816]
...
RIP: 0010:[<ffffffff81159232>] [<ffffffff81159232>] smp_call_function_single+0xe2/0x140
...
Call Trace:
? trace_hardirqs_on_caller+0xf5/0x1b0
? perf_cgroup_attach+0x70/0x70
perf_install_in_context+0x199/0x1b0
? ctx_resched+0x90/0x90
SYSC_perf_event_open+0x641/0xf90
SyS_perf_event_open+0x9/0x10
do_syscall_64+0x6c/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25
Add perf_event_account_interrupt() which does the interrupt
and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s
error path.
We keep the pending_kill and pending_wakeup logic only in the
__perf_event_overflow() path, because they make sense only if
there's any data to deliver.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
b5fd2edf6f |
perf: Fix event cleanup across CPU hotplugs
Perf hardware events are generally tied to a particular CPU. During the cleanup process perf_remove_from_context() tries to execute the PMU related cleanup on the CPU that the event is tied to. But if the CPU is offline, the cleanup is not completed successfully at the PMU level, but the event is freed (at the core level). This fix is to collect the events that are approaching the release when it's CPU is down. These zombie events are cleaned up when the CPU is hotplugged on again. Change-Id: I9ddf8f32cfcec4e0cb1c0910fb68b6db984d7553 Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org> |
||
|
|
459740ae0d |
perf: Enable user and kernel event sharing
The kernel and the user-space are now able to share the similar events. The kernel and user's attr->type (hardware/raw) and attr->config should be same for them to share the same counter. Change-Id: I4a4b35bde6beaf8f2aef74e683a9804e31807013 Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org> |
||
|
|
daf4560d30 |
perf: protect group_leader from races that cause ctx double-free
When moving a group_leader perf event from a software-context to a hardware-context, there's a race in checking and updating that context. The existing locking solution doesn't work; note that it tries to grab a lock inside the group_leader's context object, which you can only get at by going through a pointer that should be protected from these races. To avoid that problem, and to produce a simple solution, we can just use a lock per group_leader to protect all checks on the group_leader's context. The new lock is grabbed and released when no context locks are held. Bug: 30955111 Bug: 31095224 Change-Id: If37124c100ca6f4aa962559fba3bd5dbbec8e052 Git-repo: https://android.googlesource.com/kernel/msm Git-commit: 5b87e00be9ca28ea32cab49b92c0386e4a91f730 Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org> |
||
|
|
b02d7648fb |
perf: enable perf to continue across hotplug
Currently perf hardware, software and tracepoint events are
deleted when a cpu is hotplugged out. This change restarts
the events after hotplug. In arm_pmu.c most of the code
for handline power collapse is reused for hotplug.
This change supercedes commit 1f0f95c5fe9e ("perf: add hotplug
support so that perf continues after hotplug") and uses the
new hotplug notification method.
Change-Id: Ia9de1c150558125e2da5f2c4354c268ce4b1154a
Signed-off-by: Patrick Fay <pfay@codeaurora.org>
|
||
|
|
d28f856d2c |
FROMLIST: security,perf: Allow further restriction of perf_event_open
When kernel.perf_event_open is set to 3 (or greater), disallow all access to performance events by users without CAP_SYS_ADMIN. Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that makes this value the default. This is based on a similar feature in grsecurity (CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making the variable read-only. It also allows enabling further restriction at run-time regardless of whether the default is changed. https://lkml.org/lkml/2016/1/11/587 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Bug: 29054680 Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8 |
||
|
|
e6d4a923be |
Merge remote-tracking branch 'origin/tmp-a909d3e' into 1105
* origin/tmp-a909d3e: Linux 4.9-rc3 x86/smpboot: Init apic mapping before usage ACPICA: Dispatcher: Fix interpreter locking around acpi_ev_initialize_region() ACPICA: Dispatcher: Fix an unbalanced lock exit path in acpi_ds_auto_serialize_method() ACPICA: Dispatcher: Fix order issue of method termination ARC: module: print pretty section names ARC: module: elide loop to save reference to .eh_frame ARC: mm: retire ARC_DBG_TLB_MISS_COUNT... ARC: build: retire old toggles ARC: boot log: refactor cpu name/release printing ARC: boot log: remove awkward space comma from MMU line ARC: boot log: don't assume SWAPE instruction support ARC: boot log: refactor printing abt features not captured in BCRs ARCv2: boot log: print IOC exists as well as enabled status ubifs: Fix regression in ubifs_readdir() ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap() MAINTAINERS: Add entry for genwqe driver VMCI: Doorbell create and destroy fixes GenWQE: Fix bad page access during abort of resource allocation vme: vme_get_size potentially returning incorrect value on failure tty: serial_core: fix NULL struct tty pointer access in uart_write_wakeup tty: serial_core: Fix serial console crash on port shutdown tty/serial: at91: fix hardware handshake on Atmel platforms perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors perf/powerpc: Don't call perf_event_disable() from atomic context perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk cris/arch-v32: cryptocop: print a hex number after a 0x prefix ipack: print a hex number after a 0x prefix block: DAC960: print a hex number after a 0x prefix fs: exofs: print a hex number after a 0x prefix lib/genalloc.c: start search from start of chunk mm: memcontrol: do not recurse in direct reclaim CREDITS: update credit information for Martin Kepplinger proc: fix NULL dereference when reading /proc/<pid>/auxv mm: kmemleak: ensure that the task stack is not freed during scanning lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB latent_entropy: raise CONFIG_FRAME_WARN by default kconfig.h: remove config_enabled() macro ipc: account for kmem usage on mqueue and msg mm/slab: improve performance of gathering slabinfo stats mm: page_alloc: use KERN_CONT where appropriate mm/list_lru.c: avoid error-path NULL pointer deref h8300: fix syscall restarting kcov: properly check if we are in an interrupt mm/slab: fix kmemcg cache creation delayed issue device-dax: fix percpu_ref_exit ordering Allow KASAN and HOTPLUG_MEMORY to co-exist when doing build testing nvdimm: make CONFIG_NVDIMM_DAX 'bool' mm: remove unused variable in memory hotplug btrfs: fix races on root_log_ctx lists mm: remove per-zone hashtable of bitlock waitqueues blk-mq: update hardware and software queues for sleeping alloc driver core: Make Kconfig text for DEBUG_TEST_DRIVER_REMOVE stronger kernfs: Add noop_fsync to supported kernfs_file_fops vt: clear selection before resizing sc16is7xx: always write state when configuring GPIO as an output sh-sci: document R8A7743/5 support tty: serial: 8250: 8250_core: NXP SC16C2552 workaround tty: limit terminal size to 4M chars tty: serial: fsl_lpuart: Fix Tx DMA edge case serial: 8250_lpss: enable MSI for sure serial: core: fix console problems on uart_close serial: 8250_uniphier: fix clearing divisor latch access bit serial: 8250_uniphier: fix more unterminated string serial: pch_uart: add terminate entry for dmi_system_id tables devicetree: bindings: uart: Add new compatible string for ZynqMP serial: xuartps: Add new compatible string for ZynqMP serial: SERIAL_STM32 should depend on HAS_DMA serial: stm32: Fix comparisons with undefined register tty: vt, fix bogus division in csi_J powerpc/64s: relocation, register save fixes for system reset interrupt powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state() ALSA: usb-audio: Add quirk for Syntek STK1160 sched/fair: Remove unused but set variable 'rq' objtool: Fix rare switch jump table pattern detection security/keys: make BIG_KEYS dependent on stdrng. KEYS: Sort out big_key initialisation KEYS: Fix short sprintf buffer in /proc/keys show function arm64: mm: fix __page_to_voff definition arm64/numa: fix incorrect log for memory-less node arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity block: flush: fix IO hang in case of flood fua req x86: Fix export for mcount and __fentry__ doc: Add missing parameter for msi_setup extcon: qcom-spmi-misc: Sync the extcon state on interrupt drm/drivers: add support for using the arch wc mapping API. x86/io: add interface to reserve io memtype for a resource range. (v1.1) MAINTAINERS: Begin module maintainer transition ahci: fix the single MSI-X case in ahci_init_one timers: Prevent base clock corruption when forwarding timers: Prevent base clock rewind when forwarding clock timers: Lock base for same bucket optimization timers: Plug locking race vs. timer migration ALSA: seq: Fix time account regression i2c: imx: defer probe if bus recovery GPIOs are not ready i2c: designware: Avoid aborted transfers with fast reacting I2C slaves i2c: i801: Fix I2C Block Read on 8-Series/C220 and later i2c: xgene: Avoid dma_buffer overrun i2c: digicolor: Fix module autoload i2c: xlr: Fix module autoload for OF registration i2c: xlp9xx: Fix module autoload i2c: jz4780: Fix module autoload i2c: allow configuration of imx driver for ColdFire architecture i2c: mark device nodes only in case of successful instantiation x86/quirks: Hide maybe-uninitialized warning x86/build: Fix build with older GCC versions x86/unwind: Fix empty stack dereference in guess unwinder i2c: rk3x: Give the tuning value 0 during rk3x_i2c_v0_calc_timings i2c: hix5hd2: allow build with ARCH_HISI usb: chipidea: host: fix NULL ptr dereference during shutdown ALSA: hda - Fix surround output pins for ASRock B150M mobo hv: do not lose pending heartbeat vmbus packets mm: unexport __get_user_pages() proc: don't use FOLL_FORCE for reading cmdline and environment cpufreq: intel_pstate: Always set max P-state in performance mode nbd: fix incorrect unlock of nbd->sock_lock in sock_shutdown orangefs: don't use d_time orangefs: user file_inode() where it is due mei: txe: don't clean an unprocessed interrupt cause. ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct ANDROID: binder: Add strong ref checks ARCv2: IOC: use @ioc_enable not @ioc_exist where intended ARC: syscall for userspace cmpxchg assist dm table: fix missing dm_put_target_type() in dm_table_add_target() xenbus: check return value of xenbus_scanf() xenbus: prefer list_for_each() x86: xen: move cpu_up functions out of ifdef xenbus: advertise control feature flags greybus: fix a leak on error in gb_module_create() greybus: es2: fix error return code in ap_probe() greybus: arche-platform: Add missing of_node_put() in arche_platform_change_state() staging: android: ion: Fix error handling in ion_query_heaps() ARM: imx: mach-imx6q: Fix the PHY ID mask for AR8031 PM / suspend: Fix missing KERN_CONT for suspend message usb: renesas_usbhs: add wait after initialization for R-Car Gen3 ACPI / APEI: Fix incorrect return value of ghes_proc() usb: increase ohci watchdog delay to 275 msec usb: musb: Call pm_runtime from musb_gadget_queue usb: musb: Fix hardirq-safe hardirq-unsafe lock order error usb: ehci-platform: increase EHCI_MAX_RSTS to 4 usb: ohci-at91: Set RemoteWakeupConnected bit explicitly. ACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs ACPI/PCI: pci_link: penalize SCI correctly ACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages ARM: dts: vf610: fix IRQ flag of global timer powerpc/64: Fix race condition in setting lock bit in idle/wakeup code powerpc/64: Re-fix race condition between going idle and entering guest s390/mm: fix zone calculation in arch_add_memory() s390/dumpstack: use pr_cont within show_stack and die ARM: imx: gpc: Fix the imx_gpc_genpd_init() error path ARM: imx: gpc: Initialize all power domains xfs: clear cowblocks tag when cow fork is emptied xfs: fix up inode cowblocks tracking tracepoints fs: Do to trim high file position bits in iomap_page_mkwrite_actor cxl: Fix leaking pid refs in some error paths gpio: mpc8xxx: Correct irq handler function gpio: ath79: Fix module autoload arm64: dts: Updated NAND DT properties for NS2 SVK iio: accel: sca3000_core: avoid potentially uninitialized variable iio:chemical:atlas-ph-sensor: Fix use of 32 bit int to hold 16 bit big endian value arm64: dts: uniphier: change MIO node to SD control node ARM: dts: uniphier: change MIO node to SD control node reset: uniphier: rename MIO reset to SD reset for Pro5, PXs2, LD20 SoCs arm64: uniphier: select ARCH_HAS_RESET_CONTROLLER ARM: uniphier: select ARCH_HAS_RESET_CONTROLLER badblocks: badblocks_set/clear update unacked_exist softirq: Display IRQ_POLL for irq-poll statistics powerpc: Convert cmp to cmpd in idle enter sequence KVM: PPC: Book3S HV: Fix build error when SMP=n cpufreq: intel_pstate: Set P-state upfront in performance mode USB: serial: fix potential NULL-dereference at probe arm64: dts: Add timer erratum property for LS2080A and LS1043A gpio: ts4800: Fix module autoload gpio: GPIO_GET_LINEEVENT_IOCTL: Reject invalid line and event flags gpio: GPIO_GET_LINEHANDLE_IOCTL: Reject invalid line flags gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak gpio: GPIO_GET_LINEEVENT_IOCTL: Validate line offset gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak gpio: GPIO_GET_LINEHANDLE_IOCTL: Validate line offset gpio: GPIO_GET_CHIPINFO_IOCTL: Fix information leak gpio: GPIO_GET_CHIPINFO_IOCTL: Fix line offset validation clk: at91: Fix a return value in case of error ahci: fix nvec check xhci: use default USB_RESUME_TIMEOUT when resuming ports. xhci: workaround for hosts missing CAS bit xhci: add restart quirk for Intel Wildcatpoint PCH gpio / ACPI: fix returned error from acpi_dev_gpio_irq_get() gpio: mockup: add sysfs dependency gpio: stmpe: || vs && typo gpio: mxs: Unmap region obtained by of_iomap gpio/board.txt: point to gpiod_set_value ALSA: hda - Fix headset mic detection problem for two Dell laptops USB: serial: cp210x: fix tiocmget error handling thermal/powerclamp: correct cpu support check thermal: intel_pch_thermal: Enable Haswell PCH thermal: intel_pch_thermal: Add an ACPI passive trip xfs: remove xfs_bunmapi_cow xfs: optimize xfs_reflink_end_cow xfs: optimize xfs_reflink_cancel_cow_blocks xfs: refactor xfs_bunmapi_cow xfs: optimize writes to reflink files xfs: don't bother looking at the refcount tree for reads xfs: handle "raw" delayed extents xfs_reflink_trim_around_shared xfs: add xfs_trim_extent iomap: add IOMAP_REPORT xfs: merge xfs_reflink_remap_range and xfs_file_share_range xfs: remove xfs_file_wait_for_io xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range xfs: fix the same_inode check in xfs_file_share_range xfs: remove the same fs check from xfs_file_share_range libxfs: v3 inodes are only valid on crc-enabled filesystems libxfs: clean up _calc_dquots_per_chunk xfs: unset MS_ACTIVE if mount fails xfs: remove pointless error goto in xfs_bmap_remap_alloc xfs: don't take the IOLOCK exclusive for direct I/O page invalidation xfs: add some 'static' annotations xfs: Fix uninitialized variable in xfs_reflink_reserve_cow_range() xfs: remove redundant assignment of ifp ARC: fix build warning in elf.h clk: uniphier: rename MIO clock to SD clock for Pro5, PXs2, LD20 SoCs clk: uniphier: fix memory overrun bug pmem: report error on clear poison failure libnvdimm, namespace: potential NULL deref on allocation error ahci: only try to use multi-MSI mode if there is more than 1 port ARC: Adjust cpuinfo for non-continuous cpu ids perf/x86/intel/cstate: Add C-state residency events for Knights Landing wusb: fix error return code in wusb_prf() hwrng: core - Don't use a stack buffer in add_early_randomness() arm64: dts: rockchip: remove the abuse of keep-power-in-suspend dm rq: clear kworker_task if kthread_run() returned an error dm: free io_barrier after blk_cleanup_queue call ALSA: asihpi: fix kernel memory disclosure Revert "Documentation: devicetree: dwc2: Deprecate g-tx-fifo-size" Revert "usb: dwc2: gadget: fix TX FIFO size and address initialization" Revert "usb: dwc2: gadget: change variable name to more meaningful" ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table ALSA: hda - allow 40 bit DMA mask for NVidia devices clk: hi6220: use CLK_OF_DECLARE_DRIVER for sysctrl and mediactrl clock init clk: mvebu: armada-37xx-periph: Fix the clock gate flag clk: bcm2835: Clamp the PLL's requested rate to the hardware limits. clk: max77686: fix number of clocks setup for clk_hw based registration clk: mvebu: armada-37xx-periph: Fix the clock provider registration clk: core: add __init decoration for CLK_OF_DECLARE_DRIVER function clk: mediatek: Add hardware dependency clk: samsung: clk-exynos-audss: Fix module autoload clk: uniphier: fix type of variable passed to regmap_read() clk: uniphier: add system clock support for sLD3 SoC ARM: multi_v7_defconfig: Enable Intel e1000e driver MAINTAINERS: add myself as Marvell berlin SoC maintainer bus: qcom-ebi2: depend on ARCH_QCOM or COMPILE_TEST ARM: dts: fix the SD card on the Snowball dm raid: fix activation of existing raid4/10 devices scsi: NCR5380: no longer mark irq probing as __init scsi: be2iscsi: Replace _bh with _irqsave/irqrestore scsi: libiscsi: Fix locking in __iscsi_conn_send_pdu USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7 s390/dumpstack: get rid of return_address again s390/disassambler: use pr_cont where appropriate s390/dumpstack: use pr_cont where appropriate s390/dumpstack: restore reliable indicator for call traces wusb: Stop using the stack for sg crypto scratch space usb: dwc3: Fix size used in dma_free_coherent() usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable usb: gadget: f_fs: edit epfile->ep under lock usb: dwc2: Add msleep for host-only s390/mm: use hugetlb_bad_size() s390/cio: don't register chpids in reserved state s390: ignore pkey system calls s390/dasd: avoid undefined behaviour usb: dwc3: gadget: never pre-start Isochronous endpoints usb: dwc3: gadget: properly account queued requests usb: gadget: function: u_ether: don't starve tx request queue usb: gadget: udc: atmel: fix endpoint name staging/lustre/llite: Move unstable_stats from sysfs to debugfs Staging: wilc1000: Fix kernel Oops on opening the device staging: android/ion: testing the wrong variable Staging: greybus: uart: Use gbphy_dev->dev instead of bundle->dev Staging: greybus: gpio: Use gbphy_dev->dev instead of bundle->dev ARC: [build] Support gz, lzma compressed uImage ARCv2: intc: untangle SMP, MCIP and IDU arm64: dts: rockchip: remove always-on and boot-on from vcc_sd dm mirror: use all available legs on multiple failures dm mirror: fix read error on recovery after default leg failure Btrfs: fix incremental send failure caused by balance dm raid: fix compat_features validation iio: adc: ti-adc081c: Select IIO_TRIGGERED_BUFFER to prevent build errors iio: maxim_thermocouple: Align 16 bit big endian value of raw reads arm64: dts: marvell: fix clocksource for CP110 master SPI0 ARM: mvebu: Select corediv clk for all mvebu v7 SoC Conflicts: drivers/android/binder.c drivers/staging/android/ion/ion.c drivers/usb/gadget/function/u_ether.c Change-Id: Id87fbdb0731f6f7681787faa4cebf067a402aa7e Signed-off-by: Channagoud Kadabi <ckadabi@codeaurora.org> |
||
|
|
5aab90ce1e |
perf/powerpc: Don't call perf_event_disable() from atomic context
The trinity syscall fuzzer triggered following WARN() on powerpc: WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278 ... NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0 LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 Call Trace: [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable) [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 Followed by a lockdep warning: =============================== [ INFO: suspicious RCU usage. ] 4.8.0-rc5+ #7 Tainted: G W ------------------------------- ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ls/2998: #0: (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0 #1: (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0 stack backtrace: CPU: 9 PID: 2998 Comm: ls Tainted: G W 4.8.0-rc5+ #7 Call Trace: [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable) [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180 [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0 [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0 [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380 [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60 [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0 [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 While it looks like the first WARN() is probably valid, the other one is triggered by disabling event via perf_event_disable() from atomic context. The event is disabled here in case we were not able to emulate the instruction that hit the breakpoint. By disabling the event we unschedule the event and make sure it's not scheduled back. But we can't call perf_event_disable() from atomic context, instead we need to use the event's pending_disable irq_work method to disable it. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161026094824.GA21397@krava Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
88b2873135 |
FROMLIST: security,perf: Allow further restriction of perf_event_open
When kernel.perf_event_open is set to 3 (or greater), disallow all access to performance events by users without CAP_SYS_ADMIN. Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that makes this value the default. This is based on a similar feature in grsecurity (CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making the variable read-only. It also allows enabling further restriction at run-time regardless of whether the default is changed. https://lkml.org/lkml/2016/1/11/587 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Bug: 29054680 Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8 |
||
|
|
687ee0ad4e |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) BBR TCP congestion control, from Neal Cardwell, Yuchung Cheng and
co. at Google. https://lwn.net/Articles/701165/
2) Do TCP Small Queues for retransmits, from Eric Dumazet.
3) Support collect_md mode for all IPV4 and IPV6 tunnels, from Alexei
Starovoitov.
4) Allow cls_flower to classify packets in ip tunnels, from Amir Vadai.
5) Support DSA tagging in older mv88e6xxx switches, from Andrew Lunn.
6) Support GMAC protocol in iwlwifi mwm, from Ayala Beker.
7) Support ndo_poll_controller in mlx5, from Calvin Owens.
8) Move VRF processing to an output hook and allow l3mdev to be
loopback, from David Ahern.
9) Support SOCK_DESTROY for UDP sockets. Also from David Ahern.
10) Congestion control in RXRPC, from David Howells.
11) Support geneve RX offload in ixgbe, from Emil Tantilov.
12) When hitting pressure for new incoming TCP data SKBs, perform a
partial rathern than a full purge of the OFO queue (which could be
huge). From Eric Dumazet.
13) Convert XFRM state and policy lookups to RCU, from Florian Westphal.
14) Support RX network flow classification to igb, from Gangfeng Huang.
15) Hardware offloading of eBPF in nfp driver, from Jakub Kicinski.
16) New skbmod packet action, from Jamal Hadi Salim.
17) Remove some inefficiencies in snmp proc output, from Jia He.
18) Add FIB notifications to properly propagate route changes to
hardware which is doing forwarding offloading. From Jiri Pirko.
19) New dsa driver for qca8xxx chips, from John Crispin.
20) Implement RFC7559 ipv6 router solicitation backoff, from Maciej
Żenczykowski.
21) Add L3 mode to ipvlan, from Mahesh Bandewar.
22) Support 802.1ad in mlx4, from Moshe Shemesh.
23) Support hardware LRO in mediatek driver, from Nelson Chang.
24) Add TC offloading to mlx5, from Or Gerlitz.
25) Convert various drivers to ethtool ksettings interfaces, from
Philippe Reynes.
26) TX max rate limiting for cxgb4, from Rahul Lakkireddy.
27) NAPI support for ath10k, from Rajkumar Manoharan.
28) Support XDP in mlx5, from Rana Shahout and Saeed Mahameed.
29) UDP replicast support in TIPC, from Richard Alpe.
30) Per-queue statistics for qed driver, from Sudarsana Reddy Kalluru.
31) Support BQL in thunderx driver, from Sunil Goutham.
32) TSO support in alx driver, from Tobias Regnery.
33) Add stream parser engine and use it in kcm.
34) Support async DHCP replies in ipconfig module, from Uwe
Kleine-König.
35) DSA port fast aging for mv88e6xxx driver, from Vivien Didelot.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1715 commits)
mlxsw: switchx2: Fix misuse of hard_header_len
mlxsw: spectrum: Fix misuse of hard_header_len
net/faraday: Stop NCSI device on shutdown
net/ncsi: Introduce ncsi_stop_dev()
net/ncsi: Rework the channel monitoring
net/ncsi: Allow to extend NCSI request properties
net/ncsi: Rework request index allocation
net/ncsi: Don't probe on the reserved channel ID (0x1f)
net/ncsi: Introduce NCSI_RESERVED_CHANNEL
net/ncsi: Avoid unused-value build warning from ia64-linux-gcc
net: Add netdev all_adj_list refcnt propagation to fix panic
net: phy: Add Edge-rate driver for Microsemi PHYs.
vmxnet3: Wake queue from reset work
i40e: avoid NULL pointer dereference and recursive errors on early PCI error
qed: Add RoCE ll2 & GSI support
qed: Add support for memory registeration verbs
qed: Add support for QP verbs
qed: PD,PKEY and CQ verb support
qed: Add support for RoCE hw init
qede: Add qedr framework
...
|
||
|
|
aa6a5f3cb2 |
perf, bpf: add perf events core support for BPF_PROG_TYPE_PERF_EVENT programs
Allow attaching BPF_PROG_TYPE_PERF_EVENT programs to sw and hw perf events via overflow_handler mechanism. When program is attached the overflow_handlers become stacked. The program acts as a filter. Returning zero from the program means that the normal perf_event_output handler will not be called and sampling event won't be stored in the ring buffer. The overflow_handler_context==NULL is an additional safety check to make sure programs are not attached to hw breakpoints and watchdog in case other checks (that prevent that now anyway) get accidentally relaxed in the future. The program refcnt is incremented in case perf_events are inhereted when target task is forked. Similar to kprobe and tracepoint programs there is no ioctl to detach the program or swap already attached program. The user space expected to close(perf_event_fd) like it does right now for kprobe+bpf. That restriction simplifies the code quite a bit. The invocation of overflow_handler in __perf_event_overflow() is now done via READ_ONCE, since that pointer can be replaced when the program is attached while perf_event itself could have been active already. There is no need to do similar treatment for event->prog, since it's assigned only once before it's accessed. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
0515e5999a |
bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type
Introduce BPF_PROG_TYPE_PERF_EVENT programs that can be attached to
HW and SW perf events (PERF_TYPE_HARDWARE and PERF_TYPE_SOFTWARE
correspondingly in uapi/linux/perf_event.h)
The program visible context meta structure is
struct bpf_perf_event_data {
struct pt_regs regs;
__u64 sample_period;
};
which is accessible directly from the program:
int bpf_prog(struct bpf_perf_event_data *ctx)
{
... ctx->sample_period ...
... ctx->regs.ip ...
}
The bpf verifier rewrites the accesses into kernel internal
struct bpf_perf_event_data_kern which allows changing
struct perf_sample_data without affecting bpf programs.
New fields can be added to the end of struct bpf_perf_event_data
in the future.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
||
|
|
d6a2f9035b |
perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG
Introduce the flag PMU_EV_CAP_READ_ACTIVE_PKG, useful for uncore events, that allows a PMU to signal the generic perf code that an event is readable in the current CPU if the event is active in a CPU in the same package as the current CPU. This is an optimization that avoids a unnecessary IPI for the common case where uncore events are run and read in the same package but in different CPUs. As an example, the IPI removal speeds up perf_read() in my Haswell system as follows: - For event UNC_C_LLC_LOOKUP: From 260 us to 31 us. - For event RAPL's power/energy-cores/: From to 255 us to 27 us. For the optimization to work, all events in the group must have it (similarly to PERF_EV_CAP_SOFTWARE). Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1471467307-61171-4-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
4ff6a8debf |
perf/core: Generalize event->group_flags
Currently, PERF_GROUP_SOFTWARE is used in the group_flags field of a group's leader to indicate that is_software_event(event) is true for all events in a group. This is the only usage of event->group_flags. This pattern of setting a group level flags when all events in the group share a property is useful for the flag introduced in the next patch and for future CQM/CMT flags. So this patches generalizes group_flags to work as an aggregate of event level flags. PERF_GROUP_SOFTWARE denotes an inmutable event's property. All other flags that I intend to add are also determinable at event initialization. To better convey the above, this patch renames event's group_flags to group_caps and PERF_GROUP_SOFTWARE to PERF_EV_CAP_SOFTWARE. Individual event flags are stored in the new event->event_caps. Since the cap flags do not change after event initialization, there is no need to serialize event_caps. This new field is used when events are added to a context, similarly to how PERF_GROUP_SOFTWARE and is_software_event() worked. Lastly, for consistency, updates is_software_event() to rely in event_cap instead of the context index. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1471467307-61171-3-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
e48c178814 |
perf/core: Optimize perf_pmu_sched_task()
For perf record -b, which requires the pmu::sched_task callback the
current code is rather expensive:
7.68% sched-pipe [kernel.vmlinux] [k] perf_pmu_sched_task
5.95% sched-pipe [kernel.vmlinux] [k] __switch_to
5.20% sched-pipe [kernel.vmlinux] [k] __intel_pmu_disable_all
3.95% sched-pipe perf [.] worker_thread
The problem is that it will iterate all registered PMUs, most of which
will not have anything to do. Avoid this by keeping an explicit list
of PMUs that have requested the callback.
The perf_sched_cb_{inc,dec}() functions already takes the required pmu
argument, and now that these functions are no longer called from NMI
context we can use them to manage a list.
With this patch applied the function doesn't show up in the top 4
anymore (it dropped to 18th place).
6.67% sched-pipe [kernel.vmlinux] [k] __switch_to
6.18% sched-pipe [kernel.vmlinux] [k] __intel_pmu_disable_all
3.92% sched-pipe [kernel.vmlinux] [k] switch_mm_irqs_off
3.71% sched-pipe perf [.] worker_thread
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
db4a835601 |
perf/core: Set cgroup in CPU contexts for new cgroup events
There's a perf stat bug easy to observer on a machine with only one cgroup:
$ perf stat -e cycles -I 1000 -C 0 -G /
# time counts unit events
1.000161699 <not counted> cycles /
2.000355591 <not counted> cycles /
3.000565154 <not counted> cycles /
4.000951350 <not counted> cycles /
We'd expect some output there.
The underlying problem is that there is an optimization in
perf_cgroup_sched_{in,out}() that skips the switch of cgroup events
if the old and new cgroups in a task switch are the same.
This optimization interacts with the current code in two ways
that cause a CPU context's cgroup (cpuctx->cgrp) to be NULL even if a
cgroup event matches the current task. These are:
1. On creation of the first cgroup event in a CPU: In current code,
cpuctx->cpu is only set in perf_cgroup_sched_in, but due to the
aforesaid optimization, perf_cgroup_sched_in will run until the next
cgroup switches in that CPU. This may happen late or never happen,
depending on system's number of cgroups, CPU load, etc.
2. On deletion of the last cgroup event in a cpuctx: In list_del_event,
cpuctx->cgrp is set NULL. Any new cgroup event will not be sched in
because cpuctx->cgrp == NULL until a cgroup switch occurs and
perf_cgroup_sched_in is executed (updating cpuctx->cgrp).
This patch fixes both problems by setting cpuctx->cgrp in list_add_event,
mirroring what list_del_event does when removing a cgroup event from CPU
context, as introduced in:
commit
|
||
|
|
a6408f6cb6 |
Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
"This is the next part of the hotplug rework.
- Convert all notifiers with a priority assigned
- Convert all CPU_STARTING/DYING notifiers
The final removal of the STARTING/DYING infrastructure will happen
when the merge window closes.
Another 700 hundred line of unpenetrable maze gone :)"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
timers/core: Correct callback order during CPU hot plug
leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
powerpc/numa: Convert to hotplug state machine
arm/perf: Fix hotplug state machine conversion
irqchip/armada: Avoid unused function warnings
ARC/time: Convert to hotplug state machine
clocksource/atlas7: Convert to hotplug state machine
clocksource/armada-370-xp: Convert to hotplug state machine
clocksource/exynos_mct: Convert to hotplug state machine
clocksource/arm_global_timer: Convert to hotplug state machine
rcu: Convert rcutree to hotplug state machine
KVM/arm/arm64/vgic-new: Convert to hotplug state machine
smp/cfd: Convert core to hotplug state machine
x86/x2apic: Convert to CPU hotplug state machine
profile: Convert to hotplug state machine
timers/core: Convert to hotplug state machine
hrtimer: Convert to hotplug state machine
x86/tboot: Convert to hotplug state machine
arm64/armv8 deprecated: Convert to hotplug state machine
hwtracing/coresight-etm4x: Convert to hotplug state machine
...
|
||
|
|
468fc7ed55 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Unified UDP encapsulation offload methods for drivers, from
Alexander Duyck.
2) Make DSA binding more sane, from Andrew Lunn.
3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.
4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.
5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
packets as soon as the device sees them, with the option to mirror
the packet on TX via the same interface. From Brenden Blanco and
others.
6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.
7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.
8) Simplify netlink conntrack entry layout, from Florian Westphal.
9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
Schimmel, Yotam Gigi, and Jiri Pirko.
10) Add SKB array infrastructure and convert tun and macvtap over to it.
From Michael S Tsirkin and Jason Wang.
11) Support qdisc packet injection in pktgen, from John Fastabend.
12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.
13) Add NV congestion control support to TCP, from Lawrence Brakmo.
14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.
15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.
16) Support MPLS over IPV4, from Simon Horman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
xgene: Fix build warning with ACPI disabled.
be2net: perform temperature query in adapter regardless of its interface state
l2tp: Correctly return -EBADF from pppol2tp_getname.
net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
net: ipmr/ip6mr: update lastuse on entry change
macsec: ensure rx_sa is set when validation is disabled
tipc: dump monitor attributes
tipc: add a function to get the bearer name
tipc: get monitor threshold for the cluster
tipc: make cluster size threshold for monitoring configurable
tipc: introduce constants for tipc address validation
net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
MAINTAINERS: xgene: Add driver and documentation path
Documentation: dtb: xgene: Add MDIO node
dtb: xgene: Add MDIO node
drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
drivers: net: xgene: Use exported functions
drivers: net: xgene: Enable MDIO driver
drivers: net: xgene: Add backward compatibility
drivers: net: phy: xgene: Add MDIO driver
...
|
||
|
|
aa7145c16d |
bpf, events: fix offset in skb copy handler
This patch fixes the __output_custom() routine we currently use with bpf_skb_copy(). I missed that when len is larger than the size of the current handle, we can issue multiple invocations of copy_func, and __output_custom() advances destination but also source buffer by the written amount of bytes. When we have __output_custom(), this is actually wrong since in that case the source buffer points to a non-linear object, in our case an skb, which the copy_func helper is supposed to walk. Therefore, since this is non-linear we thus need to pass the offset into the helper, so that copy_func can use it for extracting the data from the source object. Therefore, adjust the callback signatures properly and pass offset into the skb_header_pointer() invoked from bpf_skb_copy() callback. The __DEFINE_OUTPUT_COPY_BODY() is adjusted to accommodate for two things: i) to pass in whether we should advance source buffer or not; this is a compile-time constant condition, ii) to pass in the offset for __output_custom(), which we do with help of __VA_ARGS__, so everything can stay inlined as is currently. Both changes allow for adapting the __output_* fast-path helpers w/o extra overhead. Fixes: |
||
|
|
7e3f977edd |
perf, events: add non-linear data support for raw records
This patch adds support for non-linear data on raw records. It extends raw records to have one or multiple fragments that will be written linearly into the ring slot, where each fragment can optionally have a custom callback handler to walk and extract complex, possibly non-linear data. If a callback handler is provided for a fragment, then the new __output_custom() will be used instead of __output_copy() for the perf_output_sample() part. perf_prepare_sample() does all the size calculation only once, so perf_output_sample() doesn't need to redo the same work anymore, meaning real_size and padding will be cached in the raw record. The raw record becomes 32 bytes in size without holes; to not increase it further and to avoid doing unnecessary recalculations in fast-path, we can reuse next pointer of the last fragment, idea here is borrowed from ZERO_OR_NULL_PTR(), which should keep the perf_output_sample() path for PERF_SAMPLE_RAW minimal. This facility is needed for BPF's event output helper as a first user that will, in a follow-up, add an additional perf_raw_frag to its perf_raw_record in order to be able to more efficiently dump skb context after a linear head meta data related to it. skbs can be non-linear and thus need a custom output function to dump buffers. Currently, the skb data needs to be copied twice; with the help of __output_custom() this work only needs to be done once. Future users could be things like XDP/BPF programs that work on different context though and would thus also have a different callback function. The few users of raw records are adapted to initialize their frag data from the raw record itself, no change in behavior for them. The code is based upon a PoC diff provided by Peter Zijlstra [1]. [1] http://thread.gmane.org/gmane.linux.network/421294 Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
89ab9cb169 |
perf/core: Remove perf CPU notifier code
All users converted to state machine callbacks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153335.115333381@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
00e16c3d68 |
perf/core: Convert to hotplug state machine
Actually a nice symmetric startup/teardown pair which fits properly into the state machine concept. In the long run we should be able to invoke the startup callback for the boot CPU via the state machine and get rid of the init function which invokes it on the boot CPU. Note: This comes actually before the perf hardware callbacks. In the notifier model the hardware callbacks have a higher priority than the core callback. But that's solely for CPU offline so that hardware migration of events happens before the core is notified about the outgoing CPU. With the symetric state array model we have the following ordering: UP: core -> hardware DOWN: hardware -> core Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Siewior <bigeasy@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153333.587514098@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
616d1c1b98 |
Merge branch 'linus' into perf/core, to refresh the branch
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
fc07e9f983 |
perf/x86: Support sysfs files depending on SMT status
Add a way to show different sysfs events attributes depending on HyperThreading is on or off. This is difficult to determine early at boot, so we just do it dynamically when the sysfs attribute is read. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
f2fb6bef92 |
perf/core: Optimize side-band event delivery
The perf_event_aux() function iterates all PMUs and all events in their respective per-CPU contexts to find the events to deliver side-band records to. For example, the brk test case in lkp triggers many mmap() operations, which, if we're also running perf, results in many perf_event_aux() invocations. If we enable uncore PMU support (even when uncore events are not used), dozens of uncore PMUs will be iterated, which can significantly decrease brk_test's throughput. For example, the brk throughput: without uncore PMUs: 2647573 ops_per_sec with uncore PMUs: 1768444 ops_per_sec ... a 33% reduction. To get at the per-CPU events that need side-band records, this patch puts these events on a per-CPU list, this avoids iterating the PMUs and any events that do not need side-band records. Per task events are unchanged to avoid extra overhead on the context switch paths. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Huang, Ying <ying.huang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1458757477-3781-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
97c79a38cd |
perf core: Per event callchain limit
Additionally to being able to control the system wide maximum depth via /proc/sys/kernel/perf_event_max_stack, now we are able to ask for different depths per event, using perf_event_attr.sample_max_stack for that. This uses an u16 hole at the end of perf_event_attr, that, when perf_event_attr.sample_type has the PERF_SAMPLE_CALLCHAIN, if sample_max_stack is zero, means use perf_event_max_stack, otherwise it'll be bounds checked under callchain_mutex. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
bdc6b758e4 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Mostly tooling and PMU driver fixes, but also a number of late updates such as the reworking of the call-chain size limiting logic to make call-graph recording more robust, plus tooling side changes for the new 'backwards ring-buffer' extension to the perf ring-buffer" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) perf record: Read from backward ring buffer perf record: Rename variable to make code clear perf record: Prevent reading invalid data in record__mmap_read perf evlist: Add API to pause/resume perf trace: Use the ptr->name beautifier as default for "filename" args perf trace: Use the fd->name beautifier as default for "fd" args perf report: Add srcline_from/to branch sort keys perf evsel: Record fd into perf_mmap perf evsel: Add overwrite attribute and check write_backward perf tools: Set buildid dir under symfs when --symfs is provided perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced perf annotate: Sort list of recognised instructions perf annotate: Fix identification of ARM blt and bls instructions perf tools: Fix usage of max_stack sysctl perf callchain: Stop validating callchains by the max_stack sysctl perf trace: Fix exit_group() formatting perf top: Use machine->kptr_restrict_warned perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1 perf machine: Do not bail out if not managing to read ref reloc symbol perf/x86/intel/p4: Trival indentation fix, remove space ... |
||
|
|
a7fd20d1c4 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Highlights:
1) Support SPI based w5100 devices, from Akinobu Mita.
2) Partial Segmentation Offload, from Alexander Duyck.
3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.
4) Allow cls_flower stats offload, from Amir Vadai.
5) Implement bpf blinding, from Daniel Borkmann.
6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
actually using FASYNC these atomics are superfluous. From Eric
Dumazet.
7) Run TCP more preemptibly, also from Eric Dumazet.
8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
driver, from Gal Pressman.
9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.
10) Improve BPF usage documentation, from Jesper Dangaard Brouer.
11) Support tunneling offloads in qed, from Manish Chopra.
12) aRFS offloading in mlx5e, from Maor Gottlieb.
13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
Leitner.
14) Add MSG_EOR support to TCP, this allows controlling packet
coalescing on application record boundaries for more accurate
socket timestamp sampling. From Martin KaFai Lau.
15) Fix alignment of 64-bit netlink attributes across the board, from
Nicolas Dichtel.
16) Per-vlan stats in bridging, from Nikolay Aleksandrov.
17) Several conversions of drivers to ethtool ksettings, from Philippe
Reynes.
18) Checksum neutral ILA in ipv6, from Tom Herbert.
19) Factorize all of the various marvell dsa drivers into one, from
Vivien Didelot
20) Add VF support to qed driver, from Yuval Mintz"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
Revert "phy dp83867: Make rgmii parameters optional"
r8169: default to 64-bit DMA on recent PCIe chips
phy dp83867: Make rgmii parameters optional
phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
bpf: arm64: remove callee-save registers use for tmp registers
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
switchdev: pass pointer to fib_info instead of copy
net_sched: close another race condition in tcf_mirred_release()
tipc: fix nametable publication field in nl compat
drivers: net: Don't print unpopulated net_device name
qed: add support for dcbx.
ravb: Add missing free_irq() calls to ravb_close()
qed: Remove a stray tab
net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fec-mpc52xx: use phydev from struct net_device
bpf, doc: fix typo on bpf_asm descriptions
stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fs-enet: use phydev from struct net_device
...
|
||
|
|
c85b033496 |
perf core: Separate accounting of contexts and real addresses in a stack trace
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.
So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.
A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
3e4de4ec4c |
perf core: Add perf_callchain_store_context() helper
We need have different helpers to account how many contexts we have in the sample and for real addresses, so do it now as a prep patch, to ease review. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
3b1fff0803 |
perf core: Add a 'nr' field to perf_event_callchain_context
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.
This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
cfbcf46845 |
perf core: Pass max stack as a perf_callchain_entry context
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
5101ef20f0 |
perf/arm: Special-case hetereogeneous CPUs
Commit: |
||
|
|
375637bc52 |
perf/core: Introduce address range filtering
Many instruction tracing PMUs out there support address range-based filtering, which would, for example, generate trace data only for a given range of instruction addresses, which is useful for tracing individual functions, modules or libraries. Other PMUs may also utilize this functionality to allow filtering to or filtering out code at certain address ranges. This patch introduces the interface for userspace to specify these filters and for the PMU drivers to apply these filters to hardware configuration. The user interface is an ASCII string that is passed via an ioctl() and specifies (in the form of an ASCII string) address ranges within certain object files or within kernel. There is no special treatment for kernel modules yet, but it might be a worthy pursuit. The PMU driver interface basically adds two extra callbacks to the PMU driver structure, one of which validates the filter configuration proposed by the user against what the hardware is actually capable of doing and the other one translates hardware-independent filter configuration into something that can be programmed into the hardware. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-6-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
c5dfd78eb7 |
perf core: Allow setting up max frame stack depth via sysctl
The default remains 127, which is good for most cases, and not even hit most of the time, but then for some cases, as reported by Brendan, 1024+ deep frames are appearing on the radar for things like groovy, ruby. And in some workloads putting a _lower_ cap on this may make sense. One that is per event still needs to be put in place tho. The new file is: # cat /proc/sys/kernel/perf_event_max_stack 127 Chaging it: # echo 256 > /proc/sys/kernel/perf_event_max_stack # cat /proc/sys/kernel/perf_event_max_stack 256 But as soon as there is some event using callchains we get: # echo 512 > /proc/sys/kernel/perf_event_max_stack -bash: echo: write error: Device or resource busy # Because we only allocate the callchain percpu data structures when there is a user, which allows for changing the max easily, its just a matter of having no callchain users at that point. Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
9ecda41acb |
perf/core: Add ::write_backward attribute to perf event
This patch introduces 'write_backward' bit to perf_event_attr, which
controls the direction of a ring buffer. After set, the corresponding
ring buffer is written from end to beginning. This feature is design to
support reading from overwritable ring buffer.
Ring buffer can be created by mapping a perf event fd. Kernel puts event
records into ring buffer, user tooling like perf fetch them from
address returned by mmap(). To prevent racing between kernel and tooling,
they communicate to each other through 'head' and 'tail' pointers.
Kernel maintains 'head' pointer, points it to the next free area (tail
of the last record). Tooling maintains 'tail' pointer, points it to the
tail of last consumed record (record has already been fetched). Kernel
determines the available space in a ring buffer using these two
pointers to avoid overwrite unfetched records.
By mapping without 'PROT_WRITE', an overwritable ring buffer is created.
Different from normal ring buffer, tooling is unable to maintain 'tail'
pointer because writing is forbidden. Therefore, for this type of ring
buffers, kernel overwrite old records unconditionally, works like flight
recorder. This feature would be useful if reading from overwritable ring
buffer were as easy as reading from normal ring buffer. However,
there's an obscure problem.
The following figure demonstrates a full overwritable ring buffer. In
this figure, the 'head' pointer points to the end of last record, and a
long record 'E' is pending. For a normal ring buffer, a 'tail' pointer
would have pointed to position (X), so kernel knows there's no more
space in the ring buffer. However, for an overwritable ring buffer,
kernel ignore the 'tail' pointer.
(X) head
. |
. V
+------+-------+----------+------+---+
|A....A|B.....B|C........C|D....D| |
+------+-------+----------+------+---+
Record 'A' is overwritten by event 'E':
head
|
V
+--+---+-------+----------+------+---+
|.E|..A|B.....B|C........C|D....D|E..|
+--+---+-------+----------+------+---+
Now tooling decides to read from this ring buffer. However, none of these
two natural positions, 'head' and the start of this ring buffer, are
pointing to the head of a record. Even the full ring buffer can be
accessed by tooling, it is unable to find a position to start decoding.
The first attempt tries to solve this problem AFAIK can be found from
[1]. It makes kernel to maintain 'tail' pointer: updates it when ring
buffer is half full. However, this approach introduces overhead to
fast path. Test result shows a 1% overhead [2]. In addition, this method
utilizes no more tham 50% records.
Another attempt can be found from [3], which allows putting the size of
an event at the end of each record. This approach allows tooling to find
records in a backward manner from 'head' pointer by reading size of a
record from its tail. However, because of alignment requirement, it
needs 8 bytes to record the size of a record, which is a huge waste. Its
performance is also not good, because more data need to be written.
This approach also introduces some extra branch instructions to fast
path.
'write_backward' is a better solution to this problem.
Following figure demonstrates the state of the overwritable ring buffer
when 'write_backward' is set before overwriting:
head
|
V
+---+------+----------+-------+------+
| |D....D|C........C|B.....B|A....A|
+---+------+----------+-------+------+
and after overwriting:
head
|
V
+---+------+----------+-------+---+--+
|..E|D....D|C........C|B.....B|A..|E.|
+---+------+----------+-------+---+--+
In each situation, 'head' points to the beginning of the newest record.
From this record, tooling can iterate over the full ring buffer and fetch
records one by one.
The only limitation that needs to be considered is back-to-back reading.
Due to the non-deterministic of user programs, it is impossible to ensure
the ring buffer keeps stable during reading. Consider an extreme situation:
tooling is scheduled out after reading record 'D', then a burst of events
come, eat up the whole ring buffer (one or multiple rounds). When the
tooling process comes back, reading after 'D' is incorrect now.
To prevent this problem, we need to find a way to ensure the ring buffer
is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is
suggested because its overhead is lower than
ioctl(PERF_EVENT_IOC_ENABLE).
By carefully verifying 'header' pointer, reader can avoid pausing the
ring-buffer. For example:
/* A union of all possible events */
union perf_event event;
p = head = perf_mmap__read_head();
while (true) {
/* copy header of next event */
fetch(&event.header, p, sizeof(event.header));
/* read 'head' pointer */
head = perf_mmap__read_head();
/* check overwritten: is the header good? */
if (!verify(sizeof(event.header), p, head))
break;
/* copy the whole event */
fetch(&event, p, event.header.size);
/* read 'head' pointer again */
head = perf_mmap__read_head();
/* is the whole event good? */
if (!verify(event.header.size, p, head))
break;
p += event.header.size;
}
However, the overhead is high because:
a) In-place decoding is not safe.
Copying-verifying-decoding is required.
b) Fetching 'head' pointer requires additional synchronization.
(From Alexei Starovoitov:
Even when this trick works, pause is needed for more than stability of
reading. When we collect the events into overwrite buffer we're waiting
for some other trigger (like all cpu utilization spike or just one cpu
running and all others are idle) and when it happens the buffer has
valuable info from the past. At this point new events are no longer
interesting and buffer should be paused, events read and unpaused until
next trigger comes.)
This patch utilizes event's default overflow_handler introduced
previously. perf_event_output_backward() is created as the default
overflow handler for backward ring buffers. To avoid extra overhead to
fast path, original perf_event_output() becomes __perf_event_output()
and marked '__always_inline'. In theory, there's no extra overhead
introduced to fast path.
Performance testing:
Calling 3000000 times of 'close(-1)', use gettimeofday() to check
duration. Use 'perf record -o /dev/null -e raw_syscalls:*' to capture
system calls. In ns.
Testing environment:
CPU : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Kernel : v4.5.0
MEAN STDVAR
BASE 800214.950 2853.083
PRE1 2253846.700 9997.014
PRE2 2257495.540 8516.293
POST 2250896.100 8933.921
Where 'BASE' is pure performance without capturing. 'PRE1' is test
result of pure 'v4.5.0' kernel. 'PRE2' is test result before this
patch. 'POST' is test result after this patch. See [4] for the detailed
experimental setup.
Considering the stdvar, this patch doesn't introduce performance
overhead to the fast path.
[1] http://lkml.iu.edu/hypermail/linux/kernel/1304.1/04584.html
[2] http://lkml.iu.edu/hypermail/linux/kernel/1307.1/00535.html
[3] http://lkml.iu.edu/hypermail/linux/kernel/1512.0/01265.html
[4] http://lkml.kernel.org/g/56F89DCD.1040202@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: <acme@kernel.org>
Cc: <pi3orama@163.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1459865478-53413-1-git-send-email-wangnan0@huawei.com
[ Fixed the changelog some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
889fac6d67 |
Merge tag 'v4.6-rc3' into perf/core, to refresh the tree
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
1e1dcd93b4 |
perf: split perf_trace_buf_prepare into alloc and update parts
split allows to move expensive update of 'struct trace_entry' to later phase. Repurpose unused 1st argument of perf_tp_event() to indicate event type. While splitting use temp variable 'rctx' instead of '*rctx' to avoid unnecessary loads done by the compiler due to -fno-strict-aliasing Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
ec5e099d6e |
perf: optimize perf_fetch_caller_regs
avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints. It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs, so we can safely drop memset from all of the above cases and move it into perf_ftrace_function_call that calls it with stack allocated pt_regs. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
1879445dfa |
perf/core: Set event's default ::overflow_handler()
Set a default event->overflow_handler in perf_event_alloc() so don't need to check event->overflow_handler in __perf_event_overflow(). Following commits can give a different default overflow_handler. Initial idea comes from Peter: http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net Since the default value of event->overflow_handler is not NULL, existing 'if (!overflow_handler)' checks need to be changed. is_default_overflow_handler() is introduced for this. No extra performance overhead is introduced into the hot path because in the original code we still need to read this handler from memory. A conditional branch is avoided so actually we remove some instructions. Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459147292-239310-3-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
3fa2fe2ce0 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "This tree contains various perf fixes on the kernel side, plus three hw/event-enablement late additions: - Intel Memory Bandwidth Monitoring events and handling - the AMD Accumulated Power Mechanism reporting facility - more IOMMU events ... and a final round of perf tooling updates/fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) perf llvm: Use strerror_r instead of the thread unsafe strerror one perf llvm: Use realpath to canonicalize paths perf tools: Unexport some methods unused outside strbuf.c perf probe: No need to use formatting strbuf method perf help: Use asprintf instead of adhoc equivalents perf tools: Remove unused perf_pathdup, xstrdup functions perf tools: Do not include stringify.h from the kernel sources tools include: Copy linux/stringify.h from the kernel tools lib traceevent: Remove redundant CPU output perf tools: Remove needless 'extern' from function prototypes perf tools: Simplify die() mechanism perf tools: Remove unused DIE_IF macro perf script: Remove lots of unused arguments perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve perf machine: Rename perf_event__preprocess_sample to machine__resolve perf tools: Add cpumode to struct perf_sample perf tests: Forward the perf_sample in the dwarf unwind test perf tools: Remove misplaced __maybe_unused perf list: Fix documentation of :ppp perf bench numa: Fix assertion for nodes bitfield ... |
||
|
|
c7ab62bfbe |
perf/x86/amd/power: Add AMD accumulated power reporting mechanism
Introduce an AMD accumlated power reporting mechanism for the Family
15h, Model 60h processor that can be used to calculate the average
power consumed by a processor during a measurement interval. The
feature support is indicated by CPUID Fn8000_0007_EDX[12].
This feature will be implemented both in hwmon and perf. The current
design provides one event to report per package/processor power
consumption by counting each compute unit power value.
Here the gory details of how the computation is done:
* Tsample: compute unit power accumulator sample period
* Tref: the PTSC counter period (PTSC: performance timestamp counter)
* N: the ratio of compute unit power accumulator sample period to the
PTSC period
* Jmax: max compute unit accumulated power which is indicated by
MSR_C001007b[MaxCpuSwPwrAcc]
* Jx/Jy: compute unit accumulated power which is indicated by
MSR_C001007a[CpuSwPwrAcc]
* Tx/Ty: the value of performance timestamp counter which is indicated
by CU_PTSC MSR_C0010280[PTSC]
* PwrCPUave: CPU average power
i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
ii. Read the full range of the cumulative energy value from the new
MSR MaxCpuSwPwrAcc.
Jmax = value returned.
iii. At time x, software reads CpuSwPwrAcc and samples the PTSC.
Jx = value read from CpuSwPwrAcc and Tx = value read from PTSC.
iv. At time y, software reads CpuSwPwrAcc and samples the PTSC.
Jy = value read from CpuSwPwrAcc and Ty = value read from PTSC.
v. Calculate the average power consumption for a compute unit over
time period (y-x). Unit of result is uWatt:
if (Jy < Jx) // Rollover has occurred
Jdelta = (Jy + Jmax) - Jx
else
Jdelta = Jy - Jx
PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
Simple example:
root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' make -j4
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CHK include/generated/compile.h
SKIPPED include/generated/compile.h
Building modules, stage 2.
Kernel: arch/x86/boot/bzImage is ready (#40)
MODPOST 4225 modules
Performance counter stats for 'system wide':
183.44 mWatts power/power-pkg/
341.837270111 seconds time elapsed
root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' sleep 10
Performance counter stats for 'system wide':
0.18 mWatts power/power-pkg/
10.012551815 seconds time elapsed
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: jacob.w.shin@gmail.com
Link: http://lkml.kernel.org/r/1457502306-2559-1-git-send-email-ray.huang@amd.com
[ Fixed the modular build. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
a223c1c7ab |
perf/x86/cqm: Fix CQM handling of grouping events into a cache_group
Currently CQM (cache quality of service monitoring) is grouping all events belonging to same PID to use one RMID. However its not counting all of these different events. Hence we end up with a count of zero for all events other than the group leader. The patch tries to address the issue by keeping a flag in the perf_event.hw which has other CQM related fields. The field is updated at event creation and during grouping. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> [peterz: Changed hw_perf_event::is_group_event to an int] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: fenghua.yu@intel.com Cc: h.peter.anvin@intel.com Cc: ravi.v.shankar@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1457652732-4499-2-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
1200b6809d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Highlights:
1) Support more Realtek wireless chips, from Jes Sorenson.
2) New BPF types for per-cpu hash and arrap maps, from Alexei
Starovoitov.
3) Make several TCP sysctls per-namespace, from Nikolay Borisov.
4) Allow the use of SO_REUSEPORT in order to do per-thread processing
of incoming TCP/UDP connections. The muxing can be done using a
BPF program which hashes the incoming packet. From Craig Gallek.
5) Add a multiplexer for TCP streams, to provide a messaged based
interface. BPF programs can be used to determine the message
boundaries. From Tom Herbert.
6) Add 802.1AE MACSEC support, from Sabrina Dubroca.
7) Avoid factorial complexity when taking down an inetdev interface
with lots of configured addresses. We were doing things like
traversing the entire address less for each address removed, and
flushing the entire netfilter conntrack table for every address as
well.
8) Add and use SKB bulk free infrastructure, from Jesper Brouer.
9) Allow offloading u32 classifiers to hardware, and implement for
ixgbe, from John Fastabend.
10) Allow configuring IRQ coalescing parameters on a per-queue basis,
from Kan Liang.
11) Extend ethtool so that larger link mode masks can be supported.
From David Decotigny.
12) Introduce devlink, which can be used to configure port link types
(ethernet vs Infiniband, etc.), port splitting, and switch device
level attributes as a whole. From Jiri Pirko.
13) Hardware offload support for flower classifiers, from Amir Vadai.
14) Add "Local Checksum Offload". Basically, for a tunneled packet
the checksum of the outer header is 'constant' (because with the
checksum field filled into the inner protocol header, the payload
of the outer frame checksums to 'zero'), and we can take advantage
of that in various ways. From Edward Cree"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
bonding: fix bond_get_stats()
net: bcmgenet: fix dma api length mismatch
net/mlx4_core: Fix backward compatibility on VFs
phy: mdio-thunder: Fix some Kconfig typos
lan78xx: add ndo_get_stats64
lan78xx: handle statistics counter rollover
RDS: TCP: Remove unused constant
RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
net: smc911x: convert pxa dma to dmaengine
team: remove duplicate set of flag IFF_MULTICAST
bonding: remove duplicate set of flag IFF_MULTICAST
net: fix a comment typo
ethernet: micrel: fix some error codes
ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
bpf, dst: add and use dst_tclassid helper
bpf: make skb->tc_classid also readable
net: mvneta: bm: clarify dependencies
cls_bpf: reset class and reuse major in da
ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
ldmvsw: Add ldmvsw.c driver code
...
|
||
|
|
e23604edac |
Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull NOHZ updates from Ingo Molnar: "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors the NOHZ 'can the tick be stopped?' infrastructure and related code to be data driven, and harmonizes the naming and handling of all the various properties" [ This makes the ugly "fetch_or()" macro that the scheduler used internally a new generic helper, and does a bad job at it. I'm pulling it, but I've asked Ingo and Frederic to get this fixed up ] * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched-clock: Migrate to use new tick dependency mask model posix-cpu-timers: Migrate to use new tick dependency mask model sched: Migrate sched to use new tick dependency mask model sched: Account rr tasks perf: Migrate perf to use new tick dependency mask model nohz: Use enum code for tick stop failure tracing message nohz: New tick dependency mask nohz: Implement wide kick on top of irq work atomic: Export fetch_or() |
||
|
|
810813c47a |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
1f25184656 |
Merge branch 'timers/core-v9' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz
Pull nohz enhancements from Frederic Weisbecker:
"Currently in nohz full configs, the tick dependency is checked
asynchronously by nohz code from interrupt and context switch for each
concerned subsystem with a set of function provided by these. Such
functions are made of many conditions and details that can be heavyweight
as they are called on fastpath: sched_can_stop_tick(),
posix_cpu_timer_can_stop_tick(), perf_event_can_stop_tick()...
Thomas suggested a few months ago to make that tick dependency check
synchronous. Instead of checking subsystems details from each interrupt
to guess if the tick can be stopped, every subsystem that may have a tick
dependency should set itself a flag specifying the state of that
dependency. This way we can verify if we can stop the tick with a single
lightweight mask check on fast path.
This conversion from a pull to a push model to implement tick dependency
is the core feature of this patchset that is split into:
* Nohz wide kick simplification
* Improve nohz tracing
* Introduce tick dependency mask
* Migrate scheduler, posix timers, perf events and sched clock tick
dependencies to the tick dependency mask."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
555e0c1ef7 |
perf: Migrate perf to use new tick dependency mask model
Instead of providing asynchronous checks for the nohz subsystem to verify perf event tick dependency, migrate perf to the new mask. Perf needs the tick for two situations: 1) Freq events. We could set the tick dependency when those are installed on a CPU context. But setting a global dependency on top of the global freq events accounting is much easier. If people want that to be optimized, we can still refine that on the per-CPU tick dependency level. This patch dooesn't change the current behaviour anyway. 2) Throttled events: this is a per-cpu dependency. Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> |