linux.git
2 weeks agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block master torvalds/master
Linus Torvalds [Thu, 17 Nov 2016 21:59:39 +0000 (13:59 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A set of fixes, one for NVMe from Keith, and a set for nvme-{rdma,t,f}
  from the usual suspects, fixing actual problems that would be a shame
  to release 4.9 with"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvme/pci: Don't free queues on error
  nvmet-rdma: drain the queue-pair just before freeing it
  nvme-rdma: stop and free io queues on connect failure
  nvmet-rdma: don't forget to delete a queue from the list of connection failed
  nvmet: Don't queue fatal error work if csts.cfs is set
  nvme-rdma: reject non-connect commands before the queue is live
  nvmet-rdma: Fix possible NULL deref when handling rdma cm events

2 weeks agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Thu, 17 Nov 2016 21:53:02 +0000 (13:53 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rmda fixes from Doug Ledford.
 "First round of -rc fixes.

  Due to various issues, I've been away and couldn't send a pull request
  for about three weeks. There were a number of -rc patches that built
  up in the meantime (some where there already from the early -rc
  stages). Obviously, there were way too many to send now, so I tried to
  pare the list down to the more important patches for the -rc cycle.

  Most of the code has had plenty of soak time at the various vendor's
  testing setups, so I doubt there will be another -rc pull request this
  cycle. I also tried to limit the patches to those with smaller
  footprints, so even though a shortlog is longer than I would like, the
  actual diffstat is mostly very small with the exception of just three
  files that had more changes, and a couple files with pure removals.

  Summary:
   - Misc Intel hfi1 fixes
   - Misc Mellanox mlx4, mlx5, and rxe fixes
   - A couple cxgb4 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
  iw_cxgb4: invalidate the mr when posting a read_w_inv wr
  iw_cxgb4: set *bad_wr for post_send/post_recv errors
  IB/rxe: Update qp state for user query
  IB/rxe: Clear queue buffer when modifying QP to reset
  IB/rxe: Fix handling of erroneous WR
  IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
  IB/mlx4: Fix create CQ error flow
  IB/mlx4: Check gid_index return value
  IB/mlx5: Fix NULL pointer dereference on debug print
  IB/mlx5: Fix fatal error dispatching
  IB/mlx5: Resolve soft lock on massive reg MRs
  IB/mlx5: Use cache line size to select CQE stride
  IB/mlx5: Validate requested RQT size
  IB/mlx5: Fix memory leak in query device
  IB/core: Avoid unsigned int overflow in sg_alloc_table
  IB/core: Add missing check for addr_resolve callback return value
  IB/core: Set routable RoCE gid type for ipv4/ipv6 networks
  IB/cm: Mark stale CM id's whenever the mad agent was unregistered
  IB/uverbs: Fix leak of XRC target QPs
  IB/hfi1: Remove incorrect IS_ERR check
  ...

2 weeks agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Thu, 17 Nov 2016 21:49:30 +0000 (13:49 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "A couple of regression fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix iov_iter_advance() for ITER_PIPE
  xattr: Fix setting security xattrs on sockfs

2 weeks agoMerge tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 17 Nov 2016 21:45:57 +0000 (13:45 -0800)]
Merge tag 'for-linus-4.9-rc5-ofs-1' of git://git./linux/kernel/git/hubcap/linux

Pull orangefs fix from Mike Marshall:
 "orangefs: add .owner to debugfs file_operations

  Without ".owner = THIS_MODULE" it is possible to crash the kernel by
  unloading the Orangefs module while someone is reading debugfs files"

* tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: add .owner to debugfs file_operations

2 weeks agomremap: fix race between mremap() and page cleanning
Aaron Lu [Thu, 10 Nov 2016 09:16:33 +0000 (17:16 +0800)]
mremap: fix race between mremap() and page cleanning

Prior to 3.15, there was a race between zap_pte_range() and
page_mkclean() where writes to a page could be lost.  Dave Hansen
discovered by inspection that there is a similar race between
move_ptes() and page_mkclean().

We've been able to reproduce the issue by enlarging the race window with
a msleep(), but have not been able to hit it without modifying the code.
So, we think it's a real issue, but is difficult or impossible to hit in
practice.

The zap_pte_range() issue is fixed by commit 1cf35d47712d("mm: split
'tlb_flush_mmu()' into tlb flushing and memory freeing parts").  And
this patch is to fix the race between page_mkclean() and mremap().

Here is one possible way to hit the race: suppose a process mmapped a
file with READ | WRITE and SHARED, it has two threads and they are bound
to 2 different CPUs, e.g.  CPU1 and CPU2.  mmap returned X, then thread
1 did a write to addr X so that CPU1 now has a writable TLB for addr X
on it.  Thread 2 starts mremaping from addr X to Y while thread 1
cleaned the page and then did another write to the old addr X again.
The 2nd write from thread 1 could succeed but the value will get lost.

        thread 1                           thread 2
     (bound to CPU1)                    (bound to CPU2)

  1: write 1 to addr X to get a
     writeable TLB on this CPU

                                        2: mremap starts

                                        3: move_ptes emptied PTE for addr X
                                           and setup new PTE for addr Y and
                                           then dropped PTL for X and Y

  4: page laundering for N by doing
     fadvise FADV_DONTNEED. When done,
     pageframe N is deemed clean.

  5: *write 2 to addr X

                                        6: tlb flush for addr X

  7: munmap (Y, pagesize) to make the
     page unmapped

  8: fadvise with FADV_DONTNEED again
     to kick the page off the pagecache

  9: pread the page from file to verify
     the value. If 1 is there, it means
     we have lost the written 2.

  *the write may or may not cause segmentation fault, it depends on
  if the TLB is still on the CPU.

Please note that this is only one specific way of how the race could
occur, it didn't mean that the race could only occur in exact the above
config, e.g. more than 2 threads could be involved and fadvise() could
be done in another thread, etc.

For anonymous pages, they could race between mremap() and page reclaim:
THP: a huge PMD is moved by mremap to a new huge PMD, then the new huge
PMD gets unmapped/splitted/pagedout before the flush tlb happened for
the old huge PMD in move_page_tables() and we could still write data to
it.  The normal anonymous page has similar situation.

To fix this, check for any dirty PTE in move_ptes()/move_huge_pmd() and
if any, did the flush before dropping the PTL.  If we did the flush for
every move_ptes()/move_huge_pmd() call then we do not need to do the
flush in move_pages_tables() for the whole range.  But if we didn't, we
still need to do the whole range flush.

Alternatively, we can track which part of the range is flushed in
move_ptes()/move_huge_pmd() and which didn't to avoid flushing the whole
range in move_page_tables().  But that would require multiple tlb
flushes for the different sub-ranges and should be less efficient than
the single whole range flush.

KBuild test on my Sandybridge desktop doesn't show any noticeable change.
v4.9-rc4:
  real    5m14.048s
  user    32m19.800s
  sys     4m50.320s

With this commit:
  real    5m13.888s
  user    32m19.330s
  sys     4m51.200s

Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 weeks agofix iov_iter_advance() for ITER_PIPE
Abhi Das [Thu, 17 Nov 2016 03:44:23 +0000 (21:44 -0600)]
fix iov_iter_advance() for ITER_PIPE

iov_iter_advance() needs to decrement iter->count by the number of
bytes we'd moved beyond.  Normal flavours do that, but ITER_PIPE
doesn't and ITER_PIPE generic_file_read_iter() for O_DIRECT files
ends up with a bogus fallback to page cache read, resulting in incorrect
values for file offset and bytes read.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2 weeks agoxattr: Fix setting security xattrs on sockfs
Andreas Gruenbacher [Sun, 13 Nov 2016 20:23:34 +0000 (21:23 +0100)]
xattr: Fix setting security xattrs on sockfs

The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
setxattr support as well.  This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity.  The
smack security module relies on socket labels (xattrs).

Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.

We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity.  A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2 weeks agoMerge tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Thu, 17 Nov 2016 01:24:21 +0000 (17:24 -0800)]
Merge tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes fr9om Dave Airlie:
 "Fixes for amdgpu, and a bunch of arm drivers.

  There seems to be an uptick in the ARM drivers sending things for
  fixes which is good, so I've decided to dequeue a bit early, more
  stuff may arrive before the weekend.

  This contains mediatek, arcpgu, sunxi, fsl-dcu display controller
  fixes along with 3 amdgpu fixes, one for a fencing issue with
  secondary GPUs"

* tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/amdgpu:fix vpost_needed routine
  drm/amdgpu/powerplay: drop a redundant NULL check
  drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
  drm/arcpgu: Accommodate adv7511 switch to DRM bridge
  drm/fsl-dcu: disable planes before disabling CRTC
  drm/fsl-dcu: update all registers on flush
  drm/fsl-dcu: do not update when modifying irq registers
  drm/sun4i: Propagate error to the caller
  drm/sun4i: Fix error handling
  drm/mediatek: modify the factor to make the pll_rate set in the 1G-2G range
  drm/mediatek: enhance the HDMI driving current
  drm/mediatek: do mtk_hdmi_send_infoframe after HDMI clock enable
  drm/mediatek: clear IRQ status before enable OVL interrupt
  drm/mediatek: set vblank_disable_allowed to true
  drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE
  drm/sun4i: rgb: Remove the bridge enable/disable functions
  drm/sun4i: rgb: Enable panel after controller

2 weeks agoiw_cxgb4: invalidate the mr when posting a read_w_inv wr
Steve Wise [Thu, 3 Nov 2016 19:09:38 +0000 (12:09 -0700)]
iw_cxgb4: invalidate the mr when posting a read_w_inv wr

Also, rearrange things a bit to have a common c4iw_invalidate_mr()
function used everywhere that we need to invalidate.

Fixes: 49b53a93a64a ("iw_cxgb4: add fast-path for small REG_MR operations")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoiw_cxgb4: set *bad_wr for post_send/post_recv errors
Steve Wise [Tue, 18 Oct 2016 21:04:39 +0000 (14:04 -0700)]
iw_cxgb4: set *bad_wr for post_send/post_recv errors

There are a few cases in c4iw_post_send() and c4iw_post_receive()
where *bad_wr is not set when an error is returned.  This can
cause a crash if the application tries to use bad_wr.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoMerge branches 'hfi1' and 'mlx' into k.o/for-4.9-rc
Doug Ledford [Thu, 17 Nov 2016 01:05:10 +0000 (20:05 -0500)]
Merge branches 'hfi1' and 'mlx' into k.o/for-4.9-rc

2 weeks agoIB/rxe: Update qp state for user query
Yonatan Cohen [Wed, 16 Nov 2016 08:39:18 +0000 (10:39 +0200)]
IB/rxe: Update qp state for user query

The method rxe_qp_error() transitions QP to error state
and make sure the QP is drained. It did not though update
the QP state for user's query.

This patch fixes this.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/rxe: Clear queue buffer when modifying QP to reset
Yonatan Cohen [Wed, 16 Nov 2016 08:39:17 +0000 (10:39 +0200)]
IB/rxe: Clear queue buffer when modifying QP to reset

RXE resets the send-q only once in rxe_qp_init_req() when
QP is created, but when the QP is reused after QP reset, the send-q
holds previous garbage data.

This garbage data wrongly fails CQEs that otherwise
should have completed successfully.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/rxe: Fix handling of erroneous WR
Yonatan Cohen [Wed, 16 Nov 2016 08:39:15 +0000 (10:39 +0200)]
IB/rxe: Fix handling of erroneous WR

To correctly handle a erroneous WR this fix does the following
1. Make sure the bad WQE causes a user completion event.
2. Call rxe_completer to handle the erred WQE.

Before the fix, when rxe_requester found a bad WQE, it changed its
status to IB_WC_LOC_PROT_ERR and exit with 0 for non RC QPs.

If this was the 1st WQE then there would be no ACK to invoke the
completer and this bad WQE would be stuck in the QP's send-q.

On top of that the requester exiting with 0 caused rxe_do_task to
endlessly invoke rxe_requester, resulting in a soft-lockup attached
below.

In case the WQE was not the 1st and rxe_completer did get a chance to
handle the bad WQE, it did not cause a complete event since the WQE's
IB_SEND_SIGNALED flag was not set.

Setting WQE status to IB_SEND_SIGNALED is subject to IBA spec
version 1.2.1, section 10.7.3.1 Signaled Completions.

NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[<ffffffffa0590145>] ? rxe_pool_get_index+0x35/0xb0 [rdma_rxe]
[<ffffffffa05952ec>] lookup_mem+0x3c/0xc0 [rdma_rxe]
[<ffffffffa0595534>] copy_data+0x1c4/0x230 [rdma_rxe]
[<ffffffffa058c180>] rxe_requester+0x9d0/0x1100 [rdma_rxe]
[<ffffffff8158e98a>] ? kfree_skbmem+0x5a/0x60
[<ffffffffa05962c9>] rxe_do_task+0x89/0xf0 [rdma_rxe]
[<ffffffffa05963e2>] rxe_run_task+0x12/0x30 [rdma_rxe]
[<ffffffffa059110a>] rxe_post_send+0x41a/0x550 [rdma_rxe]
[<ffffffff811ef922>] ? __kmalloc+0x182/0x200
[<ffffffff816ba512>] ? down_read+0x12/0x40
[<ffffffffa054bd32>] ib_uverbs_post_send+0x532/0x540 [ib_uverbs]
[<ffffffff815f8722>] ? tcp_sendmsg+0x402/0xb80
[<ffffffffa05453dc>] ib_uverbs_write+0x18c/0x3f0 [ib_uverbs]
[<ffffffff81623c2e>] ? inet_recvmsg+0x7e/0xb0
[<ffffffff8158764d>] ? sock_recvmsg+0x3d/0x50
[<ffffffff81215b87>] __vfs_write+0x37/0x140
[<ffffffff81216892>] vfs_write+0xb2/0x1b0
[<ffffffff81217ce5>] SyS_write+0x55/0xc0
[<ffffffff816bc672>] entry_SYSCALL_64_fastpath+0x1a/0xa

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
Yonatan Cohen [Wed, 16 Nov 2016 08:39:14 +0000 (10:39 +0200)]
IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum

Missing initialization of udp_tunnel_sock_cfg causes to following
kernel panic, while kernel tries to execute gro_receive().

While being there, we converted udp_port_cfg to use the same
initialization scheme as udp_tunnel_sock_cfg.

------------[ cut here ]------------
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffffffffa0588c50
IP: [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
PGD 1c09067 PUD 1c0a063 PMD bb394067 PTE 80000000ad5e8163
Oops: 0011 [#1] SMP
Modules linked in: ib_rxe ip6_udp_tunnel udp_tunnel
CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc3+ #2
Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
task: ffff880235e4e680 ti: ffff880235e68000 task.ti: ffff880235e68000
RIP: 0010:[<ffffffffa0588c50>]
[<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
RSP: 0018:ffff880237343c80  EFLAGS: 00010282
RAX: 00000000dffe482d RBX: ffff8800ae330900 RCX: 000000002001b712
RDX: ffff8800ae330900 RSI: ffff8800ae102578 RDI: ffff880235589c00
RBP: ffff880237343cb0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ae33e262
R13: ffff880235589c00 R14: 0000000000000014 R15: ffff8800ae102578
FS:  0000000000000000(0000) GS:ffff880237340000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa0588c50 CR3: 0000000001c06000 CR4: 00000000000006e0
Stack:
ffffffff8160860e ffff8800ae330900 ffff8800ae102578 0000000000000014
000000000000004e ffff8800ae102578 ffff880237343ce0 ffffffff816088fb
0000000000000000 ffff8800ae330900 0000000000000000 00000000ffad0000
Call Trace:
<IRQ>
[<ffffffff8160860e>] ? udp_gro_receive+0xde/0x130
[<ffffffff816088fb>] udp4_gro_receive+0x10b/0x2d0
[<ffffffff81611373>] inet_gro_receive+0x1d3/0x270
[<ffffffff81594e29>] dev_gro_receive+0x269/0x3b0
[<ffffffff81595188>] napi_gro_receive+0x38/0x120
[<ffffffffa011caee>] mlx5e_handle_rx_cqe+0x27e/0x340 [mlx5_core]
[<ffffffffa011d076>] mlx5e_poll_rx_cq+0x66/0x6d0 [mlx5_core]
[<ffffffffa011d7ae>] mlx5e_napi_poll+0x8e/0x400 [mlx5_core]
[<ffffffff815949a0>] net_rx_action+0x160/0x380
[<ffffffff816a9197>] __do_softirq+0xd7/0x2c5
[<ffffffff81085c35>] irq_exit+0xf5/0x100
[<ffffffff816a8f16>] do_IRQ+0x56/0xd0
[<ffffffff816a6dcc>] common_interrupt+0x8c/0x8c
<EOI>
[<ffffffff81061f96>] ? native_safe_halt+0x6/0x10
[<ffffffff81037ade>] default_idle+0x1e/0xd0
[<ffffffff8103828f>] arch_cpu_idle+0xf/0x20
[<ffffffff810c37dc>] default_idle_call+0x3c/0x50
[<ffffffff810c3b13>] cpu_startup_entry+0x323/0x3c0
[<ffffffff81050d8c>] start_secondary+0x15c/0x1a0
RIP  [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
RSP <ffff880237343c80>
CR2: ffffffffa0588c50
---[ end trace 489ee31fa7614ac5 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception in interrupt
------------[ cut here ]------------

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx4: Fix create CQ error flow
Matan Barak [Thu, 10 Nov 2016 09:30:55 +0000 (11:30 +0200)]
IB/mlx4: Fix create CQ error flow

Currently, if ib_copy_to_udata fails, the CQ
won't be deleted from the radix tree and the HW (HW2SW).

Fixes: 225c7b1feef1 ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx4: Check gid_index return value
Daniel Jurgens [Thu, 10 Nov 2016 09:30:54 +0000 (11:30 +0200)]
IB/mlx4: Check gid_index return value

Check the returned GID index value and return an error if it is invalid.

Fixes: 5070cd2239bd ('IB/mlx4: Replace mechanism for RoCE GID management')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx5: Fix NULL pointer dereference on debug print
Eli Cohen [Thu, 27 Oct 2016 13:36:46 +0000 (16:36 +0300)]
IB/mlx5: Fix NULL pointer dereference on debug print

For XRC QP CQs may not exist. Check before attempting dereference.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx5: Fix fatal error dispatching
Eli Cohen [Thu, 27 Oct 2016 13:36:44 +0000 (16:36 +0300)]
IB/mlx5: Fix fatal error dispatching

When an internal error condition is detected, make sure to set the
device inactive after dispatching the event so ULPs can get a
notification of this event.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx5: Resolve soft lock on massive reg MRs
Moshe Lazer [Thu, 27 Oct 2016 13:36:42 +0000 (16:36 +0300)]
IB/mlx5: Resolve soft lock on massive reg MRs

When calling reg_mr of large MRs (e.g. 4GB) from multiple processes
and MR caches can't supply the required amount of MRs the slow-path
of MR allocation may be used. In this case we need to serialize the
slow-path between the processes to avoid soft lock.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx5: Use cache line size to select CQE stride
Daniel Jurgens [Thu, 27 Oct 2016 13:36:41 +0000 (16:36 +0300)]
IB/mlx5: Use cache line size to select CQE stride

When creating kernel CQs use 128B CQE stride if the
cache line size is 128B, 64B otherwise.  This prevents
multiple CQEs from residing in a 128B cache line,
which can cause retries when there are concurrent
read and writes in one cache line.

Tested with IPoIB on PPC64, saw ~5% throughput
improvement.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx5: Validate requested RQT size
Maor Gottlieb [Thu, 27 Oct 2016 13:36:40 +0000 (16:36 +0300)]
IB/mlx5: Validate requested RQT size

Validate that the requested size of RQT is supported by firmware.

Fixes: c5f9092936fe ('IB/mlx5: Add Receive Work Queue Indirection table operations')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/mlx5: Fix memory leak in query device
Majd Dibbiny [Thu, 27 Oct 2016 13:36:39 +0000 (16:36 +0300)]
IB/mlx5: Fix memory leak in query device

We need to free dev->port when we fail to enable RoCE or
initialize node data.

Fixes: 0837e86a7a34 ('IB/mlx5: Add per port counters')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/core: Avoid unsigned int overflow in sg_alloc_table
Mark Bloch [Thu, 27 Oct 2016 13:36:31 +0000 (16:36 +0300)]
IB/core: Avoid unsigned int overflow in sg_alloc_table

sg_alloc_table gets unsigned int as parameter while the driver
returns it as size_t. Check npages isn't greater than maximum
unsigned int.

Fixes: eeb8461e36c9 ("IB: Refactor umem to use linear SG table")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/core: Add missing check for addr_resolve callback return value
Mark Bloch [Thu, 27 Oct 2016 13:36:29 +0000 (16:36 +0300)]
IB/core: Add missing check for addr_resolve callback return value

When calling rdma_resolve_ip inside rdma_addr_find_l2_eth_by_grh,
the return status of the request was ignored in the callback function
causing a successful return and an empty dmac.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/core: Set routable RoCE gid type for ipv4/ipv6 networks
Leon Romanovsky [Mon, 31 Oct 2016 05:50:56 +0000 (07:50 +0200)]
IB/core: Set routable RoCE gid type for ipv4/ipv6 networks

On Thu, Oct 27, 2016 at 04:36:28PM +0300, Leon Romanovsky wrote:
> From: Mark Bloch <markb@mellanox.com>
>
> If the underlying netowrk type is ipv4 or ipv6 and the device supports
> routable RoCE, prefer it so the traffic could cross subnets.
>
> Signed-off-by: Mark Bloch <markb@mellanox.com>
> Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
> Signed-off-by: Leon Romanovsky <leon@kernel.org>
> ---

Hi Doug,

Please take the following v1 of this patch where I fixed spelling error
from "netowrk" to be "network".

Thanks.

>From 09f96ba3e9b4442cfb44dca04c6726e55525c9c3 Mon Sep 17 00:00:00 2001
From: Mark Bloch <markb@mellanox.com>
Date: Sun, 11 Sep 2016 06:25:10 +0000
Subject: [PATCH rdma-rc v1 3/6] IB/core: Set routable RoCE gid type for ipv4/ipv6
 networks

If the underlying network type is ipv4 or ipv6 and the device supports
routable RoCE, prefer it so the traffic could cross subnets.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/cm: Mark stale CM id's whenever the mad agent was unregistered
Mark Bloch [Thu, 27 Oct 2016 13:36:27 +0000 (16:36 +0300)]
IB/cm: Mark stale CM id's whenever the mad agent was unregistered

When there is a CM id object that has port assigned to it, it means that
the cm-id asked for the specific port that it should go by it, but if
that port was removed (hot-unplug event) the cm-id was not updated.
In order to fix that the port keeps a list of all the cm-id's that are
planning to go by it, whenever the port is removed it marks all of them
as invalid.

This commit fixes a kernel panic which happens when running traffic between
guests and we force reboot a guest mid traffic, it triggers a kernel panic:

 Call Trace:
  [<ffffffff815271fa>] ? panic+0xa7/0x16f
  [<ffffffff8152b534>] ? oops_end+0xe4/0x100
  [<ffffffff8104a00b>] ? no_context+0xfb/0x260
  [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
  [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
  [<ffffffff81084240>] ? process_timeout+0x0/0x10
  [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
  [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
  [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
  [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
  [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
  [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
  [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
  [<ffffffff8152a815>] ? page_fault+0x25/0x30
  [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
  [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
  [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
  [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
  [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
  [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
  [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
  [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
  [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
  [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
  [<ffffffff8109aef6>] ? kthread+0x96/0xa0
  [<ffffffff8100c20a>] ? child_rip+0xa/0x20
  [<ffffffff8109ae60>] ? kthread+0x0/0xa0
  [<ffffffff8100c200>] ? child_rip+0x0/0x20

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/uverbs: Fix leak of XRC target QPs
Tariq Toukan [Thu, 27 Oct 2016 13:36:26 +0000 (16:36 +0300)]
IB/uverbs: Fix leak of XRC target QPs

The real QP is destroyed in case of the ref count reaches zero, but
for XRC target QPs this call was missed and caused to QP leaks.

Let's call to destroy for all flows.

Fixes: 0e0ec7e0638e ('RDMA/core: Export ib_open_qp() to share XRC...')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoMerge tag 'xtensa-20161116' of git://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Thu, 17 Nov 2016 00:39:01 +0000 (16:39 -0800)]
Merge tag 'xtensa-20161116' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fixes from Max Filippov:

 - fix register dumps, stack dumps and stack traces that got torn due to
   recent printk changes

 - wire up pkey_{mprotect,alloc,free} syscalls

* tag 'xtensa-20161116' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: wire up new pkey_{mprotect,alloc,free} syscalls
  xtensa: clean up printk usage for boot/crash logging

2 weeks agoMerge branch 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Wed, 16 Nov 2016 23:45:27 +0000 (09:45 +1000)]
Merge branch 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Just a few bug fixes for 4.9.  The big one is Mario's prime fencing fix.

* 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu:fix vpost_needed routine
  drm/amdgpu/powerplay: drop a redundant NULL check
  drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)

2 weeks agoMerge branch 'mediatek-drm-fixes-2016-11-11' of https://github.com/ckhu-mediatek...
Dave Airlie [Wed, 16 Nov 2016 23:44:52 +0000 (09:44 +1000)]
Merge branch 'mediatek-drm-fixes-2016-11-11' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes

This branch include one patch to fix a typo, two patches to disable
vblank interrupt, and three patches to support HDMI 4K resolution.

* 'mediatek-drm-fixes-2016-11-11' of https://github.com/ckhu-mediatek/linux.git-tags:
  drm/mediatek: modify the factor to make the pll_rate set in the 1G-2G range
  drm/mediatek: enhance the HDMI driving current
  drm/mediatek: do mtk_hdmi_send_infoframe after HDMI clock enable
  drm/mediatek: clear IRQ status before enable OVL interrupt
  drm/mediatek: set vblank_disable_allowed to true
  drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE

2 weeks agonvme/pci: Don't free queues on error
Keith Busch [Tue, 15 Nov 2016 20:56:26 +0000 (15:56 -0500)]
nvme/pci: Don't free queues on error

The nvme_remove function tears down all allocated resources in the correct
order, so no need to free queues on error during initialization. This
fixes possible use-after-free errors when queues are still associated
with a blk-mq hctx.

Reported-by: Scott Bauer <scott.bauer@intel.com>
Tested-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimbeg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
2 weeks agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi...
Linus Torvalds [Wed, 16 Nov 2016 17:20:10 +0000 (09:20 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse

Pull fuse fixes from Miklos Szeredi:
 "A regression fix and bug fix bound for stable"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix fuse_write_end() if zero bytes were copied
  fuse: fix root dentry initialization

2 weeks agoMerge tag 'mfd-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Linus Torvalds [Wed, 16 Nov 2016 17:09:00 +0000 (09:09 -0800)]
Merge tag 'mfd-fixes-4.9' of git://git./linux/kernel/git/lee/mfd

Pull MFD fixes from Lee Jones:
 - Fix PCI properties in intel-lpss-pci
 - Fix Resetting issue during suspend in intel-lpss-pci
 - Seperate IRQs for USBC device and CHRG in intel_soc_pmic_bxtwc
 - Add timeout to fix Resetting issue in stmpe
 - Ensure we 'put' reference to device when done in mfd-core

* tag 'mfd-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: core: Fix device reference leak in mfd_clone_cell
  mfd: stmpe: Fix RESET regression on STMPE2401
  mfd: intel_soc_pmic_bxtwc: Fix usbc interrupt
  mfd: intel-lpss: Do not put device in reset state on suspend
  mfd: lpss: Fix Intel Kaby Lake PCH-H properties

2 weeks agoorangefs: add .owner to debugfs file_operations
Mike Marshall [Wed, 16 Nov 2016 16:52:19 +0000 (11:52 -0500)]
orangefs: add .owner to debugfs file_operations

Without ".owner = THIS_MODULE" it is possible to crash the kernel
by unloading the Orangefs module while someone is reading debugfs
files.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2 weeks agomfd: core: Fix device reference leak in mfd_clone_cell
Johan Hovold [Tue, 1 Nov 2016 10:38:18 +0000 (11:38 +0100)]
mfd: core: Fix device reference leak in mfd_clone_cell

Make sure to drop the reference taken by bus_find_device_by_name()
before returning from mfd_clone_cell().

Fixes: a9bbba996302 ("mfd: add platform_device sharing support for mfd")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2 weeks agomfd: stmpe: Fix RESET regression on STMPE2401
Linus Walleij [Tue, 1 Nov 2016 09:22:53 +0000 (10:22 +0100)]
mfd: stmpe: Fix RESET regression on STMPE2401

Since commit c4dd1ba355aae2bc3d1213da6c66c53e3c31e028
("mfd: stmpe: Add reset support for all STMPE variant")
we're resetting the STMPE expanders before use.

This caused a regression on the STMP2401 on the Nomadik
NHK8815:

stmpe-i2c 0-0043: stmpe2401 detected, chip id: 0x101
nmk-i2c 101f8000.i2c0: write to slave 0x43 timed out
nmk-i2c 101f8000.i2c0: no ack received after address transmission
stmpe-i2c 0-0044: stmpe2401 detected, chip id: 0x101
nmk-i2c 101f8000.i2c0: write to slave 0x44 timed out
nmk-i2c 101f8000.i2c0: no ack received after address transmission

It turns out that we start to poll for the reset bit to
go low again too quickly: the STMPE2401 is not yet online and
ready to be asked for the status of the RESET bit.

By introducing a 10ms delay before starting to hammer
the register for information, we get back to normal:

stmpe-i2c 0-0043: stmpe2401 detected, chip id: 0x101
stmpe-i2c 0-0044: stmpe2401 detected, chip id: 0x101

Cc: stable@vger.kernel.org
Cc: Amelie Delaunay <amelie.delaunay@st.com>
Fixes: c4dd1ba355aa ("mfd: stmpe: Add reset support for all STMPE variant")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2 weeks agomfd: intel_soc_pmic_bxtwc: Fix usbc interrupt
Heikki Krogerus [Mon, 17 Oct 2016 07:32:13 +0000 (10:32 +0300)]
mfd: intel_soc_pmic_bxtwc: Fix usbc interrupt

The wcove USB Type-C driver is currently being flooded with
interrupts that are not targeted to it. The reason for that
is because all CHRG first level interrupts are mapped to it.
This fixes the issue by introducing separate irq for the
usbc device, and mapping only USB Type-C PHY interrupts to
it.

Fixes: 9c6235c86332 ("mfd: intel_soc_pmic_bxtwc: Add bxt_wcove_usbc device")
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2 weeks agomfd: intel-lpss: Do not put device in reset state on suspend
Azhar Shaikh [Wed, 12 Oct 2016 17:12:20 +0000 (10:12 -0700)]
mfd: intel-lpss: Do not put device in reset state on suspend

Commit 41a3da2b8e163 ("mfd: intel-lpss: Save register context on
suspend") saved the register context while going to suspend and
also put the device in reset state.

Due to the resetting of device, system cannot enter S3/S0ix
states when no_console_suspend flag is enabled. The system
and serial console both hang. The resetting of device is not
needed while going to suspend. Hence remove this code.

Cc: stable@vger.kernel.org
Fixes: 41a3da2b8e163 ("mfd: intel-lpss: Save register context on suspend")
Signed-off-by: Azhar Shaikh <azhar.shaikh@intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2 weeks agomfd: lpss: Fix Intel Kaby Lake PCH-H properties
Jarkko Nikula [Thu, 29 Sep 2016 09:59:39 +0000 (12:59 +0300)]
mfd: lpss: Fix Intel Kaby Lake PCH-H properties

There are a few issues on Intel Kaby Lake PCH-H properties added by
commit a6a576b78e09 ("mfd: lpss: Add Intel Kaby Lake PCH-H PCI IDs"):

- Input clock of I2C controller on Intel Kaby Lake PCH-H is 120 MHz not
  133 MHz. This was probably copy-paste error from Intel Broxton I2C
  properties.
- There is no default I2C SDA hold time specified which is used when
  ACPI doesn't provide it. I got information from Windows driver team
  that Kaby Lake PCH-H can use the same configuration than Intel
  Sunrisepoint PCH.
- Common HS-UART properties are not used.

Fix these by reusing the Sunrisepoint properties on Kaby Lake PCH-H.

Fixes: a6a576b78e09 ("mfd: lpss: Add Intel Kaby Lake PCH-H PCI IDs")
Reported-by: Xiang A Wang <xiang.a.wang@intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2 weeks agoMerge tag 'sunxi-drm-fixes-for-4.9' of https://git.kernel.org/pub/scm/linux/kernel...
Dave Airlie [Tue, 15 Nov 2016 23:41:08 +0000 (09:41 +1000)]
Merge tag 'sunxi-drm-fixes-for-4.9' of https://git./linux/kernel/git/mripard/linux into drm-fixes

sun4i-drm fixes for 4.9

A few patches to fix our error handling and our panel / bridge calls.

* tag 'sunxi-drm-fixes-for-4.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  drm/sun4i: Propagate error to the caller
  drm/sun4i: Fix error handling
  drm/sun4i: rgb: Remove the bridge enable/disable functions
  drm/sun4i: rgb: Enable panel after controller

2 weeks agoIB/hfi1: Remove incorrect IS_ERR check
Dennis Dalessandro [Tue, 25 Oct 2016 20:12:46 +0000 (13:12 -0700)]
IB/hfi1: Remove incorrect IS_ERR check

Remove IS_ERR check from caching code as the function being called does
not actually return error pointers.

Fixes: f19bd643dbde: "IB/hfi1: Prevent NULL pointer deferences in caching code"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Prevent hardware counter names from being cut off
Jianxin Xiong [Tue, 25 Oct 2016 20:12:40 +0000 (13:12 -0700)]
IB/hfi1: Prevent hardware counter names from being cut off

Increase the size of the buffer that is used to construct per-VL
and per-SDMA counter names.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Fix ECN processing in prescan_rxq
Dasaratharaman Chandramouli [Tue, 25 Oct 2016 20:12:23 +0000 (13:12 -0700)]
IB/hfi1: Fix ECN processing in prescan_rxq

When processing ECN via the prescan_rxq path, some fields in the packet
structure are passed uninitialized. This can potentially
cause NULL pointer exceptions during ECN handling.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Fix status error code for unsupported packets
Jakub Pawlak [Tue, 25 Oct 2016 20:12:17 +0000 (13:12 -0700)]
IB/hfi1: Fix status error code for unsupported packets

Set the status code BAD_L2 when unsupported type of packet
is received and dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Relocate rcvhdrcnt module parameter check.
Krzysztof Blaszkowski [Tue, 25 Oct 2016 20:12:11 +0000 (13:12 -0700)]
IB/hfi1: Relocate rcvhdrcnt module parameter check.

Validate the rcvhdrcnt module parameter in a single function at module
load time. This allows proper error reporting.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Krzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Fix rnr_timer addition
Ira Weiny [Mon, 17 Oct 2016 11:20:09 +0000 (04:20 -0700)]
IB/hfi1: Fix rnr_timer addition

The new s_rnr_timeout was not properly being set and the code was
incorrectly setting a different timer.

Found by code inspection.

Cc: <stable@vger.kernel.org> # 4.7.x
Fixes: 08279d5c9424 ("staging/rdma/hfi1: use new RNR timer")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Delete unused lock
Easwar Hariharan [Mon, 17 Oct 2016 11:20:04 +0000 (04:20 -0700)]
IB/hfi1: Delete unused lock

The lock is an unused vestige from qib. Remove it.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Clean up unused argument
Easwar Hariharan [Mon, 17 Oct 2016 11:19:58 +0000 (04:19 -0700)]
IB/hfi1: Clean up unused argument

hfi1_pcie_ddinit takes the PCI device id as an argument but never
uses it. Clean it up.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Remove leftover snoop references
Dennis Dalessandro [Mon, 17 Oct 2016 11:19:52 +0000 (04:19 -0700)]
IB/hfi1: Remove leftover snoop references

A few snoop related variables were missed in the snoop/capture removal
to get out of staging. Go back and clean those up too.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Fix a potential memory leak in hfi1_create_ctxts()
Jianxin Xiong [Mon, 17 Oct 2016 11:19:41 +0000 (04:19 -0700)]
IB/hfi1: Fix a potential memory leak in hfi1_create_ctxts()

In the function hfi1_create_ctxts the array "dd->rcd" is allocated and
then populated with allocated resources in a loop. Previously, if
error happened during the loop, only resource allocated in the current
iteration would be freed. The array itself would then be freed, leaving
the resources that were allocated in previous iterations and referenced
by the array elements in limbo.

This patch makes sure all allocated resources are freed before freeing
the array "dd->rcd". Also the resource allocation now takes account of
the numa node the device is attached to.

Reviewed-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Return ENODEV for unsupported PCI device ids.
Krzysztof Blaszkowski [Mon, 17 Oct 2016 11:19:24 +0000 (04:19 -0700)]
IB/hfi1: Return ENODEV for unsupported PCI device ids.

Clean up device type checking.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Krzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Fix an Oops on pci device force remove
Tadeusz Struk [Tue, 25 Oct 2016 15:57:55 +0000 (08:57 -0700)]
IB/hfi1: Fix an Oops on pci device force remove

This patch fixes an Oops on device unbind, when the device is used
by a PSM user process. PSM processes access device resources which
are freed on device removal. Similar protection exists in uverbs
in ib_core for Verbs clients, but PSM doesn't use ib_uverbs hence
a separate protection is required for PSM clients.

Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Fix integrity check flags default values
Jakub Pawlak [Mon, 10 Oct 2016 13:14:56 +0000 (06:14 -0700)]
IB/hfi1: Fix integrity check flags default values

Prevent setting up integrity check flags when module is loaded
with NO_INTEGRITY capability.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/hfi1: Remove redundant sysfs irq affinity entry
Tadeusz Struk [Mon, 10 Oct 2016 13:14:50 +0000 (06:14 -0700)]
IB/hfi1: Remove redundant sysfs irq affinity entry

The IRQ affinity entry is not needed after the irq notifier patch has been
added to the hfi1 driver.
The irq affinity settings for SDMA engine should be set using the standard
/proc/irq/<N>/ interface.

Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agoIB/rdmavt: rdmavt can handle non aligned page maps
Dennis Dalessandro [Mon, 10 Oct 2016 13:14:45 +0000 (06:14 -0700)]
IB/rdmavt: rdmavt can handle non aligned page maps

The initial code for rdmavt carried with it a restriction that was a
vestige from the qib driver, that to dma map a page it had to be less
than a page size. This is not the case on modern hardware, both qib and
hfi1 will be just fine with unaligned map requests.

This fixes a 4.8 regression where by an IPoIB transfer of > PAGE_SIZE
will hang because the dma map page call always fails. This was
introduced after commit 5faba5469522 ("IB/ipoib: Report SG feature
regardless of HW UD CSUM capability") added the capability to use SG by
default. Rather than override this, the HW supports it, so allow SG.

Cc: Stable <stable@vger.kernel.org> # 4.8
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2 weeks agodrm/amdgpu:fix vpost_needed routine
Monk Liu [Fri, 11 Nov 2016 03:24:29 +0000 (11:24 +0800)]
drm/amdgpu:fix vpost_needed routine

1,cleanup description/comments
2,for FIJI & passthrough, force post when smc fw version below 22.15
3,for other cases, follow regular rules

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 weeks agodrm/amdgpu/powerplay: drop a redundant NULL check
Alex Deucher [Tue, 15 Nov 2016 16:39:08 +0000 (11:39 -0500)]
drm/amdgpu/powerplay: drop a redundant NULL check

Left over from an earlier rev of the patch.

Acked-by: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Colin King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 weeks agoMerge tag 'trace-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 15 Nov 2016 16:49:13 +0000 (08:49 -0800)]
Merge tag 'trace-v4.9-rc5' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Alexei discovered a race condition in modules failing to load that can
  cause a ftrace check to trigger and disable ftrace.

  This is because of the way modules are registered to ftrace. Their
  functions are loaded in the ftrace function tables but set to
  "disabled" since they are still in the process of being loaded by the
  module. After the module is finished, it calls back into the ftrace
  infrastructure to enable it.

  Looking deeper into the locations that access all the functions in the
  table, I found more locations that should ignore the disabled ones"

* tag 'trace-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Add more checks for FTRACE_FL_DISABLED in processing ip records
  ftrace: Ignore FTRACE_FL_DISABLED while walking dyn_ftrace records

2 weeks agoMerge tag 'fbdev-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba...
Linus Torvalds [Tue, 15 Nov 2016 16:28:59 +0000 (08:28 -0800)]
Merge tag 'fbdev-fixes-4.9' of git://git./linux/kernel/git/tomba/linux

Pull fbdev fix from Tomi Valkeinen:
 "Fix CLCD regression on Vexpress"

* tag 'fbdev-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
  video: ARM CLCD: fix Vexpress regression

2 weeks agoMerge branch 'nvmf-4.9-rc' of git://git.infradead.org/nvme-fabrics into for-linus
Jens Axboe [Tue, 15 Nov 2016 14:51:41 +0000 (07:51 -0700)]
Merge branch 'nvmf-4.9-rc' of git://git.infradead.org/nvme-fabrics into for-linus

Sagi writes:

These are the relevant fixes for rc6
- fix possible crash in nvmet-rdma cm_handler from Bart
- fix possible memory leak in nvmet-rdma for connection failures
- fix possible use-after-free conditions in nvmet-rdma
- fix possible IO errors during reconnect stage from Christoph
- fix possible memory leak in nvme-rdma during IO queues connect
  failures from Steve

2 weeks agofuse: fix fuse_write_end() if zero bytes were copied
Miklos Szeredi [Thu, 18 Aug 2016 07:10:44 +0000 (09:10 +0200)]
fuse: fix fuse_write_end() if zero bytes were copied

If pos is at the beginning of a page and copied is zero then page is not
zeroed but is marked uptodate.

Fix by skipping everything except unlock/put of page if zero bytes were
copied.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 6b12c1b37e55 ("fuse: Implement write_begin/write_end callbacks")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
3 weeks agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 14 Nov 2016 22:15:53 +0000 (14:15 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix off by one wrt. indexing when dumping /proc/net/route entries,
    from Alexander Duyck.

 2) Fix lockdep splats in iwlwifi, from Johannes Berg.

 3) Cure panic when inserting certain netfilter rules when NFT_SET_HASH
    is disabled, from Liping Zhang.

 4) Memory leak when nft_expr_clone() fails, also from Liping Zhang.

 5) Disable UFO when path will apply IPSEC tranformations, from Jakub
    Sitnicki.

 6) Don't bogusly double cwnd in dctcp module, from Florian Westphal.

 7) skb_checksum_help() should never actually use the value "0" for the
    resulting checksum, that has a special meaning, use CSUM_MANGLED_0
    instead. From Eric Dumazet.

 8) Per-tx/rx queue statistic strings are wrong in qed driver, fix from
    Yuval MIntz.

 9) Fix SCTP reference counting of associations and transports in
    sctp_diag. From Xin Long.

10) When we hit ip6tunnel_xmit() we could have come from an ipv4 path in
    a previous layer or similar, so explicitly clear the ipv6 control
    block in the skb. From Eli Cooper.

11) Fix bogus sleeping inside of inet_wait_for_connect(), from WANG
    Cong.

12) Correct deivce ID of T6 adapter in cxgb4 driver, from Hariprasad
    Shenai.

13) Fix potential access past the end of the skb page frag array in
    tcp_sendmsg(). From Eric Dumazet.

14) 'skb' can legitimately be NULL in inet{,6}_exact_dif_match(). Fix
    from David Ahern.

15) Don't return an error in tcp_sendmsg() if we wronte any bytes
    successfully, from Eric Dumazet.

16) Extraneous unlocks in netlink_diag_dump(), we removed the locking
    but forgot to purge these unlock calls. From Eric Dumazet.

17) Fix memory leak in error path of __genl_register_family(). We leak
    the attrbuf, from WANG Cong.

18) cgroupstats netlink policy table is mis-sized, from WANG Cong.

19) Several XDP bug fixes in mlx5, from Saeed Mahameed.

20) Fix several device refcount leaks in network drivers, from Johan
    Hovold.

21) icmp6_send() should use skb dst device not skb->dev to determine L3
    routing domain. From David Ahern.

22) ip_vs_genl_family sets maxattr incorrectly, from WANG Cong.

23) We leak new macvlan port in some cases of maclan_common_netlink()
    errors. Fix from Gao Feng.

24) Similar to the icmp6_send() fix, icmp_route_lookup() should
    determine L3 routing domain using skb_dst(skb)->dev not skb->dev.
    Also from David Ahern.

25) Several fixes for route offloading and FIB notification handling in
    mlxsw driver, from Jiri Pirko.

26) Properly cap __skb_flow_dissect()'s return value, from Eric Dumazet.

27) Fix long standing regression in ipv4 redirect handling, wrt.
    validating the new neighbour's reachability. From Stephen Suryaputra
    Lin.

28) If sk_filter() trims the packet excessively, handle it reasonably in
    tcp input instead of exploding. From Eric Dumazet.

29) Fix handling of napi hash state when copying channels in sfc driver,
    from Bert Kenward.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (121 commits)
  mlxsw: spectrum_router: Flush FIB tables during fini
  net: stmmac: Fix lack of link transition for fixed PHYs
  sctp: change sk state only when it has assocs in sctp_shutdown
  bnx2: Wait for in-flight DMA to complete at probe stage
  Revert "bnx2: Reset device during driver initialization"
  ps3_gelic: fix spelling mistake in debug message
  net: ethernet: ixp4xx_eth: fix spelling mistake in debug message
  ibmvnic: Fix size of debugfs name buffer
  ibmvnic: Unmap ibmvnic_statistics structure
  sfc: clear napi_hash state when copying channels
  mlxsw: spectrum_router: Correctly dump neighbour activity
  mlxsw: spectrum: Fix refcount bug on span entries
  bnxt_en: Fix VF virtual link state.
  bnxt_en: Fix ring arithmetic in bnxt_setup_tc().
  Revert "include/uapi/linux/atm_zatm.h: include linux/time.h"
  tcp: take care of truncations done by sk_filter()
  ipv4: use new_gw for redirect neigh lookup
  r8152: Fix error path in open function
  net: bpqether.h: remove if_ether.h guard
  net: __skb_flow_dissect() must cap its return value
  ...

3 weeks agoMerge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux...
Linus Torvalds [Mon, 14 Nov 2016 22:07:13 +0000 (14:07 -0800)]
Merge branch 'stable' of git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile bugfix from Chris Metcalf:
 "This just fixes an incompatibility with tile __ro_after_init"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: handle __ro_after_init like parisc does

3 weeks agoMerge tag 'rtc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Linus Torvalds [Mon, 14 Nov 2016 22:00:29 +0000 (14:00 -0800)]
Merge tag 'rtc-4.9-2' of git://git./linux/kernel/git/abelloni/linux

Pull RTC fixes from Alexandre Belloni:
 "Here are a few driver fixes for 4.9. It has been calm for a while so I
  don't expect more for this cycle.

  Drivers:
   - asm9260: fix module autoload
   - cmos: fix crashes
   - omap: fix clock handling"

* tag 'rtc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: omap: prevent disabling of clock/module during suspend
  rtc: omap: Fix selecting external osc
  rtc: cmos: Don't enable interrupts in the middle of the interrupt handler
  rtc: cmos: remove all __exit_p annotations
  rtc: asm9260: fix module autoload

3 weeks agotile: handle __ro_after_init like parisc does
Chris Metcalf [Mon, 7 Nov 2016 19:32:02 +0000 (14:32 -0500)]
tile: handle __ro_after_init like parisc does

The tile architecture already marks RO_DATA as read-only in
the kernel, so grouping RO_AFTER_INIT_DATA with RO_DATA, as is
done by default, means the kernel faults in init when it tries
to write to RO_AFTER_INIT_DATA.  For now, just arrange that
__ro_after_init is handled like __write_once, i.e. __read_mostly.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
3 weeks agomlxsw: spectrum_router: Flush FIB tables during fini
Ido Schimmel [Mon, 14 Nov 2016 10:26:32 +0000 (11:26 +0100)]
mlxsw: spectrum_router: Flush FIB tables during fini

Since commit b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications
instead of switchdev calls") we reflect to the device the entire FIB
table and not only FIBs that point to netdevs created by the driver.

During module removal, FIBs of the second type are removed following
NETDEV_UNREGISTER events sent. The other FIBs are still present in both
the driver's cache and the device's table.

Fix this by iterating over all the FIB tables in the device and flush
them. There's no need to take locks, as we're the only writer.

Fixes: b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agonet: stmmac: Fix lack of link transition for fixed PHYs
Florian Fainelli [Mon, 14 Nov 2016 01:50:35 +0000 (17:50 -0800)]
net: stmmac: Fix lack of link transition for fixed PHYs

Commit 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch
is attached") added some logic to avoid polling the fixed PHY and
therefore invoking the adjust_link callback more than once, since this
is a fixed PHY and link events won't be generated.

This works fine the first time, because we start with phydev->irq =
PHY_POLL, so we call adjust_link, then we set phydev->irq =
PHY_IGNORE_INTERRUPT and we stop polling the PHY.

Now, if we called ndo_close(), which calls both phy_stop() and does an
explicit netif_carrier_off(), we end up with a link down. Upon calling
ndo_open() again, despite starting the PHY state machine, we have
PHY_IGNORE_INTERRUPT set, and we generate no link event at all, so the
link is permanently down.

Fixes: 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch is attached")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoftrace: Add more checks for FTRACE_FL_DISABLED in processing ip records
Steven Rostedt (Red Hat) [Mon, 14 Nov 2016 21:31:49 +0000 (16:31 -0500)]
ftrace: Add more checks for FTRACE_FL_DISABLED in processing ip records

When a module is first loaded and its function ip records are added to the
ftrace list of functions to modify, they are set to DISABLED, as their text
is still in a read only state. When the module is fully loaded, and can be
updated, the flag is cleared, and if their's any functions that should be
tracing them, it is updated at that moment.

But there's several locations that do record accounting and should ignore
records that are marked as disabled, or they can cause issues.

Alexei already fixed one location, but others need to be addressed.

Cc: stable@vger.kernel.org
Fixes: b7ffffbb46f2 "ftrace: Add infrastructure for delayed enabling of module functions"
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
3 weeks agoftrace: Ignore FTRACE_FL_DISABLED while walking dyn_ftrace records
Alexei Starovoitov [Mon, 7 Nov 2016 23:14:20 +0000 (15:14 -0800)]
ftrace: Ignore FTRACE_FL_DISABLED while walking dyn_ftrace records

ftrace_shutdown() checks for sanity of ftrace records
and if dyn_ftrace->flags is not zero, it will warn.
It can happen that 'flags' are set to FTRACE_FL_DISABLED at this point,
since some module was loaded, but before ftrace_module_enable()
cleared the flags for this module.

In other words the module.c is doing:
ftrace_module_init(mod); // calls ftrace_update_code() that sets flags=FTRACE_FL_DISABLED
... // here ftrace_shutdown() is called that warns, since
err = prepare_coming_module(mod); // didn't have a chance to clear FTRACE_FL_DISABLED

Fix it by ignoring disabled records.
It's similar to what __ftrace_hash_rec_update() is already doing.

Link: http://lkml.kernel.org/r/1478560460-3818619-1-git-send-email-ast@fb.com
Cc: stable@vger.kernel.org
Fixes: b7ffffbb46f2 "ftrace: Add infrastructure for delayed enabling of module functions"
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
3 weeks agosctp: change sk state only when it has assocs in sctp_shutdown
Xin Long [Sun, 13 Nov 2016 13:44:37 +0000 (21:44 +0800)]
sctp: change sk state only when it has assocs in sctp_shutdown

Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
this sock has no connection (assoc), sk state would be changed to
SCTP_SS_CLOSING, which is not as we expect.

Besides, after that if users try to listen on this sock, kernel
could even panic when it dereference sctp_sk(sk)->bind_hash in
sctp_inet_listen, as bind_hash is null when sock has no assoc.

This patch is to move sk state change after checking sk assocs
is not empty, and also merge these two if() conditions and reduce
indent level.

Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoMerge branch 'bnx2-kdump-fix'
David S. Miller [Mon, 14 Nov 2016 21:20:54 +0000 (16:20 -0500)]
Merge branch 'bnx2-kdump-fix'

Baoquan He says:

====================
bnx2: Wait for in-flight DMA to complete at probe stage

This is v2 post.

In commit 3e1be7a ("bnx2: Reset device during driver initialization"),
firmware requesting code was moved from open stage to probe stage.
The reason is in kdump kernel hardware iommu need device be reset in
driver probe stage, otherwise those in-flight DMA from 1st kernel
will continue going and look up into the newly created io-page tables.
However bnx2 chip resetting involves firmware requesting issue, that
need be done in open stage.

Michale Chan suggested we can just wait for the old in-flight DMA to
complete at probe stage, then though without device resetting, we
don't need to worry the old in-flight DMA could continue looking up
the newly created io-page tables.

v1->v2:
    Michael suggested to wait for the in-flight DMA to complete at probe
    stage. So give up the old method of trying to reset chip at probe
    stage, take the new way accordingly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agobnx2: Wait for in-flight DMA to complete at probe stage
Baoquan He [Sun, 13 Nov 2016 05:01:33 +0000 (13:01 +0800)]
bnx2: Wait for in-flight DMA to complete at probe stage

In-flight DMA from 1st kernel could continue going in kdump kernel.
New io-page table has been created before bnx2 does reset at open stage.
We have to wait for the in-flight DMA to complete to avoid it look up
into the newly created io-page table at probe stage.

Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoRevert "bnx2: Reset device during driver initialization"
Baoquan He [Sun, 13 Nov 2016 05:01:32 +0000 (13:01 +0800)]
Revert "bnx2: Reset device during driver initialization"

This reverts commit 3e1be7ad2d38c6bd6aeef96df9bd0a7822f4e51c.

When people build bnx2 driver into kernel, it will fail to detect
and load firmware because firmware is contained in initramfs and
initramfs has not been uncompressed yet during do_initcalls. So
revert commit 3e1be7a and work out a new way in the later patch.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoxtensa: wire up new pkey_{mprotect,alloc,free} syscalls
Max Filippov [Mon, 14 Nov 2016 20:31:49 +0000 (12:31 -0800)]
xtensa: wire up new pkey_{mprotect,alloc,free} syscalls

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
3 weeks agops3_gelic: fix spelling mistake in debug message
Colin Ian King [Sat, 12 Nov 2016 17:20:30 +0000 (17:20 +0000)]
ps3_gelic: fix spelling mistake in debug message

Trivial fix to spelling mistake "unmached" to "unmatched" in
debug message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoASoC: lpass-platform: fix uninitialized variable
Linus Torvalds [Mon, 14 Nov 2016 17:46:08 +0000 (09:46 -0800)]
ASoC: lpass-platform: fix uninitialized variable

In commit 022d00ee0b55 ("ASoC: lpass-platform: Fix broken pcm data
usage") the stream specific information initialization was broken, with
the dma channel information not being initialized if there was no
alloc_dma_channel() helper function.

Before that, the DMA channel number was implicitly initialized to zero
because the backing store was allocated with devm_kzalloc().  When the
init code was rewritten, that implicit initialization was lost, and gcc
rightfully complains about an uninitialized variable being used.

Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 weeks agoRevert "printk: make reading the kernel log flush pending lines"
Linus Torvalds [Mon, 14 Nov 2016 17:31:52 +0000 (09:31 -0800)]
Revert "printk: make reading the kernel log flush pending lines"

This reverts commit bfd8d3f23b51018388be0411ccbc2d56277fe294.

It turns out that this flushes things much too aggressiverly, and causes
lines to break up when the system logger races with new continuation
lines being printed.

There's a pending patch to make printk() flushing much more
straightforward, but it's too invasive for 4.9, so in the meantime let's
just not make the system message logging flush continuation lines.
They'll be flushed by the final newline anyway.

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 weeks agogp8psk-fe: add missing MODULE_foo() macros
Mauro Carvalho Chehab [Mon, 14 Nov 2016 13:14:37 +0000 (11:14 -0200)]
gp8psk-fe: add missing MODULE_foo() macros

This file was converted to a separate module at commit 7a0786c19d65
("gp8psk: Fix DVB frontend attach"), because the DVB attach routines
require it to work.  However, I forgot to copy the MODULE_foo() macros
from the original module, causing this warning:

    WARNING: modpost: missing MODULE_LICENSE() in drivers/media/dvb-frontends/gp8psk-fe.o

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 7a0786c19d65 ("gp8psk: Fix DVB frontend attach")
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 weeks agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 14 Nov 2016 16:39:56 +0000 (08:39 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - fix an Intel/MID boot crash/hang bug

   - fix a cache topology mis-parsing bug on certain AMD CPUs

   - fix a virtualization firmware bug by adding a check+quirk
     workaround on the kernel side"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Deal with broken firmware (VMWare/XEN)
  x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems
  x86/platform/intel-mid: Retrofit pci_platform_pm_ops ->get_state hook

3 weeks agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 14 Nov 2016 16:34:56 +0000 (08:34 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "This fixes a genirq regression that resulted in the Intel/Broxton
  pinctrl/GPIO driver (and possibly others) spewing warnings"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Use irq type from irqdata instead of irqdesc

3 weeks agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 14 Nov 2016 16:30:06 +0000 (08:30 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "An uncore PMU driver hardware enablement change for Intel SkyLake
  uncore PMUs (Skylake Y, U, H and S platforms), plus a number of
  tooling fixes for the histogram handling/displaying code"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Add more Intel uncore IMC PCI IDs for SkyLake
  perf hists: Fix column length on --hierarchy
  perf hists browser: Fix column indentation on --hierarchy
  perf hists browser: Show folded sign properly on --hierarchy
  perf hists browser: Fix indentation of folded sign on --hierarchy
  perf hist browser: Fix hierarchy column counts

3 weeks agoMerge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 14 Nov 2016 16:26:24 +0000 (08:26 -0800)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull EFI fixes from Ingo Molnar:
 "A boot crash fix and a build warning fix"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y
  x86/efi: Fix EFI memmap pointer size warning

3 weeks agoMerge tag 'ntb-4.9' of git://github.com/jonmason/ntb
Linus Torvalds [Mon, 14 Nov 2016 16:14:49 +0000 (08:14 -0800)]
Merge tag 'ntb-4.9' of git://github.com/jonmason/ntb

Pull NTB fixes from Jon Mason:
 "NTB bug fixes for ntb_hw_intel, ntb_perf, and ntb_pingpong.

  Also, a fixup to use jiffies in schedule_timeout_* call instead of a
  constant"

* tag 'ntb-4.9' of git://github.com/jonmason/ntb:
  ntb_perf: potential info leak in debugfs
  ntb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev
  ntb: make DMA_OUT_RESOURCE_TO HZ independent
  ntb_transport: make DMA_OUT_RESOURCE_TO HZ independent
  NTB: ntb_hw_intel: Fix typo in module parameter descriptions
  ntb_pingpong: Fix db_init parameter description

3 weeks agonvmet-rdma: drain the queue-pair just before freeing it
Sagi Grimberg [Sun, 6 Nov 2016 09:03:59 +0000 (11:03 +0200)]
nvmet-rdma: drain the queue-pair just before freeing it

draining the qp right after disconnect might not suffice because
the nvmet sq is not fully drained (in nvmet_sq_destroy) and we might
see completions after the drain. Instead, drain right before the
qp destroy which comes after the sq destruction and we can be sure
that no posts come after the drain.

Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
3 weeks agonvme-rdma: stop and free io queues on connect failure
Steve Wise [Tue, 8 Nov 2016 17:16:02 +0000 (09:16 -0800)]
nvme-rdma: stop and free io queues on connect failure

While testing nvme-rdma with the spdk nvmf target over iw_cxgb4, I
configured the target (mistakenly) to generate an error creating the
NVMF IO queues.  This resulted a "Invalid SQE Parameter" error sent back
to the host on the first IO queue connect:

[ 9610.928182] nvme nvme1: queue_size 128 > ctrl maxcmd 120, clamping down
[ 9610.938745] nvme nvme1: creating 32 I/O queues.

So nvmf_connect_io_queue() returns an error to
nvmf_connect_io_queue() / nvmf_connect_io_queues(), and that
is returned to nvme_rdma_create_io_queues().  In the error path,
nvmf_rdma_create_io_queues() frees the queue tagset memory _before_
stopping and freeing the IB queues, which causes yet another
touch-after-free crash due to SQ CQEs being flushed after the ib_cqe
structs pointed-to by the flushed WRs have been freed (since they are
part of the nvme_rdma_request struct).

The fix is to stop and free the queues in nvmf_connect_io_queues()
if there is an error connecting any of the queues.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
3 weeks agonvmet-rdma: don't forget to delete a queue from the list of connection failed
Sagi Grimberg [Sun, 6 Nov 2016 09:09:49 +0000 (11:09 +0200)]
nvmet-rdma: don't forget to delete a queue from the list of connection failed

In case we accepted a queue connection and it failed, we might not
remove the queue from the list until we unload and clean it up.
We should delete it from the queue list on the relevant handler.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
3 weeks agonvmet: Don't queue fatal error work if csts.cfs is set
Sagi Grimberg [Sun, 6 Nov 2016 09:03:30 +0000 (11:03 +0200)]
nvmet: Don't queue fatal error work if csts.cfs is set

In the transport, in case of an interal queue error like
error completion in rdma we trigger a fatal error. However,
multiple queues in the same controller can serr error completions
and we don't want to trigger fatal error work more than once.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
3 weeks agonvme-rdma: reject non-connect commands before the queue is live
Christoph Hellwig [Wed, 2 Nov 2016 14:49:18 +0000 (08:49 -0600)]
nvme-rdma: reject non-connect commands before the queue is live

If we reconncect we might have command queue up that get resent as soon
as the queue is restarted.  But until the connect command succeeded we
can't send other command.  Add a new flag that marks a queue as live when
connect finishes, and delay any non-connect command until the queue is
live based on it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
[sagig: fixes admin queue LIVE setting]
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
3 weeks agonvmet-rdma: Fix possible NULL deref when handling rdma cm events
Bart Van Assche [Tue, 1 Nov 2016 16:36:46 +0000 (18:36 +0200)]
nvmet-rdma: Fix possible NULL deref when handling rdma cm events

When we initiate queue teardown sequence we call rdma_destroy_qp
which clears cm_id->qp, afterwards we call rdma_destroy_id, but
we might see a rdma_cm event in between with a cleared cm_id->qp
so watch out for that and silently ignore the event because this
means that the queue teardown sequence is in progress.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
3 weeks agontb_perf: potential info leak in debugfs
Dan Carpenter [Fri, 14 Oct 2016 07:34:18 +0000 (10:34 +0300)]
ntb_perf: potential info leak in debugfs

This is a static checker warning, not something I'm desperately
concerned about.  But snprintf() returns the number of bytes that
would have been copied if there were space.  We really care about the
number of bytes that actually were copied so we should use scnprintf()
instead.

It probably won't overrun, and in that case we may as well just use
sprintf() but these sorts of things make static checkers and code
reviewers happier.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
3 weeks agontb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev
Dave Jiang [Thu, 27 Oct 2016 18:06:44 +0000 (11:06 -0700)]
ntb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev

The peer_addr member of intel_ntb_dev is not set, therefore when
acquiring ntb_peer_db and ntb_peer_spad we only get the offset rather
than the actual physical address. Adding fix to correct that.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
3 weeks agontb: make DMA_OUT_RESOURCE_TO HZ independent
Nicholas Mc Guire [Mon, 22 Aug 2016 16:51:36 +0000 (18:51 +0200)]
ntb: make DMA_OUT_RESOURCE_TO HZ independent

schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
3 weeks agontb_transport: make DMA_OUT_RESOURCE_TO HZ independent
Nicholas Mc Guire [Mon, 22 Aug 2016 16:51:35 +0000 (18:51 +0200)]
ntb_transport: make DMA_OUT_RESOURCE_TO HZ independent

schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
3 weeks agoNTB: ntb_hw_intel: Fix typo in module parameter descriptions
Wei Yongjun [Mon, 8 Aug 2016 09:48:42 +0000 (09:48 +0000)]
NTB: ntb_hw_intel: Fix typo in module parameter descriptions

Fix typo in module parameter descriptions.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
3 weeks agontb_pingpong: Fix db_init parameter description
Wei Yongjun [Mon, 8 Aug 2016 09:48:00 +0000 (09:48 +0000)]
ntb_pingpong: Fix db_init parameter description

Fix 'db_init' parameter description.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
3 weeks agonet: ethernet: ixp4xx_eth: fix spelling mistake in debug message
Colin Ian King [Sat, 12 Nov 2016 17:44:06 +0000 (17:44 +0000)]
net: ethernet: ixp4xx_eth: fix spelling mistake in debug message

Trivial fix to spelling mistake "successed" to "succeeded"
in debug message.  Also unwrap multi-line literal string.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoibmvnic: Fix size of debugfs name buffer
Thomas Falcon [Fri, 11 Nov 2016 17:00:46 +0000 (11:00 -0600)]
ibmvnic: Fix size of debugfs name buffer

This mistake was causing debugfs directory creation
failures when multiple ibmvnic devices were probed.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agoibmvnic: Unmap ibmvnic_statistics structure
Thomas Falcon [Fri, 11 Nov 2016 17:00:45 +0000 (11:00 -0600)]
ibmvnic: Unmap ibmvnic_statistics structure

This structure was mapped but never subsequently unmapped.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 weeks agosfc: clear napi_hash state when copying channels
Bert Kenward [Fri, 11 Nov 2016 15:56:51 +0000 (15:56 +0000)]
sfc: clear napi_hash state when copying channels

efx_copy_channel() doesn't correctly clear the napi_hash related state.
This means that when napi_hash_add is called for that channel nothing is
done, and we are left with a copy of the napi_hash_node from the old
channel. When we later call napi_hash_del() on this channel we have a
stale napi_hash_node.

Corruption is only seen when there are multiple entries in one of the
napi_hash lists. This is made more likely by having a very large number
of channels. Testing was carried out with 512 channels - 32 channels on
each of 16 ports.

This failure typically appears as protection faults within napi_by_id()
or napi_hash_add(). efx_copy_channel() is only used when tx or rx ring
sizes are changed (ethtool -G).

Fixes: 36763266bbe8 ("sfc: Add support for busy polling")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>