Huazhong Tan [Fri, 9 Nov 2018 14:07:53 +0000 (22:07 +0800)]
net: hns3: implement the IMP reset processing for PF
The current code only print the prompt message after receiving
the IMP reset interrupt and does not perform the corresponding driver
reset operation. This patch implements the missing IMP reset handling
in the driver.
1. The driver sets the HCLGE_STATE_CMD_DISABLE to stop sending command
after receiving the IMP reset interrupt.
2. The driver needs to notify the hardware to reload the IMP firmware.
3. The IMP firmware reloading makes the reset time of hardware longer,
so it is necessary to extend the driver's waiting time to wait for
the hardware reset to complete.
4. In hclge_check_event_cause, IMP reset event should have higher
priority than other events.
Also, after clearing HCLGE_STATE_CMD_DISABLE in the hclge_cmd_init(),
it needs to check whether there is a pending reset, if so, just set
the HCLGE_STATE_CMD_DISABLE back and return.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:52 +0000 (22:07 +0800)]
net: hns3: stop napi polling when HNS3_NIC_STATE_DOWN is set
When calling napi_disable during reset down process, if NAPIF_STATE_MISSED
is set, napi_complete will call __napi_schedule to do the polling again.
So this patch uses HNS3_NIC_STATE_DOWN to ensure the polling is not
scheduled again.
Also, when napi_complete returns true, it means polling is scheduled
again, it is not neccssary to enable the interrupt.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:51 +0000 (22:07 +0800)]
net: hns3: add error handler for hclgevf_reset()
Since hclgevf_reset() may fail for some reasons, so it needs an error
handler to deal with it. When VF reset failed, VF can only be restored
by a higher level reset asserted by PF. So, it needs to reinitialize
its command queue, then it can respond to higher level reset.
Also, this patch adds error logging in the hclgevf_notify_client().
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:50 +0000 (22:07 +0800)]
net: hns3: stop handling command queue while resetting VF
According to hardware's description, after the reset occurs, the driver
needs to re-initialize the command queue before sending and receiving
any commands. Therefore, the VF's driver needs to identify the command
queue needs to re-initialize with HCLGEVF_STATE_CMD_DISABLE, and does
not allow sending or receiving commands before the re-initialization.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:49 +0000 (22:07 +0800)]
net: hns3: add reset handling for VF when doing Core/Global/IMP reset
When a Core/Global/IMP reset occurs, the hardware sets the reset status
register of all PF/VF and reports a reset interrupt to all PF/VF and
firmware.
When receiving the reset interrupt:
1. The firmware will wait for 100 ms before resetting the hardware and
clear the reset status register of all PF when hardware reset is done.
2. The PF/VF driver needs to down the netdev within 100 ms and then wait
for hardware reset to finish.
3. After firmware clearing the reset status register of all PF, the PF
driver reinitializes the hardware and clear the reset status register
of it's VF.
4. After PF driver clearing the reset status register of VF, the VF driver
reinitializes the hardware.
This patch mainly add handling for the step 4.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:48 +0000 (22:07 +0800)]
net: hns3: add reset handling for VF when doing PF reset
When PF performs a function reset, the hardware will reset both PF
and all the VF belong to this PF. Hence, both PF's driver and VF's
driver need to perform corresponding reset operations.
Before PF driver asserting function reset to hardware, it firstly
set up VF's hardware reset status, and inform the VF driver with
HNAE3_VF_PF_FUNC_RESET, then VF driver sets this reset type to
reset_pending and shechule reset task to stop IO and waits for the
hardware reset status to clear. When PF driver has reinitialized the
hardware and is ready to process mailbox from VF, PF driver clears
VF's hardware reset status for VF to continue its reset process.
Also, this patch uses readl_poll_timeout to simplify the hardware reset
status waitting.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:47 +0000 (22:07 +0800)]
net: hns3: adjust VF's reset process
Currently when VF need to reset itself, it will send a cmd to PF,
after receiving the VF reset requset, PF sends a cmd to inform
VF to enter the reset process and send a cmd to firmware to do the
actual reset for the VF, it is possible that firmware has resetted
the VF, but VF has not entered the reset process, which may cause
IO not stopped problem when firmware is resetting VF.
This patch fixes it by adjusting the VF reset process, when VF
need to reset itself, it will enter the reset process first, and
it will tell the PF to send cmd to firmware to reset itself.
Add member reset_pending to struct hclgevf_dev, which indicates that
there is reset event need to be processed by the VF's reset task, and
the VF's reset task chooses the highest-level one and clears other
low-level one when it processes reset_pending.
hclge_inform_reset_assert_to_vf function is unused now, but it will
be used to support the PF reset with VF working, so declare it in
the header file.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Fri, 9 Nov 2018 14:07:46 +0000 (22:07 +0800)]
net: hns3: add reset_hdev to reinit the hdev in VF's reset process
When doing reset, the reset handling function only need to
reinitialize hardware, it makes sense to add a function to
do that job. Also the error handling of hclgevf_init_hdev is
different when it is used in reset process.
This patch adds reset_hdev to reinitialize hardware when resetting.
Also, this patch removes the hclgevf_dev_ongoing_full_reset because
it is unused now.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 23:38:11 +0000 (15:38 -0800)]
Merge branch 'aquantia-fixes'
Igor Russkikh says:
====================
net: aquantia: 2018-11 bugfixes
The patchset fixes a number of bugs found in various areas after
driver validation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Fri, 9 Nov 2018 11:54:03 +0000 (11:54 +0000)]
net: aquantia: allow rx checksum offload configuration
RX Checksum offloads could not be configured and ignored netdev features
flag for checksumming.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Fri, 9 Nov 2018 11:54:01 +0000 (11:54 +0000)]
net: aquantia: invalid checksumm offload implementation
Packets with marked invalid IP/UDP/TCP checksums were considered as good
by the driver. The error was in a logic, processing offload bits in
RX descriptor.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 9 Nov 2018 11:53:59 +0000 (11:53 +0000)]
net: aquantia: fixed enable unicast on 32 macvlan
Fixed a condition mistake due to which macvlans unicast
item number 32 was not added in the unicast filter.
The consequence is that when exactly 32 macvlans are created
on NIC, the last created macvlan receives no traffic because
its MAC was not registered in HW.
Fixes:
94b3b542303f ("net: aquantia: vlan unicast address list correct handling")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Fri, 9 Nov 2018 11:53:57 +0000 (11:53 +0000)]
net: aquantia: fix potential IOMMU fault after driver unbind
IOMMU fault may occurr on unbind/bind or if_down/if_up sequence.
Although driver disables the rings on down, this is not enough.
Due to internal HW design, during subsequent initialization
NIC sometimes may reuse RX descriptors cache and write to the
host memory from the descriptor cache.
That's get catched by IOMMU on host.
This patch invalidates the descriptor cache in NIC on interface down
to prevent writing to the cached descriptors and to the memory pointed
in those descriptors.
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 9 Nov 2018 11:53:56 +0000 (11:53 +0000)]
net: aquantia: synchronized flow control between mac/phy
Flow control statuses were not synchronized between blocks,
that caused packets/link drop on some corner cases, when
MAC sent PFC although Phy was not expecting these to come.
Driver should readout the negotiated FC from phy and
configure RX block accordigly.
This is done on each link change event with information from FW.
Fixes:
288551de45aa ("net: aquantia: Implement rx/tx flow control ethtools callback")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 9 Nov 2018 09:22:53 +0000 (14:52 +0530)]
cxgb4vf: free mac_hlist properly
The locally maintained list for tracking hash mac table was
not freed during driver remove.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 9 Nov 2018 09:22:01 +0000 (14:52 +0530)]
cxgb4vf: fix memleak in mac_hlist initialization
mac_hlist was initialized during adapter_up, which will be called
every time a vf device is first brought up, or every time when device
is brought up again after bringing all devices down. This means our
state of previous list is lost, causing a memleak if entries are
present in the list. To fix that, move list init to the condition
that performs initial one time adapter setup.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 9 Nov 2018 09:20:25 +0000 (14:50 +0530)]
cxgb4: free mac_hlist properly
The locally maintained list for tracking hash mac table was
not freed during driver remove.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 9 Nov 2018 09:04:51 +0000 (17:04 +0800)]
net: tcp: remove BUG_ON from tcp_v4_err
if skb is NULL pointer, and the following access of skb's
skb_mstamp_ns will trigger panic, which is same as BUG_ON
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Wen [Fri, 9 Nov 2018 06:53:59 +0000 (14:53 +0800)]
xen/netfront: remove unnecessary wmb
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY is already able to make sure backend sees
requests before req_prod is updated.
Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 9 Nov 2018 22:41:58 +0000 (16:41 -0600)]
Merge tag 'devicetree-fixes-for-4.20-2' of git://git./linux/kernel/git/robh/linux
Pull Devicetree fixes from Rob Herring:
- Add validation of NUMA distance map to prevent crashes with bad map
- Fix setting of dma_mask
* tag 'devicetree-fixes-for-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of, numa: Validate some distance map rules
of/device: Really only set bus DMA mask when appropriate
Linus Torvalds [Fri, 9 Nov 2018 22:31:51 +0000 (16:31 -0600)]
Merge tag 'for-linus-
20181109' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
- Two fixes for an ubd regression, one for missing locking, and one for
a missing initialization of a field. The latter was an old latent
bug, but it's now visible and triggers (Me, Anton Ivanov)
- Set of NVMe fixes via Christoph, but applied manually due to a git
tree mixup (Christoph, Sagi)
- Fix for a discard split regression, in three patches (Ming)
- Update libata git trees (Geert)
- SPDX identifier for sata_rcar (Kuninori Morimoto)
- Virtual boundary merge fix (Johannes)
- Preemptively clear memory we are going to pass to userspace, in case
the driver does a short read (Keith)
* tag 'for-linus-
20181109' of git://git.kernel.dk/linux-block:
block: make sure writesame bio is aligned with logical block size
block: cleanup __blkdev_issue_discard()
block: make sure discard bio is aligned with logical block size
Revert "nvmet-rdma: use a private workqueue for delete"
nvme: make sure ns head inherits underlying device limits
nvmet: don't try to add ns to p2p map unless it actually uses it
sata_rcar: convert to SPDX identifiers
ubd: fix missing initialization of io_req
block: Clear kernel memory before copying to user
MAINTAINERS: Fix remaining pointers to obsolete libata.git
ubd: fix missing lock around request issue
block: respect virtual boundary mask in bvecs
Linus Torvalds [Fri, 9 Nov 2018 22:26:18 +0000 (16:26 -0600)]
Merge tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client
Pull Ceph fixes from Ilya Dryomov:
"Two CephFS fixes (copy_file_range and quota) and a small feature bit
cleanup"
* tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client:
libceph: assume argonaut on the server side
ceph: quota: fix null pointer dereference in quota check
ceph: add destination file data sync before doing any remote copy
Linus Torvalds [Fri, 9 Nov 2018 22:21:24 +0000 (16:21 -0600)]
Merge tag 'mips_fixes_4.20_2' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A couple of small MIPS fixes for 4.20:
- Extend an array to avoid overruns on some Octeon hardware, fixing a
bug introduced in 4.3.
- Fix a coherent DMA regression for systems without cache-coherent
DMA introduced in the 4.20 merge window"
* tag 'mips_fixes_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation
MIPS: OCTEON: fix out of bounds array access on CN68XX
Vinod Koul [Fri, 9 Nov 2018 09:50:54 +0000 (15:20 +0530)]
clk: qcom: gcc: Fix board clock node name
Device tree node name are not supposed to have "_" in them so fix the
node name use of xo_board to xo-board
Fixes:
652f1813c113 ("clk: qcom: gcc: Add global clock controller driver for QCS404")
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Steven Rostedt (VMware) [Fri, 9 Nov 2018 20:22:07 +0000 (15:22 -0500)]
x86/cpu/vmware: Do not trace vmware_sched_clock()
When running function tracing on a Linux guest running on VMware
Workstation, the guest would crash. This is due to tracing of the
sched_clock internal call of the VMware vmware_sched_clock(), which
causes an infinite recursion within the tracing code (clock calls must
not be traced).
Make vmware_sched_clock() not traced by ftrace.
Fixes:
80e9a4f21fd7c ("x86/vmware: Add paravirt sched clock")
Reported-by: GwanYeong Kim <gy741.kim@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: Alok Kataria <akataria@vmware.com>
CC: GwanYeong Kim <gy741.kim@gmail.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
CC: virtualization@lists.linux-foundation.org
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181109152207.4d3e7d70@gandalf.local.home
Ajay Gupta [Fri, 26 Oct 2018 16:36:59 +0000 (09:36 -0700)]
usb: typec: ucsi: add support for Cypress CCGx
Latest NVIDIA GPU cards have a Cypress CCGx Type-C controller
over I2C interface.
This UCSI I2C driver uses I2C bus driver interface for communicating
with Type-C controller.
Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Yoshihiro Shimoda [Tue, 30 Oct 2018 06:13:35 +0000 (15:13 +0900)]
serial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout
This patch fixes an issue that the sci_remove() could not remove
dev_attr_rx_fifo_timeout because uart_remove_one_port() set
the port->port.type to PORT_UNKNOWN.
Reported-by: Hiromitsu Yamasaki <hiromitsu.yamasaki.ym@renesas.com>
Fixes:
5d23188a473d ("serial: sh-sci: make RX FIFO parameters tunable via sysfs")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wolfram Sang [Fri, 9 Nov 2018 16:54:38 +0000 (17:54 +0100)]
i2c: nvidia-gpu: make pm_ops static
sparse rightfully says:
warning: symbol 'gpu_i2c_driver_pm' was not declared. Should it be static?
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Ajay Gupta [Fri, 26 Oct 2018 16:36:58 +0000 (09:36 -0700)]
i2c: add i2c bus driver for NVIDIA GPU
Latest NVIDIA GPU card has USB Type-C interface. There is a
Type-C controller which can be accessed over I2C.
This driver adds I2C bus driver to communicate with Type-C controller.
I2C client driver will be part of USB Type-C UCSI driver.
Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[wsa: kept Makefile sorting]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Vasily Averin [Fri, 9 Nov 2018 16:34:40 +0000 (11:34 -0500)]
ext4: missing !bh check in ext4_xattr_inode_write()
According to Ted Ts'o ext4_getblk() called in ext4_xattr_inode_write()
should not return bh = NULL
The only time that bh could be NULL, then, would be in the case of
something really going wrong; a programming error elsewhere (perhaps a
wild pointer dereference) or I/O error causing on-disk file system
corruption (although that would be highly unlikely given that we had
*just* allocated the blocks and so the metadata blocks in question
probably would still be in the cache).
Fixes:
e50e5129f384 ("ext4: xattr-in-inode support")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 4.13
Stephen Boyd [Fri, 2 Nov 2018 20:57:32 +0000 (13:57 -0700)]
i2c: qcom-geni: Fix runtime PM mismatch with child devices
We need to enable runtime PM on this i2c controller before populating
child devices with i2c_add_adapter(). Otherwise, if a child device uses
runtime PM and stays runtime PM enabled we'll get the following warning
at boot.
Enabling runtime PM for inactive device (a98000.i2c) with active children
[...]
Call trace:
pm_runtime_enable+0xd8/0xf8
geni_i2c_probe+0x440/0x460
platform_drv_probe+0x74/0xc8
[...]
Let's move the runtime PM enabling and setup to before we add the
adapter, so that this device can respond to runtime PM requests from
children.
Fixes:
37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Vignesh R [Fri, 9 Nov 2018 11:14:12 +0000 (16:44 +0530)]
MAINTAINERS: Add entry for i2c-omap driver
Add separate entry for i2c-omap and add my name as maintainer for this
driver.
Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Vignesh R [Fri, 9 Nov 2018 11:14:11 +0000 (16:44 +0530)]
i2c: omap: Enable for ARCH_K3
Allow I2C_OMAP to be built for K3 platforms.
Signed-off-by: Vignesh R <vigneshr@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Vignesh R [Fri, 9 Nov 2018 11:14:10 +0000 (16:44 +0530)]
dt-bindings: i2c: omap: Add new compatible for AM654 SoCs
AM654 SoCs have same I2C IP as OMAP SoCs. Add new compatible to
handle AM654 SoCs. While at that reformat the existing compatible list
for older SoCs to list one valid compatible per line.
Signed-off-by: Vignesh R <vigneshr@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Juergen Gross [Thu, 1 Nov 2018 12:33:07 +0000 (13:33 +0100)]
xen: remove size limit of privcmd-buf mapping interface
Currently the size of hypercall buffers allocated via
/dev/xen/hypercall is limited to a default of 64 memory pages. For live
migration of guests this might be too small as the page dirty bitmask
needs to be sized according to the size of the guest. This means
migrating a 8GB sized guest is already exhausting the default buffer
size for the dirty bitmap.
There is no sensible way to set a sane limit, so just remove it
completely. The device node's usage is limited to root anyway, so there
is no additional DOS scenario added by allowing unlimited buffers.
While at it make the error path for the -ENOMEM case a little bit
cleaner by setting n_pages to the number of successfully allocated
pages instead of the target size.
Fixes:
c51b3c639e01f2 ("xen: add new hypercall buffer mapping device")
Cc: <stable@vger.kernel.org> #4.18
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Thu, 8 Nov 2018 07:35:06 +0000 (08:35 +0100)]
xen: fix xen_qlock_wait()
Commit
a856531951dc80 ("xen: make xen_qlock_wait() nestable")
introduced a regression for Xen guests running fully virtualized
(HVM or PVH mode). The Xen hypervisor wouldn't return from the poll
hypercall with interrupts disabled in case of an interrupt (for PV
guests it does).
So instead of disabling interrupts in xen_qlock_wait() use a nesting
counter to avoid calling xen_clear_irq_pending() in case
xen_qlock_wait() is nested.
Fixes:
a856531951dc80 ("xen: make xen_qlock_wait() nestable")
Cc: stable@vger.kernel.org
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Ming Lei [Mon, 29 Oct 2018 12:57:19 +0000 (20:57 +0800)]
block: make sure writesame bio is aligned with logical block size
Obviously the created writesame bio has to be aligned with logical block
size, and use bio_allowed_max_sectors() to retrieve this number.
Cc: stable@vger.kernel.org
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Fixes:
b49a0871be31a745b2ef ("block: remove split code in blkdev_issue_{discard,write_same}")
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Mon, 29 Oct 2018 12:57:18 +0000 (20:57 +0800)]
block: cleanup __blkdev_issue_discard()
Cleanup __blkdev_issue_discard() a bit:
- remove local variable of 'end_sect'
- remove code block of 'fail'
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Mon, 29 Oct 2018 12:57:17 +0000 (20:57 +0800)]
block: make sure discard bio is aligned with logical block size
Obviously the created discard bio has to be aligned with logical block size.
This patch introduces the helper of bio_allowed_max_sectors() for
this purpose.
Cc: stable@vger.kernel.org
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Fixes:
744889b7cbb56a6 ("block: don't deal with discard limit in blkdev_issue_discard()")
Fixes:
a22c4d7e34402cc ("block: re-add discard_granularity and alignment checks")
Reported-by: Rui Salvaterra <rsalvaterra@gmail.com>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Christoph Hellwig [Wed, 7 Nov 2018 08:20:25 +0000 (09:20 +0100)]
Revert "nvmet-rdma: use a private workqueue for delete"
This reverts commit
2acf70ade79d26b97611a8df52eb22aa33814cd4.
The commit never really fixed the intended issue and caused all
kinds of other issues, including a use before initialization.
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Sagi Grimberg [Fri, 2 Nov 2018 18:22:13 +0000 (11:22 -0700)]
nvme: make sure ns head inherits underlying device limits
Whenever we update ns_head info, we need to make sure it is still
compatible with all underlying backing devices because although nvme
multipath doesn't have any explicit use of these limits, other devices
can still be stacked on top of it which may rely on the underlying limits.
Start with unlimited stacking limits, and every info update iterate over
siblings and adjust queue limits.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Sagi Grimberg [Fri, 2 Nov 2018 23:12:21 +0000 (16:12 -0700)]
nvmet: don't try to add ns to p2p map unless it actually uses it
Even without CONFIG_P2PDMA this results in a error print:
nvmet: no peer-to-peer memory is available that's supported by rxe0 and /dev/nullb0
Fixes:
c6925093d0b2 ("nvmet: Optionally use PCI P2P memory")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 9 Nov 2018 12:30:44 +0000 (06:30 -0600)]
Merge tag 's390-4.20-2' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
- A fix for the pgtable_bytes misaccounting on s390. The patch changes
common code part in regard to page table folding and adds extra
checks to mm_[inc|dec]_nr_[pmds|puds].
- Add FORCE for all build targets using if_changed
- Use non-loadable phdr for the .vmlinux.info section to avoid a
segment overlap that confuses kexec
- Cleanup the attribute definition for the diagnostic sampling
- Increase stack size for CONFIG_KASAN=y builds
- Export __node_distance to fix a build error
- Correct return code of a PMU event init function
- An update for the default configs
* tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/perf: Change CPUM_CF return code in event init function
s390: update defconfigs
s390/mm: Fix ERROR: "__node_distance" undefined!
s390/kasan: increase instrumented stack size to 64k
s390/cpum_sf: Rework attribute definition for diagnostic sampling
s390/mm: fix mis-accounting of pgtable_bytes
mm: add mm_pxd_folded checks to pgtable_bytes accounting functions
mm: introduce mm_[p4d|pud|pmd]_folded
mm: make the __PAGETABLE_PxD_FOLDED defines non-empty
s390: avoid vmlinux segments overlap
s390/vdso: add missing FORCE to build targets
s390/decompressor: add missing FORCE to build targets
Juergen Gross [Wed, 7 Nov 2018 17:01:00 +0000 (18:01 +0100)]
x86/xen: fix pv boot
Commit
9da3f2b7405440 ("x86/fault: BUG() when uaccess helpers fault on
kernel addresses") introduced a regression for booting Xen PV guests.
Xen PV guests are using __put_user() and __get_user() for accessing the
p2m map (physical to machine frame number map) as accesses might fail
in case of not populated areas of the map.
With above commit using __put_user() and __get_user() for accessing
kernel pages is no longer valid. So replace the Xen hack by adding
appropriate p2m access functions using the default fixup handler.
Fixes:
9da3f2b7405440 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
David S. Miller [Fri, 9 Nov 2018 04:48:01 +0000 (20:48 -0800)]
Merge branch 'nfp-abm-move-code-and-improve-parameter-validation'
Jakub Kicinski says:
====================
nfp: abm: move code and improve parameter validation
This set starts by separating Qdisc handling code into a new file.
Next two patches allow early access to TLV-based capabilities during
probe, previously the capabilities were parsed just before netdevs
were registered, but its cleaner to do some basic validation earlier
and avoid cleanup work.
Next three patches improve RED's parameter validation. First we provide
a more precise message about why offload failed (and move the parameter
validation to a helper). Next we make sure we don't set the top bit
in the 32 bit max RED threshold value. Because FW is treating the value
as signed it reportedly causes slow downs (unnecessary queuing and
marking) when top bit is set with recent firmwares. Last (and perhaps
least importantly) we offload the harddrop parameter of the Qdisc.
We don't plan to offload harddrop RED, but it seems prudent to make
sure user didn't set that flag as device behaviour would have differed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:39 +0000 (19:50 -0800)]
nfp: abm: refuse RED offload with harddrop set
RED Qdisc will now inform the drivers about the state of the harddrop
flag. Refuse to offload in case harddrop is set.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:38 +0000 (19:50 -0800)]
net: sched: red: inform offloads about harddrop setting
To mirror software behaviour on offload more precisely inform
the drivers about the state of the harddrop flag.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:37 +0000 (19:50 -0800)]
nfp: abm: don't set negative threshold
Turns out the threshold value is used in signed compares in the FW,
so we should avoid setting the top bit.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:36 +0000 (19:50 -0800)]
nfp: abm: provide more precise info about offload parameter validation
Improve log messages printed when RED can't be offloaded because
of Qdisc parameters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:35 +0000 (19:50 -0800)]
nfp: parse vNIC TLV capabilities at alloc time
In certain cases initialization logic which follows allocation of
the vNIC structure may want to validate the capabilities of that vNIC.
This is easy before vNIC is initialized for normal capabilities which
are at fixed offsets in control memory, easy to locate and read, but
poses a challenge if the capabilities are in form of TLVs. Parse
the TLVs early on so other code can just access parsed info, instead
of having to do the parsing by itself.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:34 +0000 (19:50 -0800)]
nfp: pass ctrl_bar pointer to nfp_net_alloc
Move setting ctrl_bar pointer to the nfp_net_alloc function,
to make sure we can parse capabilities early in the following
patch.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 9 Nov 2018 03:50:33 +0000 (19:50 -0800)]
nfp: abm: split qdisc offload code into a separate file
The Qdisc offload code is logically separate, and we will soon
do significant surgery on it to support more Qdiscs, so move
it to a separate file.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Fri, 9 Nov 2018 02:54:00 +0000 (21:54 -0500)]
tcp_bbr: update comments to reflect pacing_margin_percent
Recently, in commit
ab408b6dc744 ("tcp: switch tcp and sch_fq to new
earliest departure time model"), the TCP BBR code switched to a new
approach of using an explicit bbr_pacing_margin_percent for shaving a
pacing rate "haircut", rather than the previous implict
approach. Update an old comment to reflect the new approach.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 04:45:05 +0000 (20:45 -0800)]
Merge branch 'net-Use-__vlan_hwaccel_-helpers'
Michał Mirosław says:
====================
net: Use __vlan_hwaccel_*() helpers
This series removes from networking core and driver code an assumption
about how VLAN tag presence is stored in an skb. This will allow to free
up overloading of VLAN.CFI bit to incidate tag's presence.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:06 +0000 (00:18 +0100)]
sky2: use __vlan_hwaccel helpers
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:05 +0000 (00:18 +0100)]
mlx4: use __vlan_hwaccel helpers
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:04 +0000 (00:18 +0100)]
benet: use __vlan_hwaccel helpers
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:04 +0000 (00:18 +0100)]
ipv4/tunnel: use __vlan_hwaccel helpers
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:03 +0000 (00:18 +0100)]
bridge: use __vlan_hwaccel helpers
This removes assumption than vlan_tci != 0 when tag is present.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:03 +0000 (00:18 +0100)]
8021q: use __vlan_hwaccel helpers
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:02 +0000 (00:18 +0100)]
nfnetlink/queue: use __vlan_hwaccel helpers
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:02 +0000 (00:18 +0100)]
net/core: use __vlan_hwaccel helpers
This removes assumptions about VLAN_TAG_PRESENT bit.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 23:18:01 +0000 (00:18 +0100)]
cxgb4: use __vlan_hwaccel helpers
Use __vlan_hwaccel_put_tag() to set vlan tag and proto fields.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Thu, 8 Nov 2018 22:05:42 +0000 (14:05 -0800)]
net: move __skb_checksum_complete*() to skbuff.c
__skb_checksum_complete_head() and __skb_checksum_complete()
are both declared in skbuff.h, they fit better in skbuff.c
than datagram.c.
Cc: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 04:30:58 +0000 (20:30 -0800)]
Merge branch 'net-ethernet-ti-cpsw-fix-vlan-mcast'
Ivan Khoronzhuk says:
====================
net: ethernet: ti: cpsw: fix vlan mcast
The cpsw holds separate mcast entires for vlan entries. At this moment
driver adds only not vlan mcast addresses, omitting vlan/mcast entries.
As result mcast for vlans doesn't work. It can be fixed by adding same
mcast entries for every created vlan, but this patchseries uses more
sophisticated way and allows to create mcast entries only for vlans
that really require it. Generic functions from this series can be
reused for fixing vlan and macvlan unicast.
Simple example of ALE table before and after this series, having same
mcast entries as for vlan 100 as for real device (reserved vlan 2),
and one mcast address only for vlan 100 - 01:1b:19:00:00:00.
<---- Before this patchset ---->
vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5, mem_list = 0x5
mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, port_mask = 0x1
ucast, vid = 2, addr = 74:da:ea:47:7d:9d, persistant, port_num = 0x0
vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0, mem_list = 0x7
mcast, vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x1
mcast, vid = 2, addr = 01:00:5e:00:00:01, port_mask = 0x1
vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3, mem_list = 0x3
mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, port_mask = 0x1
ucast, vid = 1, addr = 74:da:ea:47:7d:9c, persistant, port_num = 0x0
mcast, vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x1
mcast, vid = 1, addr = 01:00:5e:00:00:01, port_mask = 0x1
mcast, vid = 2, addr = 01:80:c2:00:00:00, port_mask = 0x1
mcast, vid = 2, addr = 01:80:c2:00:00:03, port_mask = 0x1
mcast, vid = 2, addr = 01:80:c2:00:00:0e, port_mask = 0x1
mcast, vid = 1, addr = 01:80:c2:00:00:00, port_mask = 0x1
mcast, vid = 1, addr = 01:80:c2:00:00:03, port_mask = 0x1
mcast, vid = 1, addr = 01:80:c2:00:00:0e, port_mask = 0x1
mcast, vid = 2, addr = 33:33:ff:47:7d:9d, port_mask = 0x1
mcast, vid = 2, addr = 33:33:00:00:00:fb, port_mask = 0x1
mcast, vid = 2, addr = 33:33:00:01:00:03, port_mask = 0x1
mcast, vid = 1, addr = 33:33:ff:47:7d:9c, port_mask = 0x1
mcast, vid = 1, addr = 33:33:00:00:00:fb, port_mask = 0x1
mcast, vid = 1, addr = 33:33:00:01:00:03, port_mask = 0x1
mcast, vid = 1, addr = 01:00:5e:00:00:fb, port_mask = 0x1
mcast, vid = 1, addr = 01:00:5e:00:00:fc, port_mask = 0x1
vlan , vid = 100, untag_force = 0x0, reg_mcast = 0x5, mem_list = 0x5
ucast, vid = 100, addr = 74:da:ea:47:7d:9d, persistant, port_num = 0x0
mcast, vid = 100, addr = ff:ff:ff:ff:ff:ff, port_mask = 0x1
mcast, vid = 2, addr = 01:1b:19:00:00:00, port_mask = 0x1
^^^
Here mcast entry (ptpl2), has to be added only for vlan 100
but added for reserved vlan 2...that's not enough.
<---- After this patchset ---->
vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5, mem_list = 0x5
mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, port_mask = 0x1
ucast, vid = 2, addr = 74:da:ea:47:7d:9d, persistant, port_num = 0x0
vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0, mem_list = 0x7
mcast, vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x1
mcast, vid = 2, addr = 01:00:5e:00:00:01, port_mask = 0x1
vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3, mem_list = 0x3
mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, port_mask = 0x1
ucast, vid = 1, addr = 74:da:ea:47:7d:9c, persistant, port_num = 0x0
mcast, vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x1
mcast, vid = 1, addr = 01:00:5e:00:00:01, port_mask = 0x1
mcast, vid = 2, addr = 01:80:c2:00:00:00, port_mask = 0x1
mcast, vid = 2, addr = 01:80:c2:00:00:03, port_mask = 0x1
mcast, vid = 2, addr = 01:80:c2:00:00:0e, port_mask = 0x1
mcast, vid = 1, addr = 01:80:c2:00:00:00, port_mask = 0x1
mcast, vid = 1, addr = 01:80:c2:00:00:03, port_mask = 0x1
mcast, vid = 1, addr = 01:80:c2:00:00:0e, port_mask = 0x1
mcast, vid = 2, addr = 33:33:ff:47:7d:9d, port_mask = 0x1
mcast, vid = 1, addr = 33:33:ff:47:7d:9c, port_mask = 0x1
mcast, vid = 2, addr = 33:33:00:00:00:fb, port_mask = 0x1
mcast, vid = 2, addr = 33:33:00:01:00:03, port_mask = 0x1
mcast, vid = 1, addr = 33:33:00:00:00:fb, port_mask = 0x1
mcast, vid = 1, addr = 33:33:00:01:00:03, port_mask = 0x1
vlan , vid = 100, untag_force = 0x0, reg_mcast = 0x5, mem_list = 0x5
ucast, vid = 100, addr = 74:da:ea:47:7d:9d, persistant, port_num = 0x0
mcast, vid = 100, addr = ff:ff:ff:ff:ff:ff, port_mask = 0x1
mcast, vid = 100, addr = 33:33:00:00:00:01, port_mask = 0x1
mcast, vid = 100, addr = 01:00:5e:00:00:01, port_mask = 0x1
mcast, vid = 100, addr = 33:33:ff:47:7d:9d, port_mask = 0x1
mcast, vid = 100, addr = 01:80:c2:00:00:00, port_mask = 0x1
mcast, vid = 100, addr = 01:80:c2:00:00:03, port_mask = 0x1
mcast, vid = 100, addr = 01:80:c2:00:00:0e, port_mask = 0x1
mcast, vid = 100, addr = 33:33:00:00:00:fb, port_mask = 0x1
mcast, vid = 100, addr = 33:33:00:01:00:03, port_mask = 0x1
mcast, vid = 100, addr = 01:1b:19:00:00:00, port_mask = 0x1
^^^
Here mcast entry (ptpl2), is added only for vlan 100
as it should be.
Based on net-next/master
v2..v1:
net: ethernet: ti: cpsw: fix vlan mcast
- removed limit for legacy switch cpsw mode
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Thu, 8 Nov 2018 20:27:57 +0000 (22:27 +0200)]
net: ethernet: ti: cpsw: fix vlan configuration while down/up
The vlan configuration is not restored after interface donw/up sequence
(if dual-emac - both interfaces). Tested on am572x EVM.
Steps to check:
~# ip link add link eth1 name eth1.100 type vlan id 100
~# ifconfig eth0 down
~# ifconfig eth1 down
Try to remove vid and observe warning:
~# ip link del eth1.100
[ 739.526757] net eth1: removing vlanid 100 from vlan filter
[ 739.533322] failed to kill vid 0081/100 for device eth1
This patch fixes it, restoring only vlan ALE entries and all other
unicast/multicast entries are restored by system calling rx_mode ndo.
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Thu, 8 Nov 2018 20:27:56 +0000 (22:27 +0200)]
net: ethernet: ti: cpsw: fix vlan mcast
At this moment, mcast addresses are added for real device only
(reserved vlans for dual-emac mode), even if a mcast address was added
for some vlan only, thus ALE doesn't have corresponding vlan mcast
entries after vlan socket joined multicast group. So ALE drops vlan
frames with mcast addresses intended for vlans and potentially can
receive mcast frames for base ndev. That's not correct. So, fix it by
creating only vlan/mcast entries as requested. Patch doesn't use any
additional lists and is based on device mc address list and cpsw ALE
table entries.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Thu, 8 Nov 2018 20:27:55 +0000 (22:27 +0200)]
net: 8021q: vlan_core: allow use list of vlans for real device
It's redundancy for the drivers to hold the list of vlans when
absolutely the same list exists in vlan core. In most cases it's
needed only to traverse the vlan devices, their vids and sync some
settings with h/w, so add API to simplify this.
At least some of these drivers also can benefit:
grep "for_each.*vid" -r drivers/net/ethernet/
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:
drivers/net/ethernet/synopsys/dwc-xlgmac-hw.c:
drivers/net/ethernet/qlogic/qlge/qlge_main.c:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:
drivers/net/ethernet/via/via-rhine.c:
drivers/net/ethernet/via/via-velocity.c:
drivers/net/ethernet/intel/igb/igb_main.c:
drivers/net/ethernet/intel/ice/ice_main.c:
drivers/net/ethernet/intel/e1000/e1000_main.c:
drivers/net/ethernet/intel/i40e/i40e_main.c:
drivers/net/ethernet/intel/e1000e/netdev.c:
drivers/net/ethernet/intel/igbvf/netdev.c:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:
drivers/net/ethernet/intel/ixgb/ixgb_main.c:
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:
drivers/net/ethernet/amd/xgbe/xgbe-dev.c:
drivers/net/ethernet/emulex/benet/be_main.c:
drivers/net/ethernet/neterion/vxge/vxge-main.c:
drivers/net/ethernet/adaptec/starfire.c:
drivers/net/ethernet/brocade/bna/bnad.c:
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Thu, 8 Nov 2018 20:27:54 +0000 (22:27 +0200)]
net: core: dev_addr_lists: add auxiliary func to handle reference address updates
In order to avoid all table update, and only remove or add new
address, the auxiliary function exists, named __hw_addr_sync_dev().
It allows end driver do nothing when nothing changed and add/rm when
concrete address is firstly added or lastly removed. But it doesn't
include cases when an address of real device or vlan was reused by
other vlans or vlan/macval devices.
For handaling events when address was reused/unreused the patch adds
new auxiliary routine - __hw_addr_ref_sync_dev(). It allows to do
nothing when nothing was changed and do updates only for an address
being added/reused/deleted/unreused. Thus, clone address changes for
vlans can be mirrored in the table. The function is exclusive with
__hw_addr_sync_dev(). It's responsibility of the end driver to
identify address vlan device, if it needs so.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Thu, 8 Nov 2018 19:47:19 +0000 (19:47 +0000)]
sfc: use the new __netdev_tx_sent_queue BQL optimisation
As added in
3e59020abf0f ("net: bql: add __netdev_tx_sent_queue()"), which
see for performance rationale.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Thu, 8 Nov 2018 19:38:26 +0000 (20:38 +0100)]
net: smsc95xx: Fix MTU range
The commit
f77f0aee4da4 ("net: use core MTU range checking in USB NIC
drivers") introduce a common MTU handling for usbnet. But it's missing
the necessary changes for smsc95xx. So set the MTU range accordingly.
This patch has been tested on a Raspberry Pi 3.
Fixes:
f77f0aee4da4 ("net: use core MTU range checking in USB NIC drivers")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 03:49:32 +0000 (19:49 -0800)]
Merge branch 'net-Remove-VLAN_TAG_PRESENT-from-drivers'
Michał Mirosław says:
====================
net: Remove VLAN_TAG_PRESENT from drivers
This series removes VLAN_TAG_PRESENT use from network drivers in
preparation to removing its special meaning.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 17:44:50 +0000 (18:44 +0100)]
gianfar: remove use of VLAN_TAG_PRESENT
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 17:44:50 +0000 (18:44 +0100)]
OVS: remove use of VLAN_TAG_PRESENT
This is a minimal change to allow removing of VLAN_TAG_PRESENT.
It leaves OVS unable to use CFI bit, as fixing this would need
a deeper surgery involving userspace interface.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 17:44:50 +0000 (18:44 +0100)]
cnic: remove use of VLAN_TAG_PRESENT
This just removes VLAN_TAG_PRESENT use. VLAN TCI=0 special meaning is
deeply embedded in the driver code and so is left as is.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Thu, 8 Nov 2018 17:44:49 +0000 (18:44 +0100)]
i40iw: remove use of VLAN_TAG_PRESENT
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thor Thayer [Thu, 8 Nov 2018 17:42:14 +0000 (11:42 -0600)]
net: stmmac: Fix RX packet size > 8191
Ping problems with packets > 8191 as shown:
PING 192.168.1.99 (192.168.1.99) 8150(8178) bytes of data.
8158 bytes from 192.168.1.99: icmp_seq=1 ttl=64 time=0.669 ms
wrong data byte 8144 should be 0xd0 but was 0x0
16 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f
%< ---------------snip--------------------------------------
8112 b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
8144 0 0 0 0 d0 d1
^^^^^^^
Notice the 4 bytes of 0 before the expected byte of d0.
Databook notes that the RX buffer must be a multiple of 4/8/16
bytes [1].
Update the DMA Buffer size define to 8188 instead of 8192. Remove
the -1 from the RX buffer size allocations and use the new
DMA Buffer size directly.
[1] Synopsys DesignWare Cores Ethernet MAC Universal v3.70a
[section 8.4.2 - Table 8-24]
Tested on SoCFPGA Stratix10 with ping sweep from 100 to 8300 byte packets.
Fixes:
286a83721720 ("stmmac: add CHAINED descriptor mode support (V4)")
Suggested-by: Jose Abreu <jose.abreu@synopsys.com>
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilias Apalodimas [Thu, 8 Nov 2018 15:19:55 +0000 (17:19 +0200)]
net: socionext: refactor netsec_alloc_dring()
return -ENOMEM directly instead of assigning it in a variable
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilias Apalodimas [Thu, 8 Nov 2018 15:19:54 +0000 (17:19 +0200)]
net: socionext: different approach on DMA
Current driver dynamically allocates an skb and maps it as DMA Rx
buffer. In order to prepare for upcoming XDP changes, let's introduce a
different allocation scheme.
Buffers are allocated dynamically and mapped into hardware.
During the Rx operation the driver uses build_skb() to produce the
necessary buffers for the network stack.
This change increases performance ~15% on 64b packets with smmu disabled
and ~5% with smmu enabled
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Thu, 8 Nov 2018 13:38:21 +0000 (14:38 +0100)]
net: qca_spi: Add available buffer space verification
Interferences on the SPI line could distort the response of
available buffer space. So at least we should check that the
response doesn't exceed the maximum available buffer space.
In error case increase a new error counter and retry it later.
This behavior avoids buffer errors in the QCA7000, which
results in an unnecessary chip reset including packet loss.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 03:38:19 +0000 (19:38 -0800)]
Merge branch 'qed-Slowpath-Queue-bug-fixes'
Denis Bolotin says:
====================
qed: Slowpath Queue bug fixes
This patch series fixes several bugs in the SPQ mechanism.
It deals with SPQ entries management, preventing resource leaks, memory
corruptions and handles error cases throughout the driver.
Please consider applying to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sagiv Ozeri [Thu, 8 Nov 2018 14:46:11 +0000 (16:46 +0200)]
qed: Fix potential memory corruption
A stuck ramrod should be deleted from the completion_pending list,
otherwise it will be added again in the future and corrupt the list.
Return error value to inform that ramrod is stuck and should be deleted.
Signed-off-by: Sagiv Ozeri <sagiv.ozeri@cavium.com>
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis Bolotin [Thu, 8 Nov 2018 14:46:10 +0000 (16:46 +0200)]
qed: Fix SPQ entries not returned to pool in error flows
qed_sp_destroy_request() API was added for SPQ users that need to
free/return the entry they acquired in their error flows.
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis Bolotin [Thu, 8 Nov 2018 14:46:09 +0000 (16:46 +0200)]
qed: Fix blocking/unlimited SPQ entries leak
When there are no SPQ entries left in the free_pool, new entries are
allocated and are added to the unlimited list. When an entry in the pool
is available, the content is copied from the original entry, and the new
entry is sent to the device. qed_spq_post() is not aware of that, so the
additional entry is stored in the original entry as p_post_ent, which can
later be returned to the pool.
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis Bolotin [Thu, 8 Nov 2018 14:46:08 +0000 (16:46 +0200)]
qed: Fix memory/entry leak in qed_init_sp_request()
Free the allocated SPQ entry or return the acquired SPQ entry to the free
list in error flows.
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Barmann [Thu, 8 Nov 2018 14:13:35 +0000 (08:13 -0600)]
sock: Reset dst when changing sk_mark via setsockopt
When setting the SO_MARK socket option, if the mark changes, the dst
needs to be reset so that a new route lookup is performed.
This fixes the case where an application wants to change routing by
setting a new sk_mark. If this is done after some packets have already
been sent, the dst is cached and has no effect.
Signed-off-by: David Barmann <david.barmann@stackpath.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 9 Nov 2018 01:34:27 +0000 (17:34 -0800)]
inet: frags: better deal with smp races
Multiple cpus might attempt to insert a new fragment in rhashtable,
if for example RPS is buggy, as reported by 배석진 in
https://patchwork.ozlabs.org/patch/994601/
We use rhashtable_lookup_get_insert_key() instead of
rhashtable_insert_fast() to let cpus losing the race
free their own inet_frag_queue and use the one that
was inserted by another cpu.
Fixes:
648700f76b03 ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: 배석진 <soukjin.bae@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 01:22:24 +0000 (17:22 -0800)]
Merge branch 's390-qeth-next'
Julian Wiedmann says:
====================
s390/qeth: updates 2018-11-08
please apply the following qeth patches to net-next.
The first patch allows one more device type to query the FW for a MAC address,
the others are all basically just removal of duplicated or unused code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:22 +0000 (15:06 +0100)]
s390/qeth: don't process hsuid in qeth_l3_setup_netdev()
qeth_l3_setup_netdev() checks if the hsuid attribute is set on the qeth
device, and propagates it to the net_device. In the past this was needed
to pick up any hsuid that was set before allocation of the net_device.
With commit
d3d1b205e89f ("s390/qeth: allocate netdevice early") this
is no longer necessary, qeth_l3_dev_hsuid_store() always stores the
hsuid straight into dev->perm_addr.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:21 +0000 (15:06 +0100)]
s390/qeth: remove unused fallback in Layer3's MAC code
If the CREATE ADDR sent by qeth_l3_iqd_read_initial_mac() fails, its
callback sets a random MAC address on the net_device. The error then
propagates back, and qeth_l3_setup_netdev() bails out without
registering the net_device.
Any subsequent call to qeth_l3_setup_netdev() will then attempt a fresh
CREATE ADDR which either 1) also fails, or 2) sets a proper MAC address
on the net_device. Consequently, the net_device will never be registered
with a random MAC and we can drop the fallback code.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:20 +0000 (15:06 +0100)]
s390/qeth: remove two IPA command helpers
qeth_l3_send_ipa_arp_cmd() is merely a wrapper around
qeth_send_control_data() now. So push the length adjustment into
QETH_SETASS_BASE_LEN, and remove the wrapper. While at it, also remove
some redundant 0-initializations.
qeth_send_setassparms() requires that callers prepare their command
parameters, so that they can be copied into the parameter area in one
go. Skip the indirection, and just let callers set up the command
themselves.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:19 +0000 (15:06 +0100)]
s390/qeth: replace open-coded cmd setup
Call qeth_prepare_ipa_cmd() during setup of a new IPA cmd buffer, so
that it is used for all commands. Thus ARP and SNMP requests don't have
to do their own initialization.
This will now also set the proper MPC protocol version for SNMP requests
on L2 devices.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:18 +0000 (15:06 +0100)]
s390/qeth: remove card list
Re-implement the card-by-RDEV lookup by using device model concepts, and
remove the now redundant list of all qeth card instances in the system.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:17 +0000 (15:06 +0100)]
s390/qeth: unify transmit code
Since commit
82bf5c0867f6 ("s390/qeth: add support for IPv6 TSO"),
qeth_xmit() also knows how to build TSO packets and is practically
identical to qeth_l3_xmit().
Convert qeth_l3_xmit() into a thin wrapper that merely strips the
L2 header off a packet, and calls qeth_xmit() for the actual
TX processing.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:16 +0000 (15:06 +0100)]
s390/qeth: handle af_iucv skbs in qeth_l3_fill_header()
Filling the HW header from one single function will make it easier to
rip out all the duplicated transmit code in qeth_l3_xmit(). On top, this
saves one conditional branch in the TSO path.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 8 Nov 2018 14:06:15 +0000 (15:06 +0100)]
s390/qeth: utilize virtual MAC for Layer2 OSD devices
By default, READ MAC on a Layer2 OSD device returns the adapter's
burnt-in MAC address. Given the default scenario of many virtual devices
on the same adapter, qeth can't make any use of this address and
therefore skips the READ MAC call for this device type.
But in some configurations, the READ MAC command for a Layer2 OSD device
actually returns a pre-provisioned, virtual MAC address. So enable the
READ MAC code to detect this situation, and let the L2 subdriver
call READ MAC for OSD devices.
This also removes the QETH_LAYER2_MAC_READ flag, which protects L2
devices against calling READ MAC multiple times. Instead protect the
whole call to qeth_l2_request_initial_mac().
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Thu, 8 Nov 2018 12:40:20 +0000 (20:40 +0800)]
openvswitch: remove BUG_ON from get_dpdev
if local is NULL pointer, and the following access of local's
dev will trigger panic, which is same as BUG_ON
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Nov 2018 01:13:09 +0000 (17:13 -0800)]
Merge branch 'ICMP-error-handling-for-UDP-tunnels'
Stefano Brivio says:
====================
ICMP error handling for UDP tunnels
This series introduces ICMP error handling for UDP tunnels and
encapsulations and related selftests. We need to handle ICMP errors to
support PMTU discovery and route redirection -- this support is entirely
missing right now:
- patch 1/11 adds a socket lookup for UDP tunnels that use, by design,
the same destination port on both endpoints -- i.e. VXLAN and GENEVE
- patches 2/11 to 7/11 are specific to VxLAN and GENEVE
- patches 8/11 and 9/11 add infrastructure for lookup of encapsulations
where sent packets cannot be matched via receiving socket lookup, i.e.
FoU and GUE
- patches 10/11 and 11/11 are specific to FoU and GUE
v2: changes are listed in the single patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Thu, 8 Nov 2018 11:19:24 +0000 (12:19 +0100)]
selftests: pmtu: Introduce FoU and GUE PMTU exceptions tests
Introduce eight tests, for FoU and GUE, with IPv4 and IPv6 payload,
on IPv4 and IPv6 transport, that check that PMTU exceptions are created
with the right value when exceeding the MTU on a link of the path.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Thu, 8 Nov 2018 11:19:23 +0000 (12:19 +0100)]
fou, fou6: ICMP error handlers for FoU and GUE
As the destination port in FoU and GUE receiving sockets doesn't
necessarily match the remote destination port, we can't associate errors
to the encapsulating tunnels with a socket lookup -- we need to blindly
try them instead. This means we don't even know if we are handling errors
for FoU or GUE without digging into the packets.
Hence, implement a single handler for both, one for IPv4 and one for IPv6,
that will check whether the packet that generated the ICMP error used a
direct IP encapsulation or if it had a GUE header, and send the error to
the matching protocol handler, if any.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>