summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2013-03-28net: add ETH_P_802_3_MINSimon Horman
Add a new constant ETH_P_802_3_MIN, the minimum ethernet type for an 802.3 frame. Frames with a lower value in the ethernet type field are Ethernet II. Also update all the users of this value that David Miller and I could find to use the new constant. Also correct a bug in util.c. The comparison with ETH_P_802_3_MIN should be >= not >. As suggested by Jesse Gross. Compile tested only. Cc: David Miller <davem@davemloft.net> Cc: Jesse Gross <jesse@nicira.com> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: John W. Linville <linville@tuxdriver.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Bart De Schuymer <bart.de.schuymer@pandora.be> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: linux-bluetooth@vger.kernel.org Cc: netfilter-devel@vger.kernel.org Cc: bridge@lists.linux-foundation.org Cc: linux-wireless@vger.kernel.org Cc: linux1394-devel@lists.sourceforge.net Cc: linux-media@vger.kernel.org Cc: netdev@vger.kernel.org Cc: dev@openvswitch.org Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-27metag: ptrace: Implement NT_METAG_TLSPaul Clothier
Implement functionality to get the TLS pointer for the metag architecture using regsets. This provides multi-threaded debug support for GDB. Signed-off-by: Paul Clothier <Paul.Clothier@imgtec.com> Signed-off-by: James Hogan <james.hogan@imgtec.com>
2013-03-26netlink: remove duplicated NLMSG_ALIGNHong zhi guo
NLMSG_HDRLEN is already aligned value. It's for directly reference without extra alignment. The redundant alignment here may confuse the API users. Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-26Merge tag 'arizona-extcon-asoc' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc into char-misc-next Mark writes: ASoC/extcon: arizona: Fix interaction between HPDET and headphone outputs This patch series covers both ASoC and extcon subsystems and fixes an interaction between the HPDET function and the headphone outputs - we really shouldn't run HPDET while the headphone is active. The first patch is a refactoring to make the extcon side easier.
2013-03-25USB: cdc-wdm: implement IOCTL_WDM_MAX_COMMANDBjørn Mork
Userspace applications need to know the maximum supported message size. The cdc-wdm driver translates between a character device stream and a message based protocol. Each message is transported as a usb control message with no further encapsulation or syncronization. Each read or write on the character device should translate to exactly one usb control message to ensure that message boundaries are kept intact. That means that the userspace application must know the maximum message size supported by the device and driver, making this size a vital part of the cdc-wdm character device API. CDC WDM and CDC MBIM functions export the maximum supported message size through CDC functional descriptors. The cdc-wdm and cdc_mbim drivers will parse these descriptors and use the value chosen by the device. The only current way for a userspace application to retrive the value is by duplicating the descriptor parsing. This is an unnecessary complex task, and application writers are likely to postpone it, using a fixed value and adding a "todo" item. QMI functions have no way to tell the host what message size they support. The qmi_wwan driver use a fixed value based on protocol recommendations and observed device behaviour. Userspace applications must know and hard code the same value. This scheme will break if we ever encounter a QMI device needing a device specific message size quirk. We are currently unable to support such a device because using a non default size would break the implicit userspace API. The message size is currently a hidden attribute of the cdc-wdm userspace API. Retrieving it is unnecessarily complex, increasing the possibility of drivers and applications using different limits. The resulting errors are hard to debug, and can only be replicated on identical hardware. Exporting the maximum message size from the driver simplifies the task for the userspace application, and creates a unified information source independent of device and function class. It also serves to document that the message size is part of the cdc-wdm userspace API. This proposed API extension has been presented for the authors of userspace applications and libraries using the current API: libmbim, libqmi, uqmi, oFono and ModemManager. The replies were: Aleksander Morgado: "We do really need max message size for MBIM; and as you say, it may be good to have the max message size info also for QMI, so the new ioctl seems a good addition. So +1 from my side, for what it's worth." Dan Williams: "Yeah, +1 here. I'd prefer the sysfs file, but the fact that that doesn't work for fd passing pretty much kills it." No negative replies are so far received. Cc: Aleksander Morgado <aleksander@lanedo.com> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-25[media] v4l2: add new VIDIOC_DBG_G_CHIP_NAME ioctlHans Verkuil
Simplify the debugging ioctls by creating the VIDIOC_DBG_G_CHIP_NAME ioctl. This will eventually replace VIDIOC_DBG_G_CHIP_IDENT. Chip matching is done by the name or index of subdevices or an index to a bridge chip. Most of this can all be done automatically, so most drivers just need to provide get/set register ops. In particular, it is now possible to get/set subdev registers without requiring assistance of the bridge driver. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-03-24[media] videodev2.h: remove obsolete DV_PRESET APIHans Verkuil
This API is now obsolete and can be removed. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-03-24[media] v4l2-ctrls: add V4L2_CID_MPEG_VIDEO_REPEAT_SEQ_HEADER controlHans Verkuil
Control whether video sequence headers should be repeated. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-03-22timekeeping: Add CLOCK_TAI clockidJohn Stultz
This add a CLOCK_TAI clockid and the needed accessors. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2013-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Pull to get the thermal netlink multicast group name fix, otherwise the assertion added in net-next to netlink to detect that kind of bug makes systems unbootable for some folks. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21[media] media: add support for decoder as one of media entity typesManjunath Hadli
A lot of SOCs including Texas Instruments Davinci family mainly use video decoders as input devices. This patch adds a flag 'MEDIA_ENT_T_V4L2_SUBDEV_DECODER' media entity type for decoder's. Along side updates the documentation for this media entity type. Signed-off-by: Manjunath Hadli <manjunath.hadli@ti.com> Signed-off-by: Lad, Prabhakar <prabhakar.lad@ti.com> Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-03-21netlink: Diag core and basic socket info dumping (v2)Andrey Vagin
The netlink_diag can be built as a module, just like it's done in unix sockets. The core dumping message carries the basic info about netlink sockets: family, type and protocol, portis, dst_group, dst_portid, state. Groups can be received as an optional parameter NETLINK_DIAG_GROUPS. Netlink sockets cab be filtered by protocols. The socket inode number and cookie is reserved for future per-socket info retrieving. The per-protocol filtering is also reserved for future by requiring the sdiag_protocol to be zero. The file /proc/net/netlink doesn't provide enough information for dumping netlink sockets. It doesn't provide dst_group, dst_portid, groups above 32. v2: fix NETLINK_DIAG_MAX. Now it's equal to the last constant. Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21net: fix *_DIAG_MAX constantsAndrey Vagin
Follow the common pattern and define *_DIAG_MAX like: [...] __XXX_DIAG_MAX, }; Because everyone is used to do: struct nlattr *attrs[XXX_DIAG_MAX+1]; nla_parse([...], XXX_DIAG_MAX, [...] Reported-by: Thomas Graf <tgraf@suug.ch> Cc: "David S. Miller" <davem@davemloft.net> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21Merge remote-tracking branch 'upstream/master' into queueMarcelo Tosatti
Merge reason: From: Alexander Graf <agraf@suse.de> "Just recently this really important patch got pulled into Linus' tree for 3.9: commit 1674400aaee5b466c595a8fc310488263ce888c7 Author: Anton Blanchard <anton <at> samba.org> Date: Tue Mar 12 01:51:51 2013 +0000 Without that commit, I can not boot my G5, thus I can't run automated tests on it against my queue. Could you please merge kvm/next against linus/master, so that I can base my trees against that?" * upstream/master: (653 commits) PCI: Use ROM images from firmware only if no other ROM source available sparc: remove unused "config BITS" sparc: delete "if !ULTRA_HAS_POPULATION_COUNT" KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED inet: limit length of fragment queue hash table bucket lists qeth: Fix scatter-gather regression qeth: Fix invalid router settings handling qeth: delay feature trace sgy-cts1000: Remove __dev* attributes KVM: x86: fix deadlock in clock-in-progress request handling KVM: allow host header to be included even for !CONFIG_KVM hwmon: (lm75) Fix tcn75 prefix hwmon: (lm75.h) Update header inclusion MAINTAINERS: Remove Mark M. Hoffman xfs: ensure we capture IO errors correctly xfs: fix xfs_iomap_eof_prealloc_initial_size type ... Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-20Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2013-03-20connector: Added coredumping event to the process connectorJesper Derehag
Process connector can now also detect coredumping events. Main aim of patch is get notified at start of coredumping, instead of having to wait for it to finish and then being notified through EXIT event. Could be used for instance by process-managers that want to get notified as soon as possible about process failures, and not necessarily beeing notified after coredump, which could be in the order of minutes depending on size of coredump, piping and so on. Signed-off-by: Jesper Derehag <jderehag@hotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20filter: add ANC_PAY_OFFSET instruction for loading payload start offsetDaniel Borkmann
It is very useful to do dynamic truncation of packets. In particular, we're interested to push the necessary header bytes to the user space and cut off user payload that should probably not be transferred for some reasons (e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET, we can load it into the accumulator, and return it. E.g. in bpfc syntax ... ld #poff ; { 0x20, 0, 0, 0xfffff034 }, ret a ; { 0x16, 0, 0, 0x00000000 }, ... as a filter will accomplish this without having to do a big hackery in a BPF filter itself. Follow-up JIT implementations are welcome. Thanks to Eric Dumazet for suggesting and discussing this during the Netfilter Workshop in Copenhagen. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Pull in the 'net' tree to get Daniel Borkmann's flow dissector infrastructure change. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20caif_virtio: Introduce caif over virtioErwan Yvin
Add the CAIF Virtio shared memory driver for talking to a modem. This CAIF Link layer communicates to the modem over shared memory. It is implemented as a virtio_driver. The underlying virtio device is managed by the remoteproc framework. The Virtio queue is used for transmitting data to the modem, and the new vringh is used for receiving data. Genalloc is used for managing the shared memory used for TX data. The default dma-alloc-coherent allocator can only allocate whole pages, and this wastes too much shared memory. Flow control is implemented by stopping the TX-queues if the virtio queues go full or we run out of memory. Queued are reopened when queues are below the watermark. NAPI is used in RX path, and a dedicated tasklet is used for releasing TX buffers. Signed-off-by: Erwan Yvin <erwan.yvin@stericsson.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor fixes)
2013-03-19smack: SMACK_MAGIC to include/uapi/linux/magic.hJarkko Sakkinen
SMACK_MAGIC moved to a proper place for easy user space access (i.e. libsmack). Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@iki.fi>
2013-03-19packet: packet fanout rollover during socket overloadWillem de Bruijn
Changes: v3->v2: rebase (no other changes) passes selftest v2->v1: read f->num_members only once fix bug: test rollover mode + flag Minimize packet drop in a fanout group. If one socket is full, roll over packets to another from the group. Maintain flow affinity during normal load using an rxhash fanout policy, while dispersing unexpected traffic storms that hit a single cpu, such as spoofed-source DoS flows. Rollover breaks affinity for flows arriving at saturated sockets during those conditions. The patch adds a fanout policy ROLLOVER that rotates between sockets, filling each socket before moving to the next. It also adds a fanout flag ROLLOVER. If passed along with any other fanout policy, the primary policy is applied until the chosen socket is full. Then, rollover selects another socket, to delay packet drop until the entire system is saturated. Probing sockets is not free. Selecting the last used socket, as rollover does, is a greedy approach that maximizes chance of success, at the cost of extreme load imbalance. In practice, with sufficiently long queues to absorb bursts, sockets are drained in parallel and load balance looks uniform in `top`. To avoid contention, scales counters with number of sockets and accesses them lockfree. Values are bounds checked to ensure correctness. Tested using an application with 9 threads pinned to CPUs, one socket per thread and sufficient busywork per packet operation to limits each thread to handling 32 Kpps. When sent 500 Kpps single UDP stream packets, a FANOUT_CPU setup processes 32 Kpps in total without this patch, 270 Kpps with the patch. Tested with read() and with a packet ring (V1). Also, passes psock_fanout.c unit test added to selftests. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17Merge tag 'v3.9-rc3' into nextDmitry Torokhov
Merge with mainline to bring in module_platform_driver_probe() and devm_ioremap_resource().
2013-03-17tcp: Remove TCPCTChristoph Paasch
TCPCT uses option-number 253, reserved for experimental use and should not be used in production environments. Further, TCPCT does not fully implement RFC 6013. As a nice side-effect, removing TCPCT increases TCP's performance for very short flows: Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests for files of 1KB size. before this patch: average (among 7 runs) of 20845.5 Requests/Second after: average (among 7 runs) of 21403.6 Requests/Second Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17vxlan: generalize forwarding tablesDavid Stevens
This patch generalizes VXLAN forwarding table entries allowing an administrator to: 1) specify multiple destinations for a given MAC 2) specify alternate vni's in the VXLAN header 3) specify alternate destination UDP ports 4) use multicast MAC addresses as fdb lookup keys 5) specify multicast destinations 6) specify the outgoing interface for forwarded packets The combination allows configuration of more complex topologies using VXLAN encapsulation. Changes since v1: rebase to 3.9.0-rc2 Signed-Off-By: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-15Drivers: hv: Add a new driver to support host initiated backupK. Y. Srinivasan
This driver supports host initiated backup of the guest. On Windows guests, the host can generate application consistent backups using the Windows VSS framework. On Linux, we ensure that the backup will be file system consistent. This driver allows the host to initiate a "Freeze" operation on all the mounted file systems in the guest. Once the mounted file systems in the guest are frozen, the host snapshots the guest's file systems. Once this is done, the guest's file systems are "thawed". This driver has a user-level component (daemon) that invokes the appropriate operation on all the mounted file systems in response to the requests from the host. The duration for which the guest is frozen is very short - a few seconds. During this interval, the diff disk is comitted. In this version of the patch I have addressed the feedback from Olaf Herring. Also, some of the connector related issues have been fixed. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-13Merge branch 'akpm' (fixes from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: - A bunch of fixes - Finish off the idr API conversions before someone starts to use the old interfaces again. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: idr: idr_alloc() shouldn't trigger lowmem warning when preloaded UAPI: fix endianness conditionals in M32R's asm/stat.h UAPI: fix endianness conditionals in linux/raid/md_p.h UAPI: fix endianness conditionals in linux/acct.h UAPI: fix endianness conditionals in linux/aio_abi.h decompressors: fix typo "POWERPC" mm/fremap.c: fix oops on error path idr: deprecate idr_pre_get() and idr_get_new[_above]() tidspbridge: convert to idr_alloc() zcache: convert to idr_alloc() mlx4: remove leftover idr_pre_get() call workqueue: convert to idr_alloc() nfsd: convert to idr_alloc() nfsd: remove unused get_new_stid() kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER signal: always clear sa_restorer on execve mm: remove_memory(): fix end_pfn setting include/linux/res_counter.h needs errno.h
2013-03-13UAPI: fix endianness conditionals in linux/raid/md_p.hDavid Howells
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be compared against __BYTE_ORDER in preprocessor conditionals where these are exposed to userspace (that is they're not inside __KERNEL__ conditionals). However, in the main kernel the norm is to check for "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and this has incorrectly leaked into the userspace headers. The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in this way. Note that userspace will likely interpret the ordering of the fields incorrectly as the big-endian variant on a little-endian machines - depending on header inclusion order. [!!!] NOTE [!!!] This patch may adversely change the userspace API. It might be better to fix the ordering of events_hi, events_lo, cp_events_hi and cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13UAPI: fix endianness conditionals in linux/acct.hDavid Howells
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be compared against __BYTE_ORDER in preprocessor conditionals where these are exposed to userspace (that is they're not inside __KERNEL__ conditionals). However, in the main kernel the norm is to check for "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and this has incorrectly leaked into the userspace headers. The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way. Note that userspace will likely interpret this incorrectly as the big-endian variant on little-endian machines - depending on header inclusion order. [!!!] NOTE [!!!] This patch may adversely change the userspace API. It might be better to fix the value of ACCT_BYTEORDER. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13UAPI: fix endianness conditionals in linux/aio_abi.hDavid Howells
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be compared against __BYTE_ORDER in preprocessor conditionals where these are exposed to userspace (that is they're not inside __KERNEL__ conditionals). However, in the main kernel the norm is to check for "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and this has incorrectly leaked into the userspace headers. The definition of PADDED() in linux/aio_abi.h is wrong in this way. Note that userspace will likely interpret this and thus the order of fields in struct iocb incorrectly as the little-endian variant on big-endian machines - depending on header inclusion order. [!!!] NOTE [!!!] This patch may adversely change the userspace API. It might be better to fix the ordering of aio_key and aio_reserved1 in struct iocb. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12tty/serial: Add support for Altera serial portLey Foon Tan
Add support for Altera 8250/16550 compatible serial port. Signed-off-by: Ley Foon Tan <lftan@altera.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-12Input: add new keycodes for passenger control unitsDmitry Torokhov
Entertainment systems used in aircraft need additional keycodes for their Passenger Control Units, so let's add them. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-03-12tcp: TLP loss detection.Nandita Dukkipati
This is the second of the TLP patch series; it augments the basic TLP algorithm with a loss detection scheme. This patch implements a mechanism for loss detection when a Tail loss probe retransmission plugs a hole thereby masking packet loss from the sender. The loss detection algorithm relies on counting TLP dupacks as outlined in Sec. 3 of: http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01 The basic idea is: Sender keeps track of TLP "episode" upon retransmission of a TLP packet. An episode ends when the sender receives an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the episode. We want to make sure that before the episode ends the sender receives a "TLP dupack", indicating that the TLP retransmission was unnecessary, so there was no loss/hole that needed plugging. If the sender gets no TLP dupack before the end of the episode, then it reduces ssthresh and the congestion window, because the TLP packet arriving at the receiver probably plugged a hole. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12tcp: Tail loss probe (TLP)Nandita Dukkipati
This patch series implement the Tail loss probe (TLP) algorithm described in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The first patch implements the basic algorithm. TLP's goal is to reduce tail latency of short transactions. It achieves this by converting retransmission timeouts (RTOs) occuring due to tail losses (losses at end of transactions) into fast recovery. TLP transmits one packet in two round-trips when a connection is in Open state and isn't receiving any ACKs. The transmitted packet, aka loss probe, can be either new or a retransmission. When there is tail loss, the ACK from a loss probe triggers FACK/early-retransmit based fast recovery, thus avoiding a costly RTO. In the absence of loss, there is no change in the connection state. PTO stands for probe timeout. It is a timer event indicating that an ACK is overdue and triggers a loss probe packet. The PTO value is set to max(2*SRTT, 10ms) and is adjusted to account for delayed ACK timer when there is only one oustanding packet. TLP Algorithm On transmission of new data in Open state: -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms). -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms) -> PTO = min(PTO, RTO) Conditions for scheduling PTO: -> Connection is in Open state. -> Connection is either cwnd limited or no new data to send. -> Number of probes per tail loss episode is limited to one. -> Connection is SACK enabled. When PTO fires: new_segment_exists: -> transmit new segment. -> packets_out++. cwnd remains same. no_new_packet: -> retransmit the last segment. Its ACK triggers FACK or early retransmit based recovery. ACK path: -> rearm RTO at start of ACK processing. -> reschedule PTO if need be. In addition, the patch includes a small variation to the Early Retransmit (ER) algorithm, such that ER and TLP together can in principle recover any N-degree of tail loss through fast recovery. TLP is controlled by the same sysctl as ER, tcp_early_retrans sysctl. tcp_early_retrans==0; disables TLP and ER. ==1; enables RFC5827 ER. ==2; delayed ER. ==3; TLP and delayed ER. [DEFAULT] ==4; TLP only. The TLP patch series have been extensively tested on Google Web servers. It is most effective for short Web trasactions, where it reduced RTOs by 15% and improved HTTP response time (average by 6%, 99th percentile by 10%). The transmitted probes account for <0.5% of the overall transmissions. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-11VFIO-AER: Vfio-pci driver changes for supporting AERVijay Mohan Pandarathil
- New VFIO_SET_IRQ ioctl option to pass the eventfd that is signaled when an error occurs in the vfio_pci_device - Register pci_error_handler for the vfio_pci driver - When the device encounters an error, the error handler registered by the vfio_pci driver gets invoked by the AER infrastructure - In the error handler, signal the eventfd registered for the device. - This results in the qemu eventfd handler getting invoked and appropriate action taken for the guest. Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil@hp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-03-10NFC: llcp: Service Name Lookup netlink interfaceThierry Escande
This adds a netlink interface for service name lookup support. Multiple URIs can be passed nested into the NFC_ATTR_LLC_SDP attribute using the NFC_CMD_LLC_SDREQ netlink command. When the SNL reply is received, a NFC_EVENT_LLC_SDRES event is sent to the user space. URI and SAP tuples are passed back, nested into NFC_ATTR_LLC_SDP attribute. Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-03-10NFC: llcp: Implement socket optionsSamuel Ortiz
Some LLCP services (e.g. the validation ones) require some control over the LLCP link parameters like the receive window (RW) or the MIU extension (MIUX). This can only be done through socket options. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-03-08VSOCK: Split vm_sockets.h into kernel/uapiAndy King
Split the vSockets header into kernel and UAPI parts. The former gets the bits that used to be in __KERNEL__ guards, while the latter gets everything that is user-visible. Tested by compiling vsock (+transport) and a simple user-mode vSockets application. Reported-by: David Howells <dhowells@redhat.com> Acked-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andy King <acking@vmware.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06htb: add HTB_DIRECT_QLEN attributeEric Dumazet
HTB uses an internal pfifo queue, which limit is not reported to userland tools (tc), and value inherited from device tx_queue_len at setup time. Introduce TCA_HTB_DIRECT_QLEN attribute to allow finer control. Remove two obsolete pr_err() calls as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06nl80211: user_mpm overrides auto_open_plinksThomas Pedersen
If the user requested a userspace MPM, automatically disable auto_open_plinks to fully disable the kernel MPM. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06nl80211: explicit userspace MPMThomas Pedersen
Secure mesh had the implicit requirement that the Mesh Peering Management entity be in userspace. However userspace might want to implement an open MPM as well, so specify a mesh setup parameter to indicate this. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06cfg80211: Extend support for IEEE 802.11r Fast BSS TransitionJouni Malinen
Add NL80211_CMD_UPDATE_FT_IES to support update of FT IEs to the WLAN driver and NL80211_CMD_FT_EVENT to send FT events from the WLAN driver. This will carry the target AP's MAC address along with the relevant Information Elements. This event is used to report received FT IEs (MDIE, FTIE, RSN IE, TIE, RICIE). These changes allow FT to be supported with drivers that use an internal SME instead of user space option (like FT implementation in wpa_supplicant with mac80211-based drivers). Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06cfg80211: add ability to override VHT capabilitiesJohannes Berg
For testing it's sometimes useful to be able to override certain VHT capability advertisement, add the ability to do that in cfg80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06nl80211: allow splitting wiphy information in dumpsJohannes Berg
The per-wiphy information is getting large, to the point where with more than the typical number of channels it's too large and overflows, and userspace can't get any of the information at all. To address this (in a way that doesn't require making all messages bigger) allow userspace to specify that it can deal with wiphy information split across multiple parts of the dump, and if it can split up the data. This also splits up each channel separately so an arbitrary number of channels can be supported. Additionally, since GET_WIPHY has the same problem, add support for filtering the wiphy dump and get information for a single wiphy only, this allows userspace apps to use dump in this case to retrieve all data from a single device. As userspace needs to know if all this this is supported, add a global nl80211 feature set and include a bit for this behaviour in it. Cc: Dennis H Jensen <dennis.h.jensen@siemens.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06cfg80211: comprehensively check station changesJohannes Berg
The station change API isn't being checked properly before drivers are called, and as a result it is difficult to see what should be allowed and what not. In order to comprehensively check the API parameters parse everything first, and then have the driver call a function (cfg80211_check_station_change()) with the additionally information about the kind of station that is being changed; this allows the function to make better decisions than the old code could. While at it, also add a few checks, particularly in mesh and clarify the TDLS station lifetime in documentation. To be able to reduce a few checks, ignore any flag set bits when the mask isn't set, they shouldn't be applied then. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06cfg80211: clean up mesh plink station change APIJohannes Berg
Make the ability to leave the plink_state unchanged not use a magic -1 variable that isn't in the enum, but an explicit change flag; reject invalid plink states or actions and move the needed constants for plink actions to the right header file. Also reject plink_state changes for non-mesh interfaces. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06xfrm: allow to avoid copying DSCP during encapsulationNicolas Dichtel
By default, DSCP is copying during encapsulation. Copying the DSCP in IPsec tunneling may be a bit dangerous because packets with different DSCP may get reordered relative to each other in the network and then dropped by the remote IPsec GW if the reordering becomes too big compared to the replay window. It is possible to avoid this copy with netfilter rules, but it's very convenient to be able to configure it for each SA directly. This patch adds a toogle for this purpose. By default, it's not set to maintain backward compatibility. Field flags in struct xfrm_usersa_info is full, hence I add a new attribute. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-03-05KVM: ioeventfd for virtio-ccw devices.Cornelia Huck
Enhance KVM_IOEVENTFD with a new flag that allows to attach to virtio-ccw devices on s390 via the KVM_VIRTIO_CCW_NOTIFY_BUS. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-05[media] s2255: convert to the control frameworkHans Verkuil
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-03-05[media] v4l: Define video buffer flag for the COPY timestamp typeKamil Debski
Define video buffer flag for the COPY timestamp. In this case the timestamp value is copied from the OUTPUT to the corresponding CAPTURE buffer. Signed-off-by: Kamil Debski <k.debski@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-03-05[media] bttv: convert to the control frameworkHans Verkuil
Note that the private chroma agc control has been replaced with the standard CHROMA_AGC control. Also fixes a mute/automute problem where closing the file handle would force mute on. That's not what you want since that would make the mute state out of sync with the mute control. Instead check against the user count. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>