linux-bcache.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2013-01-22	net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly	Daniel Wagner
	Commit 6a328d8c6f03501657ad580f6f98bf9a42583ff7 changed the update logic for the socket but it does not update the SCM_RIGHTS update as well. This patch is based on the net_prio fix commit 48a87cc26c13b68f6cce4e9d769fcb17a6b3e4b8 net: netprio: fd passed in SCM_RIGHTS datagram not set correctly A socket fd passed in a SCM_RIGHTS datagram was not getting updated with the new tasks cgrp prioidx. This leaves IO on the socket tagged with the old tasks priority. To fix this add a check in the scm recvmsg path to update the sock cgrp prioidx with the new tasks value. Let's apply the same fix for net_cls. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Reported-by: Li Zefan <lizefan@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21	openvswitch: Move LRO check from transmit to receive.	Jesse Gross
	The check for LRO packets was incorrectly put in the transmit path instead of on receive. Since this check is supposed to protect OVS (and other parts of the system) from packets that it cannot handle it is obviously not useful on egress. Therefore, this commit moves it back to the receive side. The primary problem that this caused is upcalls to userspace tried to segment the packet even though no segmentation information is available. This would later cause NULL pointer dereferences when skb_gso_segment() did nothing. Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-01-21	ipv4: Add a socket release callback for datagram sockets	Steffen Klassert
	This implements a socket release callback function to check if the socket cached route got invalid during the time we owned the socket. The function is used from udp, raw and ping sockets. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21	ipv4: Invalidate the socket cached route on pmtu events if possible	Steffen Klassert
	The route lookup in ipv4_sk_update_pmtu() might return a route different from the route we cached at the socket. This is because standart routes are per cpu, so each cpu has it's own struct rtable. This means that we do not invalidate the socket cached route if the NET_RX_SOFTIRQ is not served by the same cpu that the sending socket uses. As a result, the cached route reused until we disconnect. With this patch we invalidate the socket cached route if possible. If the socket is owened by the user, we can't update the cached route directly. A followup patch will implement socket release callback functions for datagram sockets to handle this case. Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21	xfrm4: Invalidate all ipv4 routes on IPsec pmtu events	Steffen Klassert
	On IPsec pmtu events we can't access the transport headers of the original packet, so we can't find the socket that sent the packet. The only chance to notify the socket about the pmtu change is to force a relookup for all routes. This patch implenents this for the IPsec protocols. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-01-21	xfrm: fix freed block size calculation in xfrm_policy_fini()	Michal Kubecek
	Missing multiplication of block size by sizeof(struct hlist_head) can cause xfrm_hash_free() to be called with wrong second argument so that kfree() is called on a block allocated with vzalloc() or __get_free_pages() or free_pages() is called with wrong order when a namespace with enough policies is removed. Bug introduced by commit a35f6c5d, i.e. versions >= 2.6.29 are affected. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-01-20	net: splice: fix __splice_segment()	Eric Dumazet
	commit 9ca1b22d6d2 (net: splice: avoid high order page splitting) forgot that skb->head could need a copy into several page frags. This could be the case for loopback traffic mostly. Also remove now useless skb argument from linear_to_page() and __splice_segment() prototypes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-20	net: splice: avoid high order page splitting	Eric Dumazet
	splice() can handle pages of any order, but network code tries hard to split them in PAGE_SIZE units. Not quite successfully anyway, as __splice_segment() assumed poff < PAGE_SIZE. This is true for the skb->data part, not necessarily for the fragments. This patch removes this logic to give the pages as they are in the skb. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-20	tcp: fix incorrect LOCKDROPPEDICMPS counter	Eric Dumazet
	commit 563d34d057 (tcp: dont drop MTU reduction indications) added an error leading to incorrect accounting of LINUX_MIB_LOCKDROPPEDICMPS If socket is owned by the user, we want to increment this SNMP counter, unless the message is a (ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one. Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-18	ipv6: Add an error handler for icmp6	Steffen Klassert
	pmtu and redirect events are now handled in the protocols error handler, so add an error handler for icmp6 to do this. It is needed in the case when we have no socket context. Based on a patch by Duan Jiong. Reported-by: Duan Jiong <djduanjiong@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-18	net/xfrm/xfrm_replay: avoid division by zero	Nickolai Zeldovich
	All of the xfrm_replay->advance functions in xfrm_replay.c check if x->replay_esn->replay_window is zero (and return if so). However, one of them, xfrm_replay_advance_bmp(), divides by that value (in the '%' operator) before doing the check, which can potentially trigger a divide-by-zero exception. Some compilers will also assume that the earlier division means the value cannot be zero later, and thus will eliminate the subsequent zero check as dead code. This patch moves the division to after the check. Signed-off-by: Nickolai Zeldovich <nickolai@csail.mit.edu> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-01-17	sctp: refactor sctp_outq_teardown to insure proper re-initalization	Neil Horman
	Jamie Parsons reported a problem recently, in which the re-initalization of an association (The duplicate init case), resulted in a loss of receive window space. He tracked down the root cause to sctp_outq_teardown, which discarded all the data on an outq during a re-initalization of the corresponding association, but never reset the outq->outstanding_data field to zero. I wrote, and he tested this fix, which does a proper full re-initalization of the outq, fixing this problem, and hopefully future proofing us from simmilar issues down the road. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Jamie Parsons <Jamie.Parsons@metaswitch.com> Tested-by: Jamie Parsons <Jamie.Parsons@metaswitch.com> CC: Jamie Parsons <Jamie.Parsons@metaswitch.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17	libceph: pass num_op with ops	Alex Elder
	Both ceph_osdc_alloc_request() and ceph_osdc_build_request() are provided an array of ceph osd request operations. Rather than just passing the number of operations in the array, the caller is required append an additional zeroed operation structure to signal the end of the array. All callers know the number of operations at the time these functions are called, so drop the silly zero entry and supply that number directly. As a result, get_num_ops() is no longer needed. This also means that ceph_osdc_alloc_request() never uses its ops argument, so that can be dropped. Also rbd_create_rw_ops() no longer needs to add one to reserve room for the additional op. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: don't set pages or bio in ceph_osdc_alloc_request()	Alex Elder
	Only one of the two callers of ceph_osdc_alloc_request() provides page or bio data for its payload. And essentially all that function was doing with those arguments was assigning them to fields in the osd request structure. Simplify ceph_osdc_alloc_request() by having the caller take care of making those assignments Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: don't set flags in ceph_osdc_alloc_request()	Alex Elder
	The only thing ceph_osdc_alloc_request() really does with the flags value it is passed is assign it to the newly-created osd request structure. Do that in the caller instead. Both callers subsequently call ceph_osdc_build_request(), so have that function (instead of ceph_osdc_alloc_request()) issue a warning if a request comes through with neither the read nor write flags set. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: drop osdc from ceph_calc_raw_layout()	Alex Elder
	The osdc parameter to ceph_calc_raw_layout() is not used, so get rid of it. Consequently, the corresponding parameter in calc_layout() becomes unused, so get rid of that as well. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: drop snapid in ceph_calc_raw_layout()	Alex Elder
	A snapshot id must be provided to ceph_calc_raw_layout() even though it is not needed at all for calculating the layout. Where the snapshot id is needed is when building the request message for an osd operation. Drop the snapid parameter from ceph_calc_raw_layout() and pass that value instead in ceph_osdc_build_request(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: pass length to ceph_calc_file_object_mapping()	Alex Elder
	ceph_calc_file_object_mapping() takes (among other things) a "file" offset and length, and based on the layout, determines the object number ("bno") backing the affected portion of the file's data and the offset into that object where the desired range begins. It also computes the size that should be used for the request--either the amount requested or something less if that would exceed the end of the object. This patch changes the input length parameter in this function so it is used only for input. That is, the argument will be passed by value rather than by address, so the value provided won't get updated by the function. The value would only get updated if the length would surpass the current object, and in that case the value it got updated to would be exactly that returned in *oxlen. Only one of the two callers is affected by this change. Update ceph_calc_raw_layout() so it records any updated value. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: pass length to ceph_osdc_build_request()	Alex Elder
	The len argument to ceph_osdc_build_request() is set up to be passed by address, but that function never updates its value so there's no need to do this. Tighten up the interface by passing the length directly. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: kill op_needs_trail()	Alex Elder
	Since every osd message is now prepared to include trailing data, there's no need to check ahead of time whether any operations will make use of the trail portion of the message. We can drop the second argument to get_num_ops(), and as a result we can also get rid of op_needs_trail() which is no longer used. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: always allow trail in osd request	Alex Elder
	An osd request structure contains an optional trail portion, which if present will contain data to be passed in the payload portion of the message containing the request. The trail field is a ceph_pagelist pointer, and if null it indicates there is no trail. A ceph_pagelist structure contains a length field, and it can legitimately hold value 0. Make use of this to change the interpretation of the "trail" of an osd request so that every osd request has trailing data, it just might have length 0. This means we change the r_trail field in a ceph_osd_request structure from a pointer to a structure that is always initialized. Note that in ceph_osdc_start_request(), the trail pointer (or now address of that structure) is assigned to a ceph message's trail field. Here's why that's still OK (looking at net/ceph/messenger.c): - What would have resulted in a null pointer previously will now refer to a 0-length page list. That message trail pointer is used in two functions, write_partial_msg_pages() and out_msg_pos_next(). - In write_partial_msg_pages(), a null page list pointer is handled the same as a message with 0-length trail, and both result in a "in_trail" variable set to false. The trail pointer is only used if in_trail is true. - The only other place the message trail pointer is used is out_msg_pos_next(). That function is only called by write_partial_msg_pages() and only touches the trail pointer if the in_trail value it is passed is true. Therefore a null ceph_msg->trail pointer is equivalent to a non-null pointer referring to a 0-length page list structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	rbd: drop oid parameters from ceph_osdc_build_request()	Alex Elder
	The last two parameters to ceph_osd_build_request() describe the object id, but the values passed always come from the osd request structure whose address is also provided. Get rid of those last two parameters. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	libceph: reformat __reset_osd()	Alex Elder
	Reformat __reset_osd() into three distinct blocks of code handling the three return cases. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-17	crush: avoid recursion if we have already collided	Sage Weil
	This saves us some cycles, but does not affect the placement result at all. This corresponds to ceph.git commit 4abb53d4f. Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-17	libceph: for chooseleaf rules, retry CRUSH map descent from root if leaf is ↵	Jim Schutt
	failed Add libceph support for a new CRUSH tunable recently added to Ceph servers. Consider the CRUSH rule step chooseleaf firstn 0 type <node_type> This rule means that <n> replicas will be chosen in a manner such that each chosen leaf's branch will contain a unique instance of <node_type>. When an object is re-replicated after a leaf failure, if the CRUSH map uses a chooseleaf rule the remapped replica ends up under the <node_type> bucket that held the failed leaf. This causes uneven data distribution across the storage cluster, to the point that when all the leaves but one fail under a particular <node_type> bucket, that remaining leaf holds all the data from its failed peers. This behavior also limits the number of peers that can participate in the re-replication of the data held by the failed leaf, which increases the time required to re-replicate after a failure. For a chooseleaf CRUSH rule, the tree descent has two steps: call them the inner and outer descents. If the tree descent down to <node_type> is the outer descent, and the descent from <node_type> down to a leaf is the inner descent, the issue is that a down leaf is detected on the inner descent, so only the inner descent is retried. In order to disperse re-replicated data as widely as possible across a storage cluster after a failure, we want to retry the outer descent. So, fix up crush_choose() to allow the inner descent to return immediately on choosing a failed leaf. Wire this up as a new CRUSH tunable. Note that after this change, for a chooseleaf rule, if the primary OSD in a placement group has failed, choosing a replacement may result in one of the other OSDs in the PG colliding with the new primary. This requires that OSD's data for that PG to need moving as well. This seems unavoidable but should be relatively rare. This corresponds to ceph.git commit 88f218181a9e6d2292e2697fc93797d0f6d6e5dc. Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-17	ceph: re-calculate truncate_size for strip object	Yan, Zheng
	Otherwise osd may truncate the object to larger size. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-17	Merge branch 'master' of ↵	John W. Linville
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-01-17	ipv4: Don't update the pmtu on mtu locked routes	Steffen Klassert
	Routes with locked mtu should not use learned pmtu informations, so do not update the pmtu on these routes. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17	ipv4: Remove output route check in ipv4_mtu	Steffen Klassert
	The output route check was introduced with git commit 261663b0 (ipv4: Don't use the cached pmtu informations for input routes) during times when we cached the pmtu informations on the inetpeer. Now the pmtu informations are back in the routes, so this check is obsolete. It also had some unwanted side effects, as reported by Timo Teras and Lukas Tribus. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-17	ipv6: fix header length calculation in ip6_append_data()	Romain KUNTZ
	Commit 299b0767 (ipv6: Fix IPsec slowpath fragmentation problem) has introduced a error in the header length calculation that provokes corrupted packets when non-fragmentable extensions headers (Destination Option or Routing Header Type 2) are used. rt->rt6i_nfheader_len is the length of the non-fragmentable extension header, and it should be substracted to rt->dst.header_len, and not to exthdrlen, as it was done before commit 299b0767. This patch reverts to the original and correct behavior. It has been successfully tested with and without IPsec on packets that include non-fragmentable extensions headers. Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-16	mac80211: add encrypt headroom to PERR frames	Bob Copeland
	Mesh PERR action frames are robust and thus may be encrypted, so add proper head/tailroom to allow this. Fixes this warning when operating a Mesh STA on ath5k: WARNING: at net/mac80211/wpa.c:427 ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]() Call Trace: [<c011c5e7>] warn_slowpath_common+0x63/0x78 [<c011c60b>] warn_slowpath_null+0xf/0x13 [<e090621d>] ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211] [<e090685c>] ieee80211_crypto_ccmp_encrypt+0x1f/0x37 [mac80211] [<e0917113>] invoke_tx_handlers+0xcad/0x10bd [mac80211] [<e0917665>] ieee80211_tx+0x87/0xb3 [mac80211] [<e0918932>] ieee80211_tx_pending+0xcc/0x170 [mac80211] [<c0121c43>] tasklet_action+0x3e/0x65 Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16	mac80211: set NEED_TXPROCESSING for PERR frames	Bob Copeland
	A user reported warnings in ath5k due to transmitting frames with no rates set up. The frames were Mesh PERR frames, and some debugging showed an empty control block with just the vif pointer: > [ 562.522682] XXX txinfo: 00000000: 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 ................ > [ 562.522688] XXX txinfo: 00000010: 00 00 00 00 00 00 00 00 54 b8 f2 > db 00 00 00 00 ........T....... > [ 562.522693] XXX txinfo: 00000020: 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 ................ Set the IEEE80211_TX_INTFL_NEED_TXPROCESSING flag to ensure that rate control gets run before the frame is sent. Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16	mac80211: fix monitor mode injection	Felix Fietkau
	Channel contexts are not always used with monitor interfaces. If no channel context is set, use the oper channel, otherwise tx fails. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [check local->use_chanctx] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16	mac80211: synchronize scan off/on-channel and PS states	Stanislaw Gruszka
	Since: commit b23b025fe246f3acc2988eb6d400df34c27cb8ae Author: Ben Greear <greearb@candelatech.com> Date: Fri Feb 4 11:54:17 2011 -0800 mac80211: Optimize scans on current operating channel. we do not disable PS while going back to operational channel (on ieee80211_scan_state_suspend) and deffer that until scan finish. But since we are allowed to send frames, we can send a frame to AP without PM bit set, so disable PS on AP side. Then when we switch to off-channel (in ieee80211_scan_state_resume) we do not enable PS. Hence we are off-channel with PS disabled, frames are not buffered by AP. To fix remove offchannel_ps_disable argument and always enable PS when going off-channel and disable it when going on-channel, like it was before. Cc: stable@vger.kernel.org # 2.6.39+ Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-16	mac80211: fix FT roaming	Johannes Berg
	During FT roaming, wpa_supplicant attempts to set the key before association. This used to be rejected, but as a side effect of my commit 66e67e418908442389d3a9e ("mac80211: redesign auth/assoc") the key was accepted causing hardware crypto to not be used for it as the station isn't added to the driver yet. It would be possible to accept the key and then add it to the driver when the station has been added. However, this may run into issues with drivers using the state- based station adding if they accept the key only after association like it used to be. For now, revert to the behaviour from before the auth and assoc change. Cc: stable@vger.kernel.org Reported-by: Cédric Debarge <cedric.debarge@acksys.fr> Tested-by: Cédric Debarge <cedric.debarge@acksys.fr> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-14	Merge branch 'master' of git://1984.lsi.us.es/nf	David S. Miller
	Pablo Neira Ayuso says: ==================== The following patchset contains netfilter fixes for 3.8-rc3, they are: * fix possible BUG_ON if several netns are in use and the nf_conntrack module is removed, initial patch from Gao feng, final patch from myself. * fix unset return value if conntrack zone are disabled at compile-time, reported by Borislav Petkov, fix from myself. * fix display error message via dmesg for arp_tables, from Jan Engelhardt. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14	tcp: fix a panic on UP machines in reqsk_fastopen_remove	Eric Dumazet
	spin_is_locked() on a non !SMP build is kind of useless. BUG_ON(!spin_is_locked(xx)) is guaranteed to crash. Just remove this check in reqsk_fastopen_remove() as the callers do hold the socket lock. Reported-by: Ketan Kulkarni <ketkulka@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Dave Taht <dave.taht@gmail.com> Acked-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Fix regression allowing IP_TTL setting of zero, fix from Cong Wang. 2) Fix leak regressions in tunap, from Jason Wang. 3) be2net driver always returns IRQ_HANDLED in INTx handler, fix from Sathya Perla. 4) qlge doesn't really support NETIF_F_TSO6, don't set that flag. Fix from Amerigo Wang. 5) Add 802.11ad Atheros wil6210 driver, from Vladimir Kondratiev. 6) Fix MTU calculations in mac80211 layer, from T Krishna Chaitanya. 7) Station info layer of mac80211 needs to use del_timer_sync(), from Johannes Berg. 8) tcp_read_sock() can loop forever, because we don't immediately stop when recv_actor() returns zero. Fix from Eric Dumazet. 9) Fix WARN_ON() in tcp_cleanup_rbuf(). We have to use sk_eat_skb() in tcp_recv_skb() to handle the case where a large GRO packet is split up while it is use by a splice() operation. Fix also from Eric Dumazet. 10) addrconf_get_prefix_route() in ipv6 tests flags incorrectly, it does: if (X && (p->flags & Y) != 0) when it really meant to go: if (X && (p->flags & X) != 0) fix from Romain Kuntz. 11) Fix lost Kconfig dependency for bfin_mac driver hardware timestamping. From Lars-Peter Clausen. 12) Fix regression in handling of RST without ACK in TCP, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits) be2net: fix unconditionally returning IRQ_HANDLED in INTx tuntap: fix leaking reference count tuntap: forbid calling TUNSETIFF when detached tuntap: switch to use rtnl_dereference() net, wireless: overwrite default_ethtool_ops qlge: remove NETIF_F_TSO6 flag tcp: accept RST without ACK flag net: ethernet: xilinx: Do not use NO_IRQ in axienet net: ethernet: xilinx: Do not use axienet on PPC bnx2x: Allow management traffic after boot from SAN bnx2x: Fix fastpath structures when memory allocation fails bfin_mac: Restore hardware time-stamping dependency on BF518 tun: avoid owner checks on IFF_ATTACH_QUEUE bnx2x: move debugging code before the return tuntap: refuse to re-attach to different tun_struct ipv6: use addrconf_get_prefix_route for prefix route lookup [v2] ipv6: fix the noflags test in addrconf_get_prefix_route tcp: fix splice() and tcp collapsing interaction tcp: splice: fix an infinite loop in tcp_read_sock() net: prevent setting ttl=0 via IP_TTL ...
2013-01-13	netfilter: x_tables: print correct hook names for ARP	Jan Engelhardt
	arptables 0.0.4 (released on 10th Jan 2013) supports calling the CLASSIFY target, but on adding a rule to the wrong chain, the diagnostic is as follows: # arptables -A INPUT -j CLASSIFY --set-class 0:0 arptables: Invalid argument # dmesg \| tail -n1 x_tables: arp_tables: CLASSIFY target: used from hooks PREROUTING, but only usable from INPUT/FORWARD This is incorrect, since xt_CLASSIFY.c does specify (1 << NF_ARP_OUT) \| (1 << NF_ARP_FORWARD). This patch corrects the x_tables diagnostic message to print the proper hook names for the NFPROTO_ARP case. Affects all kernels down to and including v2.6.31. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-01-12	netfilter: nf_conntrack: fix BUG_ON while removing nf_conntrack with netns	Pablo Neira Ayuso
	canqun zhang reported that we're hitting BUG_ON in the nf_conntrack_destroy path when calling kfree_skb while rmmod'ing the nf_conntrack module. Currently, the nf_ct_destroy hook is being set to NULL in the destroy path of conntrack.init_net. However, this is a problem since init_net may be destroyed before any other existing netns (we cannot assume any specific ordering while releasing existing netns according to what I read in recent emails). Thanks to Gao feng for initial patch to address this issue. Reported-by: canqun zhang <canqunzhang@gmail.com> Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-01-11	net, wireless: overwrite default_ethtool_ops	Stanislaw Gruszka
	Since: commit 2c60db037034d27f8c636403355d52872da92f81 Author: Eric Dumazet <edumazet@google.com> Date: Sun Sep 16 09:17:26 2012 +0000 net: provide a default dev->ethtool_ops wireless core does not correctly assign ethtool_ops. After alloc_netdev*() call, some cfg80211 drivers provide they own ethtool_ops, but some do not. For them, wireless core provide generic cfg80211_ethtool_ops, which is assigned in NETDEV_REGISTER notify call: if (!dev->ethtool_ops) dev->ethtool_ops = &cfg80211_ethtool_ops; But after Eric's commit, dev->ethtool_ops is no longer NULL (on cfg80211 drivers without custom ethtool_ops), but points to &default_ethtool_ops. In order to fix the problem, provide function which will overwrite default_ethtool_ops and use it by wireless core. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-11	Merge tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds
	Pull NFS client bugfix from Trond Myklebust: - Fix a socket lock leak in net/sunrpc/xprt.c * tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Ensure we release the socket write lock if the rpc_task exits early
2013-01-10	tcp: accept RST without ACK flag	Eric Dumazet
	commit c3ae62af8e755 (tcp: should drop incoming frames without ACK flag set) added a regression on the handling of RST messages. RST should be allowed to come even without ACK bit set. We validate the RST by checking the exact sequence, as requested by RFC 793 and 5961 3.2, in tcp_validate_incoming() Reported-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10	nfs: fix sunrpc/clnt.c kernel-doc warnings	Randy Dunlap
	Fix new kernel-doc warnings in clnt.c: Warning(net/sunrpc/clnt.c:561): No description found for parameter 'flavor' Warning(net/sunrpc/clnt.c:561): Excess function parameter 'auth' description in 'rpc_clone_client_set_auth' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-10	ipv6: use addrconf_get_prefix_route for prefix route lookup [v2]	Romain Kuntz
	Replace ip6_route_lookup() with addrconf_get_prefix_route() when looking up for a prefix route. This ensures that the connected prefix is looked up in the main table, and avoids the selection of other matching routes located in different tables as well as blackhole or prohibited entries. In addition, this fixes an Opps introduced by commit 64c6d08e (ipv6: del unreachable route when an addr is deleted on lo), that would occur when a blackhole or prohibited entry is selected by ip6_route_lookup(). Such entries have a NULL rt6i_table argument, which is accessed by __ip6_del_rt() when trying to lock rt6i_table->tb6_lock. The function addrconf_is_prefix_route() is not used anymore and is removed. [v2] Minor indentation cleanup and log updates. Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10	ipv6: fix the noflags test in addrconf_get_prefix_route	Romain Kuntz
	The tests on the flags in addrconf_get_prefix_route() does no make much sense: the 'noflags' parameter contains the set of flags that must not match with the route flags, so the test must be done against 'noflags', and not against 'flags'. Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10	tcp: fix splice() and tcp collapsing interaction	Eric Dumazet
	Under unusual circumstances, TCP collapse can split a big GRO TCP packet while its being used in a splice(socket->pipe) operation. skb_splice_bits() releases the socket lock before calling splice_to_pipe(). [ 1081.353685] WARNING: at net/ipv4/tcp.c:1330 tcp_cleanup_rbuf+0x4d/0xfc() [ 1081.371956] Hardware name: System x3690 X5 -[7148Z68]- [ 1081.391820] cleanup rbuf bug: copied AD3BCF1 seq AD370AF rcvnxt AD3CF13 To fix this problem, we must eat skbs in tcp_recv_skb(). Remove the inline keyword from tcp_recv_skb() definition since it has three call sites. Reported-by: Christian Becker <c.becker@traviangames.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10	tcp: splice: fix an infinite loop in tcp_read_sock()	Eric Dumazet
	commit 02275a2ee7c0 (tcp: don't abort splice() after small transfers) added a regression. [ 83.843570] INFO: rcu_sched self-detected stall on CPU [ 83.844575] INFO: rcu_sched detected stalls on CPUs/tasks: { 6} (detected by 0, t=21002 jiffies, g=4457, c=4456, q=13132) [ 83.844582] Task dump for CPU 6: [ 83.844584] netperf R running task 0 8966 8952 0x0000000c [ 83.844587] 0000000000000000 0000000000000006 0000000000006c6c 0000000000000000 [ 83.844589] 000000000000006c 0000000000000096 ffffffff819ce2bc ffffffffffffff10 [ 83.844592] ffffffff81088679 0000000000000010 0000000000000246 ffff880c4b9ddcd8 [ 83.844594] Call Trace: [ 83.844596] [<ffffffff81088679>] ? vprintk_emit+0x1c9/0x4c0 [ 83.844601] [<ffffffff815ad449>] ? schedule+0x29/0x70 [ 83.844606] [<ffffffff81537bd2>] ? tcp_splice_data_recv+0x42/0x50 [ 83.844610] [<ffffffff8153beaa>] ? tcp_read_sock+0xda/0x260 [ 83.844613] [<ffffffff81537b90>] ? tcp_prequeue_process+0xb0/0xb0 [ 83.844615] [<ffffffff8153c0f0>] ? tcp_splice_read+0xc0/0x250 [ 83.844618] [<ffffffff814dc0c2>] ? sock_splice_read+0x22/0x30 [ 83.844622] [<ffffffff811b820b>] ? do_splice_to+0x7b/0xa0 [ 83.844627] [<ffffffff811ba4bc>] ? sys_splice+0x59c/0x5d0 [ 83.844630] [<ffffffff8119745b>] ? putname+0x2b/0x40 [ 83.844633] [<ffffffff8118bcb4>] ? do_sys_open+0x174/0x1e0 [ 83.844636] [<ffffffff815b6202>] ? system_call_fastpath+0x16/0x1b if recv_actor() returns 0, we should stop immediately, because looping wont give a chance to drain the pipe. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 patches from Martin Schwidefsky: "Add the finit_module system call, fix the irq statistics in /proc/stat, fix a s390dbf lockdep problem, a patch revert for a problem that is not 100% understood yet, and a few patches to fix warnings." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: define read*_relaxed functions s390/topology: export cpu_topology s390/pm: export pm_power_off s390/pci: define isa_dma_bridge_buggy s390/3215: partially revert tty close handling fix s390/irq: count cpu restart events s390/irq: remove split irq fields from /proc/stat s390/irq: enable irq sum accounting for /proc/stat again s390/syscalls: wire up finit_module syscall s390/pci: remove dead code s390/smp: fix section mismatch for smp_add_present_cpu() s390/debug: Fix s390dbf lockdep problem in debug_(un)register_view()
2013-01-10	netfilter: xt_CT: fix unset return value if conntrack zone are disabled	Pablo Neira Ayuso
	net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v1’: net/netfilter/xt_CT.c:250:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v0’: net/netfilter/xt_CT.c:112:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized] Reported-by: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>