summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-08-22ARM: shmobile: r8a7791: add missing 0x0100 for SDCKCRKuninori Morimoto
4bfb358b1d6cdeff8c6a13677f01ed78e9696b98 (ARM: shmobile: Add r8a7791 legacy SDHI clocks) added r8a7791 SDHI clock support. But, it is missing "0x0100: x 1/8" division ratio. This patch fixes hidden bug. It is based on R-Car H2 v0.7, R-Car M2 v0.9. Reported-by: Yusuke Goda <yusuke.goda.sx@renesas.com> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2014-08-22ARM: shmobile: r8a7790: add missing 0x0100 for SDCKCRKuninori Morimoto
9f13ee6f83c52065112d3e396e42e3780911ef53 (ARM: shmobile: r8a7790: add div4 clocks) added r8a7790 DIV4 clock support. But, it is missing "0x0100: x 1/8" division ratio. This patch fixes hidden bug. It is based on R-Car H2 v0.7, R-Car M2 v0.9. Reported-by: Yusuke Goda <yusuke.goda.sx@renesas.com> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2014-08-21bnx2x: Revert UNDI flushing mechanismYuval Mintz
Commit 91ebb929b6f8 ("bnx2x: Add support for Multi-Function UNDI") [which was later supposedly fixed by de682941eef3 ("bnx2x: Fix UNDI driver unload")] introduced a bug in which in some [yet-to-be-determined] scenarios the alternative flushing mechanism which was to guarantee the Rx buffers are empty before resetting them during device probe will fail. If this happens, when device will be loaded once more a fatal attention will occur; Since this most likely happens in boot from SAN scenarios, the machine will fail to load. Notice this may occur not only in the 'Multi-Function' scenario but in the regular scenario as well, i.e., this introduced a regression in the driver's ability to perform boot from SAN. The patch reverts the mechanism and applies the old scheme to multi-function devices as well as to single-function devices. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21Merge branch 'qlcnic'David S. Miller
Shahed Shaikh says: ==================== qlcnic: Bug fixes This series fixes some bugs related to endianess. Please apply this series to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21qlcnic: Fix endianess issue in firmware load from file operationShahed Shaikh
Firmware binary file is in little endian. On big-endian architecture, while writing this binary FW file to adapters memory, writel() swaps the data resulting into corruption of FW image. So, swap the data before writing into adapters memory. Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21qlcnic: Fix endianess issue in FW dump template headerRajesh Borundia
Firmware dump template header is read from adapter using readl() which swaps the data. So, adjust structure element on the boundary of 32bit dword. Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21qlcnic: Fix flash access interface to applicationJitendra Kalsaria
Application expects flash data in little endian, but driver reads/writes flash data using readl()/writel() APIs which swaps data on big endian machine. So, swap the data after reading from and before writing to flash memory. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driverAlan Ott
Alan is the original author of the driver. This change was discussed with the 802.15.4 subsystem maintainer, Alexander Aring. Signed-off-by: Alan Ott <alan@signal11.us> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21macvlan: Allow setting multicast filter on all macvlan typesVlad Yasevich
Currently, macvlan code restricts multicast and unicast filter setting only to passthru devices. As a result, if a guest using macvtap wants to receive multicast traffic, it has to set IFF_ALLMULTI or IFF_PROMISC. This patch makes it possible to use the fdb interface to add multicast addresses to the filter thus allowing a guest to receive only targeted multicast traffic. CC: John Fastabend <john.r.fastabend@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Jason Wang <jasowang@redhat.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21packet: handle too big packets for PACKET_V3Eric Dumazet
af_packet can currently overwrite kernel memory by out of bound accesses, because it assumed a [new] block can always hold one frame. This is not generally the case, even if most existing tools do it right. This patch clamps too long frames as API permits, and issue a one time error on syslog. [ 394.357639] tpacket_rcv: packet too big, clamped from 5042 to 3966. macoff=82 In this example, packet header tp_snaplen was set to 3966, and tp_len was set to 5042 (skb->len) Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21MAINTAINERS: add entry for ec_bhf driverDariusz Marcinkiewicz
Added entry for ec_bhf driver. Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-21lec: Use rtnl lock/unlock when updating MTUchas williams - CONTRACTOR
The LECS response contains the MTU that should be used. Correctly synchronize with other layers when updating. Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-22Merge tag 'drm-intel-fixes-2014-08-21' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel Display fixes from Ville and Imre, all cc: stable. * tag 'drm-intel-fixes-2014-08-21' of git://anongit.freedesktop.org/drm-intel: drm/i915: don't try to retrain a DP link on an inactive CRTC drm/i915: make sure VDD is turned off during system suspend drm/i915: cancel hotplug and dig_port work during suspend and unload drm/i915: fix HPD IRQ reenable work cancelation drm/i915: take display port power domain in DP HPD handler drm/i915: Don't try to enable cursor from setplane when crtc is disabled drm/i915: Skip load detect when intel_crtc->new_enable==true drm/i915: Fix locking for intel_enable_pipe_a()
2014-08-22Merge branch 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linuxDave Airlie
more radeon fixes * 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux: Revert "drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe" drm/radeon: fix active_cu mask on SI and CIK after re-init (v3) drm/radeon: fix active cu count for SI and CIK drm/radeon: re-enable selective GPUVM flushing drm/radeon: Sync ME and PFP after CP semaphore waits v4 drm/radeon: fix display handling in radeon_gpu_reset drm/radeon: fix pm handling in radeon_gpu_reset drm/radeon: Only flush HDP cache for indirect buffers from userspace drm/radeon: properly document reloc priority mask
2014-08-21Merge branch 'for-3.17-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Nothing drastic but pushing out early due to build breakage in the new tegra platform. Additionally: - M550 tagged trim blacklist pattern is widened so that it matches the new 1TB model - three controller specific fixes" * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: widen Crucial M550 blacklist matching pata_scc: propagate return value of scc_wait_after_reset ata: ahci_tegra: Change include to fix compilation pata_samsung_cf: change ret type to signed ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host Controller driver.
2014-08-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - fixes for a couple potential memory corruption problems (the HW would have to be manufactured to be deliberately evil to trigger those) found by Ben Hawkes - fix for potential infinite loop when using sysfs interface of logitech driver, from Simon Wood - a couple more simple driver fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: fix a couple of off-by-ones HID: logitech: perform bounds checking on device_id early enough HID: logitech: fix bounds checking on LED report size HID: logitech: Prevent possibility of infinite loop when using /sys interface HID: rmi: print an error if F11 is not found instead of stopping the device HID: hid-sensor-hub: use devm_ functions consistently HID: huion: Use allocated buffer for DMA HID: huion: Fail on parameter retrieval errors
2014-08-21Merge tag 'sound-3.17-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A bunch of ASoC fixes with a few HD-audio fixes in this pull request. All fairly small, boring and device-specific fixes, in addition to MAINTAINERS update for better reviewing" * tag 'sound-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/hdmi - apply Valleyview fix-ups to Cherryview display codec ALSA: hda/hdmi - set depop_delay for haswell plus ALSA: hda - restore the gpio led after resume ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co ASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE ASoC: fsl-esai: Revert .xlate_tdm_slot_mask() support ASoC: mcasp: Fix implicit BLCK divider setting ASoC: arizona: Fix TDM slot length handling in arizona_hw_params ASoC: pcm512x: Correct Digital Playback control names ASoC: dapm: Fix uninitialized variable in snd_soc_dapm_get_enum_double() ASoC: Intel: Restore Baytrail ADSP streams only when ADSP was in reset ASoC: Intel: Wait Baytrail ADSP boot at resume_early stage ASoC: Intel: Merge Baytrail ADSP suspend_noirq into suspend_late MAINTAINERS: Add i.MX maintainers and paths to Freescale ASoC entry ASoC: Intel: Update Baytrail ADSP firmware name
2014-08-21Merge branch 'i2c/for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here is the fixup for the 'lowlight' of my last pull request. I2C is not selected anymore by I2C_ACPI. Instead, the code in question now depends on I2C=y. Also, Mika has agreed to support me and be the maintainer for I2C-ACPI related patches. Finally, a new-ID-patch came along last week" * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: add maintainer for ACPI parts of I2C i2c: i801: Add PCI ID for Intel Braswell i2c: rework kernel config I2C_ACPI
2014-08-21Merge tag 'please-pull-memfd_create' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 update from Tony Luck: "Add memfd_create syscall to ia64" * tag 'please-pull-memfd_create' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: [IA64] Wire up memfd_create() system call
2014-08-21Merge tag 'microblaze-3.17-rc2' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds
Pull microblaze update from Michal Simek: "Wire-up seccomp/getrandom/memfd_create syscalls" * tag 'microblaze-3.17-rc2' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Wire-up memfd_create syscall microblaze: Wire-up getrandom syscall microblaze: Wire-up seccomp syscall
2014-08-21f2fs: introduce need_do_checkpoint for readabilityChao Yu
This patch introduce need_do_checkpoint() to include numerous judgment condition for readability. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: fix incorrect calculation with total/free inode numChao Yu
Theoretically, our total inodes number is the same as total node number, but there are three node ids are reserved in f2fs, they are 0, 1 (node nid), and 2 (meta nid), and they should never be used by user, so our total/free inode number calculated in ->statfs is wrong. This patch indroduces F2FS_RESERVED_NODE_NUM and then fixes this issue by recalculating total/free inode number with the macro. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: remove rename and use rename2Jaegeuk Kim
Refer the following patch. commit 7177a9c4b509eb357cc450256bc3cf39f1a1e639 Author: Miklos Szeredi <mszeredi@suse.cz> Date: Wed Jul 23 15:15:30 2014 +0200 fs: call rename2 if exists Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: skip if inline_data was converted alreadyJaegeuk Kim
This patch checks inline_data one more time under the inode page lock whether its inline_data is converted or not. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: remove rewrite_node_pageJaegeuk Kim
I think we need to let the dirty node pages remain in the page cache instead of rewriting them in their places. So, after done with successful recovery, write_checkpoint will flush all of them through the normal write path. Through this, we can avoid potential error cases in terms of block allocation. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: avoid double lock in truncate_blocksJaegeuk Kim
The init_inode_metadata calls truncate_blocks when error is occurred. The callers holds f2fs_lock_op, so we should not call it again in truncate_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: prevent checkpoint during roll-forwardJaegeuk Kim
Any checkpoint should not be done during the core roll-forward procedure. Especially, it includes error cases too. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: add WARN_ON in f2fs_bug_onJaegeuk Kim
This patch adds WARN_ON when f2fs_bug_on is disable to see kernel messages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: handle EIO not to break fs consistencyJaegeuk Kim
There are two rules when EIO is occurred. 1. don't write any checkpoint data to preserve the previous checkpoint 2. don't lose the cached dentry/node/meta pages So, at first, this patch adds set_page_dirty in f2fs_write_end_io's failure. Then, writing checkpoint/dentry/node blocks is not allowed. Note that, for the data pages, we can't just throw away by redirtying them. Otherwise, kworker can fall into infinite loop to flush them. (Ref. xfstests/019) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21spi: davinci: fix SPI_NO_CS functionalityGrygorii Strashko
The driver should not touch CS lines if SPI_NO_CS flag is set. This patch fixes it as this functionality was broken accidentally by commit a88e34ea213e1b ("spi: davinci: add support to configure gpio cs through dt"). Fixes: a88e34ea213e1b ("spi: davinci: add support to configure gpio cs through dt") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Mark Brown <broonie@linaro.org>
2014-08-21cifs: remove unneeded check of null checking in if conditionNamjae Jeon
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-08-21cifs: fix a possible use of uninit variable in SMB2_sess_setupNamjae Jeon
In case of error, goto ssetup_exit can be hit and we could end up using uninitialized value of resp_buftype Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-08-21cifs: fix memory leak when password is supplied multiple timesNamjae Jeon
Unlikely but possible. When password is supplied multiple times, we have to free the previous allocation. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-08-21cifs: fix a possible null pointer deref in decode_ascii_ssetupNamjae Jeon
When kzalloc fails, we will end up doing NULL pointer derefrence Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-08-21f2fs: check s_dirty under cp_mutexJaegeuk Kim
It needs to check s_dirty under cp_mutex, since s_dirty is reset under that mutex. And previous condition was not correct, since we can omit doing checkpoint when checkpoint was done followed by all the node pages were written back. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: unlock_page when node page is redirtied outJaegeuk Kim
This patch fixes missing unlock_page when a node page is redirtied out. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: introduce f2fs_cp_error for readabilityJaegeuk Kim
This patch adds f2fs_cp_error for readability. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: give a chance to mount again when encountering errorsJaegeuk Kim
This patch gives another chance to try mount process when we encounter an error. This makes an effect on the roll-forward recovery failures as well. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: trigger release_dirty_inode in f2fs_put_superJaegeuk Kim
The generic_shutdown_super calls sync_filesystem, evict_inode, and then f2fs_put_super. In f2fs_evict_inode, we remain some dirty inode information so we should release them at f2fs_put_super. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21HID: fix a couple of off-by-onesJiri Kosina
There are a few very theoretical off-by-one bugs in report descriptor size checking when performing a pre-parsing fixup. Fix those. Cc: stable@vger.kernel.org Reported-by: Ben Hawkes <hawkes@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-08-21HID: logitech: perform bounds checking on device_id early enoughJiri Kosina
device_index is a char type and the size of paired_dj_deivces is 7 elements, therefore proper bounds checking has to be applied to device_index before it is used. We are currently performing the bounds checking in logi_dj_recv_add_djhid_device(), which is too late, as malicious device could send REPORT_TYPE_NOTIF_DEVICE_UNPAIRED early enough and trigger the problem in one of the report forwarding functions called from logi_dj_raw_event(). Fix this by performing the check at the earliest possible ocasion in logi_dj_raw_event(). Cc: stable@vger.kernel.org Reported-by: Ben Hawkes <hawkes@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-08-21HID: logitech: fix bounds checking on LED report sizeJiri Kosina
The check on report size for REPORT_TYPE_LEDS in logi_dj_ll_raw_request() is wrong; the current check doesn't make any sense -- the report allocated by HID core in hid_hw_raw_request() can be much larger than DJREPORT_SHORT_LENGTH, and currently logi_dj_ll_raw_request() doesn't handle this properly at all. Fix the check by actually trimming down the report size properly if it is too large. Cc: stable@vger.kernel.org Reported-by: Ben Hawkes <hawkes@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-08-21ARM: dts: set 'ti,set-rate-parent' for dpll4_m5x2 clockStefan Herbrechtsmeier
Set 'ti,set-rate-parent' property for the dpll4_m5x2_ck clock, which is used for the ISP functional clock. This fixes the OMAP3 ISP driver's clock rate configuration on OMAP34xx, which needs the rate to be propagated properly to the divider node (dpll4_m5_ck). Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Tony Lindgren <tony@atomide.com> Cc: Tero Kristo <t-kristo@ti.com> Cc: <linux-media@vger.kernel.org> Cc: <linux-omap@vger.kernel.org> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Tero Kristo <t-kristo@ti.com>
2014-08-21Btrfs: fix filemap_flush call in btrfs_file_releaseChris Mason
We should only be flushing on close if the file was flagged as needing it during truncate. I broke this with my ordered data vs transaction commit deadlock fix. Thanks to Miao Xie for catching this. Signed-off-by: Chris Mason <clm@fb.com> Reported-by: Miao Xie <miaox@cn.fujitsu.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
2014-08-21Btrfs: fix crash on endio of reading corrupted blockLiu Bo
The crash is ------------[ cut here ]------------ kernel BUG at fs/btrfs/extent_io.c:2124! [...] Workqueue: btrfs-endio normal_work_helper [btrfs] RIP: 0010:[<ffffffffa02d6055>] [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs] This is in fact a regression. It is because we forgot to increase @offset properly in reading corrupted block, so that the @offset remains, and this leads to checksum errors while reading left blocks queued up in the same bio, and then ends up with hiting the above BUG_ON. Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-08-21btrfs: fix leak in qgroup_subtree_accounting() error pathEric Sandeen
Coverity pointed this out; in the newly added qgroup_subtree_accounting(), if btrfs_find_all_roots() returns an error, we leak at least the parents pointer, and possibly the roots pointer, depending on what failure occurs. If btrfs_find_all_roots() returns an error, we need to free up all allocations before we return. "roots" is initialized to NULL, so it should be safe to free it unconditionally (ulist_free() handles that case). Cc: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Chris Mason <clm@fb.com>
2014-08-21btrfs: Use right extent length when inserting overlap extent map.Qu Wenruo
When current btrfs finds that a new extent map is going to be insereted but failed with -EEXIST, it will try again to insert the extent map but with the length of sectorsize. This is OK if we don't enable 'no-holes' feature since all extent space is continuous, we will not go into the not found->insert routine. But if we enable 'no-holes' feature, it will make things out of control. e.g. in 4K sectorsize, we pass the following args to btrfs_get_extent(): btrfs_get_extent() args: start: 27874 len 4100 28672 27874 28672 27874+4100 32768 |-----------------------| |---------hole--------------------|---------data----------| 1) not found and insert Since no extent map containing the range, btrfs_get_extent() will go into the not_found and insert routine, which will try to insert the extent map (27874, 27847 + 4100). 2) first overlap But it overlaps with (28672, 32768) extent, so -EEXIST will be returned by add_extent_mapping(). 3) retry but still overlap After catching the -EEXIST, then btrfs_get_extent() will try insert it again but with 4K length, which still overlaps, so -EEXIST will be returned. This makes the following patch fail to punch hole. d77815461f047e561f77a07754ae923ade597d4e btrfs: Avoid trucating page or punching hole in a already existed hole. This patch will use the right length, which is the (exsisting->start - em->start) to insert, making the above patch works in 'no-holes' mode. Also, some small code style problems in above patch is fixed too. Reported-by: Filipe David Manana <fdmanana@gmail.com> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: Filipe David Manana <fdmanana@suse.com> Tested-by: Filipe David Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-08-21Btrfs: clone, don't create invalid hole extent mapFilipe Manana
When cloning a file that consists of an inline extent, we were creating an extent map that represents a non-existing trailing hole starting at a file offset that isn't a multiple of the sector size. This happened because when processing an inline extent we weren't aligning the extent's length to the sector size, and therefore incorrectly treating the range [inline_extent_length; sector_size[ as a hole. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-08-21Btrfs: don't monopolize a core when evicting inodeFilipe Manana
If an inode has a very large number of extent maps, we can spend a lot of time freeing them, which triggers a soft lockup warning. Therefore reschedule if we need to when freeing the extent maps while evicting the inode. I could trigger this all the time by running xfstests/generic/299 on a file system with the no-holes feature enabled. That test creates an inode with 11386677 extent maps. $ mkfs.btrfs -f -O no-holes $TEST_DEV $ MKFS_OPTIONS="-O no-holes" ./check generic/299 generic/299 382s ... Message from syslogd@debian-vm3 at Aug 7 10:44:29 ... kernel:[85304.208017] BUG: soft lockup - CPU#0 stuck for 22s! [umount:25330] 384s Ran: generic/299 Passed all 1 tests $ dmesg (...) [86304.300017] BUG: soft lockup - CPU#0 stuck for 23s! [umount:25330] (...) [86304.300036] Call Trace: [86304.300036] [<ffffffff81698ba9>] __slab_free+0x54/0x295 [86304.300036] [<ffffffffa02ee9cc>] ? free_extent_map+0x5c/0xb0 [btrfs] [86304.300036] [<ffffffff811a6cd2>] kmem_cache_free+0x282/0x2a0 [86304.300036] [<ffffffffa02ee9cc>] free_extent_map+0x5c/0xb0 [btrfs] [86304.300036] [<ffffffffa02e3775>] btrfs_evict_inode+0xd5/0x660 [btrfs] [86304.300036] [<ffffffff811e7c8d>] ? __inode_wait_for_writeback+0x6d/0xc0 [86304.300036] [<ffffffff816a389b>] ? _raw_spin_unlock+0x2b/0x40 [86304.300036] [<ffffffff811d8cbb>] evict+0xab/0x180 [86304.300036] [<ffffffff811d8dce>] dispose_list+0x3e/0x60 [86304.300036] [<ffffffff811d9b04>] evict_inodes+0xf4/0x110 [86304.300036] [<ffffffff811bd953>] generic_shutdown_super+0x53/0x110 [86304.300036] [<ffffffff811bdaa6>] kill_anon_super+0x16/0x30 [86304.300036] [<ffffffffa02a78ba>] btrfs_kill_super+0x1a/0xa0 [btrfs] [86304.300036] [<ffffffff811bd3a9>] deactivate_locked_super+0x59/0x80 [86304.300036] [<ffffffff811be44e>] deactivate_super+0x4e/0x70 [86304.300036] [<ffffffff811dec14>] mntput_no_expire+0x174/0x1f0 [86304.300036] [<ffffffff811deab7>] ? mntput_no_expire+0x17/0x1f0 [86304.300036] [<ffffffff811e0517>] SyS_umount+0x97/0x100 (...) Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Tested-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-08-21Btrfs: fix hole detection during file fsyncFilipe Manana
The file hole detection logic during a file fsync wasn't correct, because it didn't look back (in a previous leaf) for the last file extent item that can be in a leaf to the left of our leaf and that has a generation lower than the current transaction id. This made it assume that a hole exists when it really doesn't exist in the file. Such false positive hole detection happens in the following scenario: * We have a file that has many file extent items, covering 3 or more btree leafs (the first leaf must contain non file extent items too). * Two ranges of the file are modified, with their extent items being located at 2 different leafs and those leafs aren't consecutive. * When processing the second modified leaf, we weren't checking if some file extent item exists that is located in some leaf that is between our 2 modified leafs, and therefore assumed the range defined between the last file extent item in the first leaf and the first file extent item in the second leaf matched a hole. Fortunately this didn't result in overriding the log with wrong data, instead it made the last loop in copy_items() attempt to insert a duplicated key (for a hole file extent item), which makes the file fsync code return with -EEXIST to file.c:btrfs_sync_file() which in turn ends up doing a full transaction commit, which is much more expensive then writing only to the log tree and wait for it to be durably persisted (as well as the file's modified extents/pages). Therefore fix the hole detection logic, so that we don't pay the cost of doing full transaction commits. I could trigger this issue with the following test for xfstests (which never fails, either without or with this patch). The last fsync call results in a full transaction commit, due to the -EEXIST error mentioned above. I could also observe this behaviour happening frequently when running xfstests/generic/075 in a loop. Test: _cleanup() { _cleanup_flakey rm -fr $tmp } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _supported_fs btrfs _supported_os Linux _require_scratch _require_dm_flakey _need_to_be_root rm -f $seqres.full # Create a file with many file extent items, each representing a 4Kb extent. # These items span 3 btree leaves, of 16Kb each (default mkfs.btrfs leaf size # as of btrfs-progs 3.12). _scratch_mkfs -l 16384 >/dev/null 2>&1 _init_flakey SAVE_MOUNT_OPTIONS="$MOUNT_OPTIONS" MOUNT_OPTIONS="$MOUNT_OPTIONS -o commit=999" _mount_flakey # First fsync, inode has BTRFS_INODE_NEEDS_FULL_SYNC flag set. $XFS_IO_PROG -f -c "pwrite -S 0x01 -b 4096 0 4096" -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io # For any of the following fsync calls, inode doesn't have the flag # BTRFS_INODE_NEEDS_FULL_SYNC set. for ((i = 1; i <= 500; i++)); do OFFSET=$((4096 * i)) LEN=4096 $XFS_IO_PROG -c "pwrite -S 0x01 $OFFSET $LEN" -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io done # Commit transaction and bump next transaction's id (to 7). sync # Truncate will set the BTRFS_INODE_NEEDS_FULL_SYNC flag in the btrfs's # inode runtime flags. $XFS_IO_PROG -c "truncate 2048000" $SCRATCH_MNT/foo # Commit transaction and bump next transaction's id (to 8). sync # Touch 1 extent item from the first leaf and 1 from the last leaf. The leaf # in the middle, containing only file extent items, isn't touched. So the # next fsync, when calling btrfs_search_forward(), won't visit that middle # leaf. First and 3rd leaf have now a generation with value 8, while the # middle leaf remains with a generation with value 6. $XFS_IO_PROG \ -c "pwrite -S 0xee -b 4096 0 4096" \ -c "pwrite -S 0xff -b 4096 2043904 4096" \ -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io _load_flakey_table $FLAKEY_DROP_WRITES md5sum $SCRATCH_MNT/foo | _filter_scratch _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES # During mount, we'll replay the log created by the fsync above, and the file's # md5 digest should be the same we got before the unmount. _mount_flakey md5sum $SCRATCH_MNT/foo | _filter_scratch _unmount_flakey MOUNT_OPTIONS="$SAVE_MOUNT_OPTIONS" status=0 exit Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>