summaryrefslogtreecommitdiff
path: root/drivers/pci/pci-sysfs.c
AgeCommit message (Collapse)Author
2025-06-04Merge branch 'pci/pm'Bjorn Helgaas
- Add pm_runtime_put() cleanup helper for use with __free() to automatically drop the device usage count when a pointer goes out of scope (Alex Williamson) - Increment PM usage counter when probing reset methods so we don't try to read config space of a powered-off device (Alex Williamson) - Set all devices to D0 during enumeration to ensure ACPI opregion is connected via _REG (Mario Limonciello) * pci/pm: PCI: Explicitly put devices into D0 when initializing PCI: Increment PM usage counter when probing reset methods PM: runtime: Define pm_runtime_put cleanup helper
2025-05-23PCI/AER: Add sysfs attributes for log ratelimitsJon Pan-Doh
Allow userspace to read/write log ratelimits per device (including enable/disable). Create aer/ sysfs directory to store them and any future AER configs. The new sysfs files are: /sys/bus/pci/devices/*/aer/correctable_ratelimit_burst /sys/bus/pci/devices/*/aer/correctable_ratelimit_interval_ms /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_burst /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_interval_ms The default values are ratelimit_burst=10, ratelimit_interval_ms=5000, so if we try to emit more than 10 messages in a 5 second period, some are suppressed. Update AER sysfs ABI filename to reflect the broader scope of AER sysfs attributes (e.g. stats and ratelimits). Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats -> sysfs-bus-pci-devices-aer Tested using aer-inject[1]. Configured correctable log ratelimit to 5. Sent 6 AER errors. Observed 5 errors logged while AER stats (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) shows 6. Disabled ratelimiting and sent 6 more AER errors. Observed all 6 errors logged and accounted in AER stats (12 total errors). [1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git [bhelgaas: note fatal errors are not ratelimited, "aer_report" -> "aer_info", replace ratelimit_log_enable toggle with *_ratelimit_interval_ms] Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com> Signed-off-by: Jon Pan-Doh <pandoh@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/20250522232339.1525671-21-helgaas@kernel.org
2025-04-23PCI: Increment PM usage counter when probing reset methodsAlex Williamson
We can get different results probing reset methods for a device depending on its power state. For example, reading the PM control register of a device in D3cold will always indicate NoSoftRst+ because we get ~0 data when the config read fails on PCI, preventing us from correctly probing PM reset support. Increment the PM usage counter before any probes and use the cleanup __free facility to automatically drop the usage counter out of scope. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250422230534.2295291-3-alex.williamson@redhat.com
2025-03-27Merge branch 'pci/resource'Bjorn Helgaas
- Use pci_resource_n() to simplify BAR/window resource lookup (Ilpo Järvinen) - Fix typo that repeatedly distributed resources to a bridge instead of iterating over subordinate bridges, which resulted in too little space to assign some BARs (Kai-Heng Feng) - Relax bridge window tail sizing for optional resources, e.g., IOV BARs, to avoid failures when removing and re-adding devices (Ilpo Järvinen) - Fix a double counting error for I/O resources, as we previously did for memory resources (Ilpo Järvinen) - Use resource_set_{range,size}() helpers in more places (Ilpo Järvinen) - Add pci_resource_is_iov() to identify IOV resources (Ilpo Järvinen) - Add pci_resource_num() to look up the BAR number from the resource pointer (Ilpo Järvinen) - Add restore_dev_resource() to simplify code that resources saved device resources (Ilpo Järvinen) - Allow drivers to enable devices even if we haven't assigned optional IOV resources to them (Ilpo Järvinen) - Improve debug output during resource reallocation (Ilpo Järvinen) - Rework handling of optional resources (IOV BARs, ROMs) to reduce failures if we can't allocate them (Ilpo Järvinen) - Move declarations of pci_rescan_bus_bridge_resize(), pci_reassign_bridge_resources(), and CardBus-related sizes from include/linux/pci.h to drivers/pci/pci.h since they're not used outside the PCI core (Ilpo Järvinen) - Make pci_setup_bridge() static (Ilpo Järvinen) - Fix a NULL dereference in the SR-IOV VF creation error path (Shay Drory) - Fix s390 mmio_read/write syscalls, which didn't cause page faults in some cases, which broke vfio-pci lazy mapping on first access (Niklas Schnelle) - Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which was disabled only for s390 (Niklas Schnelle) - Support mmap of PCI resources on s390 except for ISM devices (Niklas Schnelle) * pci/resource: s390/pci: Support mmap() of PCI resources except for ISM devices s390/pci: Introduce pdev->non_mappable_bars and replace VFIO_PCI_MMAP s390/pci: Fix s390_mmio_read/write syscall page fault handling PCI: Fix NULL dereference in SR-IOV VF creation error path PCI: Move cardbus IO size declarations into pci/pci.h PCI: Make pci_setup_bridge() static PCI: Move resource reassignment func declarations into pci/pci.h PCI: Move pci_rescan_bus_bridge_resize() declaration to pci/pci.h PCI: Fix BAR resizing when VF BARs are assigned PCI: Do not claim to release resource falsely PCI: Increase Resizable BAR support from 512 GB to 128 TB PCI: Rework optional resource handling PCI: Perform reset_resource() and build fail list in sync PCI: Use res->parent to check if resource is assigned PCI: Add debug print when releasing resources before retry PCI: Indicate optional resource assignment failures PCI: Always have realloc_head in __assign_resources_sorted() PCI: Extend enable to check for any optional resource PCI: Add restore_dev_resource() PCI: Remove incorrect comment from pci_reassign_resource() PCI: Consolidate assignment loop next round preparation PCI: Rename retval to ret PCI: Use while loop and break instead of gotos PCI: Refactor pdev_sort_resources() & __dev_sort_resources() PCI: Converge return paths in __assign_resources_sorted() PCI: Add dev & res local variables to resource assignment funcs PCI: Add pci_resource_num() helper PCI: Check resource_size() separately PCI: Add pci_resource_is_iov() to identify IOV resources PCI: Use resource_set_{range,size}() helpers PCI: Use SZ_* instead of literals in setup-bus.c PCI: Fix old_size lower bound in calculate_iosize() too PCI: Allow relaxed bridge window tail sizing for optional resources PCI: Simplify size1 assignment logic PCI: Use min_align, not unrelated add_align, for size0 PCI: Remove add_align overwrite unrelated to size0 PCI: Use downstream bridges for distributing resources PCI: Cleanup dev->resource + resno to use pci_resource_n()
2025-03-21PCI/DOE: Expose DOE features via sysfsAlistair Francis
PCIe r6.0 added support for Data Object Exchange (DOE). When DOE is supported, the DOE Discovery Feature must be implemented per PCIe r6.1, sec 6.30.1.1. DOE allows a requester to obtain information about the other DOE features supported by the device. The kernel already queries the DOE features supported and caches the values. Expose the values in sysfs to allow user space to determine which DOE features are supported by the PCIe device. By exposing the information to userspace, tools like lspci can relay the information to users. By listing all of the supported features we can allow userspace to parse the list, which might include vendor specific features as well as yet to be supported features. As the DOE Discovery feature must always be supported we treat it as a special named attribute case. This allows the usual PCI attribute_group handling to correctly create the doe_features directory when registering pci_doe_sysfs_group (otherwise it doesn't and sysfs_add_file_to_group() will seg fault). After this patch is supported you can see something like this when attaching a DOE device: $ ls /sys/devices/pci0000:00/0000:00:02.0//doe* 0001:01 0001:02 doe_discovery Link: https://lore.kernel.org/r/20250306075211.1855177-3-alistair@alistair23.me Signed-off-by: Alistair Francis <alistair@alistair23.me> [bhelgaas: drop pci_doe_sysfs_init() stub return, make DEVICE_ATTR_RO(doe_discovery) static] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-03-21s390/pci: Introduce pdev->non_mappable_bars and replace VFIO_PCI_MMAPNiklas Schnelle
The ability to map PCI resources to user-space is controlled by global defines. For vfio there is VFIO_PCI_MMAP which is only disabled on s390 and controls mapping of PCI resources using vfio-pci with a fallback option via the pread()/pwrite() interface. For the PCI core there is ARCH_GENERIC_PCI_MMAP_RESOURCE which enables a generic implementation for mapping PCI resources plus the newer sysfs interface. Then there is HAVE_PCI_MMAP which can be used with custom definitions of pci_mmap_resource_range() and the historical /proc/bus/pci interface. Both mechanisms are all or nothing. For s390 mapping PCI resources is possible and useful for testing and certain applications such as QEMU's vfio-pci based user-space NVMe driver. For certain devices, however access to PCI resources via mappings to user-space is not possible and these must be excluded from the general PCI resource mapping mechanisms. Introduce pdev->non_mappable_bars to indicate that a PCI device's BARs can not be accessed via mappings to user-space. In the future this enables per-device restrictions of PCI resource mapping. For now, set this flag for all PCI devices on s390 in line with the existing, general disable of PCI resource mapping. As s390 is the only user of the VFI_PCI_MMAP Kconfig options this can already be replaced with a check of this new flag. Also add similar checks in the other code protected by HAVE_PCI_MMAP respectively ARCH_GENERIC_PCI_MMAP in preparation for enabling these for supported devices. Link: https://lore.kernel.org/lkml/20250212132808.08dcf03c.alex.williamson@redhat.com/ Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-2-c5c0f1d26efd@linux.ibm.com Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-20PCI: Fix BAR resizing when VF BARs are assignedIlpo Järvinen
__resource_resize_store() attempts to release all resources of the device before attempting the resize. The loop, however, only covers standard BARs (< PCI_STD_NUM_BARS). If a device has VF BARs that are assigned, pci_reassign_bridge_resources() finds the bridge window still has some assigned child resources and returns -NOENT which makes pci_resize_resource() to detect an error and abort the resize. Change the release loop to cover all resources up to VF BARs which allows the resize operation to release the bridge windows and attempt to assigned them again with the different size. If SR-IOV is enabled, disallow resize as it requires releasing also IOV resources. Link: https://lore.kernel.org/r/20250320142837.8027-1-ilpo.jarvinen@linux.intel.com Fixes: 91fa127794ac ("PCI: Expose PCIe Resizable BAR support via sysfs") Reported-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2025-01-23Merge branch 'pci/misc'Bjorn Helgaas
- Constify struct bin_attribute for sysfs, VPD, P2PDMA, and the IBM ACPI hotplug driver (Thomas Weißschuh) - Update PCI_EXP_LNKCAP_SLS comment (Lukas Wunner) - Drop superfluous pm_wakeup.h include (Wolfram Sang) - Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT (Dongdong Zhang) - Correct documentation of the 'config_acs=' kernel parameter (Akihiko Odaki) * pci/misc: Documentation: Fix pci=config_acs= example PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT PCI: Don't include 'pm_wakeup.h' directly PCI: Update code comment on PCI_EXP_LNKCAP_SLS for PCIe r3.0 PCI/ACPI: Constify 'struct bin_attribute' PCI/P2PDMA: Constify 'struct bin_attribute' PCI/VPD: Constify 'struct bin_attribute' PCI/sysfs: Constify 'struct bin_attribute'
2025-01-15PCI/sysfs: Remove unnecessary zero in initializerIlpo Järvinen
Providing empty initializer for an array is enough to set its elements to zero. Thus, remove the redundant 0 from the initializer. Link: https://lore.kernel.org/r/20241028174046.1736-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-15PCI/sysfs: Use __free() in reset_method_store()Ilpo Järvinen
Use __free() from cleanup.h to handle freeing options in reset_method_store() as it simplifies the code flow. Link: https://lore.kernel.org/r/20241028174046.1736-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-15PCI/sysfs: Move reset related sysfs code to correct fileIlpo Järvinen
Most PCI sysfs code and structs are in a dedicated file but a few reset related things remain in pci.c. Move also them to pci-sysfs.c and drop pci_dev_reset_method_attr_is_visible() as it is 100% duplicate of pci_dev_reset_attr_is_visible(). Link: https://lore.kernel.org/r/20241028174046.1736-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2024-12-03PCI/sysfs: Constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Link: https://lore.kernel.org/r/20241202-sysfs-const-bin_attr-pci-v1-1-c32360f495a7@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-29Merge tag 'driver-core-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is a small set of driver core changes for 6.13-rc1. Nothing major for this merge cycle, except for the two simple merge conflicts are here just to make life interesting. Included in here are: - sysfs core changes and preparations for more sysfs api cleanups that can come through all driver trees after -rc1 is out - fw_devlink fixes based on many reports and debugging sessions - list_for_each_reverse() removal, no one was using it! - last-minute seq_printf() format string bug found and fixed in many drivers all at once. - minor bugfixes and changes full details in the shortlog" * tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits) Fix a potential abuse of seq_printf() format string in drivers cpu: Remove spurious NULL in attribute_group definition s390/con3215: Remove spurious NULL in attribute_group definition perf: arm-ni: Remove spurious NULL in attribute_group definition driver core: Constify bin_attribute definitions sysfs: attribute_group: allow registration of const bin_attribute firmware_loader: Fix possible resource leak in fw_log_firmware_info() drivers: core: fw_devlink: Fix excess parameter description in docstring driver core: class: Correct WARN() message in APIs class_(for_each|find)_device() cacheinfo: Use of_property_present() for non-boolean properties cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap() drivers: core: fw_devlink: Make the error message a bit more useful phy: tegra: xusb: Set fwnode for xusb port devices drm: display: Set fwnode for aux bus devices driver core: fw_devlink: Stop trying to optimize cycle detection logic driver core: Constify attribute arguments of binary attributes sysfs: bin_attribute: add const read/write callback variants sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR() sysfs: treewide: constify attribute callback of bin_attribute::llseek() sysfs: treewide: constify attribute callback of bin_attribute::mmap() ...
2024-11-13PCI: Add 'reset_subordinate' to reset hierarchy below bridgeKeith Busch
The "bus" and "cxl_bus" reset methods reset a device by asserting Secondary Bus Reset on the bridge leading to the device. These only work if the device is the only device below the bridge. Add a sysfs 'reset_subordinate' attribute on bridges that can assert Secondary Bus Reset regardless of how many devices are below the bridge. This resets all the devices below a bridge in a single command, including the locking and config space save/restore that reset methods normally do. This may be the only way to reset devices that don't support other reset methods (ACPI, FLR, PM reset, etc). Link: https://lore.kernel.org/r/20241025222755.3756162-1-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: commit log, add capable(CAP_SYS_ADMIN) check] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>
2024-11-05sysfs: treewide: constify attribute callback of bin_attribute::llseek()Thomas Weißschuh
The llseek() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-7-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05sysfs: treewide: constify attribute callback of bin_attribute::mmap()Thomas Weißschuh
The mmap() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-6-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05sysfs: treewide: constify attribute callback of bin_is_visible()Thomas Weißschuh
The is_bin_visible() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-5-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05PCI/sysfs: Calculate bin_attribute size through bin_size()Thomas Weißschuh
Stop abusing the is_bin_visible() callback to calculate the attribute size. Instead use the new, dedicated bin_size() one. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-3-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-09s390/pci: Stop usurping pdev->dev.groupsLukas Wunner
Bjorn suggests using pdev->dev.groups for attribute_groups constructed on PCI device enumeration: "Is it feasible to build an attribute group in pci_doe_init() and add it to dev->groups so device_add() will automatically add them?" https://lore.kernel.org/r/20231019165829.GA1381099@bhelgaas Unfortunately on s390, pcibios_device_add() usurps pdev->dev.groups for arch-specific attribute_groups, preventing its use for anything else. Introduce an ARCH_PCI_DEV_GROUPS macro which arches can define in <asm/pci.h>. The macro is visible in drivers/pci/pci-sysfs.c through the inclusion of <linux/pci.h>, which in turn includes <asm/pci.h>. On s390, define the macro to the three attribute_groups previously assigned to pdev->dev.groups. Thereby pdev->dev.groups is made available for use by the PCI core. As a side effect, arch/s390/pci/pci_sysfs.c no longer needs to be compiled into the kernel if CONFIG_SYSFS=n. Link: https://lore.kernel.org/r/7b970f7923e373d1b23784721208f93418720485.1722870934.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
2024-03-05PCI/sysfs: Demacrofy pci_dev_resource_resize_attr(n) functionsIlpo Järvinen
pci_dev_resource_resize_attr(n) macro is invoked for six resources, creating a large footprint function for each resource. Rework the macro to only create a function that calls a helper function so the compiler can decide if it warrants to inline the function or not. With x86_64 defconfig, this saves roughly 2.5kB: $ scripts/bloat-o-meter drivers/pci/pci-sysfs.o{.old,.new} add/remove: 1/0 grow/shrink: 0/6 up/down: 512/-2934 (-2422) Function old new delta __resource_resize_store - 512 +512 resource5_resize_store 503 14 -489 resource4_resize_store 503 14 -489 resource3_resize_store 503 14 -489 resource2_resize_store 503 14 -489 resource1_resize_store 503 14 -489 resource0_resize_store 500 11 -489 Total: Before=13399, After=10977, chg -18.08% (The compiler seemingly chose to still inline __resource_resize_show() which is fine, those functions are not very complex/large.) Link: https://lore.kernel.org/r/20240222114607.1837-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-03-05PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=yLukas Wunner
It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for space-constrained devices such as routers, such a configuration may actually make sense. However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled, unnecessarily increasing the kernel's size. To rectify that: * Move pci_mmap_fits() to mmap.c. It is not only needed by pci-sysfs.c, but also proc.c. * Move pci_dev_type to probe.c and make it private. It references pci_dev_attr_groups in pci-sysfs.c. Make that public instead for consistency with pci_dev_groups, pcibus_groups and pci_bus_groups, which are likewise public and referenced by struct definitions in pci-driver.c and probe.c. * Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty static inlines for pci_{create,remove}_legacy_files() and pci_{create,remove}_sysfs_dev_files(). Result: vmlinux size is reduced by 122996 bytes in my arm 32-bit test build. Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
2023-11-03Merge tag 'driver-core-6.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core updates for 6.7-rc1. Nothing major in here at all, just a small number of changes including: - minor cleanups and updates from Andy Shevchenko - __counted_by addition - firmware_loader update for aborting loads cleaner - other minor changes, details in the shortlog - documentation update All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (21 commits) firmware_loader: Abort all upcoming firmware load request once reboot triggered firmware_loader: Refactor kill_pending_fw_fallback_reqs() Documentation: security-bugs.rst: linux-distros relaxed their rules driver core: Release all resources during unbind before updating device links driver core: class: remove boilerplate code driver core: platform: Annotate struct irq_affinity_devres with __counted_by resource: Constify resource crosscheck APIs resource: Unify next_resource() and next_resource_skip_children() resource: Reuse for_each_resource() macro PCI: Implement custom llseek for sysfs resource entries kernfs: sysfs: support custom llseek method for sysfs entries debugfs: Fix __rcu type comparison warning device property: Replace custom implementation of COUNT_ARGS() drivers: base: test: Make property entry API test modular driver core: Add missing parameter description to __fwnode_link_add() device property: Clarify usage scope of some struct fwnode_handle members devres: rename the first parameter of devm_add_action(_or_reset) driver core: platform: Unify the firmware node type check driver core: platform: Use temporary variable in platform_device_add() driver core: platform: Refactor error path in a couple places ...
2023-10-28Merge branch 'pci/field-get'Bjorn Helgaas
- Use FIELD_GET()/FIELD_PREP() when possible throughout drivers/pci/ (Ilpo Järvinen, Bjorn Helgaas) - Rework DPC control programming for clarity (Ilpo Järvinen) * pci/field-get: PCI/portdrv: Use FIELD_GET() PCI/VC: Use FIELD_GET() PCI/PTM: Use FIELD_GET() PCI/PME: Use FIELD_GET() PCI/ATS: Use FIELD_GET() PCI/ATS: Show PASID Capability register width in bitmasks PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk PCI: Use FIELD_GET() PCI/MSI: Use FIELD_GET/PREP() PCI/DPC: Use defines with DPC reason fields PCI/DPC: Use defined fields with DPC_CTL register PCI/DPC: Use FIELD_GET() PCI: hotplug: Use FIELD_GET/PREP() PCI: dwc: Use FIELD_GET/PREP() PCI: cadence: Use FIELD_GET() PCI: Use FIELD_GET() to extract Link Width PCI: mvebu: Use FIELD_PREP() with Link Width PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fields # Conflicts: # drivers/pci/controller/dwc/pcie-tegra194.c
2023-10-28Merge branch 'pci/vga'Bjorn Helgaas
- Add pci_is_vga() helper, which checks for both PCI_CLASS_DISPLAY_VGA and PCI_CLASS_NOT_DEFINED_VGA (which catches ancient devices built before Class Codes were defined) (Sui Jingfeng) - Use the new pci_is_vga() to identify devices for the VGA arbiter, the sysfs "boot_vga" attribute, and the virtio and qxl drivers (SUi Jingfeng) * pci/vga: drm/qxl: Use pci_is_vga() to identify VGA devices drm/virtio: Use pci_is_vga() to identify VGA devices PCI/sysfs: Enable 'boot_vga' attribute via pci_is_vga() PCI/VGA: Select VGA devices earlier PCI/VGA: Use pci_is_vga() to identify VGA devices PCI: Add pci_is_vga() helper
2023-10-17PCI: Use FIELD_GET() to extract Link WidthIlpo Järvinen
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting. Link: https://lore.kernel.org/r/20230919125648.1920-7-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: drop duplicate include of <linux/bitfield.h>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2023-10-06PCI/sysfs: Enable 'boot_vga' attribute via pci_is_vga()Sui Jingfeng
Enable the 'boot_vga' sysfs attribute via pci_is_vga(). This exposes 'boot_vga' for old PCI_CLASS_NOT_DEFINED_VGA (0x0001) devices as well as for the PCI_CLASS_DISPLAY_VGA (0x0300) devices where it was previously exposed. Link: https://lore.kernel.org/r/20230830111532.444535-4-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: "Maciej W. Rozycki" <macro@orcam.me.uk>
2023-10-05PCI: Implement custom llseek for sysfs resource entriesValentine Sinitsyn
Since commit 636b21b50152 ("PCI: Revoke mappings like devmem"), mmappable sysfs entries have started to receive their f_mapping from the iomem pseudo filesystem, so that CONFIG_IO_STRICT_DEVMEM is honored in sysfs (and procfs) as well as in /dev/[k]mem. This resulted in a userspace-visible regression: 1. Open a sysfs PCI resource file (eg. /sys/bus/pci/devices/*/resource0) 2. Use lseek(fd, 0, SEEK_END) to determine its size Expected result: a PCI region size is returned. Actual result: 0 is returned. The reason is that PCI resource files residing in sysfs use generic_file_llseek(), which relies on f_mapping->host inode to get the file size. As f_mapping is now redefined, f_mapping->host points to an anonymous zero-sized iomem_inode which has nothing to do with sysfs file in question. Implement a custom llseek method for sysfs PCI resources, which is almost the same as proc_bus_pci_lseek() used for procfs entries. This makes sysfs and procfs entries consistent with regards to seeking, but also introduces userspace-visible changes to seeking PCI resources in sysfs: - SEEK_DATA and SEEK_HOLE are no longer supported; - Seeking past the end of the file is prohibited while previously offsets up to MAX_NON_LFS were accepted (reading from these offsets was always invalid). Signed-off-by: Valentine Sinitsyn <valesini@yandex-team.ru> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230925084013.309399-2-valesini@yandex-team.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-29PCI/sysfs: Protect driver's D3cold preference from user spaceLukas Wunner
struct pci_dev contains two flags which govern whether the device may suspend to D3cold: * no_d3cold provides an opt-out for drivers (e.g. if a device is known to not wake from D3cold) * d3cold_allowed provides an opt-out for user space (default is true, user space may set to false) Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), the user space setting overwrites the driver setting. Essentially user space is trusted to know better than the driver whether D3cold is working. That feels unsafe and wrong. Assume that the change was introduced inadvertently and do not overwrite no_d3cold when d3cold_allowed is modified. Instead, consider d3cold_allowed in addition to no_d3cold when choosing a suspend state for the device. That way, user space may opt out of D3cold if the driver hasn't, but it may no longer force an opt in if the driver has opted out. Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Link: https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # v4.8+
2023-07-18PCI/sysfs: Make I/O resource depend on HAS_IOPORTNiklas Schnelle
If legacy I/O spaces are not supported simply return an error when trying to access them via pci_resource_io(). This allows inb() and friends to become undefined when they are known at compile time to be non-functional in a later patch. Co-developed-by: Arnd Bergmann <arnd@kernel.org> Link: https://lore.kernel.org/r/20230703135255.2202721-3-schnelle@linux.ibm.com Signed-off-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-03-23driver core: bus: mark the struct bus_type for sysfs callbacks as constantGreg Kroah-Hartman
struct bus_type should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct bus_type to be moved to read-only memory. Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hu Haowen <src.res@email.cn> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Stuart Yoder <stuyoder@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl Reviewed-by: Alex Shi <alexs@kernel.org> Acked-by: Iwona Winiarska <iwona.winiarska@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-14Merge tag 'pci-v6.2-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and make more things static. - Make portdrv bind to Switch Ports that have AER. Previously, if these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant the Ports couldn't be suspended to low-power states. AER on these Ports doesn't use interrupts, and the AER driver doesn't need to claim them. - Assign PCI domain IDs using ida_alloc(), which makes host bridge add/remove work better. Resource management: - To work better with recent BIOSes that use EfiMemoryMappedIO for PCI host bridge apertures, remove those regions from the E820 map (E820 entries normally prevent us from allocating BARs). In v5.19, we added some quirks to disable E820 checking, but that's not very maintainable. EfiMemoryMappedIO means the OS needs to map the region for use by EFI runtime services; it shouldn't prevent OS from using it. PCIe native device hotplug: - Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4 PCIe tunneling depends on native PCIe hotplug. - Enable Command Completed Interrupt only if supported to avoid user confusion from lspci output that says this is enabled but not supported. - Prevent pciehp from binding to Switch Upstream Ports; this happened because of interaction with acpiphp and caused devices below the Upstream Port to disappear. Power management: - Convert AGP drivers to generic power management. We hope to remove legacy power management from the PCI core eventually. Virtualization: - Fix pci_device_is_present(), which previously always returned "false" for VFs, causing virtio hangs when unbinding the driver. Miscellaneous: - Convert drivers to gpiod API to prepare for dropping some legacy code. - Fix DOE fencepost error for the maximum data object length. Baikal-T1 PCIe controller driver: - Add driver and DT bindings. Broadcom STB PCIe controller driver: - Enable Multi-MSI. - Delay 100ms after PERST# deassert to allow power and clocks to stabilize. - Configure Read Completion Boundary to 64 bytes. Freescale i.MX6 PCIe controller driver: - Initialize PHY before deasserting core reset to fix a regression in v6.0 on boards where the PHY provides the reference. - Fix imx6sx and imx8mq clock names in DT schema. Intel VMD host bridge driver: - Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe SSDs in VT-d pass-through scenarios. - Disable MSI remapping, which gets re-enabled by firmware during suspend/resume. MediaTek PCIe Gen3 controller driver: - Add MT7986 and MT8195 support. Qualcomm PCIe controller driver: - Add SC8280XP/SA8540P basic interconnect support. Rockchip DesignWare PCIe controller driver: - Base DT schema on common Synopsys schema. Synopsys DesignWare PCIe core: - Collect DT items shared between Root Port and Endpoint (PERST GPIO, PHY info, clocks, resets, link speed, number of lanes, number of iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml. - Add dma-ranges support for Root Ports and Endpoints. - Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to reduce code duplication. - Add generic names for clocks and resets to encourage more consistent naming across drivers using DesignWare IP. - Stop advertising PTM Responder role for Endpoints, which aren't allowed to be responders. TI J721E PCIe driver: - Add j721s2 host mode ID to DT schema. - Add interrupt properties to DT schema. Toshiba Visconti PCIe controller driver: - Fix interrupts array max constraints in DT schema" * tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits) x86/PCI: Use pr_info() when possible x86/PCI: Fix log message typo x86/PCI: Tidy E820 removal messages PCI: Skip allocate_resource() if too little space available efi/x86: Remove EfiMemoryMappedIO from E820 map PCI/portdrv: Allow AER service only for Root Ports & RCECs PCI: xilinx-nwl: Fix coding style violations PCI: mvebu: Switch to using gpiod API PCI: pciehp: Enable Command Completed Interrupt only if supported PCI: aardvark: Switch to using devm_gpiod_get_optional() dt-bindings: PCI: mediatek-gen3: add support for mt7986 dt-bindings: PCI: mediatek-gen3: add SoC based clock config dt-bindings: PCI: qcom: Allow 'dma-coherent' property PCI: mt7621: Add sentinel to quirks table PCI: vmd: Fix secondary bus reset for Intel bridges PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32) PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path ...
2022-11-14PCI: Allow drivers to request exclusive config regionsIra Weiny
PCI config space access from user space has traditionally been unrestricted with writes being an understood risk for device operation. Unfortunately, device breakage or odd behavior from config writes lacks indicators that can leave driver writers confused when evaluating failures. This is especially true with the new PCIe Data Object Exchange (DOE) mailbox protocol where backdoor shenanigans from user space through things such as vendor defined protocols may affect device operation without complete breakage. A prior proposal restricted read and writes completely.[1] Greg and Bjorn pointed out that proposal is flawed for a couple of reasons. First, lspci should always be allowed and should not interfere with any device operation. Second, setpci is a valuable tool that is sometimes necessary and it should not be completely restricted.[2] Finally methods exist for full lock of device access if required. Even though access should not be restricted it would be nice for driver writers to be able to flag critical parts of the config space such that interference from user space can be detected. Introduce pci_request_config_region_exclusive() to mark exclusive config regions. Such regions trigger a warning and kernel taint if accessed via user space. Create pci_warn_once() to restrict the user from spamming the log. [1] https://lore.kernel.org/all/161663543465.1867664.5674061943008380442.stgit@dwillia2-desk3.amr.corp.intel.com/ [2] https://lore.kernel.org/all/YF8NGeGv9vYcMfTV@kroah.com/ Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20220926215711.2893286-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-09PCI/sysfs: Fix double free in error pathSascha Hauer
When pci_create_attr() fails, pci_remove_resource_files() is called which will iterate over the res_attr[_wc] arrays and frees every non NULL entry. To avoid a double free here set the array entry only after it's clear we successfully initialized it. Fixes: b562ec8f74e4 ("PCI: Don't leak memory if sysfs_create_bin_file() fails") Link: https://lore.kernel.org/r/20221007070735.GX986@pengutronix.de/ Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2022-10-05PCI: Expose PCIe Resizable BAR support via sysfsAlex Williamson
Add a simple sysfs interface to Resizable BAR support, largely for the purposes of assigning such devices to a VM through VFIO. Resizable BARs present a difficult feature to expose to a VM through emulation, as resizing a BAR is done on the host. It can fail, and often does, but we have no means via emulation of a PCIe REBAR capability to handle the error cases. A vfio-pci specific ioctl interface is also cumbersome as there are often multiple devices within the same bridge aperture and handling them is a challenge. In the interface proposed here, expanding a BAR potentially requires such devices to be soft-removed during the resize operation and rescanned after, in order for all the necessary resources to be released. A pci-sysfs interface is also more universal than a vfio specific interface. Please see the ABI documentation update for usage. Link: https://lore.kernel.org/r/166336088796.3597940.14973499936692558556.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Krzysztof Wilczyński <kw@linux.com>
2022-04-22PCI: Use driver_set_override() instead of open-codingKrzysztof Kozlowski
Use a helper to set driver_override to the reduce amount of duplicated code. Make the driver_override field const char, because it is not modified by the core and it matches other subsystems. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220419113435.246203-6-krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-22PCI: Remove unused assignmentsBjorn Helgaas
Remove variables and assignments that are never used. Found by Krzysztof using cppcheck, e.g., $ cppcheck --enable=all --force uselessAssignmentPtrArg drivers/pci/proc.c:102 Assignment of function parameter has no effect outside the function. Did you forget dereferencing it? unreadVariable drivers/pci/setup-bus.c:1528 Variable 'old_flags' is assigned a value that is never used. Reported-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20220313192933.434746-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-12-09PCI/sysfs: Use pci_irq_vector()Thomas Gleixner
instead of fiddling with MSI descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210224.265589103@linutronix.de
2021-11-06Merge tag 'pci-v5.16-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Conserve IRQs by setting up portdrv IRQs only when there are users (Jan Kiszka) - Rework and simplify _OSC negotiation for control of PCIe features (Joerg Roedel) - Remove struct pci_dev.driver pointer since it's redundant with the struct device.driver pointer (Uwe Kleine-König) Resource management: - Coalesce contiguous host bridge apertures from _CRS to accommodate BARs that cover more than one aperture (Kai-Heng Feng) Sysfs: - Check CAP_SYS_ADMIN before parsing user input (Krzysztof Wilczyński) - Return -EINVAL consistently from "store" functions (Krzysztof Wilczyński) - Use sysfs_emit() in endpoint "show" functions to avoid buffer overruns (Kunihiko Hayashi) PCIe native device hotplug: - Ignore Link Down/Up caused by resets during error recovery so endpoint drivers can remain bound to the device (Lukas Wunner) Virtualization: - Avoid bus resets on Atheros QCA6174, where they hang the device (Ingmar Klein) - Work around Pericom PI7C9X2G switch packet drop erratum by using store and forward mode instead of cut-through (Nathan Rossi) - Avoid trying to enable AtomicOps on VFs; the PF setting applies to all VFs (Selvin Xavier) MSI: - Document that /sys/bus/pci/devices/.../irq contains the legacy INTx interrupt or the IRQ of the first MSI (not MSI-X) vector (Barry Song) VPD: - Add pci_read_vpd_any() and pci_write_vpd_any() to access anywhere in the possible VPD space; use these to simplify the cxgb3 driver (Heiner Kallweit) Peer-to-peer DMA: - Add (not subtract) the bus offset when calculating DMA address (Wang Lu) ASPM: - Re-enable LTR at Downstream Ports so they don't report Unsupported Requests when reset or hot-added devices send LTR messages (Mingchuang Qiao) Apple PCIe controller driver: - Add driver for Apple M1 PCIe controller (Alyssa Rosenzweig, Marc Zyngier) Cadence PCIe controller driver: - Return success when probe succeeds instead of falling into error path (Li Chen) HiSilicon Kirin PCIe controller driver: - Reorganize PHY logic and add support for external PHY drivers (Mauro Carvalho Chehab) - Support PERST# GPIOs for HiKey970 external PEX 8606 bridge (Mauro Carvalho Chehab) - Add Kirin 970 support (Mauro Carvalho Chehab) - Make driver removable (Mauro Carvalho Chehab) Intel VMD host bridge driver: - If IOMMU supports interrupt remapping, leave VMD MSI-X remapping enabled (Adrian Huang) - Number each controller so we can tell them apart in /proc/interrupts (Chunguang Xu) - Avoid building on UML because VMD depends on x86 bare metal APIs (Johannes Berg) Marvell Aardvark PCIe controller driver: - Define macros for PCI_EXP_DEVCTL_PAYLOAD_* (Pali Rohár) - Set Max Payload Size to 512 bytes per Marvell spec (Pali Rohár) - Downgrade PIO Response Status messages to debug level (Marek Behún) - Preserve CRS SV (Config Request Retry Software Visibility) bit in emulated Root Control register (Pali Rohár) - Fix issue in configuring reference clock (Pali Rohár) - Don't clear status bits for masked interrupts (Pali Rohár) - Don't mask unused interrupts (Pali Rohár) - Avoid code repetition in advk_pcie_rd_conf() (Marek Behún) - Retry config accesses on CRS response (Pali Rohár) - Simplify emulated Root Capabilities initialization (Pali Rohár) - Fix several link training issues (Pali Rohár) - Fix link-up checking via LTSSM (Pali Rohár) - Fix reporting of Data Link Layer Link Active (Pali Rohár) - Fix emulation of W1C bits (Marek Behún) - Fix MSI domain .alloc() method to return zero on success (Marek Behún) - Read entire 16-bit MSI vector in MSI handler, not just low 8 bits (Marek Behún) - Clear Root Port I/O Space, Memory Space, and Bus Master Enable bits at startup; PCI core will set those as necessary (Pali Rohár) - When operating as a Root Port, set class code to "PCI Bridge" instead of the default "Mass Storage Controller" (Pali Rohár) - Add emulation for PCI_BRIDGE_CTL_BUS_RESET since aardvark doesn't implement this per spec (Pali Rohár) - Add emulation of option ROM BAR since aardvark doesn't implement this per spec (Pali Rohár) MediaTek MT7621 PCIe controller driver: - Add MediaTek MT7621 PCIe host controller driver and DT binding (Sergio Paracuellos) Qualcomm PCIe controller driver: - Add SC8180x compatible string (Bjorn Andersson) - Add endpoint controller driver and DT binding (Manivannan Sadhasivam) - Restructure to use of_device_get_match_data() (Prasad Malisetty) - Add SC7280-specific pcie_1_pipe_clk_src handling (Prasad Malisetty) Renesas R-Car PCIe controller driver: - Remove unnecessary includes (Geert Uytterhoeven) Rockchip DesignWare PCIe controller driver: - Add DT binding (Simon Xue) Socionext UniPhier Pro5 controller driver: - Serialize INTx masking/unmasking (Kunihiko Hayashi) Synopsys DesignWare PCIe controller driver: - Run dwc .host_init() method before registering MSI interrupt handler so we can deal with pending interrupts left by bootloader (Bjorn Andersson) - Clean up Kconfig dependencies (Andy Shevchenko) - Export symbols to allow more modular drivers (Luca Ceresoli) TI DRA7xx PCIe controller driver: - Allow host and endpoint drivers to be modules (Luca Ceresoli) - Enable external clock if present (Luca Ceresoli) TI J721E PCIe driver: - Disable PHY when probe fails after initializing it (Christophe JAILLET) MicroSemi Switchtec management driver: - Return error to application when command execution fails because an out-of-band reset has cleared the device BARs, Memory Space Enable, etc (Kelvin Cao) - Fix MRPC error status handling issue (Kelvin Cao) - Mask out other bits when reading of management VEP instance ID (Kelvin Cao) - Return EOPNOTSUPP instead of ENOTSUPP from sysfs show functions (Kelvin Cao) - Add check of event support (Logan Gunthorpe) Miscellaneous: - Remove unused pci_pool wrappers, which have been replaced by dma_pool (Cai Huoqing) - Use 'unsigned int' instead of bare 'unsigned' (Krzysztof Wilczyński) - Use kstrtobool() directly, sans strtobool() wrapper (Krzysztof Wilczyński) - Fix some sscanf(), sprintf() format mismatches (Krzysztof Wilczyński) - Update PCI subsystem information in MAINTAINERS (Krzysztof Wilczyński) - Correct some misspellings (Krzysztof Wilczyński)" * tag 'pci-v5.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (137 commits) PCI: Add ACS quirk for Pericom PI7C9X2G switches PCI: apple: Configure RID to SID mapper on device addition iommu/dart: Exclude MSI doorbell from PCIe device IOVA range PCI: apple: Implement MSI support PCI: apple: Add INTx and per-port interrupt support PCI: kirin: Allow removing the driver PCI: kirin: De-init the dwc driver PCI: kirin: Disable clkreq during poweroff sequence PCI: kirin: Move the power-off code to a common routine PCI: kirin: Add power_off support for Kirin 960 PHY PCI: kirin: Allow building it as a module PCI: kirin: Add MODULE_* macros PCI: kirin: Add Kirin 970 compatible PCI: kirin: Support PERST# GPIOs for HiKey970 external PEX 8606 bridge PCI: apple: Set up reference clocks when probing PCI: apple: Add initial hardware bring-up PCI: of: Allow matching of an interrupt-map local to a PCI device of/irq: Allow matching of an interrupt-map local to an interrupt controller irqdomain: Make of_phandle_args_to_fwspec() generally available PCI: Do not enable AtomicOps on VFs ...
2021-11-05Merge branch 'pci/sysfs'Bjorn Helgaas
- Check for CAP_SYS_ADMIN before validating sysfs user input, not after (Krzysztof Wilczyński) - Always return -EINVAL from sysfs "store" functions for invalid user input instead of -EINVAL sometimes and -ERANGE others (Krzysztof Wilczyński) - Use kstrtobool() directly instead of the strtobool() wrapper (Krzysztof Wilczyński) * pci/sysfs: PCI: Use kstrtobool() directly, sans strtobool() wrapper PCI/sysfs: Return -EINVAL consistently from "store" functions PCI/sysfs: Check CAP_SYS_ADMIN before parsing user input # Conflicts: # drivers/pci/iov.c
2021-10-18PCI/sysfs: Explicitly show first MSI IRQ for 'irq'Barry Song
The sysfs "irq" file contains the legacy INTx IRQ. Or, if the device has MSI enabled, it contains the first MSI IRQ instead. Previously this file showed the pci_dev.irq value directly. But we'd prefer to use pci_dev.irq only for the INTx IRQ and decouple that from any MSI or MSI-X IRQs. If the device has MSI enabled, explicitly look up and show the first MSI IRQ in the sysfs "irq" file. Otherwise, show the INTx IRQ. This removes the requirement that msi_capability_init() set pci_dev.irq to the first MSI IRQ when enabling MSI and pci_msi_shutdown() restore the INTx IRQ when disabling MSI. [bhelgaas: commit log] Link: https://lore.kernel.org/r/20210825102636.52757-3-21cnbao@gmail.com Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-10-05PCI/sysfs: use NUMA_NO_NODE macroMax Gurtovoy
Use the proper macro instead of hard-coded (-1) value. Suggested-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20211004133453.18881-2-mgurtovoy@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-28PCI/sysfs: Return -EINVAL consistently from "store" functionsKrzysztof Wilczyński
Most of the "store" functions that handle userspace input via sysfs return -EINVAL should the value fail validation and/or type conversion. This error code is a clear message to userspace that the value is not a valid input. However, some of the "show" functions return input parsing error codes as-is, which may be either -EINVAL or -ERANGE. The former would often be from kstrtobool(), and the latter typically from other kstr*() functions such as kstrtou8(), kstrtou32(), kstrtoint(), etc. -EINVAL is commonly returned as the error code to indicate that the value provided is invalid, but -ERANGE is not very useful in userspace. Therefore, normalize the return error code to be -EINVAL for when the validation and/or type conversion fails. Link: https://lore.kernel.org/r/20210915230127.2495723-2-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-09-28PCI/sysfs: Check CAP_SYS_ADMIN before parsing user inputKrzysztof Wilczyński
Check if the "CAP_SYS_ADMIN" capability flag is set before parsing user input as it makes more sense to first check whether the current user actually has the right permissions before accepting any input from such user. This will also make order in which enable_store() and msi_bus_store() perform the "CAP_SYS_ADMIN" capability check consistent with other PCI-related sysfs objects that first verify whether user has this capability set. Link: https://lore.kernel.org/r/20210915230127.2495723-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-09-07Merge tag 'pci-v5.15-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Convert controller drivers to generic_handle_domain_irq() (Marc Zyngier) - Simplify VPD (Vital Product Data) access and search (Heiner Kallweit) - Update bnx2, bnx2x, bnxt, cxgb4, cxlflash, sfc, tg3 drivers to use simplified VPD interfaces (Heiner Kallweit) - Run Max Payload Size quirks before configuring MPS; work around ASMedia ASM1062 SATA MPS issue (Marek Behún) Resource management: - Refactor pci_ioremap_bar() and pci_ioremap_wc_bar() (Krzysztof Wilczyński) - Optimize pci_resource_len() to reduce kernel size (Zhen Lei) PCI device hotplug: - Fix a double unmap in ibmphp (Vishal Aslot) PCIe port driver: - Enable Bandwidth Notification only if port supports it (Stuart Hayes) Sysfs/proc/syscalls: - Add schedule point in proc_bus_pci_read() (Krzysztof Wilczyński) - Return ~0 data on pciconfig_read() CAP_SYS_ADMIN failure (Krzysztof Wilczyński) - Return "int" from pciconfig_read() syscall (Krzysztof Wilczyński) Virtualization: - Extend "pci=noats" to also turn on Translation Blocking to protect against some DMA attacks (Alex Williamson) - Add sysfs mechanism to control the type of reset used between device assignments to VMs (Amey Narkhede) - Add support for ACPI _RST reset method (Shanker Donthineni) - Add ACS quirks for Cavium multi-function devices (George Cherian) - Add ACS quirks for NXP LX2xx0 and LX2xx2 platforms (Wasim Khan) - Allow HiSilicon AMBA devices that appear as fake PCI devices to use PASID and SVA (Zhangfei Gao) Endpoint framework: - Add support for SR-IOV Endpoint devices (Kishon Vijay Abraham I) - Zero-initialize endpoint test tool parameters so we don't use random parameters (Shunyong Yang) APM X-Gene PCIe controller driver: - Remove redundant dev_err() call in xgene_msi_probe() (ErKun Yang) Broadcom iProc PCIe controller driver: - Don't fail devm_pci_alloc_host_bridge() on missing 'ranges' because it's optional on BCMA devices (Rob Herring) - Fix BCMA probe resource handling (Rob Herring) Cadence PCIe driver: - Work around J7200 Link training electrical issue by increasing delays in LTSSM (Nadeem Athani) Intel IXP4xx PCI controller driver: - Depend on ARCH_IXP4XX to avoid useless config questions (Geert Uytterhoeven) Intel Keembay PCIe controller driver: - Add Intel Keem Bay PCIe controller (Srikanth Thokala) Marvell Aardvark PCIe controller driver: - Work around config space completion handling issues (Evan Wang) - Increase timeout for config access completions (Pali Rohár) - Emulate CRS Software Visibility bit (Pali Rohár) - Configure resources from DT 'ranges' property to fix I/O space access (Pali Rohár) - Serialize INTx mask/unmask (Pali Rohár) MediaTek PCIe controller driver: - Add MT7629 support in DT (Chuanjia Liu) - Fix an MSI issue (Chuanjia Liu) - Get syscon regmap ("mediatek,generic-pciecfg"), IRQ number ("pci_irq"), PCI domain ("linux,pci-domain") from DT properties if present (Chuanjia Liu) Microsoft Hyper-V host bridge driver: - Add ARM64 support (Boqun Feng) - Support "Create Interrupt v3" message (Sunil Muthuswamy) NVIDIA Tegra PCIe controller driver: - Use seq_puts(), move err_msg from stack to static, fix OF node leak (Christophe JAILLET) NVIDIA Tegra194 PCIe driver: - Disable suspend when in Endpoint mode (Om Prakash Singh) - Fix MSI-X address programming error (Om Prakash Singh) - Disable interrupts during suspend to avoid spurious AER link down (Om Prakash Singh) Renesas R-Car PCIe controller driver: - Work around hardware issue that prevents Link L1->L0 transition (Marek Vasut) - Fix runtime PM refcount leak (Dinghao Liu) Rockchip DesignWare PCIe controller driver: - Add Rockchip RK356X host controller driver (Simon Xue) TI J721E PCIe driver: - Add support for J7200 and AM64 (Kishon Vijay Abraham I) Toshiba Visconti PCIe controller driver: - Add Toshiba Visconti PCIe host controller driver (Nobuhiro Iwamatsu) Xilinx NWL PCIe controller driver: - Enable PCIe reference clock via CCF (Hyun Kwon) Miscellaneous: - Convert sta2x11 from 'pci_' to 'dma_' API (Christophe JAILLET) - Fix pci_dev_str_match_path() alloc while atomic bug (used for kernel parameters that specify devices) (Dan Carpenter) - Remove pointless Precision Time Management warning when PTM is present but not enabled (Jakub Kicinski) - Remove surplus "break" statements (Krzysztof Wilczyński)" * tag 'pci-v5.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (132 commits) PCI: ibmphp: Fix double unmap of io_mem x86/PCI: sta2x11: switch from 'pci_' to 'dma_' API PCI/VPD: Use unaligned access helpers PCI/VPD: Clean up public VPD defines and inline functions cxgb4: Use pci_vpd_find_id_string() to find VPD ID string PCI/VPD: Add pci_vpd_find_id_string() PCI/VPD: Include post-processing in pci_vpd_find_tag() PCI/VPD: Stop exporting pci_vpd_find_info_keyword() PCI/VPD: Stop exporting pci_vpd_find_tag() PCI: Set dma-can-stall for HiSilicon chips PCI: rockchip-dwc: Add Rockchip RK356X host controller driver PCI: dwc: Remove surplus break statement after return PCI: artpec6: Remove local code block from switch statement PCI: artpec6: Remove surplus break statement after return MAINTAINERS: Add entries for Toshiba Visconti PCIe controller PCI: visconti: Add Toshiba Visconti PCIe host controller driver PCI/portdrv: Enable Bandwidth Notification only if port supports it PCI: Allow PASID on fake PCIe devices without TLP prefixes PCI: mediatek: Use PCI domain to handle ports detection PCI: mediatek: Add new method to get irq number ...
2021-09-01Merge tag 'driver-core-5.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core patches for 5.15-rc1. These do change a number of different things across different subsystems, and because of that, there were 2 stable tags created that might have already come into your tree from different pulls that did the following - changed the bus remove callback to return void - sysfs iomem_get_mapping rework Other than those two things, there's only a few small things in here: - kernfs performance improvements for huge numbers of sysfs users at once - tiny api cleanups - other minor changes All of these have been in linux-next for a while with no reported problems, other than the before-mentioned merge issue" * tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (33 commits) MAINTAINERS: Add dri-devel for component.[hc] driver core: platform: Remove platform_device_add_properties() ARM: tegra: paz00: Handle device properties with software node API bitmap: extend comment to bitmap_print_bitmask/list_to_buf drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI topology: use bin_attribute to break the size limitation of cpumap ABI lib: test_bitmap: add bitmap_print_bitmask/list_to_buf test cases cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list sysfs: Rename struct bin_attribute member to f_mapping sysfs: Invoke iomem_get_mapping() from the sysfs open callback debugfs: Return error during {full/open}_proxy_open() on rmmod zorro: Drop useless (and hardly used) .driver member in struct zorro_dev zorro: Simplify remove callback sh: superhyway: Simplify check in remove callback nubus: Simplify check in remove callback nubus: Make struct nubus_driver::remove return void kernfs: dont call d_splice_alias() under kernfs node lock kernfs: use i_lock to protect concurrent inode updates kernfs: switch kernfs to use an rwsem kernfs: use VFS negative dentry caching ...
2021-08-19PCI/sysfs: Use correct variable for the legacy_mem sysfs objectKrzysztof Wilczyński
Two legacy PCI sysfs objects "legacy_io" and "legacy_mem" were updated to use an unified address space in the commit 636b21b50152 ("PCI: Revoke mappings like devmem"). This allows for revocations to be managed from a single place when drivers want to take over and mmap() a /dev/mem range. Following the update, both of the sysfs objects should leverage the iomem_get_mapping() function to get an appropriate address range, but only the "legacy_io" has been correctly updated - the second attribute seems to be using a wrong variable to pass the iomem_get_mapping() function to. Thus, correct the variable name used so that the "legacy_mem" sysfs object would also correctly call the iomem_get_mapping() function. Fixes: 636b21b50152 ("PCI: Revoke mappings like devmem") Link: https://lore.kernel.org/r/20210812132144.791268-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-08-18PCI: Allow userspace to query and set device reset mechanismAmey Narkhede
Add "reset_method" sysfs attribute to enable user to query and set preferred device reset methods and their ordering. [bhelgaas: on invalid sysfs input, return error and preserve previous config, as in earlier patch versions] Co-developed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20210817180500.1253-6-ameynarkhede03@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2021-08-17PCI: Remove reset_fn field from pci_devAmey Narkhede
"reset_fn" indicates whether the device supports any reset mechanism. Remove the use of reset_fn in favor of the reset_methods array that tracks supported reset mechanisms of a device and their ordering. The octeon driver incorrectly used reset_fn to detect whether the device supports FLR or not. Use pcie_reset_flr() to probe whether it supports FLR. Co-developed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20210817180500.1253-5-ameynarkhede03@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2021-08-05sysfs: Rename struct bin_attribute member to f_mappingKrzysztof Wilczyński
There are two users of iomem_get_mapping(), the struct file and struct bin_attribute. The former has a member called "f_mapping" and the latter has a member called "mapping", and both are poniters to struct address_space. Rename struct bin_attribute member to "f_mapping" to keep both meaning and the usage consistent with other users of iomem_get_mapping(). Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20210729233235.1508920-3-kw@linux.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-05sysfs: Invoke iomem_get_mapping() from the sysfs open callbackKrzysztof Wilczyński
Defer invocation of the iomem_get_mapping() to the sysfs open callback so that it can be executed as needed when the binary sysfs object has been accessed. To do that, convert the "mapping" member of the struct bin_attribute from a pointer to the struct address_space into a function pointer with a signature that requires the same return type, and then updates the sysfs_kf_bin_open() to invoke provided function should the function pointer be valid. Also, convert every invocation of iomem_get_mapping() into a function pointer assignment, therefore allowing for the iomem_get_mapping() invocation to be deferred to when the sysfs open callback runs. Thus, this change removes the need for the fs_initcalls to complete before any other sub-system that uses the iomem_get_mapping() would be able to invoke it safely without leading to a failure and an Oops related to an invalid iomem_get_mapping() access. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20210729233235.1508920-2-kw@linux.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>