Age | Commit message (Collapse) | Author |
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In ktest, we try to keep all essential information on test failure in a
single log file - dumping seqres.full to stdout will end up in that log
file.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This adds a new flag to check which exits immediately after the first
test failure, so as to leave test/scratch devices untouched and make it
easier to debug rare test failures.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
commit 60054d51 ("check: fix excluded tests are only expunged in the
first iteration") change to use exclude_tests array instead of file.
The check if a test is in expunge file was using grep -q $TEST_ID FILE
so it was checking if the test was a non-exact match to one of the
lines, for a common example: "generic/001 # exclude this test" would be
a match to test generic/001.
The commit regressed this example, because the new code checks for exact
match of [ "generic/001" == "generic/001 " ]. Change the code to match a
regular expression to deal with this case and any other suffix correctly.
NOTE that the original code would have matched test generic/100 with lines
like "generic/1000" when we get to 4 digit seqnum, so the regular
expression does an exact match to the first word of the line.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
The filesystem configuration file does not allow you to use symlinks to
real devices given the existing sanity checks verify that the target end
device matches the source. Device mapper links work but not symlinks for
real drives do not.
Using a symlink is desirable if you want to enable persistent tests
across reboots. For example you may want to use /dev/disk/by-id/nvme-eui.*
so to ensure that the same drives are used even after reboot. This
is very useful if you are testing for example with a virtualized
environment and are using PCIe passthrough with other qemu NVMe drives
with one or many NVMe drives.
To enable support just add a helper to canonicalize devices prior to
running the tests.
This allows one test runner, kdevops, which I just extended with
support to use real NVMe drives it has support now to use nvme EUI
symlinks and fallbacks to nvme model + serial symlinks as not all
NVMe drives support EUIs. The drives it uses for the filesystem
configuration optionally is with NVMe eui symlinks so to allow
the same drives to be used over reboots.
For instance this works today with real nvme drives:
mkfs.xfs -f /dev/nvme0n1
mount /dev/nvme0n1 /mnt
TEST_DIR=/mnt TEST_DEV=/dev/nvme0n1 FSTYP=xfs ./check generic/110
FSTYP -- xfs (debug)
PLATFORM -- Linux/x86_64 flax-mtr01 6.5.0-rc3-djwx #rc3 SMP PREEMPT_DYNAMIC Wed Jul 26 14:26:48 PDT 2023
generic/110 2s
Ran: generic/110
Passed all 1 tests
But this does not:
TEST_DIR=/mnt TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e FSTYP=xfs ./check generic/110
mount: /mnt: /dev/disk/by-id/nvme-eui.0035385411904c1e already mounted on /mnt.
common/rc: retrying test device mount with external set
mount: /mnt: /dev/disk/by-id/nvme-eui.0035385411904c1e already mounted on /mnt.
common/rc: could not mount /dev/disk/by-id/nvme-eui.0035385411904c1e on /mnt
umount /mnt
TEST_DIR=/mnt TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e FSTYP=xfs ./check generic/110
TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e is mounted but not on TEST_DIR=/mnt - aborting
Already mounted result:
/dev/disk/by-id/nvme-eui.0035385411904c1e /mnt
This fixes this. This allows the same real drives for a test to be
used over and over after reboots.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Support collecting kernel code coverage information as reported in
debugfs. At the start of each section, we reset the gcov counters;
during the section wrapup, we'll collect the kernel gcov data.
If lcov is installed and the kernel source code is available, it will
also generate a nice html report. If a CLI web browser is available, it
will also format the html report into text for easy grepping.
This requires the test runner to set REPORT_GCOV=1 explicitly and gcov
to be enabled in the kernel.
Cc: tytso@mit.edu
Cc: kent.overstreet@linux.dev
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
If iterating more than once and excluding some tests, the
excluded tests are expunged in the first iteration, but run in
subsequent iterations. This is not expected.
The problem was caused by the temporary file saving the excluded
tests being deleted by `rm -f $tmp.*` in _wrapup() at the end of
the first iteration.
This commit saves the excluded tests into a variable instead of a
temporary file.
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@foxmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
kvm-xfstests invokes check -n twice to pre-process and generate the
tests-to-run list, which is then being passed as a long list of tests
for invkoing check in the command line.
check invokes dirname, basename and sed several times per test just
for doing basic string prefix/suffix trimming.
Use bash string pattern matching instead which is much faster.
Note that the following pattern matching expression change:
< test_dir=${test_dir#$SRC_DIR/*}
> t=${t#$SRC_DIR/}
does not change the meaning of the expression, because the
shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
and removing the tests/ prefix is what this code intended to do.
With check -n, there is no need to cleanup the results dir,
but check -n is doing that for every single listed test.
Move the cleanup of results dir to before actually running the test.
These improvements to check pre-test code cut down several minutes
from the time until tests actually start to run with kvm-xfstests.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Make it so that test runners can schedule long soak stress test programs
for an exact number of seconds by setting the SOAK_DURATION config
variable. Change the definition of the 'soak' test to specify that
these tests can be controlled via SOAK_DURATION.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Add support for the AFS filesystem. AFS is a network filesystem and there
are a number of features it doesn't support.
- No mkfs. (Kind of. An AFS volume server can be asked to create a new
volume, but that's probably best left to AFS-specific test suites.
Further, a volume would need to be destroyed before another of the same
name could be created; it's not simply a matter of overwriting the old
one as it is on a blockdev with a block-based filesystem.)
- No fsck. (Kind of - the server can be asked to salvage a volume, but it
may involve taking the server offline).
- No richacls. AFS has its own ACL system.
- No atimes.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Previously, we would only run _check_filesystems to ensure that a test
that appeared to pass did not have any filesystem corruption. However,
in _check_filesystems, we also repair any errors found in the filesystem.
Let's do this even if we already know the test failed so that subsequent
tests aren't affected.
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
If the test device gets corrupted all subsequent tests will fail. To
prevent this from causing all subsequent tests to be useless, try
repair the file system on TEST_DEV if possible. We don't need to do
this with the scratch device since that file system gets recreated
each time anyway.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Report two new timestamps in the xml report: the time that ./check was
started, and the time that the report was generated. We introduce new
timestamps to minimize breakage with parsing scripts.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
We've never specified what the timestamp attribute of the testsuite
element actually means, and it history is rather murky.
Prior to the introduction of the xml report format in commit f9fde7db2f,
the "date_time" variable was used only to scrape dmesg via the /dev/kmsg
device after each test. If /dev/kmsg was not a writable path, the
variable was not set at all. In this case, the report timestamp would
be blank.
In commit ffdecf7498a1, Ted changed the xunit report code to handle
empty date_time values by setting date_time to the time of report
generation. This change was done to handle the case where no tests are
run at all. However, it did not change the behavior that date_time is
not set if /dev/kmsg is not writable.
Clear up all this confusion by defining the timestamp attribute to
reflect the start time of the most recent test, regardless of the state
of /dev/kmsg. If no tests are run, then define the attribute to be the
time of report generation.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Generate the section report between tests so that the summary report
always reflects the outcome of the most recent test. Two usecases are
envisioned here -- if a cluster-based test runner anticipates that the
testrun could crash the VM, they can set REPORT_DIR to (say) an NFS
mount to preserve the intermediate results. If the VM does indeed
crash, the scheduler can examine the state of the crashed VM and move
the tests to another VM. The second usecase is a reporting agent that
runs in the VM to upload live results to a test dashboard.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Leah Rumancik <leah.rumancik@gmail.com>
Tested-by: Leah Rumancik <leah.rumancik@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
This allows using any fuse filesystem that can be mounted with
mount -t fuse$FUSE_SUBTYP ...
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jakob Unterwurzacher <jakobunt@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Generally, FSTYP is used to specify OVL_BASE_FSTYP. When we specify FSTYP
through an environment variable, it is not converted to OVL_BASE_FSTYP.
In addition, sometimes we do not even specify the file type. For example,
we only use `./check -n -overlay -g auto` to list overlay-related cases.
If OVL_BASE_FSTYP is NULL, mounting fails and the test fails.
To solve this problem, try to assign a value to OVL_BASE_FSTYP when
specifying -overlay. In addition, in the _overlay_base_mount function,
the basic file system type of the overlay is specified only when
OVL_BASE_FSTYP is not NULL.
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Make sure tmp.arglist is wiped before each run to avoid
accidentally rerunning tests.
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
sect_stop is normally set immediately prior to calling _wrapup() via
run_section(). However, when called via a trap signal handler,
sect_stop may be uninitialized, leading to a negative section time
(sect_stop - sect_start) in the xunit report. E.g.
Interrupted!
Passed all 1 tests
Xunit report: /home/david/xfstests/results//result.xml
rapido1:/# head /home/david/xfstests/results//result.xml
<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="xfstests" failures="0" skipped="0" tests="1"
time="-1670885797" ... >
This commit uses the existing $interrupt flag to determine when
sect_stop needs to be initialised.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
[BUG]
When KEEP_DMESG=yes is specified, passed test cases will also keep their
$seqres.dmesg files.
However for failed test cases (caused by _fail calls), their dmesg files
are not saved at all:
# rm -rf results/btrfs/219*
# ./check btrfs/219
# ls result/btrfs/219*
results/btrfs/219.full results/btrfs/219.out.bad
[CAUSE]
$seqres.dmesg is created (and later deleted depending on config) by
_check_dmesg() function.
But if a test case failed by calling _fail, then we no longer call
_check_dmesg(), thus no dmesg will be saved no matter whatever the
config is.
[FIX]
If the test case itself failed, then still call _check_dmesg() to either
save the dmesg unconditionally (KEEP_DMESG=yes case), or save the dmesg
if there is something wrong (default).
The dmesg can be pretty handy debug clue for both cases.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
While trying to do
./check -s <some section>
I was failing because I had a section defined higher than <some section>
that had TEST_DEV=/some/nonexistent/device, since I was using the other
section to test an experimental drive. This appears to be because we
run through all of the sections, and when getting the section config we
check to see if it's valid, and in this case the section wasn't valid.
The section I was actually trying to use was valid however. Fix check
to see if the section we're trying to run is in our list of sections to
run first, and then if it is get the config at that point.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
If someone sets kernel.core_uses_pid (or kernel.core_pattern), any
coredumps generated by fstests might have names that are longer than
just "core". Since the pid isn't all that useful by itself, let's
record the coredumps by hash when we save them, so that we don't waste
space storing identical crash dumps.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Since this grep commit:
commit a9515624709865d480e3142fd959bccd1c9372d1
Author: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun Aug 15 10:52:13 2021 -0700
egrep, fgrep: now obsolete
egrep will trigger a warning like:
+egrep: warning: egrep is obsolescent; using grep -E
This will break many gold output.
Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
The goal of an EIO shutdown test is to examine the shutdown and recovery
behavior if we make the underlying storage device return EIO. On XFS,
it's possible that the shutdown will come from a thread that cancels a
dirty transaction due to the EIO. This is expected behavior, but
_check_dmesg will flag it as a test failure.
Make it so that we can add simple regexps to the default check_dmesg
filter function, then add the "Internal error" string to filter function
when we invoke an EIO test. This fixes periodic regressions in
generic/019 and generic/475.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
The xunit-quiet format excludes the NNN.{full,dmesg,bad} files in
<system-out> and <system-err> nodes which are included in the xunit
report format.
For test runners that save the entire results directory to preserve
all of the test artifacts, capturing the NNN.{full,dmesg,bad} in the
results.xml file is redundant. In addition, if the NNN.bad is too
large, it can cause the junitparser python library to refuse to parse
the XML file to prevent potential denial of service attacks[1]. A
simple way to avoid this problem is to simply to omit the <system-out>
and <system-err> nodes in the results.xml file.
[1] https://gitlab.com/gitlab-org/gitlab/-/issues/268035
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
If check is run with -L <n>, then a failed test will be rerun <n> times
before proceeding to the next test. Following completion of the rerun
loop, aggregate pass/fail statistics are printed.
Rerun tests will be tracked as a single failure in overall pass/fail
metrics (via @try and @bad), with .out.bad, .dmesg, .core, .hints,
.notrun and .full saved using a .rerun# suffix.
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lwn.net/Articles/897061/
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Currently the @try, @bad and @notrun arrays are appended with seqnum at
different points in the main run_section() loop:
- @try: shortly prior to test script execution
- @notrun: on list (check -n), or after .notrun flagged test completion
- @bad: at the start of subsequent test loop and loop exit
For future loop-test-following-failure functionality it makes sense to
combine some of these steps. This change moves both @notrun and @bad
appends into a helper function which is called at the end of each loop
iteration.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
The variables aren't used outside of function scope. Also convert one
timestamp output to use the helper.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Report generation currently involves reaching into a whole bunch of
globals for things like section name and start/end times. Pass these
through as explicit function parameters to avoid unintentional breakage.
One minor fix included is the default xunit error message, which used
$sequm instead of $seqnum.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
There are a number of fstests that employ special (and now unsupported)
XFS filesystem configurations to perform testing in a controlled
environment. The presence of the QA_CHECK_FS and MSGVERB variables are
used by mkfs.xfs to detect that it's running inside fstests, which
enables the unsupported configurations. Nobody else should be using
filesystems with tiny logs, non-redundant superblocks, or smaller than
the (new) minimum supported size.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
tc_status can be used for both of these.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
The separate n_try, n_bad and n_notrun counters are unnecessary when
the corresponding lists are switched to bash arrays.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
The xunit "section report" provides a tests attribute, which according
to https://llg.cubic.org/docs/junit/ represents:
tests="" <!-- The total number of tests in the suite, required. -->
The current value is generated as a sum of the $n_try and $n_notrun
counters. This is incorrect as the $n_try counter already includes tests
which are run but complete with _notrun.
One special case exists for $showme (check -n), where $n_try remains
zero, so $n_notrun can be used as-is.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
There's no need to use grep and awk when the latter can do all that's
needed, including the pretty printing.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Introduce helpers _fixed_by_{kernel,git}_commit() and
_fixed_in_{kernel_,}version() that can be used to hint testers why a
test might be failing and aid in auto-generating of expunge lists for
downstream kernel testing.
A test may be annotated with multiple hints, for example:
_fixed_by_kernel_commit 09889695864 xfs: foo
_fixed_by_kernel_commit 46464565465 ext4: bar
_fixed_in_version xfsprogs v5.15
Annotate fix kernel commits for some overlayfs tests.
Annotate fix kernel version for some overlayfs tests testing
for legacy behavior whose fixes are not likely to be backported
to stable kernels.
This is modeled after LTP's 'make filter-known-fails' and
print_failure_hints() using struct tst_tag annotations.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
|
|
Dave Chinner complained that fstests really shouldn't be running at
-1000 oom score adjustment because that makes it more "important" than
certain system daemons (e.g. journald, udev). That's true, so increase
it to -500.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Unmount the scratch filesystem if a test decides to _notrun itself
because _try_wipe_scratch_devs will not be able to wipe the scratch
device prior to the next test run. We don't want to let scratch state
from one test leak into subsequent tests if we can help it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
While running fstests one night, I observed that fstests stopped
abruptly because ./check ran _check_filesystems to run xfs_repair.
In turn, repair (which inherited oom_score_adj=-1000 from ./check)
consumed so much memory that the OOM killer ran around killing other
daemons, rendering the system nonfunctional.
This is silly -- we set an OOM score adjustment of -1000 on the
./check process so that the test framework itself wouldn't get
OOM-killed, because that aborts the entire run. Everything else is
fair game for that, including subprocesses started by
_check_filesystems.
Therefore, adapt _check_filesystems (and its children) to run in a
subshell with a much higher oom score adjustment.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
If check is passed an invalid command line option, exit with a
non-zero exit code so that a script calling check can detect the
failure. The check script already performs an "exit 1" if a valid
option has an invalid argument, so this is consistent with existing
practice.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Convert the ./check script to use the automatically generated group list
membership files, as the transition is now complete.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Currently with -i <n> option the test can run for many iterations,
but in case if we want to stop the iteration in case of a failure,
it is much easier to have such an option which could check the
failed status and stop the test from further proceeding.
This patch adds such an option (-I <n>) thereby extending the -i <n>
option functionality.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Introduce a new --exact-order switch to disable all sorting, filtering
of repeated lines, and shuffling of test order. The goal of this is to
be able to run tests in a specific order, namely to try to reproduce
test failures that could be the result of a -r(andomize) run getting
lucky.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Don't abort the whole test run if we asked to exclude groups that aren't
included in the candidate group list, since we actually /are/ satisfying
the user's request.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
This enables us to mask off specific tests.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
If TEST_DEV is recreated by check, FSTYP derived from TEST_DEV
previously could be changed too and might not reflect the reality.
So source common/rc again with correct FSTYP to get fs-specific
configs, e.g. common/xfs.
For example, using this config-section config file, and run section
ext4 first then xfs, you can see:
our local _scratch_mkfs routine ...
./common/rc: line 825: _scratch_mkfs_xfs: command not found
check: failed to mkfs $SCRATCH_DEV using specified options
local.config:
[default]
RECREATE_TEST_DEV=true
TEST_DEV=/dev/sda5
SCRATCH_DEV=/dev/sda6
TEST_DIR=/mnt/test
SCRATCH_MNT=/mnt/scratch
[ext4]
MKFS_OPTIONS="-b 4096"
FSTYP=ext4
[xfs]
FSTYP=xfs
MKFS_OPTIONS="-f -b size=4k"
Tested-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
|
|
TLDR: If systemd is available, run each test in its own temporary
systemd scope. This enables the test harness to forcibly clean up all
of the test's child processes (if it does not do so itself) so that we
can move into the post-test unmount and check cleanly.
I frequently run fstests in "low" memory situations (2GB!) to force the
kernel to do interesting things. There are a few tests like generic/224
and generic/561 that put processes in the background and occasionally
trigger the OOM killer. Most of the time the OOM killer correctly
shoots down fsstress or duperemove, but once in a while it's stupid
enough to shoot down the test control process (i.e. tests/generic/224)
instead. fsstress is still running in the background, and the one
process that knew about that is dead.
When the control process dies, ./check moves on to the post-test fsck,
which fails because fsstress is still running and we can't unmount.
After fsck fails, ./check moves on to the next test, which fails because
fsstress is /still/ writing to the filesystem and we can't unmount or
format.
The end result is that that one OOM kill causes cascading test failures,
and I have to re-start fstests to see if I get a clean(er) run.
So, the solution I present in this patch is to teach ./check to try to
run the test script in a systemd scope. If that succeeds, ./check will
tell systemd to kill the scope when the test script exits and returns
control to ./check. Concretely, this means that systemd creates a new
cgroup, stuffs the processes in that cgroup, and when we kill the scope,
systemd kills all the processes in that cgroup and deletes the cgroup.
The end result is that fstests now has an easy way to ensure that /all/
child processes of a test are dead before we try to unmount the test and
scratch devices. I've designed this to be optional, because not
everyone does or wants or likes to run systemd, but it makes QA easier.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
We have some places that refer to the variable OPTIONS_HAVE_SECTIONS
has OPTIONS_HAVE_SECIONS, obviously a typo. So fix them.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Right now we only track check.log and check.time globally, it would
be nice to do it per-section as well. This makes it easier to parse
results from systems that run a bunch of different configurations at
once.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|
|
Optionally reload the module between each test to try to pinpoint slab
cache errors and whatnot.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
|