diff options
author | Darrick J. Wong <djwong@kernel.org> | 2023-02-14 17:50:48 -0800 |
---|---|---|
committer | Darrick J. Wong <djwong@kernel.org> | 2023-02-14 17:50:48 -0800 |
commit | 571dc9ae4eefb452d32cfb3761a87089e8f37ca7 (patch) | |
tree | 0e222d843e98ce03b4c70fa14f744ae3a44a2507 /fs/xfs/libxfs/xfs_ialloc.c | |
parent | dd07bb8b6baf2389caff221f043d9188ce6bab8c (diff) | |
parent | bd4f5d09cc93c8ca51e4efea86ac90a4bb553d6e (diff) |
Merge tag 'xfs-alloc-perag-conversion' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-6.3-merge-A
xfs: per-ag centric allocation alogrithms
This series continues the work towards making shrinking a filesystem
possible. We need to be able to stop operations from taking place
on AGs that need to be removed by a shrink, so before shrink can be
implemented we need to have the infrastructure in place to prevent
incursion into AGs that are going to be, or are in the process, of
being removed from active duty.
The focus of this is making operations that depend on access to AGs
use the perag to access and pin the AG in active use, thereby
creating a barrier we can use to delay shrink until all active uses
of an AG have been drained and new uses are prevented.
This series starts by fixing some existing issues that are exposed
by changes later in the series. They stand alone, so can be picked
up independently of the rest of this patchset.
The most complex of these fixes is cleaning up the mess that is the
AGF deadlock avoidance algorithm. This algorithm stores the first
block that is allocated in a transaction in tp->t_firstblock, then
uses this to try to limit future allocations within the transaction
to AGs at or higher than the filesystem block stored in
tp->t_firstblock. This depends on one of the initial bug fixes in
the series to move the deadlock avoidance checks to
xfs_alloc_vextent(), and then builds on it to relax the constraints
of the avoidance algorithm to only be active when a deadlock is
possible.
We also update the algorithm to record allocations from higher AGs
that are allocated from, because we when we need to lock more than
two AGs we still have to ensure lock order is correct. Therefore we
can't lock AGs in the order 1, 3, 2, even though tp->t_firstblock
indicates that we've allocated from AG 1 and so AG is valid to lock.
It's not valid, because we already hold AG 3 locked, and so
tp->t-first_block should actually point at AG 3, not AG 1 in this
situation.
It should now be obvious that the deadlock avoidance algorithm
should record AGs, not filesystem blocks. So the series then changes
the transaction to store the highest AG we've allocated in rather
than a filesystem block we allocated. This makes it obvious what
the constraints are, and trivial to update as we lock and allocate
from various AGs.
With all the bug fixes out of the way, the series then starts
converting the code to use active references. Active reference
counts are used by high level code that needs to prevent the AG from
being taken out from under it by a shrink operation. The high level
code needs to be able to handle not getting an active reference
gracefully, and the shrink code will need to wait for active
references to drain before continuing.
Active references are implemented just as reference counts right now
- an active reference is taken at perag init during mount, and all
other active references are dependent on the active reference count
being greater than zero. This gives us an initial method of stopping
new active references without needing other infrastructure; just
drop the reference taken at filesystem mount time and when the
refcount then falls to zero no new references can be taken.
In future, this will need to take into account AG control state
(e.g. offline, no alloc, etc) as well as the reference count, but
right now we can implement a basic barrier for shrink with just
reference count manipulations. As such, patches to convert the perag
state to atomic opstate fields similar to the xfs_mount and xlog
opstate fields follow the initial active perag reference counting
patches.
The first target for active reference conversion is the
for_each_perag*() iterators. This captures a lot of high level code
that should skip offline AGs, and introduces the ability to
differentiate between a lookup that didn't have an online AG and the
end of the AG iteration range.
From there, the inode allocation AG selection is converted to active
references, and the perag is driven deeper into the inode allocation
and btree code to replace the xfs_mount. Most of the inode
allocation code operates on a single AG once it is selected, hence
it should pass the perag as the primary referenced object around for
allocation, not the xfs_mount. There is a bit of churn here, but it
emphasises that inode allocation is inherently an allocation group
based operation.
Next the bmap/alloc interface undergoes a major untangling,
reworking xfs_bmap_btalloc() into separate allocation operations for
different contexts and failure handling behaviours. This then allows
us to completely remove the xfs_alloc_vextent() layer via
restructuring the xfs_alloc_vextent/xfs_alloc_ag_vextent() into a
set of realtively simple helper function that describe the
allocation that they are doing. e.g. xfs_alloc_vextent_exact_bno().
This allows the requirements for accessing AGs to be allocation
context dependent. The allocations that require operation on a
single AG generally can't tolerate failure after the allocation
method and AG has been decided on, and hence the caller needs to
manage the active references to ensure the allocation does not race
with shrink removing the selected AG for the duration of the
operation that requires access to that allocation group.
Other allocations iterate AGs and so the first AG is just a hint -
these do not need to pin a perag first as they can tolerate not
being able to access an AG by simply skipping over it. These require
new perag iteration functions that can start at arbitrary AGs and
wrap around at arbitrary AGs, hence a new set for
for_each_perag_wrap*() helpers to do this.
Next is the rework of the filestreams allocator. This doesn't change
any functionality, but gets rid of the unnecessary multi-pass
selection algorithm when the selected AG is not available. It
currently does a lookup pass which might iterate all AGs to select
an AG, then checks if the AG is acceptible and if not does a "new
AG" pass that is essentially identical to the lookup pass. Both of
these scans also do the same "longest extent in AG" check before
selecting an AG as is done after the AG is selected.
IOWs, the filestreams algorithm can be greatly simplified into a
single new AG selection pass if the there is no current association
or the currently associated AG doesn't have enough contiguous free
space for the allocation to proceed. With this simplification of
the filestreams allocator, it's then trivial to convert it to use
for_each_perag_wrap() for the AG scan algorithm.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
* tag 'xfs-alloc-perag-conversion' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (42 commits)
xfs: refactor the filestreams allocator pick functions
xfs: return a referenced perag from filestreams allocator
xfs: pass perag to filestreams tracing
xfs: use for_each_perag_wrap in xfs_filestream_pick_ag
xfs: track an active perag reference in filestreams
xfs: factor out MRU hit case in xfs_filestream_select_ag
xfs: remove xfs_filestream_select_ag() longest extent check
xfs: merge new filestream AG selection into xfs_filestream_select_ag()
xfs: merge filestream AG lookup into xfs_filestream_select_ag()
xfs: move xfs_bmap_btalloc_filestreams() to xfs_filestreams.c
xfs: use xfs_bmap_longest_free_extent() in filestreams
xfs: get rid of notinit from xfs_bmap_longest_free_extent
xfs: factor out filestreams from xfs_bmap_btalloc_nullfb
xfs: convert trim to use for_each_perag_range
xfs: convert xfs_alloc_vextent_iterate_ags() to use perag walker
xfs: move the minimum agno checks into xfs_alloc_vextent_check_args
xfs: fold xfs_alloc_ag_vextent() into callers
xfs: move allocation accounting to xfs_alloc_vextent_set_fsbno()
xfs: introduce xfs_alloc_vextent_prepare()
xfs: introduce xfs_alloc_vextent_exact_bno()
...
Diffstat (limited to 'fs/xfs/libxfs/xfs_ialloc.c')
-rw-r--r-- | fs/xfs/libxfs/xfs_ialloc.c | 241 |
1 files changed, 111 insertions, 130 deletions
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 5118dedf9267..82ea6aa5af10 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -169,10 +169,9 @@ xfs_inobt_insert_rec( */ STATIC int xfs_inobt_insert( - struct xfs_mount *mp, + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agbp, - struct xfs_perag *pag, xfs_agino_t newino, xfs_agino_t newlen, xfs_btnum_t btnum) @@ -182,7 +181,7 @@ xfs_inobt_insert( int i; int error; - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, btnum); + cur = xfs_inobt_init_cursor(pag, tp, agbp, btnum); for (thisino = newino; thisino < newino + newlen; @@ -514,20 +513,20 @@ __xfs_inobt_rec_merge( */ STATIC int xfs_inobt_insert_sprec( - struct xfs_mount *mp, + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agbp, - struct xfs_perag *pag, int btnum, struct xfs_inobt_rec_incore *nrec, /* in/out: new/merged rec. */ bool merge) /* merge or replace */ { + struct xfs_mount *mp = pag->pag_mount; struct xfs_btree_cur *cur; int error; int i; struct xfs_inobt_rec_incore rec; - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, btnum); + cur = xfs_inobt_init_cursor(pag, tp, agbp, btnum); /* the new record is pre-aligned so we know where to look */ error = xfs_inobt_lookup(cur, nrec->ir_startino, XFS_LOOKUP_EQ, &i); @@ -609,9 +608,9 @@ error: */ STATIC int xfs_ialloc_ag_alloc( + struct xfs_perag *pag, struct xfs_trans *tp, - struct xfs_buf *agbp, - struct xfs_perag *pag) + struct xfs_buf *agbp) { struct xfs_agi *agi; struct xfs_alloc_arg args; @@ -631,6 +630,7 @@ xfs_ialloc_ag_alloc( args.mp = tp->t_mountp; args.fsbno = NULLFSBLOCK; args.oinfo = XFS_RMAP_OINFO_INODES; + args.pag = pag; #ifdef DEBUG /* randomly do sparse inode allocations */ @@ -662,8 +662,6 @@ xfs_ialloc_ag_alloc( goto sparse_alloc; if (likely(newino != NULLAGINO && (args.agbno < be32_to_cpu(agi->agi_length)))) { - args.fsbno = XFS_AGB_TO_FSB(args.mp, pag->pag_agno, args.agbno); - args.type = XFS_ALLOCTYPE_THIS_BNO; args.prod = 1; /* @@ -684,7 +682,10 @@ xfs_ialloc_ag_alloc( /* Allow space for the inode btree to split. */ args.minleft = igeo->inobt_maxlevels; - if ((error = xfs_alloc_vextent(&args))) + error = xfs_alloc_vextent_exact_bno(&args, + XFS_AGB_TO_FSB(args.mp, pag->pag_agno, + args.agbno)); + if (error) return error; /* @@ -717,22 +718,17 @@ xfs_ialloc_ag_alloc( } else args.alignment = igeo->cluster_align; /* - * Need to figure out where to allocate the inode blocks. - * Ideally they should be spaced out through the a.g. - * For now, just allocate blocks up front. - */ - args.agbno = be32_to_cpu(agi->agi_root); - args.fsbno = XFS_AGB_TO_FSB(args.mp, pag->pag_agno, args.agbno); - /* * Allocate a fixed-size extent of inodes. */ - args.type = XFS_ALLOCTYPE_NEAR_BNO; args.prod = 1; /* * Allow space for the inode btree to split. */ args.minleft = igeo->inobt_maxlevels; - if ((error = xfs_alloc_vextent(&args))) + error = xfs_alloc_vextent_near_bno(&args, + XFS_AGB_TO_FSB(args.mp, pag->pag_agno, + be32_to_cpu(agi->agi_root))); + if (error) return error; } @@ -741,11 +737,11 @@ xfs_ialloc_ag_alloc( * alignment. */ if (isaligned && args.fsbno == NULLFSBLOCK) { - args.type = XFS_ALLOCTYPE_NEAR_BNO; - args.agbno = be32_to_cpu(agi->agi_root); - args.fsbno = XFS_AGB_TO_FSB(args.mp, pag->pag_agno, args.agbno); args.alignment = igeo->cluster_align; - if ((error = xfs_alloc_vextent(&args))) + error = xfs_alloc_vextent_near_bno(&args, + XFS_AGB_TO_FSB(args.mp, pag->pag_agno, + be32_to_cpu(agi->agi_root))); + if (error) return error; } @@ -757,9 +753,6 @@ xfs_ialloc_ag_alloc( igeo->ialloc_min_blks < igeo->ialloc_blks && args.fsbno == NULLFSBLOCK) { sparse_alloc: - args.type = XFS_ALLOCTYPE_NEAR_BNO; - args.agbno = be32_to_cpu(agi->agi_root); - args.fsbno = XFS_AGB_TO_FSB(args.mp, pag->pag_agno, args.agbno); args.alignment = args.mp->m_sb.sb_spino_align; args.prod = 1; @@ -781,7 +774,9 @@ sparse_alloc: args.mp->m_sb.sb_inoalignmt) - igeo->ialloc_blks; - error = xfs_alloc_vextent(&args); + error = xfs_alloc_vextent_near_bno(&args, + XFS_AGB_TO_FSB(args.mp, pag->pag_agno, + be32_to_cpu(agi->agi_root))); if (error) return error; @@ -831,7 +826,7 @@ sparse_alloc: * if necessary. If a merge does occur, rec is updated to the * merged record. */ - error = xfs_inobt_insert_sprec(args.mp, tp, agbp, pag, + error = xfs_inobt_insert_sprec(pag, tp, agbp, XFS_BTNUM_INO, &rec, true); if (error == -EFSCORRUPTED) { xfs_alert(args.mp, @@ -856,20 +851,20 @@ sparse_alloc: * existing record with this one. */ if (xfs_has_finobt(args.mp)) { - error = xfs_inobt_insert_sprec(args.mp, tp, agbp, pag, + error = xfs_inobt_insert_sprec(pag, tp, agbp, XFS_BTNUM_FINO, &rec, false); if (error) return error; } } else { /* full chunk - insert new records to both btrees */ - error = xfs_inobt_insert(args.mp, tp, agbp, pag, newino, newlen, + error = xfs_inobt_insert(pag, tp, agbp, newino, newlen, XFS_BTNUM_INO); if (error) return error; if (xfs_has_finobt(args.mp)) { - error = xfs_inobt_insert(args.mp, tp, agbp, pag, newino, + error = xfs_inobt_insert(pag, tp, agbp, newino, newlen, XFS_BTNUM_FINO); if (error) return error; @@ -981,9 +976,9 @@ xfs_inobt_first_free_inode( */ STATIC int xfs_dialloc_ag_inobt( + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agbp, - struct xfs_perag *pag, xfs_ino_t parent, xfs_ino_t *inop) { @@ -999,12 +994,12 @@ xfs_dialloc_ag_inobt( int i, j; int searchdistance = 10; - ASSERT(pag->pagi_init); - ASSERT(pag->pagi_inodeok); + ASSERT(xfs_perag_initialised_agi(pag)); + ASSERT(xfs_perag_allows_inodes(pag)); ASSERT(pag->pagi_freecount > 0); restart_pagno: - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, XFS_BTNUM_INO); + cur = xfs_inobt_init_cursor(pag, tp, agbp, XFS_BTNUM_INO); /* * If pagino is 0 (this is the root inode allocation) use newino. * This must work because we've just allocated some. @@ -1429,9 +1424,9 @@ xfs_dialloc_ag_update_inobt( */ static int xfs_dialloc_ag( + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agbp, - struct xfs_perag *pag, xfs_ino_t parent, xfs_ino_t *inop) { @@ -1448,7 +1443,7 @@ xfs_dialloc_ag( int i; if (!xfs_has_finobt(mp)) - return xfs_dialloc_ag_inobt(tp, agbp, pag, parent, inop); + return xfs_dialloc_ag_inobt(pag, tp, agbp, parent, inop); /* * If pagino is 0 (this is the root inode allocation) use newino. @@ -1457,7 +1452,7 @@ xfs_dialloc_ag( if (!pagino) pagino = be32_to_cpu(agi->agi_newino); - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, XFS_BTNUM_FINO); + cur = xfs_inobt_init_cursor(pag, tp, agbp, XFS_BTNUM_FINO); error = xfs_check_agi_freecount(cur); if (error) @@ -1500,7 +1495,7 @@ xfs_dialloc_ag( * the original freecount. If all is well, make the equivalent update to * the inobt using the finobt record and offset information. */ - icur = xfs_inobt_init_cursor(mp, tp, agbp, pag, XFS_BTNUM_INO); + icur = xfs_inobt_init_cursor(pag, tp, agbp, XFS_BTNUM_INO); error = xfs_check_agi_freecount(icur); if (error) @@ -1577,25 +1572,10 @@ xfs_dialloc_roll( return error; } -static xfs_agnumber_t -xfs_ialloc_next_ag( - xfs_mount_t *mp) -{ - xfs_agnumber_t agno; - - spin_lock(&mp->m_agirotor_lock); - agno = mp->m_agirotor; - if (++mp->m_agirotor >= mp->m_maxagi) - mp->m_agirotor = 0; - spin_unlock(&mp->m_agirotor_lock); - - return agno; -} - static bool xfs_dialloc_good_ag( - struct xfs_trans *tp, struct xfs_perag *pag, + struct xfs_trans *tp, umode_t mode, int flags, bool ok_alloc) @@ -1606,10 +1586,12 @@ xfs_dialloc_good_ag( int needspace; int error; - if (!pag->pagi_inodeok) + if (!pag) + return false; + if (!xfs_perag_allows_inodes(pag)) return false; - if (!pag->pagi_init) { + if (!xfs_perag_initialised_agi(pag)) { error = xfs_ialloc_read_agi(pag, tp, NULL); if (error) return false; @@ -1620,7 +1602,7 @@ xfs_dialloc_good_ag( if (!ok_alloc) return false; - if (!pag->pagf_init) { + if (!xfs_perag_initialised_agf(pag)) { error = xfs_alloc_read_agf(pag, tp, flags, NULL); if (error) return false; @@ -1665,8 +1647,8 @@ xfs_dialloc_good_ag( static int xfs_dialloc_try_ag( - struct xfs_trans **tpp, struct xfs_perag *pag, + struct xfs_trans **tpp, xfs_ino_t parent, xfs_ino_t *new_ino, bool ok_alloc) @@ -1689,7 +1671,7 @@ xfs_dialloc_try_ag( goto out_release; } - error = xfs_ialloc_ag_alloc(*tpp, agbp, pag); + error = xfs_ialloc_ag_alloc(pag, *tpp, agbp); if (error < 0) goto out_release; @@ -1705,7 +1687,7 @@ xfs_dialloc_try_ag( } /* Allocate an inode in the found AG */ - error = xfs_dialloc_ag(*tpp, agbp, pag, parent, &ino); + error = xfs_dialloc_ag(pag, *tpp, agbp, parent, &ino); if (!error) *new_ino = ino; return error; @@ -1737,8 +1719,9 @@ xfs_dialloc( struct xfs_perag *pag; struct xfs_ino_geometry *igeo = M_IGEO(mp); bool ok_alloc = true; + bool low_space = false; int flags; - xfs_ino_t ino; + xfs_ino_t ino = NULLFSINO; /* * Directories, symlinks, and regular files frequently allocate at least @@ -1746,7 +1729,7 @@ xfs_dialloc( * an AG has enough space for file creation. */ if (S_ISDIR(mode)) - start_agno = xfs_ialloc_next_ag(mp); + start_agno = atomic_inc_return(&mp->m_agirotor) % mp->m_maxagi; else { start_agno = XFS_INO_TO_AGNO(mp, parent); if (start_agno >= mp->m_maxagi) @@ -1768,41 +1751,55 @@ xfs_dialloc( } /* + * If we are near to ENOSPC, we want to prefer allocation from AGs that + * have free inodes in them rather than use up free space allocating new + * inode chunks. Hence we turn off allocation for the first non-blocking + * pass through the AGs if we are near ENOSPC to consume free inodes + * that we can immediately allocate, but then we allow allocation on the + * second pass if we fail to find an AG with free inodes in it. + */ + if (percpu_counter_read_positive(&mp->m_fdblocks) < + mp->m_low_space[XFS_LOWSP_1_PCNT]) { + ok_alloc = false; + low_space = true; + } + + /* * Loop until we find an allocation group that either has free inodes * or in which we can allocate some inodes. Iterate through the * allocation groups upward, wrapping at the end. */ - agno = start_agno; flags = XFS_ALLOC_FLAG_TRYLOCK; - for (;;) { - pag = xfs_perag_get(mp, agno); - if (xfs_dialloc_good_ag(*tpp, pag, mode, flags, ok_alloc)) { - error = xfs_dialloc_try_ag(tpp, pag, parent, +retry: + for_each_perag_wrap_at(mp, start_agno, mp->m_maxagi, agno, pag) { + if (xfs_dialloc_good_ag(pag, *tpp, mode, flags, ok_alloc)) { + error = xfs_dialloc_try_ag(pag, tpp, parent, &ino, ok_alloc); if (error != -EAGAIN) break; + error = 0; } if (xfs_is_shutdown(mp)) { error = -EFSCORRUPTED; break; } - if (++agno == mp->m_maxagi) - agno = 0; - if (agno == start_agno) { - if (!flags) { - error = -ENOSPC; - break; - } + } + if (pag) + xfs_perag_rele(pag); + if (error) + return error; + if (ino == NULLFSINO) { + if (flags) { flags = 0; + if (low_space) + ok_alloc = true; + goto retry; } - xfs_perag_put(pag); + return -ENOSPC; } - - if (!error) - *new_ino = ino; - xfs_perag_put(pag); - return error; + *new_ino = ino; + return 0; } /* @@ -1885,14 +1882,14 @@ next: STATIC int xfs_difree_inobt( - struct xfs_mount *mp, + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agbp, - struct xfs_perag *pag, xfs_agino_t agino, struct xfs_icluster *xic, struct xfs_inobt_rec_incore *orec) { + struct xfs_mount *mp = pag->pag_mount; struct xfs_agi *agi = agbp->b_addr; struct xfs_btree_cur *cur; struct xfs_inobt_rec_incore rec; @@ -1907,7 +1904,7 @@ xfs_difree_inobt( /* * Initialize the cursor. */ - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, XFS_BTNUM_INO); + cur = xfs_inobt_init_cursor(pag, tp, agbp, XFS_BTNUM_INO); error = xfs_check_agi_freecount(cur); if (error) @@ -2019,20 +2016,20 @@ error0: */ STATIC int xfs_difree_finobt( - struct xfs_mount *mp, + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agbp, - struct xfs_perag *pag, xfs_agino_t agino, struct xfs_inobt_rec_incore *ibtrec) /* inobt record */ { + struct xfs_mount *mp = pag->pag_mount; struct xfs_btree_cur *cur; struct xfs_inobt_rec_incore rec; int offset = agino - ibtrec->ir_startino; int error; int i; - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, XFS_BTNUM_FINO); + cur = xfs_inobt_init_cursor(pag, tp, agbp, XFS_BTNUM_FINO); error = xfs_inobt_lookup(cur, ibtrec->ir_startino, XFS_LOOKUP_EQ, &i); if (error) @@ -2179,7 +2176,7 @@ xfs_difree( /* * Fix up the inode allocation btree. */ - error = xfs_difree_inobt(mp, tp, agbp, pag, agino, xic, &rec); + error = xfs_difree_inobt(pag, tp, agbp, agino, xic, &rec); if (error) goto error0; @@ -2187,7 +2184,7 @@ xfs_difree( * Fix up the free inode btree. */ if (xfs_has_finobt(mp)) { - error = xfs_difree_finobt(mp, tp, agbp, pag, agino, &rec); + error = xfs_difree_finobt(pag, tp, agbp, agino, &rec); if (error) goto error0; } @@ -2200,15 +2197,15 @@ error0: STATIC int xfs_imap_lookup( - struct xfs_mount *mp, - struct xfs_trans *tp, struct xfs_perag *pag, + struct xfs_trans *tp, xfs_agino_t agino, xfs_agblock_t agbno, xfs_agblock_t *chunk_agbno, xfs_agblock_t *offset_agbno, int flags) { + struct xfs_mount *mp = pag->pag_mount; struct xfs_inobt_rec_incore rec; struct xfs_btree_cur *cur; struct xfs_buf *agbp; @@ -2229,7 +2226,7 @@ xfs_imap_lookup( * we have a record, we need to ensure it contains the inode number * we are looking up. */ - cur = xfs_inobt_init_cursor(mp, tp, agbp, pag, XFS_BTNUM_INO); + cur = xfs_inobt_init_cursor(pag, tp, agbp, XFS_BTNUM_INO); error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &i); if (!error) { if (i) @@ -2263,12 +2260,13 @@ xfs_imap_lookup( */ int xfs_imap( - struct xfs_mount *mp, /* file system mount structure */ - struct xfs_trans *tp, /* transaction pointer */ + struct xfs_perag *pag, + struct xfs_trans *tp, xfs_ino_t ino, /* inode to locate */ struct xfs_imap *imap, /* location map structure */ uint flags) /* flags for inode btree lookup */ { + struct xfs_mount *mp = pag->pag_mount; xfs_agblock_t agbno; /* block number of inode in the alloc group */ xfs_agino_t agino; /* inode number within alloc group */ xfs_agblock_t chunk_agbno; /* first block in inode chunk */ @@ -2276,17 +2274,15 @@ xfs_imap( int error; /* error code */ int offset; /* index of inode in its buffer */ xfs_agblock_t offset_agbno; /* blks from chunk start to inode */ - struct xfs_perag *pag; ASSERT(ino != NULLFSINO); /* * Split up the inode number into its parts. */ - pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino)); agino = XFS_INO_TO_AGINO(mp, ino); agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (!pag || agbno >= mp->m_sb.sb_agblocks || + if (agbno >= mp->m_sb.sb_agblocks || ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { error = -EINVAL; #ifdef DEBUG @@ -2295,20 +2291,14 @@ xfs_imap( * as they can be invalid without implying corruption. */ if (flags & XFS_IGET_UNTRUSTED) - goto out_drop; - if (!pag) { - xfs_alert(mp, - "%s: agno (%d) >= mp->m_sb.sb_agcount (%d)", - __func__, XFS_INO_TO_AGNO(mp, ino), - mp->m_sb.sb_agcount); - } + return error; if (agbno >= mp->m_sb.sb_agblocks) { xfs_alert(mp, "%s: agbno (0x%llx) >= mp->m_sb.sb_agblocks (0x%lx)", __func__, (unsigned long long)agbno, (unsigned long)mp->m_sb.sb_agblocks); } - if (pag && ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { + if (ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { xfs_alert(mp, "%s: ino (0x%llx) != XFS_AGINO_TO_INO() (0x%llx)", __func__, ino, @@ -2316,7 +2306,7 @@ xfs_imap( } xfs_stack_trace(); #endif /* DEBUG */ - goto out_drop; + return error; } /* @@ -2327,10 +2317,10 @@ xfs_imap( * in all cases where an untrusted inode number is passed. */ if (flags & XFS_IGET_UNTRUSTED) { - error = xfs_imap_lookup(mp, tp, pag, agino, agbno, + error = xfs_imap_lookup(pag, tp, agino, agbno, &chunk_agbno, &offset_agbno, flags); if (error) - goto out_drop; + return error; goto out_map; } @@ -2346,8 +2336,7 @@ xfs_imap( imap->im_len = XFS_FSB_TO_BB(mp, 1); imap->im_boffset = (unsigned short)(offset << mp->m_sb.sb_inodelog); - error = 0; - goto out_drop; + return 0; } /* @@ -2359,10 +2348,10 @@ xfs_imap( offset_agbno = agbno & M_IGEO(mp)->inoalign_mask; chunk_agbno = agbno - offset_agbno; } else { - error = xfs_imap_lookup(mp, tp, pag, agino, agbno, + error = xfs_imap_lookup(pag, tp, agino, agbno, &chunk_agbno, &offset_agbno, flags); if (error) - goto out_drop; + return error; } out_map: @@ -2390,14 +2379,9 @@ out_map: __func__, (unsigned long long) imap->im_blkno, (unsigned long long) imap->im_len, XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks)); - error = -EINVAL; - goto out_drop; + return -EINVAL; } - error = 0; -out_drop: - if (pag) - xfs_perag_put(pag); - return error; + return 0; } /* @@ -2613,10 +2597,10 @@ xfs_ialloc_read_agi( return error; agi = agibp->b_addr; - if (!pag->pagi_init) { + if (!xfs_perag_initialised_agi(pag)) { pag->pagi_freecount = be32_to_cpu(agi->agi_freecount); pag->pagi_count = be32_to_cpu(agi->agi_count); - pag->pagi_init = 1; + set_bit(XFS_AGSTATE_AGI_INIT, &pag->pag_opstate); } /* @@ -2924,26 +2908,24 @@ xfs_ialloc_calc_rootino( */ int xfs_ialloc_check_shrink( + struct xfs_perag *pag, struct xfs_trans *tp, - xfs_agnumber_t agno, struct xfs_buf *agibp, xfs_agblock_t new_length) { struct xfs_inobt_rec_incore rec; struct xfs_btree_cur *cur; - struct xfs_mount *mp = tp->t_mountp; - struct xfs_perag *pag; - xfs_agino_t agino = XFS_AGB_TO_AGINO(mp, new_length); + xfs_agino_t agino; int has; int error; - if (!xfs_has_sparseinodes(mp)) + if (!xfs_has_sparseinodes(pag->pag_mount)) return 0; - pag = xfs_perag_get(mp, agno); - cur = xfs_inobt_init_cursor(mp, tp, agibp, pag, XFS_BTNUM_INO); + cur = xfs_inobt_init_cursor(pag, tp, agibp, XFS_BTNUM_INO); /* Look up the inobt record that would correspond to the new EOFS. */ + agino = XFS_AGB_TO_AGINO(pag->pag_mount, new_length); error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &has); if (error || !has) goto out; @@ -2964,6 +2946,5 @@ xfs_ialloc_check_shrink( } out: xfs_btree_del_cursor(cur, error); - xfs_perag_put(pag); return error; } |