[Devel] [RFC PATCH vz9 v6 39/62] dm-ploop: prepare bat updates under bat_lock
Alexander Atanasov
alexander.atanasov at virtuozzo.com
Fri Dec 20 18:01:31 MSK 2024
On 20.12.24 15:24, Andrey Zhadchenko wrote:
>
>
> On 12/5/24 22:56, Alexander Atanasov wrote:
>> Prepare for threads. When preparing bat updates there are two important
>> things to protect - md->status MD_DIRTY bit and holes bitmap.
>> Use bat_lock to protect them.
>>
>> https://virtuozzo.atlassian.net/browse/VSTOR-91821
>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>> ---
>> drivers/md/dm-ploop-map.c | 83 ++++++++++++++++++++++++++++-----------
>> 1 file changed, 61 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>> index bb15fa1f854b..0eaab61a0c31 100644
>> --- a/drivers/md/dm-ploop-map.c
>> +++ b/drivers/md/dm-ploop-map.c
>> @@ -555,13 +555,18 @@ static bool ploop_md_make_dirty(struct ploop
>> *ploop, struct md_page *md)
>> unsigned long flags;
>> bool new = false;
>> - spin_lock_irqsave(&ploop->bat_lock, flags);
>> + /*
>> + * Usual pattern is check for dirty, prepare bat update
>> + * and then call make_dirty - to avoid races callers must lock
>> + */
>> +
>> + lockdep_assert_held(&ploop->bat_lock);
>> +
>> WARN_ON_ONCE(test_bit(MD_WRITEBACK, &md->status));
>> if (!test_and_set_bit(MD_DIRTY, &md->status)) {
>> llist_add((struct llist_node *)&md->wb_link,
>> &ploop->wb_batch_llist);
>> new = true;
>> }
>> - spin_unlock_irqrestore(&ploop->bat_lock, flags);
>> md->dirty_timeout = jiffies + ploop->md_submit_delay_ms*HZ/1000;
>> return new;
>> @@ -807,6 +812,8 @@ static void
>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>> dst_clu = piwb->kmpage;
>> + /* holes bit map requires bat_lock */
>> + spin_lock_irqsave(&ploop->bat_lock, flags);
>> spin_lock(&md->md_lock);
>> if (piwb->type == PIWB_TYPE_ALLOC)
>> goto skip_apply;
>> @@ -833,6 +840,7 @@ static void
>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>> clear_bit(MD_WRITEBACK, &md->status);
>> md->piwb = NULL;
>> spin_unlock(&md->md_lock);
>> + spin_unlock_irqrestore(&ploop->bat_lock, flags);
>> wait_llist_pending = llist_del_all(&md->wait_llist);
>> if (wait_llist_pending) {
>> @@ -1077,10 +1085,22 @@ static int ploop_allocate_cluster(struct ploop
>> *ploop, u32 *dst_clu)
>> u32 clu_size = CLU_SIZE(ploop);
>> loff_t off, pos, end, old_size;
>> struct file *file = top->file;
>> + unsigned long flags;
>> int ret;
>> - if (ploop_find_dst_clu_bit(ploop, dst_clu) < 0)
>> + spin_lock_irqsave(&ploop->bat_lock, flags);
>> + if (ploop_find_dst_clu_bit(ploop, dst_clu) < 0) {
>> + spin_unlock_irqrestore(&ploop->bat_lock, flags);
>> return -EIO;
>> + }
>> + /*
>> + * Mark clu as used. Find & clear bit must be locked,
>> + * since this may be called from different threads.
>> + * Note, that set_bit may be made from many places.
>> + * We only care to clear what find got. parallel set is ok.
>> + */
>> + ploop_hole_clear_bit(*dst_clu, ploop);
>> + spin_unlock_irqrestore(&ploop->bat_lock, flags);
>> pos = CLU_TO_POS(ploop, *dst_clu);
>> end = pos + clu_size;
>> @@ -1094,6 +1114,8 @@ static int ploop_allocate_cluster(struct ploop
>> *ploop, u32 *dst_clu)
>> else
>> ret = ploop_zero_range(file, pos, off - pos);
>> if (ret) {
>> + ploop_hole_set_bit(*dst_clu, ploop);
>> +
>> if (ret != -EOPNOTSUPP)
>> PL_ERR("punch/zero area: %d", ret);
>> else if (ploop->falloc_new_clu)
>> @@ -1108,18 +1130,15 @@ static int ploop_allocate_cluster(struct ploop
>> *ploop, u32 *dst_clu)
>> if (end > old_size) {
>> ret = ploop_truncate_prealloc_safe(ploop, top, end, __func__);
>> - if (ret)
>> + if (ret) {
>> + ploop_hole_set_bit(*dst_clu, ploop);
>> return ret;
>> + }
>> }
>> if (end > top->file_preallocated_area_start)
>> top->file_preallocated_area_start = end;
>
> I think whole ploop_md_make_dirty() is also racy. If we have parallel
> allocations, we also need to protect ploop fields. Probably if there
> will be too much allocation at the same time (more than prealloc size),
> we may have racing truncates with different file lengths.
While i agree with you about ploop_md_make_dirty(), i do not think is
possible to have running parallel allocations - if there are concurent
one will end in some delay queue either cluster locked or md page dirty.
But i will double check this. According to code comments - problem is if
we have parallel allocation and discard but i think bitmap locking helps
with it.
--
Regards,
Alexander Atanasov
More information about the Devel
mailing list