1

I am trying to write a remapping target for usage with DM.

I followed instructions from several places (including this Answer) all essentially giving the same code.

This is ok, but not enough for me.

I need to modify "in transit" data of struct bio being remapped.

This means I need to make a deep-clone of the bio, including the data; apparently the provided functions (e.g.: bio_clone_bioset()) do not copy data at all, but point iovec's to the original pages/offsets.

I tried some variations of the following scheme:

void
mt_copy(struct bio *dst, struct bio *src) {
    struct bvec_iter src_iter, dst_iter;
    struct bio_vec src_bv, dst_bv;
    void *src_p, *dst_p;
    unsigned bytes;
        unsigned salt;

    src_iter = src->bi_iter;
    dst_iter = dst->bi_iter;
        salt = src_iter.bi_sector;

    while (1) {
        if (!src_iter.bi_size) {
            break;
        }

        if (!dst_iter.bi_size) {
            break;
        }

        src_bv = bio_iter_iovec(src, src_iter);
        dst_bv = bio_iter_iovec(dst, dst_iter);

        bytes = min(src_bv.bv_len, dst_bv.bv_len);

        src_p = kmap_atomic(src_bv.bv_page);
        dst_p = kmap_atomic(dst_bv.bv_page);

        memcpy(dst_p + dst_bv.bv_offset, src_p + src_bv.bv_offset, bytes);

        kunmap_atomic(dst_p);
        kunmap_atomic(src_p);

        bio_advance_iter(src, &src_iter, bytes);
        bio_advance_iter(dst, &dst_iter, bytes);
    }
}

static struct bio *
mt_clone(struct bio *bio) {
        struct bio    *clone;

        clone = bio_clone_bioset(bio, GFP_KERNEL, NULL);
        if (!clone) {
                return NULL;
        }
        if (bio_alloc_pages(clone, GFP_KERNEL)) {
                bio_put(clone);
                return NULL;
        }

        clone->bi_private = bio;

        if (bio_data_dir(bio) == WRITE) {
                mt_copy(clone, bio);
        }

        return clone;
}

static int
mt_map(struct dm_target *ti, struct bio *bio) {
        struct mt_private *mdt = (struct mt_private *) ti->private;

        bio->bi_bdev = mdt->dev->bdev;

        bio = mt_clone(bio);
        submit_bio(bio->bi_rw, bio);

        return DM_MAPIO_SUBMITTED;
}

This, however, does not work.

When I submit_bio() using the cloned bio I do not get the .end_io call and the calling task becomes blocked ("INFO: task mount:488 blocked for more than 120 seconds."). This with a READ request consisting of a single iovec (1024 bytes). In this case, of course the in buffers do not need copying because they should be overwritten; I need to copy back the incoming data unto the original buffers after the request has completed... but I don't get there.

I'm quite evidently missing some piece, but I'm unable to understand what.

Note: I didn't do any optimization (e.g.: use smarter allocation strategies) specifically because I need to get the basics first.

Note: I corrected a mistake (thanks @RuslanRLaishev), unfortunately ininfluent; see my own answer.

ZioByte
  • 2,690
  • 1
  • 32
  • 68
  • 2
    Are you sure you are using the correct version of [struct bio](https://elixir.free-electrons.com/linux/latest/source/include/linux/blk_types.h) for your kernel. The one in the link you provide seems to be several revisions old. (this was the only thing that stood out immediately) – David C. Rankin Jan 02 '18 at 02:11
  • 2
    @DavidC.Rankin: I hope I didn't goof on that. I'm using kernel v4.4.9, which surely isn't bleeding edge, but is what is on my target (an oldish at91samg25). – ZioByte Jan 02 '18 at 02:29
  • 2
    The only reason I commented was the link you provided still used the old hard-coded 512b sector size which is absent in the current version (I suspect to handled the default 512/4096 sector sizes). The changes occurred well before 4.4.9, so I'd double check the version of `struct bio` if you are using the one from the link you provided. – David C. Rankin Jan 02 '18 at 03:05
  • 2
    @DavidC.Rankin: the code, before my attempts to clone `struct bio` was working as advertised... unfortunately it was also doing nothing useful. I actually started from [this code](https://github.com/huwan/dm-target/blob/master/mapper.c) which is very similar. As said my first attempts with code just resubmitting the original bio, were successful. I seem unable to make even a plain clone (useless, of course, but a first step). Thanks for Your patience... why are you saying we are using hard-coded 512b? that is just for loopback setup and I am not using that at all, my target device is an SD. – ZioByte Jan 02 '18 at 04:18

2 Answers2

1

It's correct ?

if (bio_alloc_pages(**bio**, GFP_KERNEL)) {
                bio_put(clone);
                return NULL;
        }

or

if (bio_alloc_pages(**clone**, GFP_KERNEL)) {
                bio_put(bio);
                return NULL;
        }
  • 2
    Good catch, you're half right. It should be if (bio_alloc_pages(**clone**, GFP_KERNEL)) { bio_put(**clone**); return NULL; } Unfortunately this has nothing to do with the root problem. Have a look to my own answer. – ZioByte Jan 12 '18 at 14:10
  • 1
    It's looks like that I have a task like your, I process data to be written in the make_request_fn(), and read data in the bi_end_io() routine. To keep an original data (WRITE request) I cloning bvecs only. Chang bi_io_vec to copy? and restore it in the bio_end_io(). Hope this will help you. –  Jan 12 '18 at 14:29
  • 2
    Beware of multiple inflight requests... and associated memory consumption. – ZioByte Jan 12 '18 at 14:34
  • Hi ! Did u finished the project successfuly ? –  Jul 10 '18 at 13:16
  • 2
    No, I was unable to solve the memory consumption problem (I also tried sending mail to the proper mailing list, but I got no answer; it seems to be a "gurus only" ML). I "solved" my problem going much lower level (directly at MMC device driver) where requests are already serialized (just one in-flight), but that is vastly suboptimal because it's very specific to my current hardware, unfortunately. Any hint welcome. – ZioByte Jul 20 '18 at 19:40
  • 1
    At the time I fighting with memory leaking too ... Do u meant that cloned bio's has not been free ? Can we got out of the "blog" to direct conversation ? –  Jul 25 '18 at 08:54
  • 2
    No, I don't have a memory leak. What happens is the upper layer (ext4 filesystem), in certain conditions (e.g.: when un-tarring a very large tarfile) sends zillions (>10000) parallel requests, each one needing it's own allocation I cannot free until it's completed, obviously. This exhausts the memory pool and tar operation fails, after that memory is freed and all is well again. You can contact me directly via mail using myname at gmail dot com (I hope I'm not breaching policy by giving it; I WILL post a solution, if found). – ZioByte Jul 25 '18 at 12:56
  • 1
    I resolved the memory leaking problem in my code! The reason was in exhausting of internal contexts (to carry original data modified by driver) area. I recode it by using SLAB memory pools and at the time all work good with 10k+ outstanding BIOs requests. – Ruslan R. Laishev Jul 28 '18 at 08:45
1

It turns out bio_clone_bioset() and friends do not copy the callback address to call when request is over.

Trivial solution is to add clone->bi_end_io = bio->bi_end_io; before the end of mt_clone().

Unfortunately this is not enough to make the code functional because it turns out upper layers can spawn thousands of inflight requests (i.e.: requests queued and preprocessed before the previous ones complete) leading to memory starvation. Trying to slow upper layers by returning DM_MAPIO_REQUEUE does not seem to work (see: https://unix.stackexchange.com/q/410525/130498). This has nothing to do with current question, however.

ZioByte
  • 2,690
  • 1
  • 32
  • 68