Writing Your Own Device Mapper Target


Writing Your Own Device Mapper Target

Device Mapper 

In the Linux kernel, the device-mapper serves as a generic framework to map a virtual layer of block devices onto existing block devices. The device mapper framework promotes a clean separation of policy and mechanism between user and kernel space respectively. It forms the foundation of LVM2 and EVMS, software RAIDs, dm-crypt disk encryption, and offers additional features such as file-system snapshots. Device-mapper works by processing data passed in from virtual block devices, that it itself provides, and then passing the resultant data on to existing block devices.
"Device mapper can be defined as a generic way to add required functionality in the storage stack by creating virtual layer of block devices and mapping them to existing block devices ". It create virtual layers of block devices that can do different things on top of existing underlying block devices like striping, concatenation, mirroring, snapshot, etc.
The Device Mapper is a modular kernel driver that provides a generic framework for volume management. It has been introduced in the kernel version 2.6. The device-mapper is used by the LVM2 and EVMS 2.x tools.

Device Mapper Target


As stated above, we can create various logical layers through device mapper to carry out the required functionality. Each such layer is created by defining “a device mapper target” for that layer.
There is one to one correspondence between a virtual layer at device mapper layer and the dm target for that layer. The particular dm target contains the code which does the task of implementing functionality which the virtual layer intends to do. For example, a device mapper target can be written to implement mirroring over existing block devices. This dm target shows a virtual layer to upper layers which do the task of mirroring.
Currently seven such features have been added to device mapper through device mapper targets. The device mapper targets are as follows:

  • Linear
  • RAID-0 / Striped
  • RAID-1 / Mirrored RAID
  • Snapshot
  • DM-Crypt
  • Write Our Own Device Mapper Target


and the corresponding file is basic_target.c. 

Our device mapper target is going to be a kernel module. Lets say we call our dm target as 'basic_target'

basic_target.c

#include<linux/module.h>
#include<linux/kernel.h>
#include<linux/init.h>
#include <linux/bio.h> #include <linux/device-mapper.h>

/* This is a structure which will store  information about the underlying device 
*  Param:
* dev : underlying device
* start:  Starting sector number of the device
*/

struct my_dm_target {
        struct dm_dev *dev;
        sector_t start;
};



/* This is map function of basic target. This function gets called whenever you get a new bio
 * request.The working of map function is to map a particular bio request to the underlying device. 
 *The request that we receive is submitted to out device so  bio->bi_bdev points to our device.
 * We should point to the bio-> bi_dev field to bdev of underlying device. Here in this function,
 * we can have other processing like changing sector number of bio request, splitting bio etc. 
 *
 *  Param : 
 *  ti : It is the dm_target structure representing our basic target
 *  bio : The block I/O request from upper layer
 *  map_context : Its mapping context of target.
 *
 *: Return values from target map function:
 *  DM_MAPIO_SUBMITTED :  Your target has submitted the bio request to underlying request
 *  DM_MAPIO_REMAPPED  :  Bio request is remapped, Device mapper should submit bio.  
 *  DM_MAPIO_REQUEUE   :  Some problem has happened with the mapping of bio, So 
 *                                                re queue the bio request. So the bio will be submitted 
 *                                                to the map function  
 */
static int basic_target_map(struct dm_target *ti, struct bio *bio,union map_info *map_context)
{
        struct my_dm_target *mdt = (struct my_dm_target *) ti->private;
        printk(KERN_CRIT "\n<<in function basic_target_map \n");

        bio->bi_bdev = mdt->dev->bdev;

        if((bio->bi_rw & WRITE) == WRITE)
                printk(KERN_CRIT "\n basic_target_map : bio is a write request.... \n");
        else
                printk(KERN_CRIT "\n basic_target_map : bio is a read request.... \n");
        submit_bio(bio->bi_rw,bio);

        printk(KERN_CRIT "\n>>out function basic_target_map \n");      
        return DM_MAPIO_SUBMITTED;
}

/* This is Constructor Function of basic target * Constructor gets called when we create some device of type 'basic_target'. * So it will get called when we execute command 'dmsetup create' * This function gets called for each device over which you want to create basic * target. Here it is just a basic target so it will take only one device so it * will get called once. */

static int basic_target_ctr(struct dm_target *ti,unsigned int argc,char **argv)
{
        struct my_dm_target *mdt;
        unsigned long long start;

        printk(KERN_CRIT "\n >>in function basic_target_ctr \n");

        if (argc != 2) {
                printk(KERN_CRIT "\n Invalid no.of arguments.\n");
                ti->error = "Invalid argument count";
                return -EINVAL;
        }

        mdt = kmalloc(sizeof(struct my_dm_target), GFP_KERNEL);

        if(mdt==NULL)
        {
                printk(KERN_CRIT "\n Mdt is null\n");
                ti->error = "dm-basic_target: Cannot allocate linear context";
                return -ENOMEM;
        }      

        if(sscanf(argv[1], "%llu", &start)!=1)
        {
                ti->error = "dm-basic_target: Invalid device sector";
                goto bad;
        }

        mdt->start=(sector_t)start;

/* dm_get_table_mode * Gives out you the Permissions of device mapper table. * This table is nothing but the table which gets created * when we execute dmsetup create. This is one of the * Data structure used by device mapper for keeping track of its devices. * * dm_get_device * The function sets the mdt->dev field to underlying device dev structure. */
   
        if (dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), &mdt->dev)) {
                ti->error = "dm-basic_target: Device lookup failed";
                goto bad;
        }

        ti->private = mdt;

        printk(KERN_CRIT "\n>>out function basic_target_ctr \n");                      
        return 0;

  bad:
        kfree(mdt);
        printk(KERN_CRIT "\n>>out function basic_target_ctr with errorrrrrrrrrr \n");          
        return -EINVAL;
}

/* * This is destruction function * This gets called when we remove a device of type basic target. The function gets * called per device. */
static void basic_target_dtr(struct dm_target *ti)
{
        struct my_dm_target *mdt = (struct my_dm_target *) ti->private;
        printk(KERN_CRIT "\n<<in function basic_target_dtr \n");        
        dm_put_device(ti, mdt->dev);
        kfree(mdt);
        printk(KERN_CRIT "\n>>out function basic_target_dtr \n");              
}

/*
* This structure is fops for basic target.
*/
static struct target_type basic_target = {
       
        .name = "basic_target",
        .version = {1,0,0},
        .module = THIS_MODULE,
        .ctr = basic_target_ctr,
        .dtr = basic_target_dtr,
        .map = basic_target_map,
};
       
/*---------Module Functions -----------------*/

static int init_basic_target(void)
{
        int result;
      result = dm_register_target(&basic_target);
        if(result < 0)
                printk(KERN_CRIT "\n Error in registering target \n");
        return 0;
}

static void cleanup_basic_target(void)
{
        dm_unregister_target(&basic_target);
}
module_init(init_basic_target);
module_exit(cleanup_basic_target);_
MODULE_LICENSE("GPL");

Makefile 

obj-m +=basic_target.o

all:
        make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
        make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

USAGE

echo 0 <size_of_device> basic_target /Path/to/your/device 0 | dmsetup create my_basic_dm_device

Compilation:
  1. make
  2. insmod basic_target.ko
Setup:
  1. Lets create a temp file of 2GB for device.
  2. dd if=/dev/zero of=/tmp/disk1 bs=512 count=20000
  3. Attach loop device to this file.
  4. losetup /dev/loop6 /tmp/disk1
  5. Lets create device with 'basic_target'
  6. echo 0 20000 basic_target /dev/loop6 0|dmsetup create my_basic_target_device
  1. You can see the constructor getting called in /var/log/messages.
  2. A new device is created at '/dev/mapper/my_basic_target_device'
Experiments:

We can create the File system on my_basic_target_device but that will trigger many IO on device
and will fill out the logs.

So lets try writing the 1 sector on our device using dd command.
dd if=/dev/zero of=/dev/mapper/my_basic_target_device bs=512 seek=10 count=1
Now you should see a write request for your device having starting sector as 10 and size 512 bytes.
References:
  1. http://linuxgazette.net/114/kapil.html

For any questions please feel free to mail me at gauravmmh@gmai.com or add comments below.
Stay tuned to http://techgmm.blogspot.in/
Thanks !!!

7 comments:

Marcelo said...

Hi,

Very interesting post. However I have a couple of doubts regarding the example:

1. Is start used somewhere? Or is it a mandatory parameter?
2. You say "We should point to the bio-> bi_dev field to bdev of underlying device.", but I couldn't match anything in the code to that.

Thanks!

Gaurav Mahajan said...

Hi Marcelo,

1. I don't know about start is mandatory or not. But here are my thoughts on this.
Basic_target is just one to one mapping target meaning that if you send read request for sector no 100 then the read will be send to underline device with sector no 100.
But in device case if you have a device with say 1024 sectors and you wish to use sectors 500 to 700 for basic target then start=500 and size=200. Now if you send read request at sector no 100, this cant be sen to underline device as it is. we need to map it with start, so start + readoffset (500 + 100) = 600 so we will have to send the read request to underlying device with sector no 600.
I would suggest you to study dm-linear after this.

2.Ye we need to point to the bio-> bi_dev field to bdev of underlying device.
So in basic_target_ctr we call to dm_get_device which basically fills in mdt->dev. This mdt->dev is a dm_device structure and represents underlying device in device mapper. Now mdt->dev->bdev is the bdev of the underlying device.
In basic_target_map we do somthing like this bio->bi_bdev = mdt->dev->bdev; and we submit the bio.

Lets see the reasoning for this. Every block device is represented by a bdev. Now with dm we create logical devices over physical devices. so the logical device and physical device will have their own bdev. Now whe you say read sec no 100 from /dev/mapper/basic-target(logical device) a bio will be create with ssector no as 100 and bdev of basic-target. I get this bio in basic-target mapper function. So now if I dont change the bdev to the underlying device then the system will send the bio again to basic_target_map again. and the request will loop and we will ahve a deadlock there. Thats why we will have to map it to the underlying bdev saying that I have done my processing now send request to underlying device.

Hope it clears your doubts!

Gaurav Mahajan said...

You can see more comments in here about the post.

https://www.blogger.com/comment.g?blogID=52214455755903509&postID=851989371271561682

Kaushik chug said...

Really helpful! Thanks a lot!

Rashid said...

Hi Gaurav,

Incredibly useful article.
I am also creating a device mapper target. However I also have a cache. So when the cache gets full, I need to move its contents to the disk.
I have hardly any kernel coding experience. Could you guide with this. I am looking for some API's like submit_bio that could read by cache and write it to the disk.

Jitendra Kumar Khasdev said...

Hi Gaurav,

Looks interesting post. I was wondering how dmsetup create will work for root disk, I mean how can create /dev/mapper/target_name before the root filesystem get mounted.

muniyappan said...

Hi gaurav,
I need to parse the data from kernel space to user space....!