Questions about storage virtualization












2















I am trying to understand and compare storage virtualization methods, including RAID and LVM. I hope I could get some general idea and big picture for the relation between various concepts.





  1. I was wondering if various storage
    virtualization methods can be
    classified into virtualization at
    the device (disk), partition or filesystem
    levels, as following





    • RAID belongs to virtualization at the device/disk level, which replace
      physical disks with logical/virtual
      disks.


    • LVM belongs to virtualization at the partition level, which
      replaces partitions with logical/virtual
      partitionss (also called logical
      volumes).

    • There is also vitualization at the filesystem level, which
      replaces filesystems with logical/virtual
      filesystems, for example,
      Network-attached storage (NAS).



  2. If my above understanding is
    correct, does virtualization at each
    level also implement virtualization
    at all lower levels? For example,
    virtualization at partition level
    also implements virtualization at
    device level, and virtualization at
    filesystem level also implements
    virtualization at both partition and
    device levels?


  3. How do different levels of
    virtualization affect/determine
    their different areas of
    applications? For example, are there
    applications suitable for RAID but
    not for LVM, and for LVM but not for
    RAID?



  4. There is a Wikipedia article for
    storage virtualization, where
    there are two main categories of
    methods, block virtualization (which
    can further be classified into
    storage device-based and host-based
    and network-based) and file
    virtualization.



    Compare the article with my
    understanding in part 1,:




    • Is it correct that storage device-based block virtualization is same as virtualization at the device level. Host-based block Virtualization is same as virtualization at the partition level. File virtualization is same as virtualization at the filesystem level.

    • But in Host-based block Virtualization#Specific_examples,
      it looks like Host-based block Virtualization includes virtualization at the filesystem level? How shall one understand what is File virtualization then?



  5. I would rather to single out
    network-based from block
    virtualization in the aforementioned
    Wikipedia article, because for
    storage virtualization over
    network, I think we can also
    classify the various methods into
    the levels of device, partition and
    filesystem? For example, can I say
    Storage Area Network (SAN) belongs
    to the level of device, and
    Network-attached storage (NAS) to
    the level of filesystem?


Thanks and regards!










share|improve this question




















  • 2





    i don't think you are being very accurate in describing RAID and LVM as "disk" or "partition" level virtualization. Storage Virtualization refers to the abstraction of multiple, commonly network-linked equipment that are centrally managed and allow access to the system as a whole rather than a per-server basis. RAID/LVM has little to do with Storage virtualization per se, although (of course) they are commonly used in SAN clusters.

    – bubu
    May 31 '11 at 19:17











  • Thanks! But I don't understand "RAID/LVM has little to do with Storage virtualization per se". From the Wikipedia articles for storage virtualization (en.wikipedia.org/wiki/Storage_virtualization), LVM (en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)) and RAID (en.wikipedia.org/wiki/RAID), both RAID and LVM are methods of storage virtualization and storage virtualization is not just about the case of network-linked storage devices.

    – Tim
    May 31 '11 at 19:40








  • 1





    the wikipedia article on storage virtualization is mediocre at best.

    – bubu
    May 31 '11 at 19:51











  • Then any references worth recommendation?

    – Tim
    May 31 '11 at 19:53











  • have a look: www-03.ibm.com/systems/resources/…

    – bubu
    May 31 '11 at 20:08
















2















I am trying to understand and compare storage virtualization methods, including RAID and LVM. I hope I could get some general idea and big picture for the relation between various concepts.





  1. I was wondering if various storage
    virtualization methods can be
    classified into virtualization at
    the device (disk), partition or filesystem
    levels, as following





    • RAID belongs to virtualization at the device/disk level, which replace
      physical disks with logical/virtual
      disks.


    • LVM belongs to virtualization at the partition level, which
      replaces partitions with logical/virtual
      partitionss (also called logical
      volumes).

    • There is also vitualization at the filesystem level, which
      replaces filesystems with logical/virtual
      filesystems, for example,
      Network-attached storage (NAS).



  2. If my above understanding is
    correct, does virtualization at each
    level also implement virtualization
    at all lower levels? For example,
    virtualization at partition level
    also implements virtualization at
    device level, and virtualization at
    filesystem level also implements
    virtualization at both partition and
    device levels?


  3. How do different levels of
    virtualization affect/determine
    their different areas of
    applications? For example, are there
    applications suitable for RAID but
    not for LVM, and for LVM but not for
    RAID?



  4. There is a Wikipedia article for
    storage virtualization, where
    there are two main categories of
    methods, block virtualization (which
    can further be classified into
    storage device-based and host-based
    and network-based) and file
    virtualization.



    Compare the article with my
    understanding in part 1,:




    • Is it correct that storage device-based block virtualization is same as virtualization at the device level. Host-based block Virtualization is same as virtualization at the partition level. File virtualization is same as virtualization at the filesystem level.

    • But in Host-based block Virtualization#Specific_examples,
      it looks like Host-based block Virtualization includes virtualization at the filesystem level? How shall one understand what is File virtualization then?



  5. I would rather to single out
    network-based from block
    virtualization in the aforementioned
    Wikipedia article, because for
    storage virtualization over
    network, I think we can also
    classify the various methods into
    the levels of device, partition and
    filesystem? For example, can I say
    Storage Area Network (SAN) belongs
    to the level of device, and
    Network-attached storage (NAS) to
    the level of filesystem?


Thanks and regards!










share|improve this question




















  • 2





    i don't think you are being very accurate in describing RAID and LVM as "disk" or "partition" level virtualization. Storage Virtualization refers to the abstraction of multiple, commonly network-linked equipment that are centrally managed and allow access to the system as a whole rather than a per-server basis. RAID/LVM has little to do with Storage virtualization per se, although (of course) they are commonly used in SAN clusters.

    – bubu
    May 31 '11 at 19:17











  • Thanks! But I don't understand "RAID/LVM has little to do with Storage virtualization per se". From the Wikipedia articles for storage virtualization (en.wikipedia.org/wiki/Storage_virtualization), LVM (en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)) and RAID (en.wikipedia.org/wiki/RAID), both RAID and LVM are methods of storage virtualization and storage virtualization is not just about the case of network-linked storage devices.

    – Tim
    May 31 '11 at 19:40








  • 1





    the wikipedia article on storage virtualization is mediocre at best.

    – bubu
    May 31 '11 at 19:51











  • Then any references worth recommendation?

    – Tim
    May 31 '11 at 19:53











  • have a look: www-03.ibm.com/systems/resources/…

    – bubu
    May 31 '11 at 20:08














2












2








2








I am trying to understand and compare storage virtualization methods, including RAID and LVM. I hope I could get some general idea and big picture for the relation between various concepts.





  1. I was wondering if various storage
    virtualization methods can be
    classified into virtualization at
    the device (disk), partition or filesystem
    levels, as following





    • RAID belongs to virtualization at the device/disk level, which replace
      physical disks with logical/virtual
      disks.


    • LVM belongs to virtualization at the partition level, which
      replaces partitions with logical/virtual
      partitionss (also called logical
      volumes).

    • There is also vitualization at the filesystem level, which
      replaces filesystems with logical/virtual
      filesystems, for example,
      Network-attached storage (NAS).



  2. If my above understanding is
    correct, does virtualization at each
    level also implement virtualization
    at all lower levels? For example,
    virtualization at partition level
    also implements virtualization at
    device level, and virtualization at
    filesystem level also implements
    virtualization at both partition and
    device levels?


  3. How do different levels of
    virtualization affect/determine
    their different areas of
    applications? For example, are there
    applications suitable for RAID but
    not for LVM, and for LVM but not for
    RAID?



  4. There is a Wikipedia article for
    storage virtualization, where
    there are two main categories of
    methods, block virtualization (which
    can further be classified into
    storage device-based and host-based
    and network-based) and file
    virtualization.



    Compare the article with my
    understanding in part 1,:




    • Is it correct that storage device-based block virtualization is same as virtualization at the device level. Host-based block Virtualization is same as virtualization at the partition level. File virtualization is same as virtualization at the filesystem level.

    • But in Host-based block Virtualization#Specific_examples,
      it looks like Host-based block Virtualization includes virtualization at the filesystem level? How shall one understand what is File virtualization then?



  5. I would rather to single out
    network-based from block
    virtualization in the aforementioned
    Wikipedia article, because for
    storage virtualization over
    network, I think we can also
    classify the various methods into
    the levels of device, partition and
    filesystem? For example, can I say
    Storage Area Network (SAN) belongs
    to the level of device, and
    Network-attached storage (NAS) to
    the level of filesystem?


Thanks and regards!










share|improve this question
















I am trying to understand and compare storage virtualization methods, including RAID and LVM. I hope I could get some general idea and big picture for the relation between various concepts.





  1. I was wondering if various storage
    virtualization methods can be
    classified into virtualization at
    the device (disk), partition or filesystem
    levels, as following





    • RAID belongs to virtualization at the device/disk level, which replace
      physical disks with logical/virtual
      disks.


    • LVM belongs to virtualization at the partition level, which
      replaces partitions with logical/virtual
      partitionss (also called logical
      volumes).

    • There is also vitualization at the filesystem level, which
      replaces filesystems with logical/virtual
      filesystems, for example,
      Network-attached storage (NAS).



  2. If my above understanding is
    correct, does virtualization at each
    level also implement virtualization
    at all lower levels? For example,
    virtualization at partition level
    also implements virtualization at
    device level, and virtualization at
    filesystem level also implements
    virtualization at both partition and
    device levels?


  3. How do different levels of
    virtualization affect/determine
    their different areas of
    applications? For example, are there
    applications suitable for RAID but
    not for LVM, and for LVM but not for
    RAID?



  4. There is a Wikipedia article for
    storage virtualization, where
    there are two main categories of
    methods, block virtualization (which
    can further be classified into
    storage device-based and host-based
    and network-based) and file
    virtualization.



    Compare the article with my
    understanding in part 1,:




    • Is it correct that storage device-based block virtualization is same as virtualization at the device level. Host-based block Virtualization is same as virtualization at the partition level. File virtualization is same as virtualization at the filesystem level.

    • But in Host-based block Virtualization#Specific_examples,
      it looks like Host-based block Virtualization includes virtualization at the filesystem level? How shall one understand what is File virtualization then?



  5. I would rather to single out
    network-based from block
    virtualization in the aforementioned
    Wikipedia article, because for
    storage virtualization over
    network, I think we can also
    classify the various methods into
    the levels of device, partition and
    filesystem? For example, can I say
    Storage Area Network (SAN) belongs
    to the level of device, and
    Network-attached storage (NAS) to
    the level of filesystem?


Thanks and regards!







virtualization storage






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 31 '11 at 20:02







Tim

















asked May 31 '11 at 19:01









TimTim

1




1








  • 2





    i don't think you are being very accurate in describing RAID and LVM as "disk" or "partition" level virtualization. Storage Virtualization refers to the abstraction of multiple, commonly network-linked equipment that are centrally managed and allow access to the system as a whole rather than a per-server basis. RAID/LVM has little to do with Storage virtualization per se, although (of course) they are commonly used in SAN clusters.

    – bubu
    May 31 '11 at 19:17











  • Thanks! But I don't understand "RAID/LVM has little to do with Storage virtualization per se". From the Wikipedia articles for storage virtualization (en.wikipedia.org/wiki/Storage_virtualization), LVM (en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)) and RAID (en.wikipedia.org/wiki/RAID), both RAID and LVM are methods of storage virtualization and storage virtualization is not just about the case of network-linked storage devices.

    – Tim
    May 31 '11 at 19:40








  • 1





    the wikipedia article on storage virtualization is mediocre at best.

    – bubu
    May 31 '11 at 19:51











  • Then any references worth recommendation?

    – Tim
    May 31 '11 at 19:53











  • have a look: www-03.ibm.com/systems/resources/…

    – bubu
    May 31 '11 at 20:08














  • 2





    i don't think you are being very accurate in describing RAID and LVM as "disk" or "partition" level virtualization. Storage Virtualization refers to the abstraction of multiple, commonly network-linked equipment that are centrally managed and allow access to the system as a whole rather than a per-server basis. RAID/LVM has little to do with Storage virtualization per se, although (of course) they are commonly used in SAN clusters.

    – bubu
    May 31 '11 at 19:17











  • Thanks! But I don't understand "RAID/LVM has little to do with Storage virtualization per se". From the Wikipedia articles for storage virtualization (en.wikipedia.org/wiki/Storage_virtualization), LVM (en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)) and RAID (en.wikipedia.org/wiki/RAID), both RAID and LVM are methods of storage virtualization and storage virtualization is not just about the case of network-linked storage devices.

    – Tim
    May 31 '11 at 19:40








  • 1





    the wikipedia article on storage virtualization is mediocre at best.

    – bubu
    May 31 '11 at 19:51











  • Then any references worth recommendation?

    – Tim
    May 31 '11 at 19:53











  • have a look: www-03.ibm.com/systems/resources/…

    – bubu
    May 31 '11 at 20:08








2




2





i don't think you are being very accurate in describing RAID and LVM as "disk" or "partition" level virtualization. Storage Virtualization refers to the abstraction of multiple, commonly network-linked equipment that are centrally managed and allow access to the system as a whole rather than a per-server basis. RAID/LVM has little to do with Storage virtualization per se, although (of course) they are commonly used in SAN clusters.

– bubu
May 31 '11 at 19:17





i don't think you are being very accurate in describing RAID and LVM as "disk" or "partition" level virtualization. Storage Virtualization refers to the abstraction of multiple, commonly network-linked equipment that are centrally managed and allow access to the system as a whole rather than a per-server basis. RAID/LVM has little to do with Storage virtualization per se, although (of course) they are commonly used in SAN clusters.

– bubu
May 31 '11 at 19:17













Thanks! But I don't understand "RAID/LVM has little to do with Storage virtualization per se". From the Wikipedia articles for storage virtualization (en.wikipedia.org/wiki/Storage_virtualization), LVM (en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)) and RAID (en.wikipedia.org/wiki/RAID), both RAID and LVM are methods of storage virtualization and storage virtualization is not just about the case of network-linked storage devices.

– Tim
May 31 '11 at 19:40







Thanks! But I don't understand "RAID/LVM has little to do with Storage virtualization per se". From the Wikipedia articles for storage virtualization (en.wikipedia.org/wiki/Storage_virtualization), LVM (en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)) and RAID (en.wikipedia.org/wiki/RAID), both RAID and LVM are methods of storage virtualization and storage virtualization is not just about the case of network-linked storage devices.

– Tim
May 31 '11 at 19:40






1




1





the wikipedia article on storage virtualization is mediocre at best.

– bubu
May 31 '11 at 19:51





the wikipedia article on storage virtualization is mediocre at best.

– bubu
May 31 '11 at 19:51













Then any references worth recommendation?

– Tim
May 31 '11 at 19:53





Then any references worth recommendation?

– Tim
May 31 '11 at 19:53













have a look: www-03.ibm.com/systems/resources/…

– bubu
May 31 '11 at 20:08





have a look: www-03.ibm.com/systems/resources/…

– bubu
May 31 '11 at 20:08










3 Answers
3






active

oldest

votes


















3














Might as well provide an answer. Refer to @soandos 's answer for more detailed answering of your specific questions.



LVM vs RAID



RAID, as many have mentioned, is a standard of technologies in which multiple disk drives are allocated together as an array of disks, providing varying level of performance and reliability benefits. For example, RAID 0 provides the best performance one can possibly get with the harddrives, and is extremely sensitive to disk loss (one loss = essentially total loss) whereas RAID 6 provide redundancy even when rebuilding array in a one drive loss scenario. RAID array are usually seen as one single drive to the OS.



One can say that RAID is a many to one mapping.



LVM, on the other hand, allows logical "disk drives" (block device to be accurate, but anyways) to be formed by parts of different disk drives. They exist in a "many-to-many mapping" manner. While one can use LVM to accomplish what can be accomplished by RAID, LVM is actually something that can accomplish much more. For example, to add another disk drive to a RAID array it would likely be necessary to rebuild the whole array from scratch. With LVM, it is just adding a disk drive to the machine, adding the disk drive mapping to a logical volume, and using it (the actual configuration is a little bit more complicated but certainly less than rebuilding a whole array).






share|improve this answer































    1















    1. RAID is a backup technology that insure that in the case of drive failure, all data remains intact, and LVM is Logical Volume Manager that can do many things.

    2. It's not.

    3. That seems like an odd question. RAID is a way of separating data across drives, so that if one fails no data is lost. LVM is a volume manager that can be used to change the way a user/OS looks at all the hard drives. They have nothing to do with each other (though LVM can implement RAID 1 and RAID 0, that is not its primary focus).

    4. The first means you don't have to know what physical device the data is on, and the second means that you can store for lack of a better word the links between files in a more abstract way.

    5. As stated above there is no "device level" or "partition level" to talk about so no, you can't refer to them as such.






    share|improve this answer


























    • Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

      – Tim
      May 31 '11 at 19:36











    • RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

      – soandos
      May 31 '11 at 19:40













    • Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

      – Tim
      May 31 '11 at 19:50













    • They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

      – soandos
      May 31 '11 at 19:50











    • (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

      – Tim
      May 31 '11 at 19:51



















    0














    The article is not well written.



    One of the biggest problems is that there are multiple layers of abstraction built into the full stack of storage, and "virtualization" is a fuzzy enough word as to be hard to definitively assign a place to put it. For a good look at the many layers of abstraction in storage, I'll point you at a blog piece I did last year (read it here for the gory details).



    In marketing-speak, "Storage Virtualization" is just introducing abstraction where there previously hasn't been any. That can happen at many points depending on the market segment. But that's just marketing. Time for technical.



    The storage stack (somewhat simplified):




    1. Disk

    2. RAID controller

    3. Software RAID

    4. Volume manager

    5. Filesystem

    6. Network filesystem

    7. Network filesystem client


    Disk, even old school spinning magnetic disks, do a level of virtualization. They present a logical view of the actual blocks on the platters (or storage cells for an SSD), and this has been this way since the mid 80's or so. Magnetic drives reserve a certain number of blocks for reassigning blocks that go bad, and the logical view is how this is abstracted away from the disk controller. Technologies like SMART can catch this in the act and report that the drive is "pre-fail" so you can plan your transition accordingly. This has been in place in some form since the 80's.



    RAID cards provide another abstraction layer, hiding the true shape of storage from an operating system. This has been in place since the first RAID cards came out in the late 80's, and they've only gotten more complex since then. Cards with write-caches on them provide still another abstraction layer, as writes can be reported as committed before they're actually on a disk somewhere. The really fancy ones (such as those in Storage Area Network arrays) can even write to two separate disk arrays for realtime replication, and the OS is none the wiser.



    Once you get into the operating system things get a lot more murky, as each does their own thing. Software RAID (md in Linux) is typically implemented as a low level storage driver that presents the logically combined storage to higher storage layers. As with the RAID cards, you can do all sorts of interesting things here. Some of the "Storage Virtualization" products you see out there are implemented at this stage.



    Going higher you get to the volume managers (LVM) can provide for some seriously complex configurations. Where the next layer down aggregates disks into a single virtual volume, the volume managers can combine multiple volumes into a single bigger volume... or split a pool of volumes into an arbitrary number of volumes. Again, some of the Storage Virtualization products you see have a presence in this layer as well.



    The next step up is the filesystem. This is the layer where the well known abstractions of "file" and "directory" come into existence. Some filesystems (btrfs, zfs) have volume-manager like features built into them which allows things like snapshotting, deduplication, replication to other devices, and even migration of files between storage tiers. That last bit is not in many filesystems yet, but is definitely a target for Storage Virtualization vendors.



    The next step up is the network filesystem. This is things like Samba/CIFS, NetATalk/Appletalk, NFS, and others. If written the right way, these network filesystems can further abstract storage. One product I'm thinking of, Novell's Open Enterprise Server and their ShadowVolumes, takes multiple volumes on different storage (presumably differing speeds/cost) and presents them as a single volume to the network user, and then migrates files between the volumes based on usage statistics. Some of the "Storage Virtualization" appliances you can buy actually do their heavy lifting at this layer.



    The last stop on our trip up the storage stack is the network filesystem client in the client machine. It is at this level that the Distributed File System (DFS) exists, which allows a single logical presentation of a filesystem to exist on multiple network filesystems. The client knows that this is a DFS share, and that specific object is a DFS link, and when following it present the specified network-share as a sub-directory of the parent directory. There have been other examples of abstraction at this level, but DFS is perhaps the most common.





    One thing to keep in mind is that through all of this, each layer of the storage stack is independent of those above it. Many layers are already doing block-level abstraction, so adding one more doesn't change a whole lot. File-level abstraction has to happen near the top of the stack (for that's where the file-systems live) and the impacts lower down are highly decoupled to the point that it may not even be noticed.





    At it's core "Storage Virtualization" is still mostly a marketing term for something that has been happening since the dawn of the PC era (if not earlier), only this time the new abstraction layers are happening when virtualization is the buzzword of the moment.



    The one new abstraction layer I know of is something called a "Storage Router", which you'll only ever see on large Storage Area Networks. This device has several different storage arrays behind it, and presents those separate arrays as single array with multiple LUNs. The fancier ones can do interesting block-level abstractions like moving rarely used blocks to slower/cheaper storage and moving the highly used blocks to SSD layers, or handle realtime replication between storage arrays that normally wouldn't allow that kind of thing.



    P.S.: RAID is not just device-level virtualization. I'm working with a storage array right now that takes slices of disks and assigns them to different RAID groups. It is working just fine (I'm doing it right now), and I have both RAID1 and RAID5 volumes on the same disk device. Lose two drives and the RAID5 volumes are toast, but the RAID1 volumes on the same disks are just fine.






    share|improve this answer

























      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "3"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f291175%2fquestions-about-storage-virtualization%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3














      Might as well provide an answer. Refer to @soandos 's answer for more detailed answering of your specific questions.



      LVM vs RAID



      RAID, as many have mentioned, is a standard of technologies in which multiple disk drives are allocated together as an array of disks, providing varying level of performance and reliability benefits. For example, RAID 0 provides the best performance one can possibly get with the harddrives, and is extremely sensitive to disk loss (one loss = essentially total loss) whereas RAID 6 provide redundancy even when rebuilding array in a one drive loss scenario. RAID array are usually seen as one single drive to the OS.



      One can say that RAID is a many to one mapping.



      LVM, on the other hand, allows logical "disk drives" (block device to be accurate, but anyways) to be formed by parts of different disk drives. They exist in a "many-to-many mapping" manner. While one can use LVM to accomplish what can be accomplished by RAID, LVM is actually something that can accomplish much more. For example, to add another disk drive to a RAID array it would likely be necessary to rebuild the whole array from scratch. With LVM, it is just adding a disk drive to the machine, adding the disk drive mapping to a logical volume, and using it (the actual configuration is a little bit more complicated but certainly less than rebuilding a whole array).






      share|improve this answer




























        3














        Might as well provide an answer. Refer to @soandos 's answer for more detailed answering of your specific questions.



        LVM vs RAID



        RAID, as many have mentioned, is a standard of technologies in which multiple disk drives are allocated together as an array of disks, providing varying level of performance and reliability benefits. For example, RAID 0 provides the best performance one can possibly get with the harddrives, and is extremely sensitive to disk loss (one loss = essentially total loss) whereas RAID 6 provide redundancy even when rebuilding array in a one drive loss scenario. RAID array are usually seen as one single drive to the OS.



        One can say that RAID is a many to one mapping.



        LVM, on the other hand, allows logical "disk drives" (block device to be accurate, but anyways) to be formed by parts of different disk drives. They exist in a "many-to-many mapping" manner. While one can use LVM to accomplish what can be accomplished by RAID, LVM is actually something that can accomplish much more. For example, to add another disk drive to a RAID array it would likely be necessary to rebuild the whole array from scratch. With LVM, it is just adding a disk drive to the machine, adding the disk drive mapping to a logical volume, and using it (the actual configuration is a little bit more complicated but certainly less than rebuilding a whole array).






        share|improve this answer


























          3












          3








          3







          Might as well provide an answer. Refer to @soandos 's answer for more detailed answering of your specific questions.



          LVM vs RAID



          RAID, as many have mentioned, is a standard of technologies in which multiple disk drives are allocated together as an array of disks, providing varying level of performance and reliability benefits. For example, RAID 0 provides the best performance one can possibly get with the harddrives, and is extremely sensitive to disk loss (one loss = essentially total loss) whereas RAID 6 provide redundancy even when rebuilding array in a one drive loss scenario. RAID array are usually seen as one single drive to the OS.



          One can say that RAID is a many to one mapping.



          LVM, on the other hand, allows logical "disk drives" (block device to be accurate, but anyways) to be formed by parts of different disk drives. They exist in a "many-to-many mapping" manner. While one can use LVM to accomplish what can be accomplished by RAID, LVM is actually something that can accomplish much more. For example, to add another disk drive to a RAID array it would likely be necessary to rebuild the whole array from scratch. With LVM, it is just adding a disk drive to the machine, adding the disk drive mapping to a logical volume, and using it (the actual configuration is a little bit more complicated but certainly less than rebuilding a whole array).






          share|improve this answer













          Might as well provide an answer. Refer to @soandos 's answer for more detailed answering of your specific questions.



          LVM vs RAID



          RAID, as many have mentioned, is a standard of technologies in which multiple disk drives are allocated together as an array of disks, providing varying level of performance and reliability benefits. For example, RAID 0 provides the best performance one can possibly get with the harddrives, and is extremely sensitive to disk loss (one loss = essentially total loss) whereas RAID 6 provide redundancy even when rebuilding array in a one drive loss scenario. RAID array are usually seen as one single drive to the OS.



          One can say that RAID is a many to one mapping.



          LVM, on the other hand, allows logical "disk drives" (block device to be accurate, but anyways) to be formed by parts of different disk drives. They exist in a "many-to-many mapping" manner. While one can use LVM to accomplish what can be accomplished by RAID, LVM is actually something that can accomplish much more. For example, to add another disk drive to a RAID array it would likely be necessary to rebuild the whole array from scratch. With LVM, it is just adding a disk drive to the machine, adding the disk drive mapping to a logical volume, and using it (the actual configuration is a little bit more complicated but certainly less than rebuilding a whole array).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered May 31 '11 at 20:07









          bubububu

          8,95622343




          8,95622343

























              1















              1. RAID is a backup technology that insure that in the case of drive failure, all data remains intact, and LVM is Logical Volume Manager that can do many things.

              2. It's not.

              3. That seems like an odd question. RAID is a way of separating data across drives, so that if one fails no data is lost. LVM is a volume manager that can be used to change the way a user/OS looks at all the hard drives. They have nothing to do with each other (though LVM can implement RAID 1 and RAID 0, that is not its primary focus).

              4. The first means you don't have to know what physical device the data is on, and the second means that you can store for lack of a better word the links between files in a more abstract way.

              5. As stated above there is no "device level" or "partition level" to talk about so no, you can't refer to them as such.






              share|improve this answer


























              • Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

                – Tim
                May 31 '11 at 19:36











              • RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

                – soandos
                May 31 '11 at 19:40













              • Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

                – Tim
                May 31 '11 at 19:50













              • They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

                – soandos
                May 31 '11 at 19:50











              • (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

                – Tim
                May 31 '11 at 19:51
















              1















              1. RAID is a backup technology that insure that in the case of drive failure, all data remains intact, and LVM is Logical Volume Manager that can do many things.

              2. It's not.

              3. That seems like an odd question. RAID is a way of separating data across drives, so that if one fails no data is lost. LVM is a volume manager that can be used to change the way a user/OS looks at all the hard drives. They have nothing to do with each other (though LVM can implement RAID 1 and RAID 0, that is not its primary focus).

              4. The first means you don't have to know what physical device the data is on, and the second means that you can store for lack of a better word the links between files in a more abstract way.

              5. As stated above there is no "device level" or "partition level" to talk about so no, you can't refer to them as such.






              share|improve this answer


























              • Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

                – Tim
                May 31 '11 at 19:36











              • RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

                – soandos
                May 31 '11 at 19:40













              • Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

                – Tim
                May 31 '11 at 19:50













              • They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

                – soandos
                May 31 '11 at 19:50











              • (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

                – Tim
                May 31 '11 at 19:51














              1












              1








              1








              1. RAID is a backup technology that insure that in the case of drive failure, all data remains intact, and LVM is Logical Volume Manager that can do many things.

              2. It's not.

              3. That seems like an odd question. RAID is a way of separating data across drives, so that if one fails no data is lost. LVM is a volume manager that can be used to change the way a user/OS looks at all the hard drives. They have nothing to do with each other (though LVM can implement RAID 1 and RAID 0, that is not its primary focus).

              4. The first means you don't have to know what physical device the data is on, and the second means that you can store for lack of a better word the links between files in a more abstract way.

              5. As stated above there is no "device level" or "partition level" to talk about so no, you can't refer to them as such.






              share|improve this answer
















              1. RAID is a backup technology that insure that in the case of drive failure, all data remains intact, and LVM is Logical Volume Manager that can do many things.

              2. It's not.

              3. That seems like an odd question. RAID is a way of separating data across drives, so that if one fails no data is lost. LVM is a volume manager that can be used to change the way a user/OS looks at all the hard drives. They have nothing to do with each other (though LVM can implement RAID 1 and RAID 0, that is not its primary focus).

              4. The first means you don't have to know what physical device the data is on, and the second means that you can store for lack of a better word the links between files in a more abstract way.

              5. As stated above there is no "device level" or "partition level" to talk about so no, you can't refer to them as such.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Feb 7 at 11:28









              karel

              9,27293139




              9,27293139










              answered May 31 '11 at 19:31









              soandossoandos

              20.3k2892131




              20.3k2892131













              • Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

                – Tim
                May 31 '11 at 19:36











              • RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

                – soandos
                May 31 '11 at 19:40













              • Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

                – Tim
                May 31 '11 at 19:50













              • They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

                – soandos
                May 31 '11 at 19:50











              • (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

                – Tim
                May 31 '11 at 19:51



















              • Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

                – Tim
                May 31 '11 at 19:36











              • RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

                – soandos
                May 31 '11 at 19:40













              • Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

                – Tim
                May 31 '11 at 19:50













              • They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

                – soandos
                May 31 '11 at 19:50











              • (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

                – Tim
                May 31 '11 at 19:51

















              Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

              – Tim
              May 31 '11 at 19:36





              Thanks! But I still can't see how RAID and LVM are fundamentally different, as both are realization of storage virtualization. Therefore still don't understand why they are used for different purposes?

              – Tim
              May 31 '11 at 19:36













              RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

              – soandos
              May 31 '11 at 19:40







              RAID is not about virtualization. It stands for "Redundant Array of Independent Disks." The word virtualization does not even appear on the wikipedia article about it. Read this:en.wikipedia.org/wiki/RAID for a longer explaination.

              – soandos
              May 31 '11 at 19:40















              Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

              – Tim
              May 31 '11 at 19:50







              Thanks! But I still don't understand your opinion. (1) "virtual disk" and "virtual device" appear in the article for RAID, and I think RAID is storage virtualization at the device level. In the article for storage device, RAID is mentioned in storage device-based block virtualization (en.wikipedia.org/wiki/…).

              – Tim
              May 31 '11 at 19:50















              They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

              – soandos
              May 31 '11 at 19:50





              They might have something in common, but the goals are totally different. In RAID, you need a virtual disk because of data striping. Since the data that would generally be written to one disk is now written to more than one, you need a way to read the whole file as if it was contiguous. This is a side point to the idea of RAID though, and is not needed in RAID 1 for example.

              – soandos
              May 31 '11 at 19:50













              (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

              – Tim
              May 31 '11 at 19:51





              (2) As to "RAID is not about virtualization", I wonder if RAID makes several physical storage devices look like and used as a whole virtual/logical storage device? Quoted from the Raid article "this is achieved by combining multiple disk drive components into a logical unit".

              – Tim
              May 31 '11 at 19:51











              0














              The article is not well written.



              One of the biggest problems is that there are multiple layers of abstraction built into the full stack of storage, and "virtualization" is a fuzzy enough word as to be hard to definitively assign a place to put it. For a good look at the many layers of abstraction in storage, I'll point you at a blog piece I did last year (read it here for the gory details).



              In marketing-speak, "Storage Virtualization" is just introducing abstraction where there previously hasn't been any. That can happen at many points depending on the market segment. But that's just marketing. Time for technical.



              The storage stack (somewhat simplified):




              1. Disk

              2. RAID controller

              3. Software RAID

              4. Volume manager

              5. Filesystem

              6. Network filesystem

              7. Network filesystem client


              Disk, even old school spinning magnetic disks, do a level of virtualization. They present a logical view of the actual blocks on the platters (or storage cells for an SSD), and this has been this way since the mid 80's or so. Magnetic drives reserve a certain number of blocks for reassigning blocks that go bad, and the logical view is how this is abstracted away from the disk controller. Technologies like SMART can catch this in the act and report that the drive is "pre-fail" so you can plan your transition accordingly. This has been in place in some form since the 80's.



              RAID cards provide another abstraction layer, hiding the true shape of storage from an operating system. This has been in place since the first RAID cards came out in the late 80's, and they've only gotten more complex since then. Cards with write-caches on them provide still another abstraction layer, as writes can be reported as committed before they're actually on a disk somewhere. The really fancy ones (such as those in Storage Area Network arrays) can even write to two separate disk arrays for realtime replication, and the OS is none the wiser.



              Once you get into the operating system things get a lot more murky, as each does their own thing. Software RAID (md in Linux) is typically implemented as a low level storage driver that presents the logically combined storage to higher storage layers. As with the RAID cards, you can do all sorts of interesting things here. Some of the "Storage Virtualization" products you see out there are implemented at this stage.



              Going higher you get to the volume managers (LVM) can provide for some seriously complex configurations. Where the next layer down aggregates disks into a single virtual volume, the volume managers can combine multiple volumes into a single bigger volume... or split a pool of volumes into an arbitrary number of volumes. Again, some of the Storage Virtualization products you see have a presence in this layer as well.



              The next step up is the filesystem. This is the layer where the well known abstractions of "file" and "directory" come into existence. Some filesystems (btrfs, zfs) have volume-manager like features built into them which allows things like snapshotting, deduplication, replication to other devices, and even migration of files between storage tiers. That last bit is not in many filesystems yet, but is definitely a target for Storage Virtualization vendors.



              The next step up is the network filesystem. This is things like Samba/CIFS, NetATalk/Appletalk, NFS, and others. If written the right way, these network filesystems can further abstract storage. One product I'm thinking of, Novell's Open Enterprise Server and their ShadowVolumes, takes multiple volumes on different storage (presumably differing speeds/cost) and presents them as a single volume to the network user, and then migrates files between the volumes based on usage statistics. Some of the "Storage Virtualization" appliances you can buy actually do their heavy lifting at this layer.



              The last stop on our trip up the storage stack is the network filesystem client in the client machine. It is at this level that the Distributed File System (DFS) exists, which allows a single logical presentation of a filesystem to exist on multiple network filesystems. The client knows that this is a DFS share, and that specific object is a DFS link, and when following it present the specified network-share as a sub-directory of the parent directory. There have been other examples of abstraction at this level, but DFS is perhaps the most common.





              One thing to keep in mind is that through all of this, each layer of the storage stack is independent of those above it. Many layers are already doing block-level abstraction, so adding one more doesn't change a whole lot. File-level abstraction has to happen near the top of the stack (for that's where the file-systems live) and the impacts lower down are highly decoupled to the point that it may not even be noticed.





              At it's core "Storage Virtualization" is still mostly a marketing term for something that has been happening since the dawn of the PC era (if not earlier), only this time the new abstraction layers are happening when virtualization is the buzzword of the moment.



              The one new abstraction layer I know of is something called a "Storage Router", which you'll only ever see on large Storage Area Networks. This device has several different storage arrays behind it, and presents those separate arrays as single array with multiple LUNs. The fancier ones can do interesting block-level abstractions like moving rarely used blocks to slower/cheaper storage and moving the highly used blocks to SSD layers, or handle realtime replication between storage arrays that normally wouldn't allow that kind of thing.



              P.S.: RAID is not just device-level virtualization. I'm working with a storage array right now that takes slices of disks and assigns them to different RAID groups. It is working just fine (I'm doing it right now), and I have both RAID1 and RAID5 volumes on the same disk device. Lose two drives and the RAID5 volumes are toast, but the RAID1 volumes on the same disks are just fine.






              share|improve this answer






























                0














                The article is not well written.



                One of the biggest problems is that there are multiple layers of abstraction built into the full stack of storage, and "virtualization" is a fuzzy enough word as to be hard to definitively assign a place to put it. For a good look at the many layers of abstraction in storage, I'll point you at a blog piece I did last year (read it here for the gory details).



                In marketing-speak, "Storage Virtualization" is just introducing abstraction where there previously hasn't been any. That can happen at many points depending on the market segment. But that's just marketing. Time for technical.



                The storage stack (somewhat simplified):




                1. Disk

                2. RAID controller

                3. Software RAID

                4. Volume manager

                5. Filesystem

                6. Network filesystem

                7. Network filesystem client


                Disk, even old school spinning magnetic disks, do a level of virtualization. They present a logical view of the actual blocks on the platters (or storage cells for an SSD), and this has been this way since the mid 80's or so. Magnetic drives reserve a certain number of blocks for reassigning blocks that go bad, and the logical view is how this is abstracted away from the disk controller. Technologies like SMART can catch this in the act and report that the drive is "pre-fail" so you can plan your transition accordingly. This has been in place in some form since the 80's.



                RAID cards provide another abstraction layer, hiding the true shape of storage from an operating system. This has been in place since the first RAID cards came out in the late 80's, and they've only gotten more complex since then. Cards with write-caches on them provide still another abstraction layer, as writes can be reported as committed before they're actually on a disk somewhere. The really fancy ones (such as those in Storage Area Network arrays) can even write to two separate disk arrays for realtime replication, and the OS is none the wiser.



                Once you get into the operating system things get a lot more murky, as each does their own thing. Software RAID (md in Linux) is typically implemented as a low level storage driver that presents the logically combined storage to higher storage layers. As with the RAID cards, you can do all sorts of interesting things here. Some of the "Storage Virtualization" products you see out there are implemented at this stage.



                Going higher you get to the volume managers (LVM) can provide for some seriously complex configurations. Where the next layer down aggregates disks into a single virtual volume, the volume managers can combine multiple volumes into a single bigger volume... or split a pool of volumes into an arbitrary number of volumes. Again, some of the Storage Virtualization products you see have a presence in this layer as well.



                The next step up is the filesystem. This is the layer where the well known abstractions of "file" and "directory" come into existence. Some filesystems (btrfs, zfs) have volume-manager like features built into them which allows things like snapshotting, deduplication, replication to other devices, and even migration of files between storage tiers. That last bit is not in many filesystems yet, but is definitely a target for Storage Virtualization vendors.



                The next step up is the network filesystem. This is things like Samba/CIFS, NetATalk/Appletalk, NFS, and others. If written the right way, these network filesystems can further abstract storage. One product I'm thinking of, Novell's Open Enterprise Server and their ShadowVolumes, takes multiple volumes on different storage (presumably differing speeds/cost) and presents them as a single volume to the network user, and then migrates files between the volumes based on usage statistics. Some of the "Storage Virtualization" appliances you can buy actually do their heavy lifting at this layer.



                The last stop on our trip up the storage stack is the network filesystem client in the client machine. It is at this level that the Distributed File System (DFS) exists, which allows a single logical presentation of a filesystem to exist on multiple network filesystems. The client knows that this is a DFS share, and that specific object is a DFS link, and when following it present the specified network-share as a sub-directory of the parent directory. There have been other examples of abstraction at this level, but DFS is perhaps the most common.





                One thing to keep in mind is that through all of this, each layer of the storage stack is independent of those above it. Many layers are already doing block-level abstraction, so adding one more doesn't change a whole lot. File-level abstraction has to happen near the top of the stack (for that's where the file-systems live) and the impacts lower down are highly decoupled to the point that it may not even be noticed.





                At it's core "Storage Virtualization" is still mostly a marketing term for something that has been happening since the dawn of the PC era (if not earlier), only this time the new abstraction layers are happening when virtualization is the buzzword of the moment.



                The one new abstraction layer I know of is something called a "Storage Router", which you'll only ever see on large Storage Area Networks. This device has several different storage arrays behind it, and presents those separate arrays as single array with multiple LUNs. The fancier ones can do interesting block-level abstractions like moving rarely used blocks to slower/cheaper storage and moving the highly used blocks to SSD layers, or handle realtime replication between storage arrays that normally wouldn't allow that kind of thing.



                P.S.: RAID is not just device-level virtualization. I'm working with a storage array right now that takes slices of disks and assigns them to different RAID groups. It is working just fine (I'm doing it right now), and I have both RAID1 and RAID5 volumes on the same disk device. Lose two drives and the RAID5 volumes are toast, but the RAID1 volumes on the same disks are just fine.






                share|improve this answer




























                  0












                  0








                  0







                  The article is not well written.



                  One of the biggest problems is that there are multiple layers of abstraction built into the full stack of storage, and "virtualization" is a fuzzy enough word as to be hard to definitively assign a place to put it. For a good look at the many layers of abstraction in storage, I'll point you at a blog piece I did last year (read it here for the gory details).



                  In marketing-speak, "Storage Virtualization" is just introducing abstraction where there previously hasn't been any. That can happen at many points depending on the market segment. But that's just marketing. Time for technical.



                  The storage stack (somewhat simplified):




                  1. Disk

                  2. RAID controller

                  3. Software RAID

                  4. Volume manager

                  5. Filesystem

                  6. Network filesystem

                  7. Network filesystem client


                  Disk, even old school spinning magnetic disks, do a level of virtualization. They present a logical view of the actual blocks on the platters (or storage cells for an SSD), and this has been this way since the mid 80's or so. Magnetic drives reserve a certain number of blocks for reassigning blocks that go bad, and the logical view is how this is abstracted away from the disk controller. Technologies like SMART can catch this in the act and report that the drive is "pre-fail" so you can plan your transition accordingly. This has been in place in some form since the 80's.



                  RAID cards provide another abstraction layer, hiding the true shape of storage from an operating system. This has been in place since the first RAID cards came out in the late 80's, and they've only gotten more complex since then. Cards with write-caches on them provide still another abstraction layer, as writes can be reported as committed before they're actually on a disk somewhere. The really fancy ones (such as those in Storage Area Network arrays) can even write to two separate disk arrays for realtime replication, and the OS is none the wiser.



                  Once you get into the operating system things get a lot more murky, as each does their own thing. Software RAID (md in Linux) is typically implemented as a low level storage driver that presents the logically combined storage to higher storage layers. As with the RAID cards, you can do all sorts of interesting things here. Some of the "Storage Virtualization" products you see out there are implemented at this stage.



                  Going higher you get to the volume managers (LVM) can provide for some seriously complex configurations. Where the next layer down aggregates disks into a single virtual volume, the volume managers can combine multiple volumes into a single bigger volume... or split a pool of volumes into an arbitrary number of volumes. Again, some of the Storage Virtualization products you see have a presence in this layer as well.



                  The next step up is the filesystem. This is the layer where the well known abstractions of "file" and "directory" come into existence. Some filesystems (btrfs, zfs) have volume-manager like features built into them which allows things like snapshotting, deduplication, replication to other devices, and even migration of files between storage tiers. That last bit is not in many filesystems yet, but is definitely a target for Storage Virtualization vendors.



                  The next step up is the network filesystem. This is things like Samba/CIFS, NetATalk/Appletalk, NFS, and others. If written the right way, these network filesystems can further abstract storage. One product I'm thinking of, Novell's Open Enterprise Server and their ShadowVolumes, takes multiple volumes on different storage (presumably differing speeds/cost) and presents them as a single volume to the network user, and then migrates files between the volumes based on usage statistics. Some of the "Storage Virtualization" appliances you can buy actually do their heavy lifting at this layer.



                  The last stop on our trip up the storage stack is the network filesystem client in the client machine. It is at this level that the Distributed File System (DFS) exists, which allows a single logical presentation of a filesystem to exist on multiple network filesystems. The client knows that this is a DFS share, and that specific object is a DFS link, and when following it present the specified network-share as a sub-directory of the parent directory. There have been other examples of abstraction at this level, but DFS is perhaps the most common.





                  One thing to keep in mind is that through all of this, each layer of the storage stack is independent of those above it. Many layers are already doing block-level abstraction, so adding one more doesn't change a whole lot. File-level abstraction has to happen near the top of the stack (for that's where the file-systems live) and the impacts lower down are highly decoupled to the point that it may not even be noticed.





                  At it's core "Storage Virtualization" is still mostly a marketing term for something that has been happening since the dawn of the PC era (if not earlier), only this time the new abstraction layers are happening when virtualization is the buzzword of the moment.



                  The one new abstraction layer I know of is something called a "Storage Router", which you'll only ever see on large Storage Area Networks. This device has several different storage arrays behind it, and presents those separate arrays as single array with multiple LUNs. The fancier ones can do interesting block-level abstractions like moving rarely used blocks to slower/cheaper storage and moving the highly used blocks to SSD layers, or handle realtime replication between storage arrays that normally wouldn't allow that kind of thing.



                  P.S.: RAID is not just device-level virtualization. I'm working with a storage array right now that takes slices of disks and assigns them to different RAID groups. It is working just fine (I'm doing it right now), and I have both RAID1 and RAID5 volumes on the same disk device. Lose two drives and the RAID5 volumes are toast, but the RAID1 volumes on the same disks are just fine.






                  share|improve this answer















                  The article is not well written.



                  One of the biggest problems is that there are multiple layers of abstraction built into the full stack of storage, and "virtualization" is a fuzzy enough word as to be hard to definitively assign a place to put it. For a good look at the many layers of abstraction in storage, I'll point you at a blog piece I did last year (read it here for the gory details).



                  In marketing-speak, "Storage Virtualization" is just introducing abstraction where there previously hasn't been any. That can happen at many points depending on the market segment. But that's just marketing. Time for technical.



                  The storage stack (somewhat simplified):




                  1. Disk

                  2. RAID controller

                  3. Software RAID

                  4. Volume manager

                  5. Filesystem

                  6. Network filesystem

                  7. Network filesystem client


                  Disk, even old school spinning magnetic disks, do a level of virtualization. They present a logical view of the actual blocks on the platters (or storage cells for an SSD), and this has been this way since the mid 80's or so. Magnetic drives reserve a certain number of blocks for reassigning blocks that go bad, and the logical view is how this is abstracted away from the disk controller. Technologies like SMART can catch this in the act and report that the drive is "pre-fail" so you can plan your transition accordingly. This has been in place in some form since the 80's.



                  RAID cards provide another abstraction layer, hiding the true shape of storage from an operating system. This has been in place since the first RAID cards came out in the late 80's, and they've only gotten more complex since then. Cards with write-caches on them provide still another abstraction layer, as writes can be reported as committed before they're actually on a disk somewhere. The really fancy ones (such as those in Storage Area Network arrays) can even write to two separate disk arrays for realtime replication, and the OS is none the wiser.



                  Once you get into the operating system things get a lot more murky, as each does their own thing. Software RAID (md in Linux) is typically implemented as a low level storage driver that presents the logically combined storage to higher storage layers. As with the RAID cards, you can do all sorts of interesting things here. Some of the "Storage Virtualization" products you see out there are implemented at this stage.



                  Going higher you get to the volume managers (LVM) can provide for some seriously complex configurations. Where the next layer down aggregates disks into a single virtual volume, the volume managers can combine multiple volumes into a single bigger volume... or split a pool of volumes into an arbitrary number of volumes. Again, some of the Storage Virtualization products you see have a presence in this layer as well.



                  The next step up is the filesystem. This is the layer where the well known abstractions of "file" and "directory" come into existence. Some filesystems (btrfs, zfs) have volume-manager like features built into them which allows things like snapshotting, deduplication, replication to other devices, and even migration of files between storage tiers. That last bit is not in many filesystems yet, but is definitely a target for Storage Virtualization vendors.



                  The next step up is the network filesystem. This is things like Samba/CIFS, NetATalk/Appletalk, NFS, and others. If written the right way, these network filesystems can further abstract storage. One product I'm thinking of, Novell's Open Enterprise Server and their ShadowVolumes, takes multiple volumes on different storage (presumably differing speeds/cost) and presents them as a single volume to the network user, and then migrates files between the volumes based on usage statistics. Some of the "Storage Virtualization" appliances you can buy actually do their heavy lifting at this layer.



                  The last stop on our trip up the storage stack is the network filesystem client in the client machine. It is at this level that the Distributed File System (DFS) exists, which allows a single logical presentation of a filesystem to exist on multiple network filesystems. The client knows that this is a DFS share, and that specific object is a DFS link, and when following it present the specified network-share as a sub-directory of the parent directory. There have been other examples of abstraction at this level, but DFS is perhaps the most common.





                  One thing to keep in mind is that through all of this, each layer of the storage stack is independent of those above it. Many layers are already doing block-level abstraction, so adding one more doesn't change a whole lot. File-level abstraction has to happen near the top of the stack (for that's where the file-systems live) and the impacts lower down are highly decoupled to the point that it may not even be noticed.





                  At it's core "Storage Virtualization" is still mostly a marketing term for something that has been happening since the dawn of the PC era (if not earlier), only this time the new abstraction layers are happening when virtualization is the buzzword of the moment.



                  The one new abstraction layer I know of is something called a "Storage Router", which you'll only ever see on large Storage Area Networks. This device has several different storage arrays behind it, and presents those separate arrays as single array with multiple LUNs. The fancier ones can do interesting block-level abstractions like moving rarely used blocks to slower/cheaper storage and moving the highly used blocks to SSD layers, or handle realtime replication between storage arrays that normally wouldn't allow that kind of thing.



                  P.S.: RAID is not just device-level virtualization. I'm working with a storage array right now that takes slices of disks and assigns them to different RAID groups. It is working just fine (I'm doing it right now), and I have both RAID1 and RAID5 volumes on the same disk device. Lose two drives and the RAID5 volumes are toast, but the RAID1 volumes on the same disks are just fine.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jun 3 '11 at 12:02

























                  answered Jun 3 '11 at 4:20









                  SysAdmin1138SysAdmin1138

                  5,1391721




                  5,1391721






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Super User!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f291175%2fquestions-about-storage-virtualization%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Probability when a professor distributes a quiz and homework assignment to a class of n students.

                      Aardman Animations

                      Are they similar matrix