Which archival formats efficiently extracts a single file from an archive?












2














Extracting a single file from a zip file is a fast operation, so I assumed this would be true for TAR as well, but I learned that even though a TAR file is without compression, it can take a looong time for a file to be extracted. I had used tar to backup my home folder on OS X, and I then needed a single file. Since tar doesn't know where the file is, it needed to scan the entire 300GB file before being able to extract. This means TAR is a terrible format for most backup scenarios, so I'd like to know my options.



So, which archival file formats are suitable for quickly extracting a single file?



Even though this question isn't really about compression, I don't mind answers listing formats that combine archiving and compression (like zip), in which case "solid compression" will matter.










share|improve this question
























  • Remeber that tar stands for tape archive so keep in mind it was originally designed (in the 70's) to work with tapes (and still works with tape drives today). Definitely wasn't meant for random or quick access.
    – LawrenceC
    Dec 18 '18 at 23:40










  • Also, it is also targeted for streaming into pipes, which doesn't work that well with indices. GNU tar does add an index though.
    – oligofren
    Dec 19 '18 at 11:12
















2














Extracting a single file from a zip file is a fast operation, so I assumed this would be true for TAR as well, but I learned that even though a TAR file is without compression, it can take a looong time for a file to be extracted. I had used tar to backup my home folder on OS X, and I then needed a single file. Since tar doesn't know where the file is, it needed to scan the entire 300GB file before being able to extract. This means TAR is a terrible format for most backup scenarios, so I'd like to know my options.



So, which archival file formats are suitable for quickly extracting a single file?



Even though this question isn't really about compression, I don't mind answers listing formats that combine archiving and compression (like zip), in which case "solid compression" will matter.










share|improve this question
























  • Remeber that tar stands for tape archive so keep in mind it was originally designed (in the 70's) to work with tapes (and still works with tape drives today). Definitely wasn't meant for random or quick access.
    – LawrenceC
    Dec 18 '18 at 23:40










  • Also, it is also targeted for streaming into pipes, which doesn't work that well with indices. GNU tar does add an index though.
    – oligofren
    Dec 19 '18 at 11:12














2












2








2


0





Extracting a single file from a zip file is a fast operation, so I assumed this would be true for TAR as well, but I learned that even though a TAR file is without compression, it can take a looong time for a file to be extracted. I had used tar to backup my home folder on OS X, and I then needed a single file. Since tar doesn't know where the file is, it needed to scan the entire 300GB file before being able to extract. This means TAR is a terrible format for most backup scenarios, so I'd like to know my options.



So, which archival file formats are suitable for quickly extracting a single file?



Even though this question isn't really about compression, I don't mind answers listing formats that combine archiving and compression (like zip), in which case "solid compression" will matter.










share|improve this question















Extracting a single file from a zip file is a fast operation, so I assumed this would be true for TAR as well, but I learned that even though a TAR file is without compression, it can take a looong time for a file to be extracted. I had used tar to backup my home folder on OS X, and I then needed a single file. Since tar doesn't know where the file is, it needed to scan the entire 300GB file before being able to extract. This means TAR is a terrible format for most backup scenarios, so I'd like to know my options.



So, which archival file formats are suitable for quickly extracting a single file?



Even though this question isn't really about compression, I don't mind answers listing formats that combine archiving and compression (like zip), in which case "solid compression" will matter.







compression zip tar archiving






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 18 '18 at 23:20

























asked Dec 18 '18 at 22:36









oligofren

560827




560827












  • Remeber that tar stands for tape archive so keep in mind it was originally designed (in the 70's) to work with tapes (and still works with tape drives today). Definitely wasn't meant for random or quick access.
    – LawrenceC
    Dec 18 '18 at 23:40










  • Also, it is also targeted for streaming into pipes, which doesn't work that well with indices. GNU tar does add an index though.
    – oligofren
    Dec 19 '18 at 11:12


















  • Remeber that tar stands for tape archive so keep in mind it was originally designed (in the 70's) to work with tapes (and still works with tape drives today). Definitely wasn't meant for random or quick access.
    – LawrenceC
    Dec 18 '18 at 23:40










  • Also, it is also targeted for streaming into pipes, which doesn't work that well with indices. GNU tar does add an index though.
    – oligofren
    Dec 19 '18 at 11:12
















Remeber that tar stands for tape archive so keep in mind it was originally designed (in the 70's) to work with tapes (and still works with tape drives today). Definitely wasn't meant for random or quick access.
– LawrenceC
Dec 18 '18 at 23:40




Remeber that tar stands for tape archive so keep in mind it was originally designed (in the 70's) to work with tapes (and still works with tape drives today). Definitely wasn't meant for random or quick access.
– LawrenceC
Dec 18 '18 at 23:40












Also, it is also targeted for streaming into pipes, which doesn't work that well with indices. GNU tar does add an index though.
– oligofren
Dec 19 '18 at 11:12




Also, it is also targeted for streaming into pipes, which doesn't work that well with indices. GNU tar does add an index though.
– oligofren
Dec 19 '18 at 11:12










3 Answers
3






active

oldest

votes


















2














It sounds like speed & efficiency of extraction are your main concerns, and I'm assuming you're using linux or macOS so want to preserve special file attributes (the ones zip & 7z ignore). In that case, an excellent archive format would be:





  • An ext[2/3/4] filesystem - Just copy the files somewhere, then extracting a single file is as quick & easy as mounting & reading the original file. You could put the whole archive filesystem inside a single archive file if you wish, just create a file big enough & format it & mount it (don't even need the -o loop option anymore).



    Pros:




    • A nice bonus is you can easily add encryption (LUKS) to the whole archive file too, or any other encryption the filesystem supports (eCryptFS, EncFS, etc).


    • You can also use rsync-based backup solutions easily.


    • It's easy to add/delete files (up to the overall archive file's size).



    Cons:




    • If using a single archive file, you have to pick it's size before adding files, and it doesn't dynamically change size.

    • It's still possible to expand or shrink the entire archive even if it's in a single file, but you need tools like resize2fs to shrink the filesystem, then truncate to shrink the file (or vice versa to expand).



  • The same filesystem you're already using, in case you're using macOS and it likes something other than ext. I'm pretty sure macOS's mount command works with a single large archive file too.



If you do want some compression also, that's usually where the solid archives & slow reading comes in. Some filesystems support compression directly (btrfs, reiserfs/reiser4, planned for ext?) but I'd just go with:





  • SquashFS - It might be the compression King, saves file attributes, and allows quick extraction of a single file (mounting & browsing of every file in fact). It's great for archives too, and has adjustable levels of compression, use it.



    Or perhaps combine it with incremental backups & overlay mounts for a nice "partial backups but full files" solution.



    A con is it's impossible to increase or shrink the size of the archive, or add/delete files.



    Or just use an existing backup product (Time Machine?).




If you really wanted to use an archive like 7z/zip anyway, but still keep the file attributes, you could tar each file individually (saving the attributes) then store the separate tar files in a 7z/zip archive. It needs an extra step with more hassles, but would let you easily extract a single (tar'd) file, and expand or shrink the archive without re-compressing everything (if it's not a solid archive).






share|improve this answer































    -1














    The Zip format has been made for extracting single files randomly and efficiently. A Zip archive contains a catalog at its end allowing to reach single files quickly - compressed or not.






    share|improve this answer





















    • Cool, but we knew this. Do you know of any other formats doing the same?
      – oligofren
      Dec 18 '18 at 23:21










    • OP already said this in his Question. He's looking for other suggestions besides .zip.
      – Spiff
      Dec 18 '18 at 23:38



















    -1














    Most modern compression archive formats include a database or catalog of the files and folders stored within them. These include: 7-Zip, ACE, ARC, ARJ, BZIP2, CAB, CPIO, GZIP, IMG, ISO (ISO9660), LHA, RAR, RPM, SFX, SQX, TAR, TBZ (TAR.BZ), TGZ (TAR.GZ), TXZ (TAR.XZ), XZ, ZIP, Zip64, and ZOO. These formats will allow you to extract an individual file or folder, as needed.



    ZIP is by far the most common and widely used. Some operating systems, like Windows have native support for ZIP files, allowing you to use a ZIP file as if it was a standard folder.



    As for efficiency of extracting an individual file, I have never seen a test on this. However, I have used ZIP archives in this manner, so I can say it is pretty fast, dependent on the size of the file.






    share|improve this answer























    • Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
      – Spiff
      Dec 18 '18 at 23:47










    • @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
      – Keltari
      Dec 19 '18 at 0:11








    • 2




      No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
      – Spiff
      Dec 19 '18 at 3:14










    • This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
      – oligofren
      Dec 19 '18 at 10:36











    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "3"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1385723%2fwhich-archival-formats-efficiently-extracts-a-single-file-from-an-archive%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    It sounds like speed & efficiency of extraction are your main concerns, and I'm assuming you're using linux or macOS so want to preserve special file attributes (the ones zip & 7z ignore). In that case, an excellent archive format would be:





    • An ext[2/3/4] filesystem - Just copy the files somewhere, then extracting a single file is as quick & easy as mounting & reading the original file. You could put the whole archive filesystem inside a single archive file if you wish, just create a file big enough & format it & mount it (don't even need the -o loop option anymore).



      Pros:




      • A nice bonus is you can easily add encryption (LUKS) to the whole archive file too, or any other encryption the filesystem supports (eCryptFS, EncFS, etc).


      • You can also use rsync-based backup solutions easily.


      • It's easy to add/delete files (up to the overall archive file's size).



      Cons:




      • If using a single archive file, you have to pick it's size before adding files, and it doesn't dynamically change size.

      • It's still possible to expand or shrink the entire archive even if it's in a single file, but you need tools like resize2fs to shrink the filesystem, then truncate to shrink the file (or vice versa to expand).



    • The same filesystem you're already using, in case you're using macOS and it likes something other than ext. I'm pretty sure macOS's mount command works with a single large archive file too.



    If you do want some compression also, that's usually where the solid archives & slow reading comes in. Some filesystems support compression directly (btrfs, reiserfs/reiser4, planned for ext?) but I'd just go with:





    • SquashFS - It might be the compression King, saves file attributes, and allows quick extraction of a single file (mounting & browsing of every file in fact). It's great for archives too, and has adjustable levels of compression, use it.



      Or perhaps combine it with incremental backups & overlay mounts for a nice "partial backups but full files" solution.



      A con is it's impossible to increase or shrink the size of the archive, or add/delete files.



      Or just use an existing backup product (Time Machine?).




    If you really wanted to use an archive like 7z/zip anyway, but still keep the file attributes, you could tar each file individually (saving the attributes) then store the separate tar files in a 7z/zip archive. It needs an extra step with more hassles, but would let you easily extract a single (tar'd) file, and expand or shrink the archive without re-compressing everything (if it's not a solid archive).






    share|improve this answer




























      2














      It sounds like speed & efficiency of extraction are your main concerns, and I'm assuming you're using linux or macOS so want to preserve special file attributes (the ones zip & 7z ignore). In that case, an excellent archive format would be:





      • An ext[2/3/4] filesystem - Just copy the files somewhere, then extracting a single file is as quick & easy as mounting & reading the original file. You could put the whole archive filesystem inside a single archive file if you wish, just create a file big enough & format it & mount it (don't even need the -o loop option anymore).



        Pros:




        • A nice bonus is you can easily add encryption (LUKS) to the whole archive file too, or any other encryption the filesystem supports (eCryptFS, EncFS, etc).


        • You can also use rsync-based backup solutions easily.


        • It's easy to add/delete files (up to the overall archive file's size).



        Cons:




        • If using a single archive file, you have to pick it's size before adding files, and it doesn't dynamically change size.

        • It's still possible to expand or shrink the entire archive even if it's in a single file, but you need tools like resize2fs to shrink the filesystem, then truncate to shrink the file (or vice versa to expand).



      • The same filesystem you're already using, in case you're using macOS and it likes something other than ext. I'm pretty sure macOS's mount command works with a single large archive file too.



      If you do want some compression also, that's usually where the solid archives & slow reading comes in. Some filesystems support compression directly (btrfs, reiserfs/reiser4, planned for ext?) but I'd just go with:





      • SquashFS - It might be the compression King, saves file attributes, and allows quick extraction of a single file (mounting & browsing of every file in fact). It's great for archives too, and has adjustable levels of compression, use it.



        Or perhaps combine it with incremental backups & overlay mounts for a nice "partial backups but full files" solution.



        A con is it's impossible to increase or shrink the size of the archive, or add/delete files.



        Or just use an existing backup product (Time Machine?).




      If you really wanted to use an archive like 7z/zip anyway, but still keep the file attributes, you could tar each file individually (saving the attributes) then store the separate tar files in a 7z/zip archive. It needs an extra step with more hassles, but would let you easily extract a single (tar'd) file, and expand or shrink the archive without re-compressing everything (if it's not a solid archive).






      share|improve this answer


























        2












        2








        2






        It sounds like speed & efficiency of extraction are your main concerns, and I'm assuming you're using linux or macOS so want to preserve special file attributes (the ones zip & 7z ignore). In that case, an excellent archive format would be:





        • An ext[2/3/4] filesystem - Just copy the files somewhere, then extracting a single file is as quick & easy as mounting & reading the original file. You could put the whole archive filesystem inside a single archive file if you wish, just create a file big enough & format it & mount it (don't even need the -o loop option anymore).



          Pros:




          • A nice bonus is you can easily add encryption (LUKS) to the whole archive file too, or any other encryption the filesystem supports (eCryptFS, EncFS, etc).


          • You can also use rsync-based backup solutions easily.


          • It's easy to add/delete files (up to the overall archive file's size).



          Cons:




          • If using a single archive file, you have to pick it's size before adding files, and it doesn't dynamically change size.

          • It's still possible to expand or shrink the entire archive even if it's in a single file, but you need tools like resize2fs to shrink the filesystem, then truncate to shrink the file (or vice versa to expand).



        • The same filesystem you're already using, in case you're using macOS and it likes something other than ext. I'm pretty sure macOS's mount command works with a single large archive file too.



        If you do want some compression also, that's usually where the solid archives & slow reading comes in. Some filesystems support compression directly (btrfs, reiserfs/reiser4, planned for ext?) but I'd just go with:





        • SquashFS - It might be the compression King, saves file attributes, and allows quick extraction of a single file (mounting & browsing of every file in fact). It's great for archives too, and has adjustable levels of compression, use it.



          Or perhaps combine it with incremental backups & overlay mounts for a nice "partial backups but full files" solution.



          A con is it's impossible to increase or shrink the size of the archive, or add/delete files.



          Or just use an existing backup product (Time Machine?).




        If you really wanted to use an archive like 7z/zip anyway, but still keep the file attributes, you could tar each file individually (saving the attributes) then store the separate tar files in a 7z/zip archive. It needs an extra step with more hassles, but would let you easily extract a single (tar'd) file, and expand or shrink the archive without re-compressing everything (if it's not a solid archive).






        share|improve this answer














        It sounds like speed & efficiency of extraction are your main concerns, and I'm assuming you're using linux or macOS so want to preserve special file attributes (the ones zip & 7z ignore). In that case, an excellent archive format would be:





        • An ext[2/3/4] filesystem - Just copy the files somewhere, then extracting a single file is as quick & easy as mounting & reading the original file. You could put the whole archive filesystem inside a single archive file if you wish, just create a file big enough & format it & mount it (don't even need the -o loop option anymore).



          Pros:




          • A nice bonus is you can easily add encryption (LUKS) to the whole archive file too, or any other encryption the filesystem supports (eCryptFS, EncFS, etc).


          • You can also use rsync-based backup solutions easily.


          • It's easy to add/delete files (up to the overall archive file's size).



          Cons:




          • If using a single archive file, you have to pick it's size before adding files, and it doesn't dynamically change size.

          • It's still possible to expand or shrink the entire archive even if it's in a single file, but you need tools like resize2fs to shrink the filesystem, then truncate to shrink the file (or vice versa to expand).



        • The same filesystem you're already using, in case you're using macOS and it likes something other than ext. I'm pretty sure macOS's mount command works with a single large archive file too.



        If you do want some compression also, that's usually where the solid archives & slow reading comes in. Some filesystems support compression directly (btrfs, reiserfs/reiser4, planned for ext?) but I'd just go with:





        • SquashFS - It might be the compression King, saves file attributes, and allows quick extraction of a single file (mounting & browsing of every file in fact). It's great for archives too, and has adjustable levels of compression, use it.



          Or perhaps combine it with incremental backups & overlay mounts for a nice "partial backups but full files" solution.



          A con is it's impossible to increase or shrink the size of the archive, or add/delete files.



          Or just use an existing backup product (Time Machine?).




        If you really wanted to use an archive like 7z/zip anyway, but still keep the file attributes, you could tar each file individually (saving the attributes) then store the separate tar files in a 7z/zip archive. It needs an extra step with more hassles, but would let you easily extract a single (tar'd) file, and expand or shrink the archive without re-compressing everything (if it's not a solid archive).







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 19 '18 at 15:18

























        answered Dec 19 '18 at 0:48









        Xen2050

        10.1k31536




        10.1k31536

























            -1














            The Zip format has been made for extracting single files randomly and efficiently. A Zip archive contains a catalog at its end allowing to reach single files quickly - compressed or not.






            share|improve this answer





















            • Cool, but we knew this. Do you know of any other formats doing the same?
              – oligofren
              Dec 18 '18 at 23:21










            • OP already said this in his Question. He's looking for other suggestions besides .zip.
              – Spiff
              Dec 18 '18 at 23:38
















            -1














            The Zip format has been made for extracting single files randomly and efficiently. A Zip archive contains a catalog at its end allowing to reach single files quickly - compressed or not.






            share|improve this answer





















            • Cool, but we knew this. Do you know of any other formats doing the same?
              – oligofren
              Dec 18 '18 at 23:21










            • OP already said this in his Question. He's looking for other suggestions besides .zip.
              – Spiff
              Dec 18 '18 at 23:38














            -1












            -1








            -1






            The Zip format has been made for extracting single files randomly and efficiently. A Zip archive contains a catalog at its end allowing to reach single files quickly - compressed or not.






            share|improve this answer












            The Zip format has been made for extracting single files randomly and efficiently. A Zip archive contains a catalog at its end allowing to reach single files quickly - compressed or not.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Dec 18 '18 at 23:16









            Zerte

            91




            91












            • Cool, but we knew this. Do you know of any other formats doing the same?
              – oligofren
              Dec 18 '18 at 23:21










            • OP already said this in his Question. He's looking for other suggestions besides .zip.
              – Spiff
              Dec 18 '18 at 23:38


















            • Cool, but we knew this. Do you know of any other formats doing the same?
              – oligofren
              Dec 18 '18 at 23:21










            • OP already said this in his Question. He's looking for other suggestions besides .zip.
              – Spiff
              Dec 18 '18 at 23:38
















            Cool, but we knew this. Do you know of any other formats doing the same?
            – oligofren
            Dec 18 '18 at 23:21




            Cool, but we knew this. Do you know of any other formats doing the same?
            – oligofren
            Dec 18 '18 at 23:21












            OP already said this in his Question. He's looking for other suggestions besides .zip.
            – Spiff
            Dec 18 '18 at 23:38




            OP already said this in his Question. He's looking for other suggestions besides .zip.
            – Spiff
            Dec 18 '18 at 23:38











            -1














            Most modern compression archive formats include a database or catalog of the files and folders stored within them. These include: 7-Zip, ACE, ARC, ARJ, BZIP2, CAB, CPIO, GZIP, IMG, ISO (ISO9660), LHA, RAR, RPM, SFX, SQX, TAR, TBZ (TAR.BZ), TGZ (TAR.GZ), TXZ (TAR.XZ), XZ, ZIP, Zip64, and ZOO. These formats will allow you to extract an individual file or folder, as needed.



            ZIP is by far the most common and widely used. Some operating systems, like Windows have native support for ZIP files, allowing you to use a ZIP file as if it was a standard folder.



            As for efficiency of extracting an individual file, I have never seen a test on this. However, I have used ZIP archives in this manner, so I can say it is pretty fast, dependent on the size of the file.






            share|improve this answer























            • Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
              – Spiff
              Dec 18 '18 at 23:47










            • @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
              – Keltari
              Dec 19 '18 at 0:11








            • 2




              No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
              – Spiff
              Dec 19 '18 at 3:14










            • This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
              – oligofren
              Dec 19 '18 at 10:36
















            -1














            Most modern compression archive formats include a database or catalog of the files and folders stored within them. These include: 7-Zip, ACE, ARC, ARJ, BZIP2, CAB, CPIO, GZIP, IMG, ISO (ISO9660), LHA, RAR, RPM, SFX, SQX, TAR, TBZ (TAR.BZ), TGZ (TAR.GZ), TXZ (TAR.XZ), XZ, ZIP, Zip64, and ZOO. These formats will allow you to extract an individual file or folder, as needed.



            ZIP is by far the most common and widely used. Some operating systems, like Windows have native support for ZIP files, allowing you to use a ZIP file as if it was a standard folder.



            As for efficiency of extracting an individual file, I have never seen a test on this. However, I have used ZIP archives in this manner, so I can say it is pretty fast, dependent on the size of the file.






            share|improve this answer























            • Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
              – Spiff
              Dec 18 '18 at 23:47










            • @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
              – Keltari
              Dec 19 '18 at 0:11








            • 2




              No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
              – Spiff
              Dec 19 '18 at 3:14










            • This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
              – oligofren
              Dec 19 '18 at 10:36














            -1












            -1








            -1






            Most modern compression archive formats include a database or catalog of the files and folders stored within them. These include: 7-Zip, ACE, ARC, ARJ, BZIP2, CAB, CPIO, GZIP, IMG, ISO (ISO9660), LHA, RAR, RPM, SFX, SQX, TAR, TBZ (TAR.BZ), TGZ (TAR.GZ), TXZ (TAR.XZ), XZ, ZIP, Zip64, and ZOO. These formats will allow you to extract an individual file or folder, as needed.



            ZIP is by far the most common and widely used. Some operating systems, like Windows have native support for ZIP files, allowing you to use a ZIP file as if it was a standard folder.



            As for efficiency of extracting an individual file, I have never seen a test on this. However, I have used ZIP archives in this manner, so I can say it is pretty fast, dependent on the size of the file.






            share|improve this answer














            Most modern compression archive formats include a database or catalog of the files and folders stored within them. These include: 7-Zip, ACE, ARC, ARJ, BZIP2, CAB, CPIO, GZIP, IMG, ISO (ISO9660), LHA, RAR, RPM, SFX, SQX, TAR, TBZ (TAR.BZ), TGZ (TAR.GZ), TXZ (TAR.XZ), XZ, ZIP, Zip64, and ZOO. These formats will allow you to extract an individual file or folder, as needed.



            ZIP is by far the most common and widely used. Some operating systems, like Windows have native support for ZIP files, allowing you to use a ZIP file as if it was a standard folder.



            As for efficiency of extracting an individual file, I have never seen a test on this. However, I have used ZIP archives in this manner, so I can say it is pretty fast, dependent on the size of the file.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Dec 18 '18 at 23:30

























            answered Dec 18 '18 at 23:24









            Keltari

            50.8k18117169




            50.8k18117169












            • Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
              – Spiff
              Dec 18 '18 at 23:47










            • @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
              – Keltari
              Dec 19 '18 at 0:11








            • 2




              No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
              – Spiff
              Dec 19 '18 at 3:14










            • This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
              – oligofren
              Dec 19 '18 at 10:36


















            • Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
              – Spiff
              Dec 18 '18 at 23:47










            • @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
              – Keltari
              Dec 19 '18 at 0:11








            • 2




              No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
              – Spiff
              Dec 19 '18 at 3:14










            • This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
              – oligofren
              Dec 19 '18 at 10:36
















            Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
            – Spiff
            Dec 18 '18 at 23:47




            Many of the formats you listed are just compression formats, not archive formats. ZIP is both-in-one, but TAR is just an uncompressed archive format, and GZIP is just a compression format. If you want to take a directory full of files and put them all inside one compressed file, you can't use TAR alone or GZIP alone; you have to use TAR to make the archive, and GZIP to compress it. Also, as OP said, TAR doesn't meet his needs because it does not contain any kind of catalog/database/table-of-contents data structure up front.
            – Spiff
            Dec 18 '18 at 23:47












            @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
            – Keltari
            Dec 19 '18 at 0:11






            @Spiff compression formats are a type of archive format. It doesnt matter if TAR meets his needs, you are capable of removing a single file. He can determine his needs as necessary.
            – Keltari
            Dec 19 '18 at 0:11






            2




            2




            No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
            – Spiff
            Dec 19 '18 at 3:14




            No, not all compression formats are archive formats. Unix has always distinguished between compression (making a single file smaller) and archiving (storing a bunch of files in side a single file). If you come from a DOS/Windows or classic Mac background where formats like PKZIP and StuffIt! always combined both roles in one, you might not have learned that there are archive formats that don't compress, and compression formats that don't archive. Here, Wikipedia is smart enough to keep it straight: en.wikipedia.org/wiki/List_of_archive_formats
            – Spiff
            Dec 19 '18 at 3:14












            This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
            – oligofren
            Dec 19 '18 at 10:36




            This is incorrect. Neither tar nor cpio has such an index (in POSIX versions - GNU tar does, but not BSD). When you list the contents it is done by scanning the entire archive. This is to make it pipe friendly. So listing the files of a 100gb archive involves reading up to 100gb. Same goes for extraction of single files. If you are lucky they might be at the start of the archive.
            – oligofren
            Dec 19 '18 at 10:36


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Super User!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1385723%2fwhich-archival-formats-efficiently-extracts-a-single-file-from-an-archive%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Index of /

            Tribalistas

            Listed building