Is it safe to use standard input & output with binary data?












13















I need to split a binary file into two. I was wondering if head and/or tail could be used but then I wondered...is it safe to use redirection, piping etc with binary data? Do new lines get messed about with, or nulls ignored, or backspace or delete do something special? (bash, kubuntu 18.04 LTS)










share|improve this question




















  • 1





    Take a look at the split command.

    – egmont
    Dec 29 '18 at 22:11
















13















I need to split a binary file into two. I was wondering if head and/or tail could be used but then I wondered...is it safe to use redirection, piping etc with binary data? Do new lines get messed about with, or nulls ignored, or backspace or delete do something special? (bash, kubuntu 18.04 LTS)










share|improve this question




















  • 1





    Take a look at the split command.

    – egmont
    Dec 29 '18 at 22:11














13












13








13


1






I need to split a binary file into two. I was wondering if head and/or tail could be used but then I wondered...is it safe to use redirection, piping etc with binary data? Do new lines get messed about with, or nulls ignored, or backspace or delete do something special? (bash, kubuntu 18.04 LTS)










share|improve this question
















I need to split a binary file into two. I was wondering if head and/or tail could be used but then I wondered...is it safe to use redirection, piping etc with binary data? Do new lines get messed about with, or nulls ignored, or backspace or delete do something special? (bash, kubuntu 18.04 LTS)







command-line 18.04 bash kubuntu






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 29 '18 at 12:58









Ketan Patel

10.3k94365




10.3k94365










asked Dec 29 '18 at 12:55









B.TannerB.Tanner

9431814




9431814








  • 1





    Take a look at the split command.

    – egmont
    Dec 29 '18 at 22:11














  • 1





    Take a look at the split command.

    – egmont
    Dec 29 '18 at 22:11








1




1





Take a look at the split command.

– egmont
Dec 29 '18 at 22:11





Take a look at the split command.

– egmont
Dec 29 '18 at 22:11










2 Answers
2






active

oldest

votes


















18














Yes it's safe if you pipe it to another process or save it to a file. There is potential "weirdness" if you let binary stdout print to a terminal since it can contain escape sequences (at random) that can temporarily mess up the terminal display.






share|improve this answer



















  • 5





    In which case you can type reset and press enter to fix it.

    – Baard Kopperud
    Dec 29 '18 at 16:42






  • 3





    @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

    – Xen2050
    Dec 30 '18 at 0:04











  • @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

    – Bakuriu
    Dec 30 '18 at 10:02






  • 2





    See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

    – Sergiy Kolodyazhnyy
    Dec 30 '18 at 10:34





















0














The main problem with using commands like head or tail is that they are line-oriented and binary files are not. If they do have newlines in them, they are often not being used to represent the end of a line and if they are, they may be just be part of strings like program messages or data fields.



If the data is structured in any way, then you have to take that into account in choosing split points so you don't break structures in the middle.



If you know the structure of the file, you can use a command such as



dd -if input-file -of output-file ...


with options to only copy so many blocks of data of a specific size starting at a particular (incremented) offset into the file.



It looks like the split command as mentioned by @egmont will automate this process for you, but it appears to be line-oriented by default, so you'll have to specify additional options such as --bytes count to tell it how large each piece of the file should be.





As a side note, if you don't know what's in a file, but suspect it contains at least some meaningful textual data, the strings command is a great way of taking a first look to see what you're dealing with.



strings -n 6 file | less


will find all runs of printable characters at least six characters in length and display them in a pager so they don't fly by on the terminal. Using a number a bit larger than the default of 4 characters helps eliminate tiny snippets of data that just happen to be printable, but are not being used that way in the file.



If you later have to explore the file in more detail with binary editor such as hexedit, you'll have some landmarks that point out where something interesting might be found.



strings has an option -t x that will precede each printed string with its offset into the file in hexadecimal (o for octal/d for decimal) so you know where to find it later. Even very short files are a lot to deal with when you have to look at them character by character.






share|improve this answer























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "89"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1105348%2fis-it-safe-to-use-standard-input-output-with-binary-data%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    18














    Yes it's safe if you pipe it to another process or save it to a file. There is potential "weirdness" if you let binary stdout print to a terminal since it can contain escape sequences (at random) that can temporarily mess up the terminal display.






    share|improve this answer



















    • 5





      In which case you can type reset and press enter to fix it.

      – Baard Kopperud
      Dec 29 '18 at 16:42






    • 3





      @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

      – Xen2050
      Dec 30 '18 at 0:04











    • @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

      – Bakuriu
      Dec 30 '18 at 10:02






    • 2





      See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

      – Sergiy Kolodyazhnyy
      Dec 30 '18 at 10:34


















    18














    Yes it's safe if you pipe it to another process or save it to a file. There is potential "weirdness" if you let binary stdout print to a terminal since it can contain escape sequences (at random) that can temporarily mess up the terminal display.






    share|improve this answer



















    • 5





      In which case you can type reset and press enter to fix it.

      – Baard Kopperud
      Dec 29 '18 at 16:42






    • 3





      @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

      – Xen2050
      Dec 30 '18 at 0:04











    • @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

      – Bakuriu
      Dec 30 '18 at 10:02






    • 2





      See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

      – Sergiy Kolodyazhnyy
      Dec 30 '18 at 10:34
















    18












    18








    18







    Yes it's safe if you pipe it to another process or save it to a file. There is potential "weirdness" if you let binary stdout print to a terminal since it can contain escape sequences (at random) that can temporarily mess up the terminal display.






    share|improve this answer













    Yes it's safe if you pipe it to another process or save it to a file. There is potential "weirdness" if you let binary stdout print to a terminal since it can contain escape sequences (at random) that can temporarily mess up the terminal display.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Dec 29 '18 at 13:05









    Eric MintzEric Mintz

    584112




    584112








    • 5





      In which case you can type reset and press enter to fix it.

      – Baard Kopperud
      Dec 29 '18 at 16:42






    • 3





      @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

      – Xen2050
      Dec 30 '18 at 0:04











    • @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

      – Bakuriu
      Dec 30 '18 at 10:02






    • 2





      See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

      – Sergiy Kolodyazhnyy
      Dec 30 '18 at 10:34
















    • 5





      In which case you can type reset and press enter to fix it.

      – Baard Kopperud
      Dec 29 '18 at 16:42






    • 3





      @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

      – Xen2050
      Dec 30 '18 at 0:04











    • @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

      – Bakuriu
      Dec 30 '18 at 10:02






    • 2





      See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

      – Sergiy Kolodyazhnyy
      Dec 30 '18 at 10:34










    5




    5





    In which case you can type reset and press enter to fix it.

    – Baard Kopperud
    Dec 29 '18 at 16:42





    In which case you can type reset and press enter to fix it.

    – Baard Kopperud
    Dec 29 '18 at 16:42




    3




    3





    @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

    – Xen2050
    Dec 30 '18 at 0:04





    @BaardKopperud I thought I read somewhere about some corner cases where tset/reset wouldn't work

    – Xen2050
    Dec 30 '18 at 0:04













    @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

    – Bakuriu
    Dec 30 '18 at 10:02





    @Xen2050 I don't know. the only case that would happen if some escape sequence changes the keyboard layout/encoding, so that typing reset<enter> does not actually type that sequence of characters as seen by the terminal...

    – Bakuriu
    Dec 30 '18 at 10:02




    2




    2





    See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

    – Sergiy Kolodyazhnyy
    Dec 30 '18 at 10:34







    See also Fix terminal after displaying a binary file and Why does the console need sometimes a reset after CTRL+C. As suggested in the first link, stty sane; tput rs1 sequence of commands will do the trick for when there are corner cases of reset not working. Such cases, in addition to mentioned by Bakuriu, could include width of the terminal line/columns or I'm guessing the settings related to serial communication ( baudrate/parity).

    – Sergiy Kolodyazhnyy
    Dec 30 '18 at 10:34















    0














    The main problem with using commands like head or tail is that they are line-oriented and binary files are not. If they do have newlines in them, they are often not being used to represent the end of a line and if they are, they may be just be part of strings like program messages or data fields.



    If the data is structured in any way, then you have to take that into account in choosing split points so you don't break structures in the middle.



    If you know the structure of the file, you can use a command such as



    dd -if input-file -of output-file ...


    with options to only copy so many blocks of data of a specific size starting at a particular (incremented) offset into the file.



    It looks like the split command as mentioned by @egmont will automate this process for you, but it appears to be line-oriented by default, so you'll have to specify additional options such as --bytes count to tell it how large each piece of the file should be.





    As a side note, if you don't know what's in a file, but suspect it contains at least some meaningful textual data, the strings command is a great way of taking a first look to see what you're dealing with.



    strings -n 6 file | less


    will find all runs of printable characters at least six characters in length and display them in a pager so they don't fly by on the terminal. Using a number a bit larger than the default of 4 characters helps eliminate tiny snippets of data that just happen to be printable, but are not being used that way in the file.



    If you later have to explore the file in more detail with binary editor such as hexedit, you'll have some landmarks that point out where something interesting might be found.



    strings has an option -t x that will precede each printed string with its offset into the file in hexadecimal (o for octal/d for decimal) so you know where to find it later. Even very short files are a lot to deal with when you have to look at them character by character.






    share|improve this answer




























      0














      The main problem with using commands like head or tail is that they are line-oriented and binary files are not. If they do have newlines in them, they are often not being used to represent the end of a line and if they are, they may be just be part of strings like program messages or data fields.



      If the data is structured in any way, then you have to take that into account in choosing split points so you don't break structures in the middle.



      If you know the structure of the file, you can use a command such as



      dd -if input-file -of output-file ...


      with options to only copy so many blocks of data of a specific size starting at a particular (incremented) offset into the file.



      It looks like the split command as mentioned by @egmont will automate this process for you, but it appears to be line-oriented by default, so you'll have to specify additional options such as --bytes count to tell it how large each piece of the file should be.





      As a side note, if you don't know what's in a file, but suspect it contains at least some meaningful textual data, the strings command is a great way of taking a first look to see what you're dealing with.



      strings -n 6 file | less


      will find all runs of printable characters at least six characters in length and display them in a pager so they don't fly by on the terminal. Using a number a bit larger than the default of 4 characters helps eliminate tiny snippets of data that just happen to be printable, but are not being used that way in the file.



      If you later have to explore the file in more detail with binary editor such as hexedit, you'll have some landmarks that point out where something interesting might be found.



      strings has an option -t x that will precede each printed string with its offset into the file in hexadecimal (o for octal/d for decimal) so you know where to find it later. Even very short files are a lot to deal with when you have to look at them character by character.






      share|improve this answer


























        0












        0








        0







        The main problem with using commands like head or tail is that they are line-oriented and binary files are not. If they do have newlines in them, they are often not being used to represent the end of a line and if they are, they may be just be part of strings like program messages or data fields.



        If the data is structured in any way, then you have to take that into account in choosing split points so you don't break structures in the middle.



        If you know the structure of the file, you can use a command such as



        dd -if input-file -of output-file ...


        with options to only copy so many blocks of data of a specific size starting at a particular (incremented) offset into the file.



        It looks like the split command as mentioned by @egmont will automate this process for you, but it appears to be line-oriented by default, so you'll have to specify additional options such as --bytes count to tell it how large each piece of the file should be.





        As a side note, if you don't know what's in a file, but suspect it contains at least some meaningful textual data, the strings command is a great way of taking a first look to see what you're dealing with.



        strings -n 6 file | less


        will find all runs of printable characters at least six characters in length and display them in a pager so they don't fly by on the terminal. Using a number a bit larger than the default of 4 characters helps eliminate tiny snippets of data that just happen to be printable, but are not being used that way in the file.



        If you later have to explore the file in more detail with binary editor such as hexedit, you'll have some landmarks that point out where something interesting might be found.



        strings has an option -t x that will precede each printed string with its offset into the file in hexadecimal (o for octal/d for decimal) so you know where to find it later. Even very short files are a lot to deal with when you have to look at them character by character.






        share|improve this answer













        The main problem with using commands like head or tail is that they are line-oriented and binary files are not. If they do have newlines in them, they are often not being used to represent the end of a line and if they are, they may be just be part of strings like program messages or data fields.



        If the data is structured in any way, then you have to take that into account in choosing split points so you don't break structures in the middle.



        If you know the structure of the file, you can use a command such as



        dd -if input-file -of output-file ...


        with options to only copy so many blocks of data of a specific size starting at a particular (incremented) offset into the file.



        It looks like the split command as mentioned by @egmont will automate this process for you, but it appears to be line-oriented by default, so you'll have to specify additional options such as --bytes count to tell it how large each piece of the file should be.





        As a side note, if you don't know what's in a file, but suspect it contains at least some meaningful textual data, the strings command is a great way of taking a first look to see what you're dealing with.



        strings -n 6 file | less


        will find all runs of printable characters at least six characters in length and display them in a pager so they don't fly by on the terminal. Using a number a bit larger than the default of 4 characters helps eliminate tiny snippets of data that just happen to be printable, but are not being used that way in the file.



        If you later have to explore the file in more detail with binary editor such as hexedit, you'll have some landmarks that point out where something interesting might be found.



        strings has an option -t x that will precede each printed string with its offset into the file in hexadecimal (o for octal/d for decimal) so you know where to find it later. Even very short files are a lot to deal with when you have to look at them character by character.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 3 at 13:40









        JoeJoe

        1,201821




        1,201821






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Ask Ubuntu!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1105348%2fis-it-safe-to-use-standard-input-output-with-binary-data%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Aardman Animations

            Are they similar matrix

            “minimization” problem in Euclidean space related to orthonormal basis