Why is the behavior of the `#!` syntax unspecified by POSIX?












17














From the Shell Command Language page of the POSIX specification:




If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.




Why is the behavior of #! unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.










share|improve this question




















  • 1




    Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
    – Kusalananda
    Dec 18 '18 at 7:41








  • 2




    Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
    – Michael Homer
    Dec 18 '18 at 7:57








  • 1




    @MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
    – ilkkachu
    Dec 18 '18 at 9:44






  • 1




    @HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the exec() function. So checking against multiple shells doesn't really tell you how portable it is.
    – Austin Hemmelgarn
    Dec 18 '18 at 20:36






  • 2




    @HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
    – jamesdlin
    Dec 18 '18 at 21:30
















17














From the Shell Command Language page of the POSIX specification:




If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.




Why is the behavior of #! unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.










share|improve this question




















  • 1




    Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
    – Kusalananda
    Dec 18 '18 at 7:41








  • 2




    Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
    – Michael Homer
    Dec 18 '18 at 7:57








  • 1




    @MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
    – ilkkachu
    Dec 18 '18 at 9:44






  • 1




    @HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the exec() function. So checking against multiple shells doesn't really tell you how portable it is.
    – Austin Hemmelgarn
    Dec 18 '18 at 20:36






  • 2




    @HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
    – jamesdlin
    Dec 18 '18 at 21:30














17












17








17


3





From the Shell Command Language page of the POSIX specification:




If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.




Why is the behavior of #! unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.










share|improve this question















From the Shell Command Language page of the POSIX specification:




If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.




Why is the behavior of #! unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.







shell posix shebang






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 18 '18 at 9:43









ilkkachu

55.9k784155




55.9k784155










asked Dec 18 '18 at 7:37









Harold Fischer

545313




545313








  • 1




    Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
    – Kusalananda
    Dec 18 '18 at 7:41








  • 2




    Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
    – Michael Homer
    Dec 18 '18 at 7:57








  • 1




    @MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
    – ilkkachu
    Dec 18 '18 at 9:44






  • 1




    @HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the exec() function. So checking against multiple shells doesn't really tell you how portable it is.
    – Austin Hemmelgarn
    Dec 18 '18 at 20:36






  • 2




    @HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
    – jamesdlin
    Dec 18 '18 at 21:30














  • 1




    Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
    – Kusalananda
    Dec 18 '18 at 7:41








  • 2




    Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
    – Michael Homer
    Dec 18 '18 at 7:57








  • 1




    @MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
    – ilkkachu
    Dec 18 '18 at 9:44






  • 1




    @HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the exec() function. So checking against multiple shells doesn't really tell you how portable it is.
    – Austin Hemmelgarn
    Dec 18 '18 at 20:36






  • 2




    @HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
    – jamesdlin
    Dec 18 '18 at 21:30








1




1




Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41






Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41






2




2




Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57






Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57






1




1




@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44




@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44




1




1




@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the exec() function. So checking against multiple shells doesn't really tell you how portable it is.
– Austin Hemmelgarn
Dec 18 '18 at 20:36




@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the exec() function. So checking against multiple shells doesn't really tell you how portable it is.
– Austin Hemmelgarn
Dec 18 '18 at 20:36




2




2




@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30




@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30










3 Answers
3






active

oldest

votes


















19














I think primarily because:





  • the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.



    It could however now specify a minimum subset of most Unix-like implementations: like #! *[^ ]+( +[^ ]+)?n (with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed as argv[0] to the interpreter.




  • POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in /bin//usr/bin and have the POSIX utilities somewhere else (like on Solaris 10 where /bin/sh is a Bourne shell and the POSIX one is in /usr/xpg4/bin; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in /bin are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.



    See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.



    So even if POSIX specified the shebang mechanism it could not be used to write POSIX sh/sed/awk... scripts (also note that the shebang mechanism cannot be used to write reliable sed/awk script as it doesn't allow passing an end-of-option marker).




Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #! if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.



In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh syntax and hope that whatever invokes the script invokes a POSIX compliant sh on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.



You may have to do things like:



#! /bin/sh -
if : ^ false; then : fine, POSIX system by default
else
# cover Solaris 10 or older. ": ^ false" returns false
# in the Bourne shell as ^ is an alias for | there for
# compatibility with the Thomson shell.
PATH=`getconf PATH`:$PATH; export PATH
exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
fi
# rest of script


If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat or .ps1 extension and use some similar trick for cmd.exe or powershell.exe to invoke the cygwin sh on the same file.






share|improve this answer























  • Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
    – muru
    Dec 18 '18 at 7:59










  • @muru, thanks, see edit.
    – Stéphane Chazelas
    Dec 18 '18 at 8:08










  • @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
    – Kusalananda
    Dec 18 '18 at 8:11






  • 1




    @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
    – muru
    Dec 18 '18 at 8:18



















9















[T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.




You aren't looking deeply enough.



Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:


Note that BSD allows files which begin with #! interpreter to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of the exec…() routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically /bin/sh) for that program instead.

— Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.

An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…() functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p() subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…() on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.



A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…() in the System 5 Interface Definition, by the turn of the 1990s:


An interpreter file begins with a line of the form
# ! pathname [arg]
where pathname is the path of the interpreter, and arg is an optional argument.
When you exec an interpreter file, the system execs the specified interpreter.

exec. System V Interface Definition. Volume 1. 1991.

Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.



Further reading




  • Which shell interpreter runs a script with no shebang?

  • Why am I able to pass arguments to /usr/bin/env in this case?


  • script. NetBSD Miscellaneous Information Manual. 2005-05-06.






share|improve this answer































    1














    As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.



    Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability






    share|improve this answer





















      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "106"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f489628%2fwhy-is-the-behavior-of-the-syntax-unspecified-by-posix%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      19














      I think primarily because:





      • the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.



        It could however now specify a minimum subset of most Unix-like implementations: like #! *[^ ]+( +[^ ]+)?n (with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed as argv[0] to the interpreter.




      • POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in /bin//usr/bin and have the POSIX utilities somewhere else (like on Solaris 10 where /bin/sh is a Bourne shell and the POSIX one is in /usr/xpg4/bin; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in /bin are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.



        See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.



        So even if POSIX specified the shebang mechanism it could not be used to write POSIX sh/sed/awk... scripts (also note that the shebang mechanism cannot be used to write reliable sed/awk script as it doesn't allow passing an end-of-option marker).




      Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #! if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.



      In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh syntax and hope that whatever invokes the script invokes a POSIX compliant sh on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.



      You may have to do things like:



      #! /bin/sh -
      if : ^ false; then : fine, POSIX system by default
      else
      # cover Solaris 10 or older. ": ^ false" returns false
      # in the Bourne shell as ^ is an alias for | there for
      # compatibility with the Thomson shell.
      PATH=`getconf PATH`:$PATH; export PATH
      exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
      fi
      # rest of script


      If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat or .ps1 extension and use some similar trick for cmd.exe or powershell.exe to invoke the cygwin sh on the same file.






      share|improve this answer























      • Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
        – muru
        Dec 18 '18 at 7:59










      • @muru, thanks, see edit.
        – Stéphane Chazelas
        Dec 18 '18 at 8:08










      • @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
        – Kusalananda
        Dec 18 '18 at 8:11






      • 1




        @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
        – muru
        Dec 18 '18 at 8:18
















      19














      I think primarily because:





      • the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.



        It could however now specify a minimum subset of most Unix-like implementations: like #! *[^ ]+( +[^ ]+)?n (with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed as argv[0] to the interpreter.




      • POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in /bin//usr/bin and have the POSIX utilities somewhere else (like on Solaris 10 where /bin/sh is a Bourne shell and the POSIX one is in /usr/xpg4/bin; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in /bin are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.



        See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.



        So even if POSIX specified the shebang mechanism it could not be used to write POSIX sh/sed/awk... scripts (also note that the shebang mechanism cannot be used to write reliable sed/awk script as it doesn't allow passing an end-of-option marker).




      Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #! if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.



      In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh syntax and hope that whatever invokes the script invokes a POSIX compliant sh on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.



      You may have to do things like:



      #! /bin/sh -
      if : ^ false; then : fine, POSIX system by default
      else
      # cover Solaris 10 or older. ": ^ false" returns false
      # in the Bourne shell as ^ is an alias for | there for
      # compatibility with the Thomson shell.
      PATH=`getconf PATH`:$PATH; export PATH
      exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
      fi
      # rest of script


      If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat or .ps1 extension and use some similar trick for cmd.exe or powershell.exe to invoke the cygwin sh on the same file.






      share|improve this answer























      • Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
        – muru
        Dec 18 '18 at 7:59










      • @muru, thanks, see edit.
        – Stéphane Chazelas
        Dec 18 '18 at 8:08










      • @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
        – Kusalananda
        Dec 18 '18 at 8:11






      • 1




        @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
        – muru
        Dec 18 '18 at 8:18














      19












      19








      19






      I think primarily because:





      • the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.



        It could however now specify a minimum subset of most Unix-like implementations: like #! *[^ ]+( +[^ ]+)?n (with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed as argv[0] to the interpreter.




      • POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in /bin//usr/bin and have the POSIX utilities somewhere else (like on Solaris 10 where /bin/sh is a Bourne shell and the POSIX one is in /usr/xpg4/bin; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in /bin are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.



        See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.



        So even if POSIX specified the shebang mechanism it could not be used to write POSIX sh/sed/awk... scripts (also note that the shebang mechanism cannot be used to write reliable sed/awk script as it doesn't allow passing an end-of-option marker).




      Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #! if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.



      In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh syntax and hope that whatever invokes the script invokes a POSIX compliant sh on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.



      You may have to do things like:



      #! /bin/sh -
      if : ^ false; then : fine, POSIX system by default
      else
      # cover Solaris 10 or older. ": ^ false" returns false
      # in the Bourne shell as ^ is an alias for | there for
      # compatibility with the Thomson shell.
      PATH=`getconf PATH`:$PATH; export PATH
      exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
      fi
      # rest of script


      If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat or .ps1 extension and use some similar trick for cmd.exe or powershell.exe to invoke the cygwin sh on the same file.






      share|improve this answer














      I think primarily because:





      • the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.



        It could however now specify a minimum subset of most Unix-like implementations: like #! *[^ ]+( +[^ ]+)?n (with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed as argv[0] to the interpreter.




      • POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in /bin//usr/bin and have the POSIX utilities somewhere else (like on Solaris 10 where /bin/sh is a Bourne shell and the POSIX one is in /usr/xpg4/bin; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in /bin are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.



        See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.



        So even if POSIX specified the shebang mechanism it could not be used to write POSIX sh/sed/awk... scripts (also note that the shebang mechanism cannot be used to write reliable sed/awk script as it doesn't allow passing an end-of-option marker).




      Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #! if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.



      In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh syntax and hope that whatever invokes the script invokes a POSIX compliant sh on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.



      You may have to do things like:



      #! /bin/sh -
      if : ^ false; then : fine, POSIX system by default
      else
      # cover Solaris 10 or older. ": ^ false" returns false
      # in the Bourne shell as ^ is an alias for | there for
      # compatibility with the Thomson shell.
      PATH=`getconf PATH`:$PATH; export PATH
      exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
      fi
      # rest of script


      If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat or .ps1 extension and use some similar trick for cmd.exe or powershell.exe to invoke the cygwin sh on the same file.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Dec 18 '18 at 14:57

























      answered Dec 18 '18 at 7:57









      Stéphane Chazelas

      299k54564913




      299k54564913












      • Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
        – muru
        Dec 18 '18 at 7:59










      • @muru, thanks, see edit.
        – Stéphane Chazelas
        Dec 18 '18 at 8:08










      • @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
        – Kusalananda
        Dec 18 '18 at 8:11






      • 1




        @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
        – muru
        Dec 18 '18 at 8:18


















      • Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
        – muru
        Dec 18 '18 at 7:59










      • @muru, thanks, see edit.
        – Stéphane Chazelas
        Dec 18 '18 at 8:08










      • @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
        – Kusalananda
        Dec 18 '18 at 8:11






      • 1




        @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
        – muru
        Dec 18 '18 at 8:18
















      Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
      – muru
      Dec 18 '18 at 7:59




      Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
      – muru
      Dec 18 '18 at 7:59












      @muru, thanks, see edit.
      – Stéphane Chazelas
      Dec 18 '18 at 8:08




      @muru, thanks, see edit.
      – Stéphane Chazelas
      Dec 18 '18 at 8:08












      @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
      – Kusalananda
      Dec 18 '18 at 8:11




      @muru If the script was truly portable, on a truly POSIX system running a POSIX sh, it would not need a hashbang line as it would be executed by POSIX sh.
      – Kusalananda
      Dec 18 '18 at 8:11




      1




      1




      @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
      – muru
      Dec 18 '18 at 8:18




      @Kusalananda that's only true if execlp or execvp were used, right? If I were to use execve, it would result in ENOEXEC?
      – muru
      Dec 18 '18 at 8:18













      9















      [T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.




      You aren't looking deeply enough.



      Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:


      Note that BSD allows files which begin with #! interpreter to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of the exec…() routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically /bin/sh) for that program instead.

      — Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.

      An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…() functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p() subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…() on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.



      A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…() in the System 5 Interface Definition, by the turn of the 1990s:


      An interpreter file begins with a line of the form
      # ! pathname [arg]
      where pathname is the path of the interpreter, and arg is an optional argument.
      When you exec an interpreter file, the system execs the specified interpreter.

      exec. System V Interface Definition. Volume 1. 1991.

      Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.



      Further reading




      • Which shell interpreter runs a script with no shebang?

      • Why am I able to pass arguments to /usr/bin/env in this case?


      • script. NetBSD Miscellaneous Information Manual. 2005-05-06.






      share|improve this answer




























        9















        [T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.




        You aren't looking deeply enough.



        Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:


        Note that BSD allows files which begin with #! interpreter to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of the exec…() routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically /bin/sh) for that program instead.

        — Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.

        An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…() functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p() subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…() on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.



        A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…() in the System 5 Interface Definition, by the turn of the 1990s:


        An interpreter file begins with a line of the form
        # ! pathname [arg]
        where pathname is the path of the interpreter, and arg is an optional argument.
        When you exec an interpreter file, the system execs the specified interpreter.

        exec. System V Interface Definition. Volume 1. 1991.

        Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.



        Further reading




        • Which shell interpreter runs a script with no shebang?

        • Why am I able to pass arguments to /usr/bin/env in this case?


        • script. NetBSD Miscellaneous Information Manual. 2005-05-06.






        share|improve this answer


























          9












          9








          9







          [T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.




          You aren't looking deeply enough.



          Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:


          Note that BSD allows files which begin with #! interpreter to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of the exec…() routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically /bin/sh) for that program instead.

          — Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.

          An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…() functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p() subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…() on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.



          A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…() in the System 5 Interface Definition, by the turn of the 1990s:


          An interpreter file begins with a line of the form
          # ! pathname [arg]
          where pathname is the path of the interpreter, and arg is an optional argument.
          When you exec an interpreter file, the system execs the specified interpreter.

          exec. System V Interface Definition. Volume 1. 1991.

          Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.



          Further reading




          • Which shell interpreter runs a script with no shebang?

          • Why am I able to pass arguments to /usr/bin/env in this case?


          • script. NetBSD Miscellaneous Information Manual. 2005-05-06.






          share|improve this answer















          [T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.




          You aren't looking deeply enough.



          Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:


          Note that BSD allows files which begin with #! interpreter to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of the exec…() routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically /bin/sh) for that program instead.

          — Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.

          An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…() functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p() subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…() on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.



          A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…() in the System 5 Interface Definition, by the turn of the 1990s:


          An interpreter file begins with a line of the form
          # ! pathname [arg]
          where pathname is the path of the interpreter, and arg is an optional argument.
          When you exec an interpreter file, the system execs the specified interpreter.

          exec. System V Interface Definition. Volume 1. 1991.

          Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.



          Further reading




          • Which shell interpreter runs a script with no shebang?

          • Why am I able to pass arguments to /usr/bin/env in this case?


          • script. NetBSD Miscellaneous Information Manual. 2005-05-06.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Dec 28 '18 at 7:03

























          answered Dec 18 '18 at 13:26









          JdeBP

          33.3k468156




          33.3k468156























              1














              As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.



              Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability






              share|improve this answer


























                1














                As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.



                Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability






                share|improve this answer
























                  1












                  1








                  1






                  As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.



                  Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability






                  share|improve this answer












                  As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.



                  Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 19 '18 at 7:10









                  jamesdlin

                  389312




                  389312






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Unix & Linux Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f489628%2fwhy-is-the-behavior-of-the-syntax-unspecified-by-posix%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Aardman Animations

                      Are they similar matrix

                      “minimization” problem in Euclidean space related to orthonormal basis