Why is the behavior of the `#!` syntax unspecified by POSIX?
From the Shell Command Language page of the POSIX specification:
If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.
Why is the behavior of #!
unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.
shell posix shebang
|
show 3 more comments
From the Shell Command Language page of the POSIX specification:
If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.
Why is the behavior of #!
unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.
shell posix shebang
1
Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41
2
Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57
1
@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44
1
@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements theexec()
function. So checking against multiple shells doesn't really tell you how portable it is.
– Austin Hemmelgarn
Dec 18 '18 at 20:36
2
@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30
|
show 3 more comments
From the Shell Command Language page of the POSIX specification:
If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.
Why is the behavior of #!
unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.
shell posix shebang
From the Shell Command Language page of the POSIX specification:
If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.
Why is the behavior of #!
unspecified by POSIX? I find it baffling that something so portable and widely used would have an unspecified behavior.
shell posix shebang
shell posix shebang
edited Dec 18 '18 at 9:43
ilkkachu
55.9k784155
55.9k784155
asked Dec 18 '18 at 7:37
Harold Fischer
545313
545313
1
Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41
2
Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57
1
@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44
1
@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements theexec()
function. So checking against multiple shells doesn't really tell you how portable it is.
– Austin Hemmelgarn
Dec 18 '18 at 20:36
2
@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30
|
show 3 more comments
1
Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41
2
Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57
1
@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44
1
@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements theexec()
function. So checking against multiple shells doesn't really tell you how portable it is.
– Austin Hemmelgarn
Dec 18 '18 at 20:36
2
@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30
1
1
Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41
Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41
2
2
Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57
Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57
1
1
@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44
@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44
1
1
@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the
exec()
function. So checking against multiple shells doesn't really tell you how portable it is.– Austin Hemmelgarn
Dec 18 '18 at 20:36
@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the
exec()
function. So checking against multiple shells doesn't really tell you how portable it is.– Austin Hemmelgarn
Dec 18 '18 at 20:36
2
2
@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30
@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30
|
show 3 more comments
3 Answers
3
active
oldest
votes
I think primarily because:
the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.
It could however now specify a minimum subset of most Unix-like implementations: like
#! *[^ ]+( +[^ ]+)?n
(with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed asargv[0]
to the interpreter.
POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in
/bin
//usr/bin
and have the POSIX utilities somewhere else (like on Solaris 10 where/bin/sh
is a Bourne shell and the POSIX one is in/usr/xpg4/bin
; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in/bin
are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.
See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.
So even if POSIX specified the shebang mechanism it could not be used to write POSIX
sh
/sed
/awk
... scripts (also note that the shebang mechanism cannot be used to write reliablesed
/awk
script as it doesn't allow passing an end-of-option marker).
Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #!
if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.
In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh
syntax and hope that whatever invokes the script invokes a POSIX compliant sh
on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.
You may have to do things like:
#! /bin/sh -
if : ^ false; then : fine, POSIX system by default
else
# cover Solaris 10 or older. ": ^ false" returns false
# in the Bourne shell as ^ is an alias for | there for
# compatibility with the Thomson shell.
PATH=`getconf PATH`:$PATH; export PATH
exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
fi
# rest of script
If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat
or .ps1
extension and use some similar trick for cmd.exe
or powershell.exe
to invoke the cygwin sh
on the same file.
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru If the script was truly portable, on a truly POSIX system running a POSIXsh
, it would not need a hashbang line as it would be executed by POSIXsh
.
– Kusalananda
Dec 18 '18 at 8:11
1
@Kusalananda that's only true ifexeclp
orexecvp
were used, right? If I were to useexecve
, it would result in ENOEXEC?
– muru
Dec 18 '18 at 8:18
add a comment |
[T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.
You aren't looking deeply enough.
Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:
Note that BSD allows files which begin with#! interpreter
to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of theexec…()
routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically/bin/sh
) for that program instead.
— Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.
An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…()
functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p()
subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…()
on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.
A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…()
in the System 5 Interface Definition, by the turn of the 1990s:
An interpreter file begins with a line of the form# ! pathname [arg]where pathname is the path of the interpreter, and arg is an optional argument.
When youexec
an interpreter file, the systemexec
s the specified interpreter.
—
exec
. System V Interface Definition. Volume 1. 1991.Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. ␀
and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.
Further reading
- Which shell interpreter runs a script with no shebang?
- Why am I able to pass arguments to /usr/bin/env in this case?
script
. NetBSD Miscellaneous Information Manual. 2005-05-06.
add a comment |
As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.
Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f489628%2fwhy-is-the-behavior-of-the-syntax-unspecified-by-posix%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think primarily because:
the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.
It could however now specify a minimum subset of most Unix-like implementations: like
#! *[^ ]+( +[^ ]+)?n
(with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed asargv[0]
to the interpreter.
POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in
/bin
//usr/bin
and have the POSIX utilities somewhere else (like on Solaris 10 where/bin/sh
is a Bourne shell and the POSIX one is in/usr/xpg4/bin
; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in/bin
are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.
See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.
So even if POSIX specified the shebang mechanism it could not be used to write POSIX
sh
/sed
/awk
... scripts (also note that the shebang mechanism cannot be used to write reliablesed
/awk
script as it doesn't allow passing an end-of-option marker).
Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #!
if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.
In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh
syntax and hope that whatever invokes the script invokes a POSIX compliant sh
on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.
You may have to do things like:
#! /bin/sh -
if : ^ false; then : fine, POSIX system by default
else
# cover Solaris 10 or older. ": ^ false" returns false
# in the Bourne shell as ^ is an alias for | there for
# compatibility with the Thomson shell.
PATH=`getconf PATH`:$PATH; export PATH
exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
fi
# rest of script
If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat
or .ps1
extension and use some similar trick for cmd.exe
or powershell.exe
to invoke the cygwin sh
on the same file.
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru If the script was truly portable, on a truly POSIX system running a POSIXsh
, it would not need a hashbang line as it would be executed by POSIXsh
.
– Kusalananda
Dec 18 '18 at 8:11
1
@Kusalananda that's only true ifexeclp
orexecvp
were used, right? If I were to useexecve
, it would result in ENOEXEC?
– muru
Dec 18 '18 at 8:18
add a comment |
I think primarily because:
the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.
It could however now specify a minimum subset of most Unix-like implementations: like
#! *[^ ]+( +[^ ]+)?n
(with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed asargv[0]
to the interpreter.
POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in
/bin
//usr/bin
and have the POSIX utilities somewhere else (like on Solaris 10 where/bin/sh
is a Bourne shell and the POSIX one is in/usr/xpg4/bin
; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in/bin
are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.
See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.
So even if POSIX specified the shebang mechanism it could not be used to write POSIX
sh
/sed
/awk
... scripts (also note that the shebang mechanism cannot be used to write reliablesed
/awk
script as it doesn't allow passing an end-of-option marker).
Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #!
if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.
In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh
syntax and hope that whatever invokes the script invokes a POSIX compliant sh
on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.
You may have to do things like:
#! /bin/sh -
if : ^ false; then : fine, POSIX system by default
else
# cover Solaris 10 or older. ": ^ false" returns false
# in the Bourne shell as ^ is an alias for | there for
# compatibility with the Thomson shell.
PATH=`getconf PATH`:$PATH; export PATH
exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
fi
# rest of script
If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat
or .ps1
extension and use some similar trick for cmd.exe
or powershell.exe
to invoke the cygwin sh
on the same file.
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru If the script was truly portable, on a truly POSIX system running a POSIXsh
, it would not need a hashbang line as it would be executed by POSIXsh
.
– Kusalananda
Dec 18 '18 at 8:11
1
@Kusalananda that's only true ifexeclp
orexecvp
were used, right? If I were to useexecve
, it would result in ENOEXEC?
– muru
Dec 18 '18 at 8:18
add a comment |
I think primarily because:
the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.
It could however now specify a minimum subset of most Unix-like implementations: like
#! *[^ ]+( +[^ ]+)?n
(with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed asargv[0]
to the interpreter.
POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in
/bin
//usr/bin
and have the POSIX utilities somewhere else (like on Solaris 10 where/bin/sh
is a Bourne shell and the POSIX one is in/usr/xpg4/bin
; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in/bin
are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.
See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.
So even if POSIX specified the shebang mechanism it could not be used to write POSIX
sh
/sed
/awk
... scripts (also note that the shebang mechanism cannot be used to write reliablesed
/awk
script as it doesn't allow passing an end-of-option marker).
Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #!
if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.
In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh
syntax and hope that whatever invokes the script invokes a POSIX compliant sh
on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.
You may have to do things like:
#! /bin/sh -
if : ^ false; then : fine, POSIX system by default
else
# cover Solaris 10 or older. ": ^ false" returns false
# in the Bourne shell as ^ is an alias for | there for
# compatibility with the Thomson shell.
PATH=`getconf PATH`:$PATH; export PATH
exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
fi
# rest of script
If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat
or .ps1
extension and use some similar trick for cmd.exe
or powershell.exe
to invoke the cygwin sh
on the same file.
I think primarily because:
the behaviour varies greatly between implementation. See https://www.in-ulm.de/~mascheck/various/shebang/ for all the details.
It could however now specify a minimum subset of most Unix-like implementations: like
#! *[^ ]+( +[^ ]+)?n
(with only characters from the portable filename character set in those one or two words) where the first word is an absolute path to a native executable, the thing is not too long and behaviour unspecified if the executable is setuid/setgid, and implementation defined whether the interpreter path or the script path is passed asargv[0]
to the interpreter.
POSIX doesn't specify the path of executables anyway. Several systems have pre-POSIX utilities in
/bin
//usr/bin
and have the POSIX utilities somewhere else (like on Solaris 10 where/bin/sh
is a Bourne shell and the POSIX one is in/usr/xpg4/bin
; Solaris 11 replaced it with ksh93 which is more POSIX compliant, but most of the other tools in/bin
are still ancient non-POSIX ones). Some systems are not POSIX ones but have a POSIX mode/emulation. All POSIX requires is that there be a documented environment in which a system behaves POSIXly.
See Windows+Cygwin for instance. Actually, with Windows+Cygwin, the she-bang is honoured when a script is invoked by a cygwin application, but not by a native Windows application.
So even if POSIX specified the shebang mechanism it could not be used to write POSIX
sh
/sed
/awk
... scripts (also note that the shebang mechanism cannot be used to write reliablesed
/awk
script as it doesn't allow passing an end-of-option marker).
Now the fact that it's unspecified doesn't mean you can't use it (well, it says you shouldn't have the first line start with #!
if you expect it to be only a regular comment and not a she-bang), but that POSIX gives you no guarantee if you do.
In my experience, using shebangs gives you more guarantee of portability than using POSIX's way of writing shell scripts: leave off the she-bang, write the script in POSIX sh
syntax and hope that whatever invokes the script invokes a POSIX compliant sh
on it, which is fine if you know the script will be invoked in the right environment by the right tool but not otherwise.
You may have to do things like:
#! /bin/sh -
if : ^ false; then : fine, POSIX system by default
else
# cover Solaris 10 or older. ": ^ false" returns false
# in the Bourne shell as ^ is an alias for | there for
# compatibility with the Thomson shell.
PATH=`getconf PATH`:$PATH; export PATH
exec /usr/xpg4/bin/sh - "$0" ${1+"$@"}
fi
# rest of script
If you want to be portable to Windows+Cygwin, you may have to name your file with a .bat
or .ps1
extension and use some similar trick for cmd.exe
or powershell.exe
to invoke the cygwin sh
on the same file.
edited Dec 18 '18 at 14:57
answered Dec 18 '18 at 7:57
Stéphane Chazelas
299k54564913
299k54564913
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru If the script was truly portable, on a truly POSIX system running a POSIXsh
, it would not need a hashbang line as it would be executed by POSIXsh
.
– Kusalananda
Dec 18 '18 at 8:11
1
@Kusalananda that's only true ifexeclp
orexecvp
were used, right? If I were to useexecve
, it would result in ENOEXEC?
– muru
Dec 18 '18 at 8:18
add a comment |
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru If the script was truly portable, on a truly POSIX system running a POSIXsh
, it would not need a hashbang line as it would be executed by POSIXsh
.
– Kusalananda
Dec 18 '18 at 8:11
1
@Kusalananda that's only true ifexeclp
orexecvp
were used, right? If I were to useexecve
, it would result in ENOEXEC?
– muru
Dec 18 '18 at 8:18
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
Interestingly, from issue 5: "The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment."
– muru
Dec 18 '18 at 7:59
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru, thanks, see edit.
– Stéphane Chazelas
Dec 18 '18 at 8:08
@muru If the script was truly portable, on a truly POSIX system running a POSIX
sh
, it would not need a hashbang line as it would be executed by POSIX sh
.– Kusalananda
Dec 18 '18 at 8:11
@muru If the script was truly portable, on a truly POSIX system running a POSIX
sh
, it would not need a hashbang line as it would be executed by POSIX sh
.– Kusalananda
Dec 18 '18 at 8:11
1
1
@Kusalananda that's only true if
execlp
or execvp
were used, right? If I were to use execve
, it would result in ENOEXEC?– muru
Dec 18 '18 at 8:18
@Kusalananda that's only true if
execlp
or execvp
were used, right? If I were to use execve
, it would result in ENOEXEC?– muru
Dec 18 '18 at 8:18
add a comment |
[T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.
You aren't looking deeply enough.
Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:
Note that BSD allows files which begin with#! interpreter
to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of theexec…()
routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically/bin/sh
) for that program instead.
— Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.
An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…()
functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p()
subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…()
on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.
A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…()
in the System 5 Interface Definition, by the turn of the 1990s:
An interpreter file begins with a line of the form# ! pathname [arg]where pathname is the path of the interpreter, and arg is an optional argument.
When youexec
an interpreter file, the systemexec
s the specified interpreter.
—
exec
. System V Interface Definition. Volume 1. 1991.Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. ␀
and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.
Further reading
- Which shell interpreter runs a script with no shebang?
- Why am I able to pass arguments to /usr/bin/env in this case?
script
. NetBSD Miscellaneous Information Manual. 2005-05-06.
add a comment |
[T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.
You aren't looking deeply enough.
Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:
Note that BSD allows files which begin with#! interpreter
to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of theexec…()
routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically/bin/sh
) for that program instead.
— Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.
An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…()
functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p()
subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…()
on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.
A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…()
in the System 5 Interface Definition, by the turn of the 1990s:
An interpreter file begins with a line of the form# ! pathname [arg]where pathname is the path of the interpreter, and arg is an optional argument.
When youexec
an interpreter file, the systemexec
s the specified interpreter.
—
exec
. System V Interface Definition. Volume 1. 1991.Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. ␀
and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.
Further reading
- Which shell interpreter runs a script with no shebang?
- Why am I able to pass arguments to /usr/bin/env in this case?
script
. NetBSD Miscellaneous Information Manual. 2005-05-06.
add a comment |
[T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.
You aren't looking deeply enough.
Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:
Note that BSD allows files which begin with#! interpreter
to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of theexec…()
routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically/bin/sh
) for that program instead.
— Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.
An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…()
functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p()
subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…()
on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.
A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…()
in the System 5 Interface Definition, by the turn of the 1990s:
An interpreter file begins with a line of the form# ! pathname [arg]where pathname is the path of the interpreter, and arg is an optional argument.
When youexec
an interpreter file, the systemexec
s the specified interpreter.
—
exec
. System V Interface Definition. Volume 1. 1991.Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. ␀
and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.
Further reading
- Which shell interpreter runs a script with no shebang?
- Why am I able to pass arguments to /usr/bin/env in this case?
script
. NetBSD Miscellaneous Information Manual. 2005-05-06.
[T]he behavior seems consistent between all POSIX-complaint shells. I don't see the need the need for wiggle room here.
You aren't looking deeply enough.
Back in the 1980s, this mechanism was not de facto standardized. Although Dennis Ritchie had implemented it, that implementation had not reached the public in the AT&T side of the universe. It was effectively only publicly available and known in BSD; with executable shell scripts not available on AT&T Unix. Thus it was not reasonable to standardize it. The state of affairs is exemplified by this contemporary doco, one of many such:
Note that BSD allows files which begin with#! interpreter
to be executed directly, while SysV allows only a.out files to be executed directly. This means that an instance of one of theexec…()
routines in a BSD program may have to be changed under SysV to execute the interpreter (typlically/bin/sh
) for that program instead.
— Stephen Frede (1988). "Programming on System X Release Y". Australian Unix Systems User Group Newsletter. Volume 9. Number 4. p. 111.
An important point here is that you are looking at shells, whereas the existence of executable shell scripts is actually a matter for the exec…()
functions. What shells do includes the precursors of the executable script mechanism, still to be found in some shells even today (and also nowadays mandated for the exec…p()
subset of functions), and is somewhat misleading. What the standard needs to address in this regard is how exec…()
on an interpreted script works, and at the time that POSIX was originally created it simply did not work in the first place across a major part of the spectrum of target operating systems.
A subordinate question is why this has not been standardized since, especially as the magic number mechanism for script interpreters had reached the public in the AT&T side of the universe and had been documented for exec…()
in the System 5 Interface Definition, by the turn of the 1990s:
An interpreter file begins with a line of the form# ! pathname [arg]where pathname is the path of the interpreter, and arg is an optional argument.
When youexec
an interpreter file, the systemexec
s the specified interpreter.
—
exec
. System V Interface Definition. Volume 1. 1991.Unfortunately, the behaviour remains today almost as widely divergent as it was in the 1980s and there is no truly common behaviour to standardize. Some Unices (famously HP-UX and FreeBSD, for examples) do not support scripts as interpreters for scripts. Whether the first line is one, two, or many elements separated by whitespace varies between MacOS (and versions of FreeBSD before 2005) and others. The maximum supported path length varies. ␀
and characters outwith the POSIX portable filename character set are tricky, as are leading and trailing whitespace. What the 0th, 1st, and 2nd argument end up being is also tricky, with significant variation across systems. Some currently POSIX-conformant but non-Unix systems still do not support any such mechanism, and mandating it would convert them into no longer being POSIX conformant.
Further reading
- Which shell interpreter runs a script with no shebang?
- Why am I able to pass arguments to /usr/bin/env in this case?
script
. NetBSD Miscellaneous Information Manual. 2005-05-06.
edited Dec 28 '18 at 7:03
answered Dec 18 '18 at 13:26
JdeBP
33.3k468156
33.3k468156
add a comment |
add a comment |
As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.
Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability
add a comment |
As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.
Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability
add a comment |
As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.
Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability
As noted by some of the other answers, implementations vary. This makes it hard to standardize and preserve backward-compatibility with existing scripts. This is true even for modern POSIX systems. For example, Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script.
Also see http://en.wikipedia.org/wiki/Shebang_(Unix)#Portability
answered Dec 19 '18 at 7:10
jamesdlin
389312
389312
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f489628%2fwhy-is-the-behavior-of-the-syntax-unspecified-by-posix%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Standards leave things unspecified to not tie down implementations to particular behaviours. For example, a "login" is "The unspecified activity by which a user gains access to the system."
– Kusalananda
Dec 18 '18 at 7:41
2
Since POSIX doesn't specify executable paths, a shebang line is inherently non-portable anyway; I'm not sure much would be gained by specifying it regardless.
– Michael Homer
Dec 18 '18 at 7:57
1
@MichaelHomer, surely not? The standard could specify that the line contains a path to use for the interpreter, even without telling what that path should be.
– ilkkachu
Dec 18 '18 at 9:44
1
@HaroldFischer Except it's not interpreted by the shell, it's interpreted by either the OS kernel (done at least on Linux, which can actually disable this support during build time), or whatever library implements the
exec()
function. So checking against multiple shells doesn't really tell you how portable it is.– Austin Hemmelgarn
Dec 18 '18 at 20:36
2
@HaroldFischer Furthermore, even among POSIX-compliant OSes the behavior isn't consistent. Linux and macOS behave differently: Linux does not fully tokenize the shebang line by spaces. macOS does not allow the script interpreter to be another script. Also see en.wikipedia.org/wiki/Shebang_(Unix)#Portability
– jamesdlin
Dec 18 '18 at 21:30