Demystifying the (Shebang): Kernel Adventures

thunderbong | 189 points

Fun fact: you can stick a null byte into the shebang line to terminate it, as an alterantive to the newline.

It's possible to have a scripting language support extra command line arguments after the null byte, which is less disruptive to the syntax than recognizing arguments from a second line.

I.e.

  #!/path/to/interpreter --arg<NUL>--more --args<LF>
Or

  #!/usr/bin/env interpreter<NUL>--all --args<LF>
On some OS's, you only get one arg: everything after the space, to the end of the line, is one argument.

When we stick a <NUL> there, that argument stops there; but our interpreter can read the whole line including the <NUL> up to the <LF> and then extract additional arguments between <NUL> and <LF>

https://www.nongnu.org/txr/txr-manpage.html#N-74C247FD

The interpreter could get the arguments in other ways, like from a second line after the hash bang line. But with the null hack, all the processing revolves around just the one hash bang line. You can retrofit this logic into an interpreter that already knows how to ignore the hash bang line, without doing any work beyond getting it to load the line properly with the embedded nul, and extract the arguments. You dont have to alter the syntax to specially recognize a hash bang continuation line.

kazinator | 7 days ago

If you found this article interesting, you might also enjoy "My Own Private Binary: An Idiosyncratic Introduction to Linux Kernel Modules"[0] and the previous discussion[1] of it on HN.

[0]: https://www.muppetlabs.com/~breadbox/txt/mopb.html

[1]: https://news.ycombinator.com/item?id=29291804

spudlyo | 7 days ago

Read your article, it's really nice. I really feel much less demystified by this.

But can you / somebody please explain what this means

According to the official Kernel Admin Guide:

This Kernel feature allows you to invoke almost (for restrictions see below) every program by simply typing its name in the shell. This includes for example compiled Java(TM), Python or Emacs programs. To achieve this you must tell binfmt_misc which interpreter has to be invoked with which binary. Binfmt_misc recognises the binary-type by matching some bytes at the beginning of the file with a magic byte sequence (masking out specified bits) you have supplied. Binfmt_misc can also recognise a filename extension aka .com or .exe.

It’s another way to tell the Kernel what interpreter to run when invoking a program that’s not native (ELF). For scripts (text files) we mostly use a shebang, but for byte-coded binaries, such as Java’s JAR or Mono EXE files, it’s the way to go!

Like, can you give me an example by what you mean. What are its use cases, if any. I read it many times and always with some sort of enthusiasm because of this sentence ending in exclamation point making me feel like it's huge yet I just can't understand it's significance.

Does it mean we can have .jar files which can then run shebang like, so we don't need #! , can this also be used for main.go or every other language which has some issues with #! ,

I see there being some interpreter for golang, rust etc. which just compiles it but it was just too complex. I am just imagining something like a simple go file which is valid golang but can be run by linux simply by ./ And it autocompiles it...

Imustaskforhelp | 7 days ago

One of my favourite old-school Perl magic spells used to portably handle broken shells is:

  #!/usr/bin/perl
  eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
    if 0;
See: https://www.perl.com/article/bang-bang/
AdieuToLogic | 7 days ago

> Since I never remember which one is which, a good way to check is using the utility `file`: `file $(which useradd)`

While we're here, can someone explain why `which` prints some locations, and for others the whole darn file? Like `which npm` prints the location; `which nvm` prints the whole darn file.

lelandfe | 7 days ago

Articles like this are just such a delight. History + common software + code snippets is a great combo

davis | 7 days ago

How do I fix my kernel so that I can use the setuid bit with shebang?

amelius | 7 days ago

The shebang seems underbaked to me. There is no way to reference a user's home directory AFIK. I came across this annoyance when trying to make my python virtual environment portable.

fracus | 7 days ago
[deleted]
| 7 days ago