https://github.com/jart/cosmopolitan/wiki/FAQ#my-program-isnt-behaving-the-way-i-expect
- Have cosmopolitan binaries be able to use
argv[0]
normally, e.g. for a shell to determine whether it is a login shell by checking*argv[0] == '-'
, or for a busybox-type binary to have the command to run passed asargv[0]
, without necessarily having to have a link to it with that name. - Have binaries be generally able to locate the file they were launched as, in order to read
/zip
. - Support reasonable set-id cases, namely when the binary is assimilated, or when the platform implements secure set-id shebangs via
/dev/fd
(which OpenBSD, NetBSD, and MacOS can do, according to this.) - Try not to introduce regressions in terms of either program runtime or binary size, especially in the loader.
- We don't care to sanitize the path that the loader is given. We needn't even prepend
getcwd
. If we are invoked set-id, it is the kernel's job to have done this. Otherwise, it is the user's prerogative to structure the paths as they please.
__sys_execve
(and thereforeexecve
, whenever__sys_execve
succeeds on the first try) clobbersargv[0]
. As far as I can tell, the only timeargv[0]
is preserved across exec is when__sys_execve
fails and the ape fallback code is called.- The loader's broken realpath logic prevents FreeBSD and NetBSD from receiving the program executable name.
GetProgramExecutableName
needs to match what the loader does.
The four proposals that follow are mostly independent of each other and can each be taken or left on their own (although the first should probably be taken.) Proposal three is the most speculative.
My preference would be: do 1 (sanity) immediately. If we're going to do 2a (set-id), then do that alongside it since they go hand-in-hand. Probably just forget about 2b. Do 3 (cosmo_execve
) some time later. Do not do 4, as assimilation is just better whenever anything like it is needed.
- Roll back all
realpath
andgetcwd
code in the loaders, and pass the (possibly relative) resolved program path asx2
/%rdx
on all platforms. - Rework
GetProgramExecutableName
so it does the same thing for each of the platform-generic options, and include__program_executable_name
as the top priority among those options (i.e. prefer the platform-specific ones first.) Specifically, for each of__program_executable_name
,argv[0]
, and_
, in that order: (includingCOSMOPOLITAN_PROGRAM_EXECUTABLE
last if there is a desire for compatibility with loaders that were never officially minted)- Try to
sys_faccessat
it fromAT_FDCWD
. - After that succeeds, prepend
getcwd
if it is relative and use it.
- Try to
Nothing further is needed on the loader side for this.
As a proposed modification to GetProgramExecutableName
, if issetugid
and __program_executable_name
looks secure (i.e. it is /dev/fd/[n]
on a platform that implements that), then use that as the top choice. Since the platform-specific methods will all return the name of the loader, not the name of the binary, and all the other methods are vulnerable to TOCTOU between the loader and the binary (as well as between the kernel and the loader), if issetugid
and __program_executable_name
is set to something that does not look secure, use the empty string.
As a questionable further proposed modification, some time very early, if Decided not to do this. Assuring the sanity of the particular set-id interpreter script setup being used is not our job.issetugid
and __program_executable_name
is non-null and set to something that does not look secure, then halt, melt, and catch fire.
Implement a cosmo_execve
that is like sys_execve
except it checks for an ape binary before the kernel call. (So pull the ape-specific code out into __ape_execve
, and then sys_execve
is __sys_execve(); __ape_execve();
whereas cosmo_execve
is __ape_execve(); __sys_execve();
.) In places where argv[0]
correctness trumps performance concerns (e.g. bash and zsh in superconfigure), patch the program to use cosmo_execve
.
Another interesting option that would get the usage benefits witout the generalized performance hit would be to make it so that ape binaries always fail if they are passed through __sys_execve
from a running ape binary; this is what happens on my Apple Silicon machine, and it actually leads to very nice behavior in that argv[0]
is not clobbered by exec. As a discussion-starter, one way to do this would be to set e.g. The failure would have to happen in "APE_YOUDIENOW=SYS_EXECVE"
in the initial __sys_execve
call, but that particular approach has too many problems to be reasonable.__sys_execve
itself, before control transitions to the child process, so it’s difficult to imagine how this could be done without breaking other things.
Just to bring this up one more time — while assimilation ought to be the preferred option for using cosmo binaries as login shells, it would be nice to be able to keep a single binary with a heavily customized /zip
on e.g. a nfs volume and use that as a shell everywhere, and the only way I can imagine to do this is the $prog.ape
hack. But I admit I may just not be used to the possibility of assimilation. In any event — I lightly suggest using some of the space savings from reverting RealPath
to do this.