Why isn't there a CreateProcess that completely isolates Windows processes?
This has perplexed me for a very long time, so I'm just going to ask.
One of the things everyone wants to do nowadays is run untrusted code in a secure environment. To do this on Windows, programs like Chrome go through extraordinary lengths. They call dozens of Windows functions to create somewhat-sandboxed separate processes for running browser pages, thus reducing the chances malicious code can directly access system functions.
Despite years of incrementally adding additional calls, flags, and parameters that restrict Windows processes, the Windows kernel still - for reasons I do not understand - doesn’t provide a way to create a process that can’t make any system calls at all (or rather, can only make one very specific kind of syscall, as I’ll discuss later).
This is truly bizarre to me. If I were working on platform security, the very first thing I would add is a call like this:
HANDLE SandboxedProcess = CreateZeroProcess();
This would instruct the kernel to create a blank process, with literally nothing in it, and without the ability to make any syscalls at all. It would be prevented from having any DLLs mapped into it whatsoever - loaded, injected or otherwise.
You would then be expected to VirtualAllocEx or MapViewOfFile3 some memory inside that process, write the code you want to execute into that memory, and then call CreateRemoteThread to start some threads executing in the process.
Since these threads can’t make system calls, they would not be able to do anything at all other than read and write from memory that has been VirtualAllocEx’d or MapViewOfFile3’d by the parent process. A base pointer to whatever this memory is would presumably be passed in the lpParameter of the CreateRemoteThread call(s).
And that’s it.
That’s the best case you can hope for in a sandboxed process.
I Assume This Doesn’t Exist
One unfortunate aspect of the massive Windows API is that honestly, I just don’t know what’s in it. I know a lot about Windows programming - I’ve been doing it for thirty years now - but I assure you I still barely know anything relative to the full scope of the API. I am constantly surprised to find out about new things I didn’t know were added, or old things I never knew were there. If you don’t spend all day keeping up with the Windows API, you’re probably missing something.
So, it’s entirely possible that this call already exists by a different name. I just assume it doesn’t, because a) I couldn’t find anything like it, and b) serious sandboxes like Chrome don’t use anything like it. All the attempts at Windows sandboxes I have seen do dozens of calls - they set mitigation policies, put processes under job objects, do a bunch of token shenanigans, etc. So I assume that there is nothing so simple as CreateZeroProcess, because if there were, I can’t imagine they wouldn’t use it.
But maybe they just don’t know about it either. If so, please someone tell me where to find this API call, because I would love to use it!
Now, not having an API like this is not the same as the concept not existing in the kernel. Thanks to Mārtiņš Možeiko, I was pointed to Pico Processes, a feature sort-of like CreateZeroProcess that already exists in Windows, just (unfortunately) without a user-mode API. It’s there for supporting WSL, so these minimal processes get used internally, but as far as I know, you can’t actually create them from an application like Chrome.
But, since Pico Processes exist, perhaps it would not be a huge leap to get something like CreateZeroProcess in userland? It might not require that much work to turn one into the other…
Power Management and Performance Issues
I can’t think of any reason I couldn’t use literally the CreateZeroProcess API, as I just presented it, to do most of what I would want to do with a sandbox. However, there is one concern that suggests you’d want slightly more than zero syscalls.
The concern is that, with no syscalls, a secure process waiting for something in its parent process has no recourse but to enter a spin loop until it arrives.
These days, it’s bad form to max out all the cores, all the time. If it’s a gaming rig and the user has RGB lights all over everything, they may thank you for it, but everyone else will be grumpy that their laptop lasted less than 30 minutes on a full battery.
So if you provide literally zero system calls with CreateZeroProcess, there is no way for the secure process to tell the operating system it wants to wait for something, so it can be put to sleep. Furthermore, there wouldn’t even be a way to tell the operating system it’s in a spin loop, so once the CPU cores are oversubscribed, it would tank performance because the OS would keep thinking it should schedule the secure process to “do its work”, even though it actually isn’t doing anything but waiting.
Thus, I would suggest that perhaps the “right” design here, if it were feasible, is to allow CreateZeroProcess processes the ability to call some subset of the synchronization primitives to allow power- and performance-efficient coordination with the parent process.
These would not need to be the existing syscalls, and probably should not be. They would presumably want to be their own special syscall handler, which would handle only these “special” calls for the secure process, and which can be hardened separately from the rest of the Windows API. But they would basically provide the equivalent of, say, WaitForMultipleObjects, SetEvent, and EnterSynchronizationBarrier on handles provided by the parent process writing them into the secure process's memory.
Furthermore, a certain amount of “future-proofing” could be added to CreateZeroProcess in order to ensure that syscall updates would not make older applications less secure. For example, CreateZeroProcess could take a single parameter that is the “security version”. A value of 0 would mean no syscalls at all, a value of 1 would mean that aforementioned sync primitives, and then 2 and up would be reserved for expansions of the allowable syscalls.
This way, future decisions to add more syscall abilities in a zero process wouldn’t jeopardize older programs, because they would still get the original more-restrictive version they were requesting.
Remote Thread Stacks
I would also say, in an ideal world, perhaps there would be a replacement for CreateRemoteThread that was specific to the sandboxed version. This would not be a call for the secure process, but rather a new Windows call for the parent process in the normal Win32 API.
I say this because, one of the really annoying things about CreateThread in general (and the CreateRemoteThread variant) is that you cannot map the stack memory yourself. This isn’t a huge deal in ordinary programming, but it can be annoying when you’re trying to write your own execution environments.
So in a perfect world, perhaps we’d also get
HANDLE CreateZeroThread( [in] HANDLE hProcess, [in] LPVOID lpStackAddress, [in] LPTHREAD_START_ROUTINE lpStartAddress, [in, optional] LPVOID lpParameter, [in] DWORD dwCreationFlags, [in, optional] LPPROC_THREAD_ATTRIBUTE_LIST lpAttributeList, [out, optional] LPDWORD lpThreadId );
which would allow you to handle the stack yourself. Furthermore, if it made the implementation easier on the kernel, perhaps CreateZeroProcess processes could be restricted to only CreateZeroThread threads, so you could cabin off that part of the API and make it specific to secure processes.
I’m assuming there must be something wrong with this design, because it seems too simple and useful to not have happened already. But I can’t quite think of a practical reason it wouldn’t be feasible.
It does take some work, to be sure - but it seems like it would take significantly less work to implement than the vast array of piecemeal mitigations that have been steadily added since Windows 2000.
And unlike all those mitigations, it seems like this would actually have a good chance of remaining secure over time, since it starts from a clean slate and adds only the very minimum necessary to work, rather than trying to continually remove things as they are found to be exploitable.
Furthermore, it would be somewhat difficult for a programmer to misunderstand something about this design. CreateZeroProcess creates a process that basically can’t do anything, and the programmer then has to add capabilities to it by proxying system calls through a memory communication channel to the parent process. If they don’t understand something, they can just not proxy that thing, and be assured it won’t cause them a security headache.
By contrast, the current Windows sandbox model is highly error-prone. It involves so many security-specific concepts that it’s unlikely anyone really understands them all well enough to write and maintain secure code. If you don’t believe me, try reading this description of a Chrome sandbox exploit that arose due to one line change Microsoft made to Windows! The sheer number of elements involved is staggering.
The closest thing I know of might be CreateRestrictedToken, AdjustTokenGroups, and AdjustTokenPrivileges, but those doesn't quite fit the bill. There's a bit more info on Pico processes at https://fourcore.io/blogs/how-a-windows-process-is-created-part-1 that you might find interesting.
I have only read this article and the comments section. And I'm not a systems programmer so I know basically nothing about OS'es. So, what I'm asking might be trivial or just nonsense...
I have tried to read the code of the v8 engine with limited success. I have not seen anything that would be along the lines of checking for syscalls. All I saw was the language they have (torque) implementing the ecma spec in phrasing that is close to the spec itself, and then c++ implementations of the the algorithms that support functions like 'indexOf`, `replace` etc.
2. How did you know where to look for determining that chromium doesn't use anything like a process you are suggesting? I also tried to read the code of chromium, but its so huge I couldn't figure out where to even start. I guess if I had to look for it I could grep the whole project for something like a platform layer and look in those files. I think I saw at least one switch statement somewhere that looked like it was checking for macos, windows and linux. Is that what you did?