Email File Retrieval in Python
SolvedHello! :)
I retrieved an open-source script that allows fetching email data based on a tag filter to then send it to an incident response tool (named thehive).
The link to the script: https://github.com/xme/dockers/blob/master/imap2thehive/imap2thehive.py
It works really well, except that I'm worried about retrieving attachments :/ I don't fully understand the potential risks if the email contains a malicious file that this script will download.
I think we need to look precisely at this part:
# Extract MIME parts filename = part.get_filename() mimetype = part.get_content_type() if filename and mimetype: if mimetype in config['caseFiles'] or not config['caseFiles']: log.info("Found attachment: %s (%s)" % (filename, mimetype)) # Decode the attachment and save it in a temporary file charset = part.get_content_charset() if charset is None: charset = chardet.detect(bytes(part))['encoding'] # Get filename extension to not break TheHive analysers (see Github #11) fname, fextension = os.path.splitext(filename) fd, path = tempfile.mkstemp(prefix=slugify(fname) + "_", suffix=fextension) try: with os.fdopen(fd, 'w+b') as tmp: tmp.write(part.get_payload(decode=1)) attachments.append(path) except OSerror as e: log.error("Cannot dump attachment to %s: %s" % (path,e.errno)) return False From what I understand, it creates a temporary file, opens it, and places the email file in it.
Can someone guide me?
Thanks! :)
2 answers
Hello,
Several things:
- In the event of a malicious email arriving, it may be more the email server's (rather than your script's) responsibility to check whether it is spam or not (typically with anti-spam and antivirus). The recipient of the email (including the email client) should have their own antivirus and anti-spam as a supplement.
- Nothing prevents basic checks (on the mime type, for example) to be extra cautious.
- This is easily bypassed by changing the extension. The extension does not change the file's content, but determines which application the system will want to use to read the file.
- If a software (say your pdf reader) contains a security flaw, that mime type is allowed, and the pdf reader has a vulnerability that allows it to execute embedded code in a pdf file, then a pdf file can indeed contain malicious code and begin to infect your machine (with the rights of the user who launched the pdf reader).
- If you download a malicious attachment locally, as yg_be says, as long as it is neither executed nor opened by software that it exploits a security flaw in, it cannot spontaneously infect your machine. This means that if the user does not execute or open a malicious file, there is no risk. If they open it, the malicious file must be able to exploit a potential vulnerability in the software used to open it, which is not systematic.
- In your message #6, you say "my mail server (Thunderbird)", but Thunderbird is an email client. A client allows for querying an email server, typically an IMAP or POP3 server (for example imap.gmail.com). So it is indeed another machine.
Good luck
Hello,
I think you understood well.
You might want to check the file extension and refuse it if that extension seems dangerous to you.
Good evening, I have no idea, I'm not much of a specialist on attacks concerning Windows ^^
Looking at what can be done in PHP in particular, a whitelist + file and extension renaming should, I think, address the issues of extensions hidden by Windows... (I'm not sure if this is still the case in recent versions).
Hello,
In the end, I prefer to avoid the risk of sending the file to the incident response tool.
In fact, what bothers me is that if I do an analysis on the extension, there will always be methods that I won't be able to detect. For example, a pdf containing executable code inside (I'm just saying anything, but that's the idea).
I thought, maybe I can calculate the SHA256 of the files, retrieve this hash, and send it along with the filename to the tool, which can then make a request to VirusTotal to check the file's integrity.
However, I still have a question, maybe a silly one. Still regarding the same block of code, we agree that I still need to download the file from the email and put it in the temporary file to allow for the SHA256 calculation? So that still poses a risk when downloading it? Since the temporary file is placed locally...
To be clearer: my script is located on the server hosting the incident response tool, it will query my mail server (Thunderbird) and directly send the response (formulated in a certain way) to the tool via its API.
Thank you! :)
Hello,
"In your message #6, you say 'my mail server (Thunderbird)', but Thunderbird is a mail client. A client allows you to query a mail server, typically an IMAP or POP3 server (for example imap.gmail.com). So it is indeed another machine."
Yes, sorry, I didn't use the right terms :( ... for my mail server.
"If you download a malicious attachment locally, as yg_be says, as long as it is neither executed nor read by software that exploits a security vulnerability, it cannot spontaneously infect your machine. This means that if the user does not execute or open a malicious file, there is no risk. If they open it, the malicious file must be able to exploit a potential vulnerability in the software used to open it, which is not systematic."
Alright. I don't know why, but for me it represented a risk to download a malicious file; aren't there cases (probably rare) where downloading a malicious file can be dangerous? (without having to touch it afterwards).
Thank you :)
No, it is not possible.
Regardless of the operating system, a download is simply a bit-by-bit copy of content to a storage space, without trying to understand what those bits mean. Therefore, there cannot be any execution of code (malicious or otherwise) at this stage, even if the downloaded content is an executable.
It is only after a task is performed by the system that a program must execute it. Then, there are two possibilities:
As you can see, there is no spontaneous execution at the moment of downloading a file.
Hello! Thank you for this explanation :)
It makes sense now that you mention it.
So I will stick with my idea of only retrieving the SHA256, which will be sufficient to know if the file is referenced in certain databases like VirusTotal. I prefer to avoid retrieving the entire file, as it will prevent someone from opening it after downloading from the incident response tool.
Have a great day!
As you feel, thank you for your feedback and best wishes for your future :-)