Retrieve the sha256 hash of a file in Python.
SolvedLou_2044 Posted messages 11 Status Member -
Hello,
I had already come here to talk about my Python script (see here), which allows me to retrieve email information, especially attachments.
I thought I could manage it without any issues, but it's more complex than expected.
I ultimately decided to calculate the sha256 without directly retrieving the files. I tried using hashlib, but the script retrieves the file into a temporary file, so I'm having difficulty calculating its hash due to the rather special file types. Here’s a part of the code; the rest is accessible via this link:
# Extract MIME parts filename = part.get_filename() print(filename) mimetype = part.get_content_type() print(mimetype) if filename and mimetype: if mimetype in config['caseFiles'] or not config['caseFiles']: log.info("Found attachment: %s (%s)" % (filename, mimetype)) # Decode the attachment and save it in a temporary file charset = part.get_content_charset() if charset is None: charset = chardet.detect(bytes(part))['encoding'] # Get filename extension to not break TheHive analysers (see Github #11) fname, fextension = os.path.splitext(filename) fd, path = tempfile.mkstemp(prefix=slugify(fname) + "_", suffix=fextension) try: with os.fdopen(fd, 'w+b') as tmp: tmp.write(part.get_payload(decode=1)) print(tmp) for line in tmp: m = hashlib.sha256(line) print("test :", m.hexdigest()) attachments.append(path) except OSerror as e: log.error("Cannot dump attachment to %s: %s" % (path,e.errno)) return False I tried this to see:
tmp.write(part.get_payload(decode=1)) print(tmp) fichier = part.get_payload(decode=1) d = hashlib.sha256() print("test :",d.hexdigest()) I thought that if we retrieve the payload, we should be able to calculate the hash with this payload, but the returned hash is the same for each attachment...
2 answers
Hello,
why are you using a file?
have you tried
payload=part.get_payload(decode=1) hash=hashlib.sha256(payload) print("hash: ",hash.hexdigest())
Hello,
As yg_be mentioned #1, you don’t need to create a temporary file. It’s all the more surprising since you stated you wanted to avoid storing the email on the hard drive.
However, if you still wish to do so:
- it's better not to use readlines, which is more suited for text files
- it's better to create your temporary file using the tempfile module
import tempfile with tempfile.TemporaryFile() as tmp: tmp.write(bytes(0x1234))