Retrieve the sha256 hash of a file in Python.

Solved
Lou -  
Lou_2044 Posted messages 11 Status Member -

Hello,

I had already come here to talk about my Python script (see here), which allows me to retrieve email information, especially attachments.

I thought I could manage it without any issues, but it's more complex than expected.

I ultimately decided to calculate the sha256 without directly retrieving the files. I tried using hashlib, but the script retrieves the file into a temporary file, so I'm having difficulty calculating its hash due to the rather special file types. Here’s a part of the code; the rest is accessible via this link:

# Extract MIME parts filename = part.get_filename() print(filename) mimetype = part.get_content_type() print(mimetype) if filename and mimetype: if mimetype in config['caseFiles'] or not config['caseFiles']: log.info("Found attachment: %s (%s)" % (filename, mimetype)) # Decode the attachment and save it in a temporary file charset = part.get_content_charset() if charset is None: charset = chardet.detect(bytes(part))['encoding'] # Get filename extension to not break TheHive analysers (see Github #11) fname, fextension = os.path.splitext(filename) fd, path = tempfile.mkstemp(prefix=slugify(fname) + "_", suffix=fextension) try: with os.fdopen(fd, 'w+b') as tmp: tmp.write(part.get_payload(decode=1)) print(tmp) for line in tmp: m = hashlib.sha256(line) print("test :", m.hexdigest()) attachments.append(path) except OSerror as e: log.error("Cannot dump attachment to %s: %s" % (path,e.errno)) return False 

I tried this to see:

tmp.write(part.get_payload(decode=1)) print(tmp) fichier = part.get_payload(decode=1) d = hashlib.sha256() print("test :",d.hexdigest())

I thought that if we retrieve the payload, we should be able to calculate the hash with this payload, but the returned hash is the same for each attachment...

2 answers

yg_be Posted messages 23437 Registration date   Status Contributor Last intervention   Ambassadeur 1 588
 

Hello,

why are you using a file?

have you tried

payload=part.get_payload(decode=1) hash=hashlib.sha256(payload) print("hash: ",hash.hexdigest())
1
mamiemando Posted messages 33228 Registration date   Status Moderator Last intervention   7 940
 

Hello,

As yg_be mentioned #1, you don’t need to create a temporary file. It’s all the more surprising since you stated you wanted to avoid storing the email on the hard drive.

However, if you still wish to do so:

  • it's better not to use readlines, which is more suited for text files
  • it's better to create your temporary file using the tempfile module
import tempfile with tempfile.TemporaryFile() as tmp: tmp.write(bytes(0x1234))
0
Lou_2044 Posted messages 11 Status Member
 

Hello @mamiemando StatusModerator and @yg_be StatusContributor,

Indeed, I wanted to avoid retrieving the file but I thought I had to. Following @yg_be's response, I feel silly because that was what I wanted to do, but I did it wrong x)

Thanks again :)

0
yg_be Posted messages 23437 Registration date   Status Contributor Last intervention   1 588 > Lou_2044 Posted messages 11 Status Member
 

The height of absurdity was the instruction "d = hashlib.sha256()", which calculated the hash of nothing.

Can you then mark the discussion as resolved?

0
Lou_2044 Posted messages 11 Status Member > yg_be Posted messages 23437 Registration date   Status Contributor Last intervention  
 

Yes, I wanted to do

d = hashlib.sha256(file)

X)

"Can you then mark the discussion as resolved?"

I can't, I just created my account but the username "Lou" was already taken..

0