## Why don't we use composite hashes?

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

Tub
Posts: 382
Joined: Wed Jul 27, 2011 3:13 pm UTC

### Why don't we use composite hashes?

Hello,

Let's use a simple definition: a hash is a function which transforms a message m into a fixed-length value H(m). And all we care about are preimage attacks: Given a message m1, how difficult is it to produce a message m2 (with m1 != m2) so that H(m1) = H(m2)?

We used MD5. Flaws were found. We migrated to SHA-1. Flaws were found. We're migrating to SHA-2. Flaws will be found. We'll migrate to something else.

But why don't we use composite hashes, e.g. the concatenation of multiple structurally different hash functions?

For example, we went from 128 bit MD5 to 160 bit SHA-1. What if we instead use a hash that's 128 bit MD5 + 32 bit CRC32? The trivial padding-attacks against CRC32 would leave you with a mismatch in the MD5 part; the not-so-trivial bitflip-attacks against MD5 would leave a mismatch in the CRC32. As far as I can see, none of the known attacks are immediately useful against the composite thing. I understand the reluctance to adapt something this duct-tapey, but given the way SHA-1 turned out, could that composite hash have been a better choice?

And then there's SHA-2, with SHA-2-256 and SHA-2-512, the second having a larger state, block size and slightly more rounds. If a structural vulnerability is found in SHA-2, would those increases even be useful? If you need a 512-bit hash, why don't you diversify your hash creation and use something like SHA-2-256 + SHA-1 + MD5? Wouldn't that be more resilient against the eventual flaw in SHA-2?

>-)
Posts: 525
Joined: Tue Apr 24, 2012 1:10 am UTC

### Re: Why don't we use composite hashes?

i think it's possible that there could be some problems with this, even though it sounds pretty good in practice
for example, suppose it just happened that (sha2(x) and md5(x)) xor sha1(x) = x (the first 256 bits of x, or something)
[edit: just saw your bit in the post about m1 != m2, but isn't allowing someone to find the original input a pretty bad idea?]
of course, this is probably not the case, but there could potentially some function of the hash values which allowed you to retrieve the input, or at least help generate a preimage
hashing x with multiple functions gives the attacker more information to work with, in that sense, and increases the attack surface

it seems to me that this concatenation does sound better, since this possible weakness sounds pretty unlikely compared to the chance of a weakness in say, sha2 -- but then again, i'm not a cryptographer

this thesis seems to be all about good ways to combine hash functions to make better hash functions

other sources:
https://security.stackexchange.com/ques ... ore-secure
https://crypto.stackexchange.com/questi ... 4490#24490
Last edited by >-) on Wed Feb 22, 2017 2:24 pm UTC, edited 5 times in total.

Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 3493
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

### Re: Why don't we use composite hashes?

Are we talking of one (say) MDxCRCy standard to be rolled out as the New Thing, or the possibility to specify TypeAHash+TypeBHash(+TypeCHash..?) for any given A≠B(≠C≠A...) from the whole range, chosen 'randomly' from amongst all agreeably shared type-standards expected to be available to all parties?

(With the latter, especially, I could even imagine a process to just request as many different 'component' hashes as deemed necessary to the importance of the validation. And available to create at both ends, of course.)

Tub
Posts: 382
Joined: Wed Jul 27, 2011 3:13 pm UTC

### Re: Why don't we use composite hashes?

>-) wrote:hashing x with multiple functions gives the attacker more information to work with, in that sense, and increases the attack surface

But.. in my definition, the attacker already knows one plaintext. If running additional hashes on that plaintext was an advantage, then the attacker could already do it.

I'm concerned with the case of creating a fake certificate/executable that still matches the hash (or digital signature) of the original.
To retrieve passwords from hashes, the attacker would certainly prefer to have multiple independent hash values of the same password, because they just need to attack the weakest one (assuming that the shortest matching plaintext is the actual password). So that's a good point for not using concatenations in a general-purpose hash like the SHA families, but that's not a case I'm concerned with right now.

Thanks for the links. That thesis seems highly relevant, but it will take me a while to work through.

Soupspoon wrote:Are we talking of one (say) MDxCRCy standard to be rolled out as the New Thing, or the possibility to specify TypeAHash+TypeBHash(+TypeCHash..?) for any given A≠B(≠C≠A...) from the whole range, chosen 'randomly' from amongst all agreeably shared type-standards expected to be available to all parties?

I'm not proposing rolling out anything, I'm just curious if the added structural complexity given by a concatenation of different families of hashes increases collision resistance more than, say, a new, shiny and unproven algorithm (as in my first example) or larger state and additional rounds (as in my second example).

>-)
Posts: 525
Joined: Tue Apr 24, 2012 1:10 am UTC

### Re: Why don't we use composite hashes?

yes in that case, where you want to fake a certificate, definitely makes the security at least as good as before, because any algorithm which can create fake certificate for the concatenated hash also makes a fake certificate for all the component hashes

wumpus
Posts: 531
Joined: Thu Feb 21, 2008 12:16 am UTC

### Re: Why don't we use composite hashes?

The two big reasons that we don't use composite hashes are efficiency and hard coded hashes.

It is much more likely that simply using a current hash is going to be more efficient than using an old one (for equal values of security).
The exception is if your old hash has been hard coded. That situation likely makes things worse as you are likely to only have the final value available (you would need to use your "new" hash to feed constant state into the old one, if the state isn't available). This limits the output to the size of the old hash.

There *might* be a reason to convolve SHA2 with SHA3. Since these hashes aren't even vaguely related, you would need separate attacks on both SHA2 and SHA3. I'd expect any attack that works on SHA to work even better on MD5. Not so much with an attack on SHA3 on SHA2. Doing something so related is simply piling weaker rounds on top of stronger rounds. In general, just run the most rounds you can (and make the rounds as fast as possible).

If you want "better" security from SHA2, I'd suggest looking at SHA512/256. Maintaining 512 bits of state makes it especially hard to work back from the 256 bits of output (which should be enough output for any hash). SHA3 uses this to a much lesser extent (a little bit goes a long way) as the "sponge construction".