I usually just store something like a row ID to the hash record and an SHA1 of the hash itself (primary key, for sorting proposed, allowing duplicates).
That opens up options like:
- a left join through the SHA1 temp table to the actual hash table
- exists check prompting the expensive current lookup on actual hash
- count > 1 prompting a less expensive current lookup on row ID (assuming it’s a key or indexed field) and hash
Again this all is based on my assumptions of the issue, I still haven’t dug into the code yet.
Oh, and if you stuck around this far into the post you get to know that my recreate finished after 4 days, 5 hours, 37 minutes.