Finding the right mnemonic passphrase

Here’s the passphrase word list (not the original, but will work as an example):

fiscal bomb mutual one alley mistake unfair they proof unveil month prepare logic yard daring adapt eyebrow turn burst mandate win report maximum giraffe

Aight, first things first, what does the Mnemonic Code Converter / BIP39 Tool by Ian Coleman (and used by Ledger) say when you enter this passphrase?

invalid mnemonic

Well, it’s clear the provided passphrase is wrong. I don’t know the details of how, but the last word in a BIP39 word sequence (whether 12 words, 24 words, or other multiples of 3) is the checksum. The checksum tells you that the rest of its preceding sequence is valid, so if you picked 24 random words from 2048 words of the BIP 39 spec, they’d likely be an invalid sequence. FYI, everywhere I read strongly recommended you do not generate your own passphrase, and let it be generated for you. Essentially, one that you try to create will never be as entropic, or random, as one algorithmically generated for you.

Anywayyyy, I first asked them if they had any ideas of words that could be wrong, like copied over incorrectly, hard to read, or written down by row instead of by column. In the original word list, the word “rigid” appeard twice and at the top of both columns on their recovery phrase card (card pictured below).




BIP39 allows repeated words, so the double “rigid” wasn’t necessarily an issue. But better than going off of nothing at all I guess.

Say we can try substituting one of those “rigid”s with other words. But that’s already 2048 possibilities…not something you want to do manually, especially since it’s likely a dead end. Well, easy enough to write a script to produce the passphrases.

# an array of the 2048 word list
bip39_word_list = %w(
  abandon
  ability
  able
  about
  above
  absent
  absorb
  ...
)

# array of our incorrect 24 word phrase
passphrase = %w(fiscal bomb mutual ... )

# generate test passphrases by replacing each word in the original passphrase with all 2048 words
passphrase.each_with_index do |current_word, index|
  bip39_word_list.each do |test_word|
    passphrase_copy = passphrase.clone
    passphrase_copy[index] = test_word
  end
end

Instead of just replacing the “rigids”, I figured I’d just make loops to generate all 24*2048 possible test passphrases. That’s 49,152 in total. This is testing the “best case” scenario, where only 1 word in the passphrase is incorrect. And only the first step. If it’s more than 1 word, or if the order of the words is wrong, then we’ll have a much bigger problem to deal with.

As we learned earlier, the last word in these test passphrases acts a checksum, and hopefully the majority of the 49,512 are not valid mnemonics at all. How do we test that automatedly? I looked into Ian’s javascript code to learn more and then found the Ruby gem BipMnemonic. Let’s expand the script to check for validity.

valid_count = 0

# generate and validate test passphrases by replacing each word in the original passphrase with all 2048 words
passphrase.each_with_index do |current_word, index|
  bip39_word_list.each do |test_word|
    passphrase_copy = passphrase.clone
    passphrase_copy[index] = test_word
    passphrase_copy_string = passphrase_copy.join(' ')

    begin
      # throws an error if the mnemonic is invalid
      BipMnemonic.to_entropy(mnemonic: passphrase_copy_string)
      valid_count+=1
    rescue SecurityError => e

    end
  end
end

Ahhhh, only 199 valid combinations! 0.4%, that’s kind of interesting on its own? I wonder if it’s a similar percentage if I randomly selected 24 words. That’d be really easy to check right now…real shame.

Ok, now what. Say we did have the correct passphrase. Then we should be able to generate the private key and access to the wallet. Is there an app (software wallet) that I can easily throw these test phrases in? That’d make things easy. Repetitive, but easy. If one allowed bulk checking, that’d be really nice. Don’t really remember what the search results were. I feel like there may have been an app but that I’d seen they supported Bitcoin specifically/only, and my impression was that his wallet had other alt coins. And definitely didn’t want to manually try 199 phrases. I didn’t understand how a wallet could have more than 1 type of coin in it. My understanding was that a wallet is really just a corresponding public and private key and a blockchain address. Well, looks like that isn’t the case here, and we’ll figure this out later.

No matter – if we can generate addresses at least, we can check for transactions on the blockchain. If we found a transaction, that means the mnemonic passphrase it came from would be the right one. Transactions are all completely public, and various websites make it easy to check, e.g. blockchain.info. Luckily I also found a Reddit post where someone posted their broken mnemonic for a bitcoin wallet, and another redditor figured out the issue and posted some info (not the same story I referred to at the beginning of this post). This could be useful as a reference because it’s a real wallet with transactions you can check.

Using the Mnemonic Code Converter for the Reddit post passphrase honey relief scale kite dose lyrics they middle globe exhaust smooth galaxy horror ensure grape way gift embody spring cupboard horror hurt image swift, we get all sorts of data: seed, coin, root key, derivation path, derived addresses…what is all this crap. Why are there so many addresses? Here are some screenshots.

mnemonic seed

mnemonic extended keys

mnemonic derived addresses

Well, if we check the Account Extended Public Key on blockchain.info, we see some transactions! Note that some of those addresses on the left match the derived addresses above.

blockchain extended key

blockchain transactions

Cool, so if we can generate whatever these accounted extended public keys are for our 199 passphrases, we can check those online. Maybe there is a way to check in bulk. For now, let’s just see if we can generate the keys. Could also be a fun exercise to download the entire Bitcoin blockchain and figure out how to read it myself…another time perhaps.

I looked back at Ian’s javascript code to see how the Mnemonic Code Converter worked. Lots of jumping around functions and some unfamiliar javascript stuff. Pleeassseeee let there be a Ruby gem I can use. Wooo, looks like this MoneyTree gem might work.

Not sure what all this stuff means. Let’s try to make one of these Master Nodes. The example is @master = MoneyTree::Master.new seed_hex: "000102030405060708090a0b0c0d0e0f". Aight, we need a seed_hex for each mnemonic passphrase, and BipMnemonic lets us do that: BipMnemonic.to_seed(mnemonic: passphrase). Let’s try using the Reddit post.

seed_hex = BipMnemonic.to_seed(mnemonic: 'honey relief scale kite dose lyrics they middle globe exhaust smooth galaxy horror ensure grape way gift embody spring cupboard horror hurt image swift')

@master = MoneyTree::Master.new seed_hex: seed_hex
=> "37757ee9da759364544a655593057c991958e6014fa8ebfe242bc11f5c8373b3735b0f0e1e731c3355efe4bc09cdbf65ba5ca79af1061a5847c6a6528d8d5d4a"

=> #<MoneyTree::Master:0x007ff651ebea70 @depth=0, @index=0, ...

Now there is a method available on the master node object we can try: @master.to_bip32(:private).

=> "xprv9s21ZrQH143K3yx2Tn5J3G2mYTpUnjrdQUkwXZxPiid5eJgWYKYQfpDCMDWQqfq8whGfts9q5txq6ERRz3rX67GgFzAfv9E3Re4ecDoG3FF"

Aha, that matches the BIP32 Root Key. We’re getting somewhere. Ok, we can do the same thing to find the public key, but if you try searching the result on blockchain, nada. We gotta figure out this derivation path stuff to get that account extended public key. MoneyTree let’s you create child nodes off the master, and you specify a BIP32 Derivation Path. Let’s try whatever this m/44'/0'/0'/0 thing is.

@node = @master.node_for_path "m/44'/0'/0'/0"
@node.to_bip32(:public)
=> "xpub6EtUqtb2dbwBrv8gdqvppvbu3MwNkSfKvd3duxQdyErmHm6PUq9P6AwqtaZp24oB12eEyqkbGnnJeR2JAVkossGdgghx7KhxyGQj6hNvVpX"

That’s the BIP32 Extended Public Key, which does not give results on blockchain.info. Quick googling didn’t reveal any sites that let you search using this key (not that there isn’t one). How do we get account extended public key…?

Looking back at those screenshots, we see the derived addresses. Let’s try to generate those. For each address, we see a path on the left. Looks like the derivation path from before but with an extra number at the end m/44'/0'/0'/0/0m/44'/0'/0'/0/1m/44'/0'/0'/0/2, etc. I’m starting to understand! Ok, so these derived addresses are really just nodes further down some tree, and the depths are separated by a “/”. Not sure what those apostrophes mean, but maybe doesn’t matter. Learn more about them here.

@node = @master.node_for_path "m/44'/0'/0'/0/0"
@node.to_address
=> "1BoTRHKRU4TLGdBpZ8Yjf1hXEmtxjdgb6v"

Well well, that’s the first derived address. And a wallet address is definitely easy to check. I asked about what coins were on the wallet, and he actually did have some bitcoin (as well as Stratis, Dash, and Litecoin). Hmmm, if we can check these in bulk, and we see any transactions, then we have a hit! Butttt…there are tons of possible derived addresses for each passphrase. Bitref.com allows bulk checking (found it from some stackoverflow post). There’s nothing stopping someone (like your wallet provider) from starting at m/44'/0'/0'/0/999 instead of m/44'/0'/0'/0/0. Let’s just hope not, and we’ll try the first three nodes. Update the script to generate all of this.

def get_keys_from_seed_hex seed_hex, passphrase_copy_string
  @master = MoneyTree::Master.new seed_hex: seed_hex
  @node = @master.node_for_path "m/44'/0'/0'/0"
  @account_node = @master.node_for_path "m/44'/0'/0'"
  
  bip32_public_key = @node.to_bip32(:public)
  bip32_private_key = @node.to_bip32(:private)
  account_extended_public_key = @account_node.to_bip32

  first_address = @master.node_for_path("m/44'/0'/0'/0/0").to_address
  second_address = @master.node_for_path("m/44'/0'/0'/0/1").to_address
  third_address = @master.node_for_path("m/44'/0'/0'/0/2").to_address

  @addresses_to_lookup << first_address
  @addresses_to_lookup << second_address
  @addresses_to_lookup << third_address

  # just to get some visual feedback of all this stuff
  puts "account public key: #{account_extended_public_key}"
  puts "public  key: #{bip32_public_key}"
  puts "private key: #{bip32_private_key}"


end

@addresses_to_lookup = []
passphrase.each_with_index do |current_word, index|
  bip39_word_list.each do |test_word|
    passphrase_copy = passphrase.clone
    passphrase_copy[index] = test_word
    passphrase_copy_string = passphrase_copy.join(' ')
    
    begin
      seed_hex = BipMnemonic.to_seed(mnemonic: passphrase_copy_string)

      get_keys_from_seed_hex seed_hex, passphrase_copy_string

    rescue SecurityError => e
    end
  end
end

Note: Above you can see account_extended_public_key included. At some point I figured out the account extended public key was one parent up in the derivation path. The ‘44’ corresponds to BIP44, and the first ‘0’ indicates Bitcoin. For Litecoin, that number is 2, and the BIP44 derivation would be “m/44’/2’/0’/0”.

That’s 597 addresses to check. Bitref didn’t respond when trying 200 at a time, but 100 worked. @addresses_to_lookup[0..100].join("+"). No hits. 101-200….no hits. 401-500 no hits. I lost a lot of hope at this point. 501-600….woahhhh wait, is that a legit transaction??

bitref transactions

I got really excited at this point and figured this meant one of the test passphrases was correct. Unfortunately Bitref doesn’t put the address along side the transactions (didn’t check the source). Not the most efficient, but I just did a manual binary search (checking the addresses between 500-550, then 500-525, etc. until I could narrow it down to at least 1 of the 3 transactions that were found. The 4th transaction, the one on top, was the coins getting moved to a new wallet (after this recovery). Ok, got 1. Now let’s update the :get_keys_from_seed_hex method and print out the passphrase when we get an address match.

def get_keys_from_seed_hex seed_hex, passphrase_copy_string
  @master = MoneyTree::Master.new seed_hex: seed_hex

  first_address = @master.node_for_path("m/44'/0'/0'/0/0").to_address
  second_address = @master.node_for_path("m/44'/0'/0'/0/1").to_address
  third_address = @master.node_for_path("m/44'/0'/0'/0/2").to_address

  @addresses_to_lookup << first_address
  @addresses_to_lookup << second_address
  @addresses_to_lookup << third_address

  # an example value here would be 1BoTRHKRU4TLGdBpZ8Yjf1hXEmtxjdgb6v
  address_to_check = "removed for privacy"

  if first_address == address_to_check || second_address == address_to_check || third_address == address_to_check
    puts passphrase_copy_string
  end

end

So I run that, and an out pops the matching passphrase:

=> "fiscal bomb mutual one alley mistake unfair they roof unveil month prepare logic yard daring adapt eyebrow turn burst mandate win report maximum giraffe"

That’s our example passphrase from the beginning, but what changed? Not “proof”…it should be “roof” !! I didn’t want to create my own wallet to check or invade privacy or anything, so I sent off the fix to the coin owner to try it out himself. SUCCESS. He got all his coins back – $10,000 worth – and looks like he moved them to a new wallet.

In Conclusion

That was a really fun problem to try to solve and how awesome is it that it worked out?? It was lucky that only 1 word was wrong, but on the other hand, it doesn’t seem likely that several transcription errors would have happened when hand copying the passphrase.

Knowing the solution, it’s easy to look back and realize that checking for pairs of words that are very similar would have solved this a lot faster. Still, got to learn about how all this works and how HD Wallets have this tree structure that allows for basically many wallets all stored in one master wallet. Learn more about HD Wallets here and here too. All in all, this took about 5-6 hours.

For anyone in the future that runs into this same issue, let’s generate that pairs list, so that can be tried first. This is 2048^2 = 4,194,304 checks.

word_pairs = []
bip39_word_list.each do |first_word|
  bip39_word_list.each do |second_word|

    # Reject if the same word
    # Check if words are similar lengths, otherwise garment matches arm
    
    if  first_word != second_word && 
        ( difference = (first_word.length-second_word.length).abs )<= 1 

        # Check if compared words contain the other, like proof contains roof
        if  first_word.include?(second_word) || second_word.include?(first_word)
          word_pairs << [first_word, second_word].sort
        
        elsif [first_word.length, second_word.length].max > 3
          pair = [first_word, second_word].sort_by { |w| -w.length }
          
          # Check if all but 1 letter match up
          if pair[0][0..-2] == pair[1][0..(pair[1].length-2+difference)] ||
             pair[0][0..-2] == pair[1][1..(pair[1].length-1+difference)] ||
             pair[0][1..-1] == pair[1][0..(pair[1].length-2+difference)] ||
             pair[0][1..-1] == pair[1][1..(pair[1].length-1+difference)]  

            word_pairs << [first_word, second_word].sort
          end
        end
    end

  end
end

# remove duplicates and sort alphabetically
(word_pairs.uniq!).sort!

word_pairs.each do |word_pair|
  puts word_pair.join(', ')
end
word_pairs.count
=> 505

I’ve made the word pairs list available here.

Some examples:

issue, tissue
item, kite
item, stem
just, must
keen, keep
kick, sick
kind, mind

...

save, wave
sea, seat
seed, seek
sell, tell
ship, whip
shoe, shop
side, tide
side, wide
sing, wing
ski, skin
slab, slam


kindly contact us for more support...