Do CoinJoin Mixes Really Require Equal Transaction Amounts for Privacy? Part Two: Knapsack
Although Satoshi Nakamoto’s white paper suggests that privacy was a design goal of the Bitcoin protocol, blockchain analysis can often break users’ privacy. This is a problem. Bitcoin users might not necessarily want the world to know where they spend their money, what they earn or how much they own, while businesses may not want to leak transaction details to competitors — to name some examples.
But there are solutions to regain privacy, like CoinJoin. Some of the most popular mixing solutions available today use this trick, including Wasabi Wallet (which leverages ZeroLink) and Samourai Wallet (which leverages Whirlpool). In both cases, users chop their coins into equal amounts to mix them with each other. Using equal amounts is considered a crucial step for the mix to be effective.
Part one of this miniseries covered a new mixing protocol in development for Bitcoin Cash called CashFusion, which challenges the assumption that equal amounts are necessary for a successful mix.
But even in 2017, in a paper analyzing the privacy of non-equal amount CoinJoins in depth, researchers from RWTH Aachen University and Karlsruhe Institute of Technology proposed a solution to gain privacy through CoinJoin without the need to use equal amounts: knapsack mixing.
Author’s note: If you do not know what a CoinJoin transaction is or why equal amounts are assumed necessary for mixing, you should first read part one of this miniseries — or at least read the first two sections of that article.
Mixing Versus Paying
As explained in part one of this miniseries, equal-amount bitcoin mixing probably offers the best achievable privacy on the Bitcoin blockchain today. But it does leave users with unequal-change outputs. These don’t offer the same level of privacy and could even be a privacy risk. CashFusion could help deal with these unequal outputs.
But there’s another problem. The requirement to use equal amounts prevents users from making actual payments through CoinJoin transactions: It’s unlikely that a merchant would charge the exact amount required in the CoinJoin. So, instead, equal-amount CoinJoins are really only used for mixing: Participants put funds in and get the same amount of funds back. Unfortunately, this means that mixing requires extra blockchain transactions, which cost transaction fees and time.
Researchers Felix Konstantin Maurer (of RWTH Aachen University), Till Neudecker and Martin Florian (both of Karlsruhe Institute of Technology) set out to solve this problem in their 2017 paper titled “Anonymous CoinJoin Transactions with Arbitrary Values.” They proposed a CoinJoin solution that could be useful for actual payments — that is, it uses unequal amounts — while still offering privacy.
Named after the knapsack problem, their solution is referred to as knapsack mixing.
Like CashFusion, the core idea behind knapsack mixing is to generate a CoinJoin transaction that can be puzzled together into several different configurations of potential original transactions. Different configurations would link different inputs to different outputs, thereby breaking the trail of coins on the blockchain.
Knapsack mixing achieves this by cutting the original outputs from the original transactions into smaller outputs for the CoinJoin transaction. Furthermore, it uses relatively simple tricks to ensure that the smaller outputs result in several potential configurations being possible.
Maurer, Neudecker and Florian’s paper includes three variants of knapsack mixing. The first variant is the most fleshed out in the white paper itself. The second and third versions are fairly similar, where the third version is really a superior version of the second version. (The authors of the paper only came up with the third version in a late stage of writing the paper; it would have probably been given a more prominent place in the study otherwise.)
Let’s look at the different variants.
To explain the first variant of knapsack mixing, let’s take a CoinJoin example from the first article in this miniseries. Alice wants to pay Carol 3.2 coins and has two inputs worth 2.3 and 1.4 coins, respectively. Meanwhile, Bob wants to pay Dave 4 coins and has two inputs worth 3 and 2 coins, respectively.
Simplified, these transactions look like so:
2.3 + 1.4 = 3.2 + 0.5
3 + 2 = 4 + 1
(The 0.5 BTC and 1 BTC outputs are change.)
Merged together, the CoinJoin transaction would then look like so:
3 + 2.3 + 2 + 1.4 = 4 + 3.2 + 1 + 0.5
As pointed out in the previous article, the transactions were merged, but assuming you know that there are two payers, the amounts can be puzzled together in only one configuration: the original transactions. As such, it’s trivial to rediscover which inputs paid which outputs, defeating the point of making a CoinJoin.
Knapsack mixing changes this. In short, it uses the value difference between the two original transactions to split an original output from the biggest transaction into smaller pieces. This at least ensures that there are two configurations, where most outputs could be linked to any input.
Let’s look at this step by step. First, the total amount of the outputs are added up per transaction. For Alice and Carol’s transaction, this is 2.3 + 1.4 = 3.7. For Bob and Dave’s transaction, this is 3 + 2 = 5. Bob and Dave’s transaction is the biggest one.
Next, the difference between the two is calculated: 5 – 3.7 = 1.3. Then, this difference is subtracted from the biggest transaction. Bob and Dave’s is the biggest transaction, and we’ll split the 4 output, so: 4 – 1.3 = 2.7.
Hence, the four outputs from the biggest transaction is in the CoinJoin split into 1.3 and 2.7.
This time, the CoinJoin looks like so:
3 + 2.3 + 2 + 1.4 = 3.2 + 2.7 + 1.3 + 1 + 0.5
Now we get back to puzzling…
Of course, the original configuration is still possible. It’s just that Dave now receives two outputs instead of one.
This would look like so:
2.3 + 1.4 = 3.2 + 0.5
3 + 2 = 2.7 + 1.3 + 1
But on top of that, a whole new configuration is now possible:
2.3 + 1.4 = 2.7 + 1
3 + 2 = 3.2 + 1.3 + 0.5
As a result, blockchain analysts can no longer link outputs 3.2, 2.7, 1 or 0.5 to any input with certainty! A boon for privacy, even though the CoinJoin transaction did not use equal amounts.
To add a new transaction to the mix, the value all previous transactions (put differently: the existing CoinJoin) would be added up as if it were one transaction. Then, like the first time around, the value difference between these previous transactions and the new transaction would be used to split an output. And so forth for the next transaction and any additional transaction after that.
Variants Two and Three
While variant one of knapsack mixing does a good job of delinking most outputs from any of the inputs, the inputs themselves can still be linked to other inputs. These sets are the same for both configurations. This is not ideal for privacy either.
Knapsack mixing variants two and three are specifically designed to unlink the inputs. Variant two does, however, require that all participants in the CoinJoin learn each other’s inputs and outputs, which means it doesn’t actually offer much privacy: Variant three fixes this. Yet, for the purpose of the article (which focuses on blockchain privacy), the difference is small enough to cover both variants at once.
We’re taking the same examples as above. Alice wants to pay Carol 3.2 coins, and Bob wants to pay Dave 4 coins.
2.3 + 1.4 = 3.2 + 0.5
3 + 2 = 4 + 1
For variants two and three, a “virtual transaction” is generated. This virtual transaction doesn’t otherwise exist, but blockchain analysts will be tricked to think that it might.
To create this virtual transaction, one input from each original transaction is taken. Then, the value of these inputs is added up.
For example, like so:
1.4 + 2 = 3.4
The value of our selected inputs is 3.4. Therefore, the value of the outputs of the virtual transaction must also be 3.4.
This is easy to accomplish. We once again take an output from the biggest original transaction, which, in our example, is again 4. We also look at the output it was originally matched with in this original transaction: 1. Then we split the big output (4) so that one of the halves can be combined with its original match (1) to generate the virtual value (3.4). In this case, that means that 4 is split into 2.4 and 1.6. (After all, 2.4 + 1 = 3.4.)
Now, the CoinJoin looks like so:
3 + 2.3 + 2 + 1.4 = 3.2 + 2.4 + 1.6 + 1 + 0.5
Again, based on this CoinJoin, the original configuration is of course still possible. It’s just that Dave once again receives two outputs instead of one.
This would look like so:
2.3 + 1.4 = 3.2 + 0.5
3 + 2 = 2.4 + 1.6 + 1
But on top of that, a new “virtual configuration” is also possible:
3 + 2.3 = 3.2 + 1.6 + 0.5
2 + 1.4 = 2.4 + 1
Not only do different configurations match different inputs to different outputs, different configurations also match different inputs with each other!
Knapsack mixing, based on a simple trick, offers a significant privacy improvement, especially compared to making normal transactions.
Still, knapsack mixing is not quite as private as equal-amount mixes. Equal-amount mixes essentially allow for a maximum amount of configurations; necessarily more than even the best knapsack mix. And perhaps more notably, knapsack mixing still allows for some linking of certain inputs and outputs — or at least more likely linkages.
Indeed, in the examples above, certain inputs and outputs were matched in both potential configurations. In variant one, the 1.3 output was matched with the 3 and 2 inputs either way. So while blockchain analysis wouldn’t reveal what the original transactions were, it’d still reveal a link between the 3 and 2 inputs and the 1.3 outputs. Variants two and three, while delinking inputs from each other, allow for even more matches between inputs and outputs.
It’s also worth pointing out that a knapsack CoinJoin for payment requires extra outputs and would, therefore, still cost more fees than regular transactions or even regular CoinJoin transactions would. It may also require merchants to provide two addresses when they are paid, instead of just one.
In other words, while an improvement over equal-amount mixing when it comes to blockspace efficiency and fees, and a big improvement versus regular transaction or even regular CoinJoin transactions for privacy, knapsack mixing still comes with a little extra hassle and cost.
Author’s note: There is a bit more to the knapsack mixing proposal, like how the CoinJoin transaction is constructed. There are also several more subtle risks and trade-offs when it comes to privacy, like how users handle their coins before and after the mix. For simplicity and readability, this article focuses only on the central and arguably most interesting idea behind knapsack mixing: unequal-amount mixing.