How the Mathematical Conundrum Called the ‘Knapsack Problem’ Is All Around Us
A litany of issues in business, finance, container ship loading and aircraft loading derive from this one simple dilemma
published : 06 February 2024
Imagine you’re a thief robbing a museum exhibit of tantalizing jewelry, geodes and rare gems. You're new at this, so you only brought a single backpack. Your goal should be to get away with the most valuable objects without overloading your bag until it breaks or becomes too heavy to carry. How do you choose among the objects to maximize your loot? You could list all the artifacts and their weights to work out the answer by hand. But the more objects there are, the more taxing this calculation becomes for a person—or a computer.
This fictional dilemma, the “knapsack problem,” belongs to a class of mathematical problems famous for pushing the limits of computing. And the knapsack problem is more than a thought experiment. “A lot of problems we face in life, be it business, finance, including logistics, container ship loading, aircraft loading — these are all knapsack problems,” says Carsten Murawski, professor at the University of Melbourne in Australia. “From a practical perspective, the knapsack problem is ubiquitous in everyday life.”
Researchers once took advantage of the problem’s complexity to create computer security systems, but these can now be cracked since the problem has been so well studied. Today, as technology capable of shattering the locks on our digital communications loom on the horizon, the knapsack problem may inspire new ways to prepare for that revolution.
All or Nothing
The knapsack problem belongs to a class of “NP” problems, which stands for “nondeterministic polynomial time.” The name references how these problems force a computer to go through many steps to arrive at a solution, and the number increases dramatically based on the size of the inputs—for example, the inventory of items to choose from when stuffing a particular knapsack. By definition, NP problems also have solutions that are easy to verify (it would be trivial to check that a particular list of items does, in fact, fit in a backpack).
“The problem the theoreticians started to look at was how efficiently a particular task can be carried out on a computer,” writes Keith Devlin in the book The Millennium Problems. For example: Given a list of 1 million museum artifacts with their weights and monetary values, and a backpack limited to 25 pounds, a computer would have to run through every possible combination to generate the single one with the most lucrative haul. Given an indefinite amount of time, a computer could use brute force to optimize large cases like this, but not on timescales that would be practical.
“We think you could cover the entire Earth with processors and run them until the heat death of the universe and still fail to solve relatively small instances of appropriate versions of these problems,” says Noah Stephens-Davidowitz, a Microsoft Research Fellow at the Simons Institute in Berkeley, California.
Some NP problems like the knapsack example have a special property: In the early 1970s, Stephen Cook and Richard Karp showed that a variety of NP problems could be converted into a single problem of formal logic. Therefore, if one could be solved and verified efficiently with an algorithm, they all could. This property is known as “NP completeness.”
One of the most stubborn questions in computer science and mathematics is whether these “NP” problems, including the knapsack problem, are truly different from “P” problems, those that can be solved in what is called polynomial time. If P=NP, then it’s possible to solve every problem whose solutions are easy to verify, says Stephens-Davidowitz. So, if this inequality persists, the general knapsack problem will always be hard.
The Knapsack Problem
Keeping Things Secret
Cryptography researchers love problems that are difficult for computers to solve because they’re useful in encrypting digital messages. Knapsack-problem-like security codes are not useful for this, as they're too easily cracked, but more complicated methods inspired by this problem are being developed, and may one day play a role in outwitting the next generation of computing.
In an early knapsack-style encryption method, one person’s private key would be a list of numbers in which each is larger than the sum of its predecessors. Exchanges involving that person would use a public key that looks random but is made up of numbers from the first list with specific transformations applied. For example, if the public key is [2, 3, 4, 5], the transmitted message “1, 0, 0, 1” would be encoded as 2+0+0+5 = 7 (because 2*1=2, 3*0=0, 4*0=0, and 5*1=5). Secret numbers involved in the conversions between keys allow the original message to be unveiled.
For this to work, a computer must also figure out whether any given number can be written as the sum of a subset of numbers in the private key, which becomes an easy knapsack problem. It’s akin to filling a backpack with a batch of such differently sized items — like a ring, a painting, a car and a house — and knowing you can’t stuff in anything else after you’ve checked that the ring and the painting fit. Cryptographers Ralph Merkle and Martin Hellman described this idea in 1978, but others figured out how to crack it by the early 1980s.
Private information exchanges on today’s internet often use keys involving large prime numbers, and while factoring big numbers is difficult, it’s not thought to belong to the same “NP complete” class as the knapsack problem. However, computer scientists are already gearing up for a future in which quantum computers can quickly unlock these keys.