Warning! If you aren't into math or number theory, you may want to skip this particular article. I'm no mathematician, but I am something of a hobbyist when it comes to math. This article provides an overview of the approximately four months I have put into exploring a forest of prime trees.
What's a prime tree? Well, that requires a little background…
Much study has been devoted throughout history to the prime numbers, that is, those numbers evenly divisibly by only themselves and 1. These numbers begin with 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31… and extend onward to inifinity. People have examined the distribution of primes, the behaviors of primes, and have devoted massive quantities of computer time to find larger and larger primes. As of this writing the largest known prime has 6,320,430 digits.
People have researched clusters, patterns, related lists of primes. The most basic of these are arithmetic progressions of primes, i.e. for some starting prime P and some increment I: P+I is prime, P+I+I is prime, P+I+I+I is prime and so forth. Currently the longest known arithmetic progressions have a length of 22.
Another type of related prime list is called a Cunningham Chain. A Cunningham Chain is a list of primes such that the next prime is the result of some simple formula applied to the previous prime. For example 2, 5, 11, 23, 47 is such a chain of length 5, where each subsequent term is 2p+1 where p is the previous term. 5=2×2+1, 11=2×5+1, 23=2×11+1 etc. Cunningham Chains are associated specifically with the formula 2p+1 or 2p-1. For 2p-1 one of the longest starts at 69257563144280941 and has a length of 15. For 2p+1, one of the longest starts at 95405042230542329 and has a length of 14.
A prime tree is a name I came up with for a connected series of primes that are generated with a simple formula just as in a Cunningham Chain, but where the offset can be either added OR subtracted. So for example, the formula of interest might be 3p+-2 starting at 5. Note: the terminology I tend to use to describe these three values are as follows: 3 is the coefficient (c), 2 is the offset (d), and 5 is the root (r).
Letting p=5, we would apply both the 3p-2 and 3p+2 formulas, and keep track of the results. As it turns out both of the results (13 and 17) are prime, so we would add these to our tree as the “children” of 5 like so:
 - =  + = 
The “- =” line shows the child when you multiply and deduct the offset, the “+ =” line shows the child when you multiply and add the offset. Then we would continue the process by letting p=13 and then 17, and adding the children of these numbers to the tree if the children are prime. 3×13-2 is 37 and 3×13+2 is 41. Both of these are prime. 3×17-2 is 49 which is not prime, but 3×17+2 is 53 which is prime:
 - =  . - =  . + =  + =  + = 
Pushing onward we find that 37 has 2 prime children (109 and 113), 41 has none, and 53 has one (3×53-2 = 157):
 - =  . - =  . . - =  . . + =  . + =  + =  + =  - = 
At this point the 109 and 157 branches die, no further primes are generated, but 3×113-2 yields 337 which is prime, so a new generation is added to the tree:
 - =  . - =  . . - =  . . + =  . . - =  . + =  + =  + =  - = 
Continuing this process with 337 and pushing on until no more prime children are generated gives us the completed 3p+/-2 at 5 tree:
 - =  . - =  . . - =  . . + =  . . - =  . . - =  . . + =  . . - =  . . . - =  . . . + =  . . + =  . + =  + =  + =  - = 
Note that at 16 primes, this tree contains more primes than the longest known Cunningham Chain. Of course this takes into account the width of the tree. The maximal depth of the tree is measured in generations, of which there are 9. The longest path through this tree is 5, 13, 37, 113, 337, 1013, 3037, 9109, 27329 which is 9 primes long. It stands to reason though that because each prime in the tree has 2 possible choices for continuing the analysis, that there must be prime trees out there that are deeper than the longest known Cunningham Chains, and perhaps even deeper than the largest arithmetic progression.
I am trying to find such trees. My ultimate goal is to find one 23 generations deep, for such a tree would contain a connected series of primes longer than both the world-record Cunningham Chain and the world-record Arithmetic Progression.
Using a high-end computational tool called Maple I have spent perhaps the last four months on this project. The problem space is infinite in three directions (coefficient, offset, and root prime) and therefore tracking down large trees has become increasingly difficult.
The essential problem is that as prime numbers get bigger, so do the spaces between them There are far more prime numbers between 100 and 200, then there are between 100,000,100 and 100,000,200. Since the growth of a tree constantly moves towards a more sparsely populated region of primes, your chances of continuing to produce primes on any particular branch drop and eventually the branch peters out.
One day while taking my typical shotgun approach to the search, I noticed something interesting that many very large trees had in common… they grew backward for awhile. How do you make a tree grow backward? Simple, just make the offset so large that after you multiply and subtract that offset, you get a child prime which is smaller than the root prime. If you can get a tree to grow backward for a few generations, you are adding depth to the tree but moving toward a denser prime space! I called trees that did this “inverted”, and it occurred to me that if I focussed specifically on these types of trees, I might find my record breakers faster.
I devised an algorithm to try, for any root prime, to build a tree with every prior prime as the first child, for a few different coefficients. After running this algorithm for a few weeks and reaching primes up around 170,000 I had succeeded in finding a few trees 21 generations deep! However the time required to run the computations was getting longer every day, and my little computer simply didn't have enough power to move through the problem space any faster.
So, during a weekend when I was sick with a stomach virus, I built a pair of applications designed to distribute the inverted prime tree search across multiple computers. One application, called “the dispatcher” (left image), runs on a single machine. The other application “the client” (right image) is installed on any machines that are going to assist in the search. The Dispatcher doles out small pieces of the problem to each client, and collates each client's results into a single datafile. In this way, a much larger “distributed computer” can be put to work on the problem. Collectively this pair of apps is called “ShareCal”. Armed with ShareCal, I entreated people where I work to let me make use of their unused computer time. I built the apps to run in the background, and share the CPU with any active applications. These apps were slower than Maple, being written in VB, but if I could get enough clients running, them, I figured the collective computing power would process the problem space much faster than Maple.
That was about 3 weeks ago. Right now I have 25 client computers working on the problem. The combined processing power of all of these boxes is over 20 gHz. That's pretty frickin' cool!
In three weeks time, we've gone from primes around 170,000 to primes around 966,000. In the next few days, clients will begin working on primes over 1,000,000. At that time, the distributed computer will have generated and examined approximately 13 BILLION trees.
The world record for a connected series of primes is not yet broken, but it has been tied! At about 4 in the morning on the Sunday before last, a client running on a workstation owned by my colleague Dan Royalty discovered a prime tree with a depth of 22 generations. Hopefully the D23 tree will be found before I get tired of working on the problem.
I have arbitrarily assigned trees into three different categories based on their depth and/or population (the number of primes in the tree).
- Trees which have a depth less than 10 and a population less than 18 are “small” and therefore uninteresting.
- Trees which have a depth of 10 to 15 or a population of 18 to 29 are “big”. Big trees are fairly common. With 25 clients, ShareCal identifies approximately 22,000 big trees in a 12-hour period. I've found well over 300,000 big trees thus far.
- Trees with a depth of 16 or higher, or a population of 30 or higher are called “monsters”. To date, only 1,646 monsters have been found in over 200 trillion trees, so they are pretty rare.
Here's the current world-record prime tree for depth (22 generations):
2p+-569415 at 384187 [Depth: 22; Population: 29]:
 - =  + =  - =  . - =  . - =  . + =  . + =  . - =  . + =  . - =  . . - =  . . . - =  . . . - =  . . . + =  . . . - =  . . . - =  . . . - =  . . . + =  . . . + =  . . . - =  . . . + =  . . . + =  . . + =  . + =  . + =  + =  + =  - = 
And here's the enormous world-record prime tree for population (71 primes):
2p+-15 at 13 [Depth: 15; Population: 71]:
 - =  . - =  . . + =  . . - =  . . . - =  . . . . - =  . . . . . - =  . . . . . . - =  . . . . . . - =  . . . . . . + =  . . . . . . - =  . . . . . . - =  . . . . . + =  . . . . . - =  . . . . . - =  . . . . . + =  . . . . . + =  . . . . . - =  . . . . . + =  . . . . . - =  . . . . + =  . . . + =  . . + =  . . - =  . . + =  . . + =  . . - =  . . + =  . + =  . - =  . . - =  . . - =  . . - =  . . . - =  . . . - =  . . . . + =  . . . . + =  . . . . + =  . . . . - =  . . . . + =  . . . . - =  . . . . + =  . . . + =  . . + =  . . + =  . + =  . - =  . . - =  . . - =  . . + =  . . + =  . . - =  . + =  . + =  . - =  . - =  + =  - =  . + =  . - =  . + =  . + =  . + =  . - =  . + =  . + =  + =  - =  + =  + = 
So there you have it. How a person consumes a massive amount of computer time for a useless endeavor and convinces over 20 other people to help. But I guess as long as it is fun to do, it isn't a waste. It wouldn't be possible to have gotten so far so fast without the IT department at Foliage Software Systems permitting me to run this experiment on their hardware. I'd like to offer my thanks to them and everyone else who has helped me on this problem.
Wish me luck!
EDIT: Originally, the number of trees tested was erroneously reported at 242 trillion instead of 13 billion. For reasons that will likely not be interesting, the number of trees tested to reach the N'th prime is the (N-1)'th triangular number. I was mistakenly triangulating twice, i.e. if the (N-1)'th triangular number is T, I was computing the T'th triangular number.