This is a copy of The Numbers Racket CD-ROM, which I prepared back in 1998. I am presenting it here, unedited, for historical purposes. Obviously most if not all of the information is 'out of date' today, but think of it as a snapshot of what the Spy Numbers scene was like in the late 1990s.

Like what you see here? Help support my work by taking a look at my software , and buy something of interest :) It's all available for download before purchase, so you can try it out.

Enjoy!


How Are Messages Sent?

How Are Messages Sent?

There are many ways intelligence organizations could encrypt messages to be sent to their field agents. One way would be to use a simple substitution cipher. One number represents the letter A, another for B, and so on. There are two problems with such a system. First, it would be very easy for another agency to break the encryption scheme. In any langauge, there are well defined frequencies of usage for each letter. For example, in English, E is the most commonly used letter, followed by T, and so on. If enough encrypted messages are monitored, it would be possible to build a table of the frequency of usage for each code sent. This could be compared to the tables for the langauge believed to be used, and the code could be broken. Second, the method is very inefficient. If one five figure group were necessary to send each letter, messages would take a very long time to send.

Another technique would be to have each group represent a word or phrase. One method that has been used in this past is to employ a book as the encryption reference. Each five figure group is broken down into two parts, one three digits, the other two digits long. The three digit part refers to a page in the book, the two digit part specifies which word on the page to use. This has the added advantage of not requiring the agent to possess any secret decoding books, which by themselves are extremely incriminating. It also requires a method of specifying individual letters for words that are not part of the book text. This is done by the use of spell / end spell codes. When a word must be spelled, the spell code is inserted, followed by the codes representing individual letters, and finally the end spell code. Of course, this system is subject to being broken by exhaustive examination.

It is believed that the second technique is used, but with the addition of a twist - random numbers are added to each group to be sent, as an offset. This is referred to as the "one time pad" method. The sender of the message has a listing of randomly generated numbers, called a pad. He starts with the first number on the pad, and adds it to the first number to be sent. The same is done with the second number, the third, and so on. Once the message is encrypted with the one time pad, the pad is destroyed, never to be used again.

The receiving agent has a copy of the same pad. After he copies down the message sent, he uses the pad in a reverse process, to remove the offset due to each random number that was added. Once done, he uses whatever second process is necessary to read the message, and destroys the pad.

Example:

Message to be encoded:

Happy May Day Comrade BORIS
Each word is converted to a code via the dictionary, with the name Boris converter letter by letter with the spell / end spell codes
Happy  May    Day    Comrade  [spell]  B      O      R      I      S      [end spell]
28732  84273  18282  73982    19382    12732  42342  63822  03282  28832  58398
So our message (before the OTP is added) is:
28732  
84273  
18282  
73982    
19382    
12732  
42342  
63822  
03282  
28832  
58398
Let's assume this is our OTP:
93824 17373 95837 35373 83633 19283 
03834 53273 63823 27837 03832 73922 
84823 03483 27387 62623 80283 37493
Next the OTP is added to the coded message. Fibonacci addition is often used, where carrying is not used, to reduce the chance of arithmetic mistakes:
28732 + 93824 = 11556
84273 + 17373 = 91546
18282 + 95837 = 03019
73982 + 35373 = 98255
19382 + 83633 = 92915
12732 + 19283 = 21915
42342 + 03834 = 45176
63822 + 53273 = 16095
03282 + 63823 = 66005
28832 + 27837 = 45669
58398 + 03832 = 51120
This is the message which is sent. The recipient uses his OTP to subtract the offset added by the OTP, to get back to the original coded message. His reverse dictionary is then used to convert the codes back into the original message.

An encryption system using a one time pad is guaranteed to be unbreakable, no matter how clever the "other side" is, or how much computing power they posess. How is this possible? Codebreaking technqiues rely upon the ability to analyze multiple messages, looking for patterns that give away the encryption system. Think of it this way - encryption systems essentially use a mathematical algorithm to encode the message. That means it is possible to deduce the algorithm, and break the code. One technique is to make use of any plaintext (unencoded text) that is known. By comparing this to the encrypted text, part of the message can be decoded. Once part of it is decoded, the other parts can be deduced.

With a one time pad, there is no algorithm. Each number is purely random, with no pattern. It is impossible to deduce part of the message, even if you know another part of it. There is no way a message encoded with a one time pad can be broken. Well, there is one way, but it requires stupidity on the part of the sender. If the same pad is reused, then it becomes possible to break the code. And if it is possible to get a plaintext copy of one of the messages sent, then every other message send using that pad can also be read. During and after WWII, the Soviets made this mistake. This blunder was utilized by US intelligence organizations to read many Soviet messages, under a project called VENONA. This information was used to apprehend many Soviet agents, including the Rosenbergs, who passed atomic bomb secrets on to the Russians. The NSA has a web page up devoted to this project, which is now publically acknowledged. The URL is http://www.nsa.gov/docs/venona/venona.html.

To quote from the NSA web page:

The cryptographic systems used by the KGB's First Chief Directorate involved
a codebook in which words and phrases were represented by numbers. These
numbers were then further enciphered by the addition of random number groups,
additive, taken from a so-called one-time pad. A one-time pad comprised pages
of random numbers, copies of which were used by the sender and receiver of a
message to add and remove an extra layer of enciperhment. One-time pads used
properly only once are unbreakable; however, the KGB's cryptographic material
manufacturing center in the Soviet Union apparently reused some of the pages
from one-time pads. This provided Arlington Hall with an opening. 
Some more interesting facts about the Soviet encryption system:
A Soviet code clerk preparing a message first reduced its text into numeric
code groups drawn from a codebook (a kind of dictionary in which the words
and common phrases correspond to four-digit numbers). After encoding the
plain text with numeric code groups, the clerk would obscure the code groups
by adding them, digit by digit, to a string of random digits. This second
series of digits, called "additive" or "key," was known to both the sender
and receiver because it was printed on the pages of a "one-time pad."
One-time pads were periodically pouched to Soviet consular missions in
sealed packets. The pad pages--with 60 five-digit additive groups per 
page--were used in order, always starting with the group in the upper
lefthand corner (the pad-page number to be used was more or less concealed
somewhere on the face of the message). 
How did this Soviet blunder happen?
For a few months in early 1942, a time of great strain on the Soviet regime,
the KGB's cryptographic center in the Soviet Union for some unknown reason
printed duplicate copies of the "key" on more than 35,000 pages of additive
and then assembled and bound these in one-time pads. 
This information provides some valuable insights on how messages were encoded during the Second World War. We can presume that similar techniques are used today to encrypted the messages which are sent by Numbers Stations (although we assume that today, the mistake the Soviets made in re-using one time pads is not repeated!).

It is interesting to note that the encryption dictionary produces four digit groups, these are then added to five digit keys, to produce the the five digit groups which are sent. As we know, most numbers stations today transmit five digit groups, although some in the past (notably the CIA) have used four digit groups. The CIA now uses five digit groups, although they are sent as three digits, pause, two digits. This may just be to give the listener some extra time to write down the group.

If the dictionary produces four digit groups, I do not know that there is a benefit to using five digit keys. There is I think a disadvantage to using a key shorter than the group length, since that would mean that the final group would always be close to the group before encryption, except for a small amount of "noise" added by the addition of the random additive. Of course, it is possible that dictionaries today actually produce five digit groups.

In the past, most Numbers Stations transmitted messages of variable length. Some still do today, although there is a trend toward messages of a fixed length. What is the reason for this? There is an cryptographic technique called "traffic analysis", which involves examining the quantity and characteristics of messages, even if the actual text is not known. A lot can still be learned from this.

For example, let's say that I'm sending you encrypted messages. Most of the time, I will have little to say. Perhaps "No Instructions", or "Happy May Day", or "The check's in the mail". Nice short messages. But then I send a very long message, say a list of contacts. This longer message would certainly stand out. It would tip off "the other side", that something is up. But if I always send messages of a fixed length, padding short messages with null information, then a longer message text won't be obvious.

The Atencion stations used to transmit messages of variable length, usually a few dozen groups. They then started sending messages of a fixed 150 count length. Recently, starting January 1, 1997, they started sending three 150 count messages per transmission. There still are some variable length Atencion transmissions, but they are increasingly more rare. The Lincolnshire Poacher and Cherry Ripe both use fixed length messages. The MOSSAD transmissions however are always a variable length.

Another obfuscation technique that Numbers Stations probably use is bogus messages. Most listeners agree that many, if not most, transmissions from Numbers Stations consist of either meaningless traffic designed to confuse the other wise, and consume intelligence resources. It also provides the illusion of a much larger agent network than actually exists. Another possibility is that many of the messages sent are for training purposes only. It is natural to assume that agents need to be instructed in the art of copying down the messages, and require some practice exercises. This might explain some transmissions which otherwise make little sense.

For example, the CIA's Counting Station often goes through periods when it increases activity, and transmits from Remington, Virginia (outside Washington, DC) during the daytime on frequencies in the 3 and 4 MHz bands. These transmissions cannot be expected to be heard outside the eastern North America. I think it is highly probable that these transmissions are meant for "students", perhaps those attending the US Department of State DTS training at the nearby Warrenton Training Center. Either that, or we have secret agents conducting espionage in Canada!

What about MOSSAD? These transmissions are rather unique, they use the NATO phonetic alphabet to convey messages. (There is an unrelated station which uses the pre-WWII phonetic alphabet, such as the Nancy-Adam-Susan station) It is probable that a similar encryption technique is used. A dictionary is used to produce the coded groups, and a one time pad is used to for the additive. However, the OTP would have a series of letters rather than numbers for each group. It is also possible that the dictionary also has letters.

Another possibility is that MOSSAD does not use a dictionary, but rather just uses the OTP to encrypt the messages. There are some problems with this theory however. First, it would be impossible to transmit numbers, only letters could be sent (unless A-J is used to mean 0-9, and the receiver must decide it is a number based on the context). Second, only short messages could be easily sent. Most MOSSAD messages are a few dozen groups long, some are as short as five. However, this is not a serious problem. I have noticed some 90+ count messages, which is good sentence or two. Since the Hebrew alphabet also uses 26 letters, messages could be sent in Hebrew. I think it's a safe bet to assume that all MOSSAD agents know Hebrew! One advantage is that the recipient would not need both a codebook and a set of one time pads. The less incriminating evidence an agent has to possess, the better.

Not all Numbers Stations use voice transmissions. In fact, the vast majority use morse code. A female voice reading out a series of numbers stands out like a sore thumb in the shortwave bands. But there are thousands of CW stations. A listener is much less likely to be fascinated enough by a CW station transmitting a series of numbers to stop and listen. In addition, morse code is able to penetrate interference much better than audio. Voice stations probably exist to relay messages to those agents who cannot (or prefer not to) learn morse code.

There exists a third transmission mode - tones. These transmissions consist of a series of tones that are repeated for a short while, this is believed to be the call-up for the agent. Then, a more rapid burst of tones is sent, presumably the message. A special machine must be used to decode these transmissions. The Soviet version of this machine was called the luminaire. The audio from the radio receiver was fed into the machine, which decoded the tones, and displayed the digits by illuminating a series of lamps, which could then be further decoded by hand using the one time pad. One of these machines was captured when Soviet agent Hugh Hambleton was arrested in Canada in 1982. They were apparently reserved for only the most important agents. The disadvantage of course is that if an agent is discovered with such a machine, that fact alone is enough to incriminate him.

These tonal transmissions are still heard, although they have always been extremely elusive.

What role do computers and other forms of high technology play in the Numbers Station world today? It would be quite possible to create a modern version of the luminaire, that could decode not only tones but also morse code or even voice! Infact, it is rumored that the reason the MOSSAD stations use a highly inflected voice is to help differentiate between similar sounding phonetics, to make it easier for a computer to recognize them.

It would also be possible of course to put the entire set of one time pads inside the device, in a small memory. This would speed up decryption times, and reduce the chances of error due to an agent incorrectly decoding a message. The big risk however is that the decoding device falls into enemy hands, who would then have a tool for decoding other messages. Some steps could be taken to reduce this risk and the problems associated with it. First, each agent would have his own set of one time pads. Should the device fall into the wrong hands, only his messages would be compromised. Second, the device would be fitted with a "self destruct mechanism". A special password would have to be entered to activate the device. And incorrect password would destroy the device. If the agent felt that the authorities were closing in, he could activate the self destruct mechanism.

Of course, a one time pad is also a very incriminating piece of evidence. Agents must take great precautions to make sure that they are not caught with one. Apparently many of them are printed on paper which may be dissolved in water, or otherwise easily disposed of. Having a one time pad fall into enemy hands would be just as damaging as the above-mentioned hypothetical device.


Copyright 1997 Chris Smolinski. All Rights Reserved.
Last Update: October 26, 1997.

This is a copy of The Numbers Racket CD-ROM, which I prepared back in 1998. I am presenting it here, unedited, for historical purposes. Obviously most if not all of the information is 'out of date' today, but think of it as a snapshot of what the Spy Numbers scene was like in the late 1990s.

Like what you see here? Help support my work by taking a look at my software , and buy something of interest :) It's all available for download before purchase, so you can try it out.