Unplugged: What the End of Metadata Collection Means for Intelligence
The future of the NSA’s bulk metadata collection program is in serious doubt, which raises the questions: how useful is it to the intelligence community, and what will they do if it goes away?
Recently President Obama endorsed a bill to end the government’s practice of sucking up and storing one kind of electronic data such as information about Americans’ telephone calls. The bill, called the Freedom Act, passed the House on Wednesday by a vote of 338 to 88. Meanwhile senators from both parties threatened to filibuster the renewal of the Patriot Act, the law that makes it legal. And one federal court has ruled the practice unlawful.
What would ending bulk collection mean for intelligence collection? Last year, the National Academy of Sciences, or NAS, began looking at what would happen if US spies no longer had easy access to phone records. In January, they issued their findings as Bulk Collection of Signals Intelligence: Technical Options. The 80-page report draws open a curtain on the probable future of electronic spying.
The most important conclusion comes early on: a huge database of the phone records of millions of people is quite valuable to signals intelligence, or SIGINT. “There is no software technique that will fully substitute for bulk collection where it is relied on to answer queries about the past after new targets become known,” the NAS report says.
But the report also suggests that there may be ways to at least partially replace this. “It may be possible to improve targeted collection [as opposed to bulk collection] to the point where it provides a viable substitute for bulk collection in at least some cases, using profiles of potential targets that are compiled from a wide range of information.”
Metadata can be defined, broadly, as all the data that you produce through your exchanges with digital devices. On its most literal level, it’s “a set of data that describes and gives information about other data.” For example, metadata about a phone conversation might include the numbers of the caller and the recipient, where each one was when they began the call, and who they called previously or next—but not an actual recording of the conversation.
The NSA sweeps up vast amounts of these digital identifiers, including information about the activities of millions of
Americans not related to any particular investigation or target, and holds them for up to five years. It does this under the authority of the Foreign Intelligence Surveillance Act (FISA)’s Section 215, which was enacted as part of the USA Patriot Act in 2001.
What does the intelligence community get from all the metadata it collects and stores? The report outlines three broad areas, starting with “contact chaining.” Phone numbers, time logs, and other pieces of metadata can be used to find hidden connections between people: middlemen, hidden contacts, or simply mutual acquaintances. Next is alternate identifier discovery, aka finding digital aliases: names, usernames, and even different communications methods a person uses. Lastly, there is triage, or ranking the urgency of threats.
Spotify for Spies: Replacing A Valuable Tool
Concerns about privacy and civil society aside, the NAS report says (passive aggressively) ending bulk collection would eliminate a unique and valuable tool.
“If the past events are unique or if delay in obtaining results is unacceptable (because of an imminent threat or perhaps because of press coverage or public demand), then the intelligence will not be as complete. So restricting bulk collection will make intelligence less effective, and technology cannot do anything about this; whether the gain in privacy is worth the loss of information is a policy question that the committee does not address,” the authors write.
But the report also explores a partial remedy; a conceptual system that would give analysts access to stored metadata only under certain circumstances and under tightly controlled limits. Think of a streaming music service like Pandora or Spotify that allows you to listen to a song but prevents you from downloading it. In the consumer world, this is called “digital rights management”; in the context of intelligence collection, it’s “usage control.”
This raises another question: how should analysts figure out what portions of the data to ask for? It’s hard to ask for the “right” puzzle piece if you don’t know what they all look like.
Again, the report offers a partial solution: create a new generation of artificial intelligence agents to assess the relevance of data either in real time or in storage. “More powerful automation could improve the precision, robustness, efficiency, and transparency of the controls, while also reducing the burden of controls on analysts,” the report says.
Such a system would, in theory, prevent unlawful fishing expeditions by limiting the scope, queries, and even personnel able to access to the data.
Finally, the program would keep a log of all the above so that overseers could make sure that analysts weren’t abusing the system … or absconding with the royal jewels to Hong Kong.
Now Google vs. NSA
Technically, this problem isn’t substantially different than other AI challenges that have recently occupied the best minds of Silicon Valley. Machines have learned to find YouTube videos featuring cats and to recognize how some queries may indicate a user’s interest in a particular product. Both are Google projects; the latter allows the search giant to send you targeted ads in Gmail without identifying you to advertisers.
Yet this begs another question: when does an algorithm become so smart that it constitutes a human-like threat to privacy?
It depends on the degree to which the program understands the data that it’s collecting. A 2013 article in The Futurist magazine outlines how a supercomputer, under a Defense Advanced Research Project Agency grant, was taught to comprehend the meaning of sentences based on grammar and syntax. It performed at 85% accuracy
What happens to your phone metadata, whether it stays with your phone companies or goes to the NSA data storage center in Utah, is a matter of temporary concern. The future of intelligence collection belongs to the machines. That’s where the debate is headed as well.
DefenseOne: http://bit.ly/1EGAAcz