Saturday, December 28, 2024
37.0°F

The Internet: The power of 'bigmeta'

by UYLESS BLACK/Special to The Press
| January 8, 2016 8:00 PM

How is it possible that marketing firms and intelligence agencies seem to know so much about us? Advertisements appear on our screens about products we actually like. Intelligence agents know where we are going before we arrive at our destination. Part of this powerful intuition is based on a technology known collectively as Big Data and metadata.

During the past few years, the terms Big Data and metadata have found their way into Internet lexicon. (I do not know why the words Big Data begin in caps.) A more accurate term is Much Data, but Big Data is used in this article because it is the popular saying.

Big Data and metadata processing take advantage of large sets of data to detect trends, identify tendencies, and find relationships in a very large set of data. In many situations, the amount of data is so huge it does not lend itself to conventional analysis, one in which a limited data set size (the amount of data) might be examined by a single computer to extrapolate information.

Big Data considers a few million characters of data a paltry amount. As one example, the NSA Utah Data Center is reported to be capable of storing exabytes of data, expressed as 1,000,000,000,000,000,000 bytes, a number almost beyond comprehension (in this example, a byte represents a character or a numeral). Clearly, this amount of data cannot be manipulated, much less analyzed with conventional computer processing methods.

For metadata, the most common definition is: “data about data.” To clarify this term, see the figure included with this article. The information circled on the left side of the figure is metadata. The information circled on the right side of the figure is data: user content.

See Figure 10-1, Metadata and user content, above

The information of “24 MONTH FIXED RATE CD ***6259” is data about data. It identifies account number 6259, which is a 24-month fixed rate certificate of deposit. The data itself, user content, is the value of the CD: “$14,458.27.”

Metadata can contain considerable information about the owner of the metadata. In this example, if metadata were made available to parties such as Internet vendors (Google, for example), government agencies (NSA, as an example), a certificates of deposit thief, a terrorist group, a former spouse, etc., the owner of the metadata is vulnerable to having his/her private transaction with a bank disclosed.

Of course, this one metadata record reveals only tidbits about the owner of this CD. However, if a snooper can capture all the banking records of this party, the snooper can manipulate and infer a considerable amount of intelligence from this so-called non-personal data. This example is restricted to bank deposits. Yet metadata exists for practically any subject, such as medical information, sexual interests, shopping habits, marital accords and discords, etc.

Bigmeta

To process billions of pieces of data and metadata, enormous computer resources are required. Some organizations have thousands of computers networked together to analyze both data and metadata. I have coined a new term to identify the use of huge computer resources to massage both data and metadata: bigmeta. Granted, it is a contrived word, but succinctly identifies two technologies and associated computer power.

How does Bigmeta work?

The relationships of the data elements in bigmeta files are important. For example, if during the examination of a massive set of data, a suspected drug dealer is discovered to be calling a number often, this phone number will be examined further. At a minimum, the analysis will determine the identity of the called party, and correlate this party to other calls this party receives and sends. Some bigmeta systems then predict the likelihood of traits that might be associated with the people using these data elements.

According to bigmeta experts, with other information (movements, location, habits, etc.), accurate assessments can be made about these people being (or not being) in the drug trade. I emphasize “with other information,” because bigmeta is often able to show relationships between seemingly unrelated events.

Hammering data yields

extraordinary results

Some parties call this approach hammering the data. The slang term conveys the idea that with enough data and with enough computer processing power, information can be gleaned from what might appear to be irrelevant data relationships. This hammering has shown to have results. Alex Pentland, an MIT scientist, is the author of a new book on a subject he calls “social physics.”

“The power of [bigmeta],” he says, “is that it is information about people’s behavior instead of information about their beliefs. It’s about the behavior of customers, employees, and prospects for your new business. It’s not about the things you post on Facebook, and it’s not about your searches on Google, which is what most people think about, and it’s not data from internal company processes. …This sort of [bigmeta] comes from things like location data [from] your cell phone or credit card; it’s the little data breadcrumbs that you leave behind you as you move around in the world.”

“What those breadcrumbs tell,” he continues, “is the story of your life. It tells what you’ve chosen to do. That’s very different from what you put on Facebook. What you put on Facebook is what you would like to tell people, edited according to the standards of the day. Who you actually are is determined by where you spend time, and which things you buy. [Bigmeta] is increasingly about real behavior, and by analyzing this sort of data, scientists can tell an enormous amount about you. They can tell whether you are the sort of person who will pay back loans. They can tell you if you’re likely to get diabetes.”

Metadata alone:

A powerful solo actor

David Cole wrote an article on the power of metadata (See http://www.nybooks.com/daily/2014/05/10/we-kill-people-based-metadata/.) Cole’s article states:

But metadata alone can provide an extremely detailed picture of a person’s most intimate associations and interests, and it’s actually much easier as a technological matter to search huge amounts of metadata [the Bigmeta approach] than to listen to millions of phone calls. As NSA General Counsel Stewart Baker has said, “Metadata absolutely tells you everything about somebody’s life. If you have enough metadata, you don’t really need content.” When I quoted Baker at a recent debate at Johns Hopkins University, my opponent, General Michael Hayden, former director of the NSA and the CIA, called Baker’s comment “absolutely correct,” and raised him one, asserting, “We kill people based on metadata.”

Big Data, metadata, and thousands of cooperating computers yield bigmeta. Their inferential power is astounding. They protect ordinary citizens from potential harm from terrorists. They give terrorists information about their intended targets. They protect enterprises from hackers. They give hackers additional tools to penetrate enterprises.

For you, me, and groups of organizations, our data and metadata are fodder for the digital farms belonging to Internet vendors and surveillance agencies. We are the silage that feed their bigmeta organisms.

Does this aspect of our online world bother you? Is it your concern that in the future, Orwell’s 1984 could come to pass in 2084? It’s once again reflective of the old saying: “Where you stand is where you sit.”

The Internet advertisers and surveillance organizations are happy as larks about bigmeta technology. How it will evolve to be used will be a key part of how our societies cope with protecting our safety and at the same time, protecting our privacy.

Uyless Black is an award-winning author who has written 40 books on a variety of subjects. His latest book is titled “2084 and Beyond,” a work on the origins and consequences of human aggression. He resides in Coeur d’Alene.