E. Todd - December 2018
Roughly 10,500 years ago humans domesticated wild cattle. The genotype of cattle has been modified more than any other species of domestic livestock. These modifications have enabled us to keep them in a wide variety of conditions and environments, most conducive to the mass production of meat and related byproducts. The modern cattle industry produces about 25 billion pounds of meat each year and is valued at 200 billion dollars. Some of the byproducts include leather, china, glue, film, soap, pharmaceuticals, insulin and gelatin.
The quantities of beef and the related byproducts have been achieved through the selective breeding of behaviours found to be most beneficial for our needs. Throughout the domestication process, humans have provided the basic requirements such as food, water, veterinary care and suitable environment but have taken away the freedoms that cattle would have in the wild, such as choice of mate, feed and freedom of movement.
Domestication has reduced their longevity and contributed to various abnormalities relating to health and well-being previously absent in their free roaming ancestors. We have created a pliant animal reliant on our care, forever altered by the environmental experiences, genetic changes and goal-oriented breeding programs thrust on them.
Now you may be wondering how does the domestication of cows relate to data collection, privacy and artificial intelligence. Historian and best selling author Yuval Harari in his latest book, 21 Lessons for the 21st Century aptly summons up the unlikely connection. Harari states,
Humans are similar to other domesticated animals. We have bred docile cows that produce enormous amounts of milk, but are otherwise far inferior to their wild ancestors. They are less curious and less resourceful. We are now creating tame humans that produce enormous amounts of data and function as very efficient chips in a huge data-processing mechanism, but these data-cows hardly maximize the human potential.
The amount of data we produce on a daily basis is quite astounding. There are 2.5 quintillion bytes of data created each day. 90% of all data in the world has been generated over the last two years. Google alone processes more than 40,000 searches every second or 3.5 billion searches a day. There are more than 300 million photos uploaded per day on Facebook. Every minute there are 510,000 comments posted and 293,000 statuses updated on the platform. We send 16 million text messages every minute. All of this data is being collected under the auspices of free services. These free services include everything from email, free information, entertainment, cat videos, etc.
All of these free services on the outset operate under the business model of what Harari calls “attention merchants”.
"Their true business isn’t to sell advertisements at all. Rather, by capturing our attention they manage to accumulate immense amounts of data about us, which is worth more than any advertising revenue. We aren’t their customers - we are their product."
Like our large mammal friends, we are being lead into digital corrals that harvest our most personal and valuable information largely for uses unknown. As we become more reliant on the services of the data-giants, it may become increasingly difficult to detach ourselves from their platforms. We now use various apps to provide us information on the weather, track our fitness progress, communicate with friends and family, get directions, etc. These services have granted us greater connectivity and access to streams of information once the purview of an elite few. It has been argued that we have democratized access to information for all, but what are the future implications for self determination if we no longer maintain certain intrinsic privacies?
Will our future employability be determined by how often we update our LinkedIn page? If an insurance company purchases an ancestry DNA site in which you’ve submitted genetic material in exchange for information about your ancestry, and it is found that you have pre-existing conditions, will this prevent you from obtaining insurance? Will marginalized communities access to social services be determined by how useful certain algorithms judge their districts to be?
Similar to the byproducts created from domesticated cattle, mass collection of our personal data has the potential to conceive of numerous byproducts. Whether these byproducts will be used for the greater good or avaricious motives is yet to be fully understood.
Juxtaposed against these privacy concerns is the reality that in order for algorithms to see patterns and evolve there is a requirement for large data sets. Artificial intelligence expert and author Kai-Fu Lee in his book, AI Superpowers: China, Silicon Valley and the New World Order states,
"In this age of implementation, data is the core. That’s because once computing power and engineering talent reach a certain threshold, the quantity of data becomes decisive in determining the overall power and accuracy of an algorithm."
For society to harness the full potential of artificial intelligence access to our data is crucial. Lee goes on to mention,
In deep learning, there’s no data like more data. The more examples of a given phenomenon a network is exposed to, the more accurately it can pick out patterns and identify things in the real world. Deep learning’s relationship with data fosters a virtuous circle for strengthening the best products and companies: more data leads to better products, which in turn attract more users, who generate more data that further improves the product.
As we nurture artificial intelligence, increasing its access to our data, corporations and nation states will benefit immensely from the patterns and knowledge bestowed upon them by these algorithms. Interrelationships that would normally be overlooked due to the limitations of our brains and the massive streams of data will now be sorted, contextualized and applied in fascinating new ways.
Lee discusses Smart Finance, an AI-powered app that relies exclusively on algorithms to make millions of small loans.
Instead of asking borrowers to enter how much money they make, it simply requests access to some of the data on a potential borrower’s phone. That data forms a kind of digital fingerprint, one with an astonishing ability to predict whether the borrower will pay back a loan of $300. Smart Finance’s deep learning algorithms don’t just look to the obvious metrics like how much money is in your account. Instead, it derives predictive power from data points that would seem irrelevant to a human loan officer. For instance, it considers the speed at which you typed your date of birth, how much battery power is left on your phone, and thousands of other parameters.
Lee further illustrates the patterns of the unconsidered,
What does an applicant’s phone battery have to do with creditworthiness? This is the kind of question that can’t be answered in terms of simple cause and effect. But that’s not a sign of the limitations of AI. It’s a sign of the limitations of our own minds at recognizing correlations hidden within massive streams of data. By training its algorithms on millions of loans-many that got paid back and some that didn’t- Smart Finance has discovered thousands of weak features that are correlated to creditworthiness, even if those correlations can’t be explained in a simple way humans can understand. Those offbeat metrics constitute what Smart Finance founder Ke Jiao calls ‘a new standard of beauty’ for lending, one to replace the crude metrics of income, zip code, and even credit score.
The potential discoveries that will be ascertained from increased access to our data may be too tantalizing to forego.
In part II, I will conclude these thoughts on what all of this may mean for society, personal privacy, artificial intelligence and the individual.