Under-the-Radar Health Information Markets: the Supply, the Demand, and the Exploited.

Nowadays, it is not a secret that healthcare providers — such as hospitals — can store and utilize individuals’ health information. Hospitals keep records of individuals so that the diagnosis can be based on more information, and some countries even have a health information exchange system among different hospitals for the same purpose.

Yet, there are also some unnoticeable health information markets that are growing rapidly by consuming your health data without your awareness or explicit consent. In the following paragraphs, I will examine the players in the under-the-radar health information market from the view of supply and demand. I will then wrap up the article by raising awareness of the high risks that individuals face.

The Supply: Who is accessing and supplying your health information without your consent?

Health Data Brokerage Industry

In general, data brokers refer to entities that collect information about individuals and sell that data to other data brokers, companies and individuals. Accordingly, health data brokers refer to those who particularly focus on health information. In the US, Health data brokers can legally buy and sell anonymous (de-identified) data under the Health Insurance Portability and Accountability Act (HIPAA), as well as non-anonymous health data not covered by that privacy standard, including what you put into search engines and health websites [1].

“Your medical data is for sale — all of it.”

— The Guardian

One of the biggest health data brokers in the field is IMS Health (now called “IQVIA” after the merge). According to Forbes, IMS claimed it “processes data from more 45 billion healthcare transactions annually and collects information from more than 780,000 different streams of data worldwide.[2]” It is noteworthy that data brokers do not have a direct relationship with the people who they are collecting data from — meaning that people tend to be unaware of their data being collected and sold.

Health Data Breaches

Throughout history, one of the common ways for criminals to get something valuable is via stealing — and at the age of the internet, it becomes data breaches. Suggested by the Forbes, healthcare industry is now the most cyber-attacked industry. In the United States alone, between 2009 and 2017, there have been 2,181 healthcare data breaches that have resulted in the exposure of 176,709,305 healthcare records — accounting for 54.25% of its population [3]. In 2016, there were 9 times more medical than financial records breached [4]. It is also noteworthy that 75% of those records were exposed or stolen as a result of hacking or IT incidents, signaling how criminals saw value in the actions [5].

Every year, with the exception of 2015, the number of healthcare data breaches (in the USA) has increased, rising from 199 breaches in 2010 to 344 breaches in 2017.

Apart from the United States, Australia and Singapore also recently faced a serious health data breach. The Office of the Australian Information Commissioner has revealed in July 2018 that there have been more than 300 major data breaches this year — among which healthcare sector was the worst hit with 49 major data breaches [6]. Singapore, on the other hand, also suffered from one of the worst cyber attacks in history this year. Hackers invaded the computers of SingHealth, Singapore’s largest group of healthcare institutions, and stolen the health records of 1.5 million patients — including Prime Minister Lee Hsien Loong [7].

Darknet Market

Darknet Market, also known as the “Dark Web” or the “Deep Web”, can be seen as an online form of black market. Many of health records from the previously mentioned data breaches go to the darknet market for sale.

“Stolen health credentials can go for $10 each, about 10 or 20 times the value of a U.S. credit card number.” — PhishLabs.

On the dark web, complete health records normally contain an individual’s name, date of birth, social security number, and medical information. Such records can sell for as much as $60 a piece, whereas stolen credit cards sell for just $1 to $3 [8]. The prices might vary due to the number of items available in the package, characteristic of the victim, the source of the stolen data and the underground reputation of the sellers [9].

Source: Redsocks Malicious Threat Detection (11st Apr 2018), Dark Web: The Harmful Business of Medical Data. Available at: https://www.redsocks.eu/blog-2/dark-web-the-harmful-business-of-medical-data/

According to Guardian, a darknet trader even claimed to have access to any Australian’s Medicare details and can supply it upon request. The price for purchasing an Australian’s Medicare card details is 0.0089 bitcoin –equivalent to US$22 at the time [10].


The Demand: Who is buying your health information without your consent?

Medical Identity Theft

Medical identity theft, as defined by the World Privacy Forum, occurs when “someone uses a person’s identity without the person’s consent to obtain medical services or goods, or uses the person’s identity information to make false claims for medical services or goods” [11].

In the US, medical records have been in great demand from cybercriminals because they contain valuable personal information — such as name, address, date of birth and Social Security Number — all in one record [12]. With such information, criminals can access specific medical equipment or drugs available upon prescription — and then later sell them on the black market.

Pharmaceutical Companies

The pharmaceutical industry has traditionally depended on aggressive marketing for the products promotion. However, the traditional commercial method does not seem to do the trick anymore these days. Particularly, companies are failing to engage with patients when they look for information about symptoms in the early stages [13]. So by accessing more health information about individuals, they can gain better insights into the market and how to best interact with patients/consumers [14].

Besides the marketing aspect, to prove the value of their drugs, pharmaceutical companies have started to involve real-world data when conducting clinical studies over the past decade. Between 2010 and 2016, the average cost of bringing a drug to market has increased by 33%, yet the average peak sales decrease by 49%. Meanwhile, the market for precision medicine is expected to grow from $39 billion in 2015 to &87.7 billion by 2023 [15]. IMS Health, for instance, claims that pharmaceutical sales and marketing are a key part of IMS’ business, and its data also helps big pharma justify prices for drugs by demonstrating their effectiveness [16].


The Exploited: High risks, yet low (if any) returns for individuals

Your health information cannot exist without you. Yet, other people are benefiting from it instead of you.

All the health information that I mentioned above — whether it is in a data breach or being purchased by the pharmaceutical companies — are generated by individuals. Therefore, I believe it is fair to argue that individuals, instead of the data brokers or the hackers, have the most at stake — yet as it shows, receive the least benefits from the market.

Privacy is at stake

Most of the current legal protections (e.g. HIPAA) focus on removing personally identifiable information — such as name, phone number, address, date of birth — when it comes to health records. Health data brokers, for instance, tend to only deal with such de-identified health information when running their business. However, it is critical to realize that such method is no longer enough for securing one’s privacy as it is possible to re-identify those data what were de-identified. One of the popular ways to do so is by combing databases to fill in the blanks, which is also known as “mosaicking”[17].

“Enough anonymous data gathered over time will eventually contain enough clues to re-identify nearly anyone who has received medical care, posing a big potential threat to privacy [18].”

The Australian government, for instance, published medical billing records covers 2.9 million people on its open data website and those data were later found re-identifiable by using known information about the individuals [19]. With the increasing popularity of consumer genomics, a research has found out the “more than 60 per cent of Americans with European ancestry can be identified through their DNA using open genetic genealogy databases, regardless of whether they’ve ever sent in a spit kit [20].” In the below graph, Bloomberg shows how someone can successfully re-identify your medical records in 5 simple steps.

Source: Bloomberg Research

Pay the high price for being a medical identity theft victim

In the US, it is suggested that a medical identity theft can cost one about $13,500 to resolve [21]. Unlike the traditional financial identity theft, medical identity theft is more difficult to be discovered and dealt with. One of the main reasons is that health information tends to be very private and unchangeable — one cannot simply cancel his/her demographic data, family history, insurance information or medication.

Once you become a victim of medical identity theft, doctors might update your health records with the imposter’s medical information, which can lead to false treatment for you and medical bills that you have to pay for [22].

What’s it in for the individuals?

Bearing such costs and risks as mentioned, one would assume that there must be something in it for the individuals. But in my reality, I have never get rewarded (in any forms) from hospitals, pharmaceutical companies or health data brokers for utilizing my valuable health information — I believe that is the experience of almost everyone out there.

To conclude, our health information (in many forms) are in fact traded around more than we expected, both legally and illegally. From data brokers to hackers, entities get on hold of valuable and sensitive health information/data and make profits out of them. I believe the very first step is to raise public awareness as well as empowering individuals to request better control over their health information.


Reference:

[1] Fast Company (1st Apr 2018). Can this app that lets you sell your health data cut your health costs. Available at: https://www.fastcompany.com/40512559/can-this-app-that-lets-you-sell-your-health-data-cut-your-health-costs[2] Forbes (6th Jan 2014). Company that knows what drugs everyone takes going public. Available at: https://www.forbes.com/sites/adamtanner/2014/01/06/company-that-knows-what-drugs-everyone-takes-going-public/#2f37caf24c90[3] HIPAA Journal. Healthcare Data Breach Statistics. Available at: https://www.hipaajournal.com/healthcare-data-breach-statistics/[4] Forbes (Dec 2017). The Real Threat Of Identity Theft Is In Your Medical Records, Not Credit Cards. Available at: https://www.forbes.com/sites/forbestechcouncil/2017/12/15/the-real-threat-of-identity-theft-is-in-your-medical-records-not-credit-cards/#5c7f7fa01b59[5] HIPPA Journal (Sep 2018), Study reveals 70% Increase in Healthcare Data Breaches Between 2010 and 2017. Available at: https://www.hipaajournal.com/study-reveals-70-increase-in-healthcare-data-breaches-between-2010-and-2017/[6] News.Com.AU (31st Jul 2018). Health sector tops the list as Australians hit by 300 data breaches since February. Available at: https://www.news.com.au/technology/online/hacking/health-sector-tops-the-list-as-australians-hit-by-300-data-breaches-since-february/news-story/5e95c47694418ad072bf34d872e22124 [7] The Strait Times (Jul 2018). Personal info of 1.5m SingHealth patients, including PM Lee, stolen in Singapore’s worst cyber attack. Available at: https://www.straitstimes.com/singapore/personal-info-of-15m-singhealth-patients-including-pm-lee-stolen-in-singapores-most[8] Fast Company (2016). On the Dark Web, Medical Records Are a Hot Commodity. Available at: https://www.fastcompany.com/3061543/on-the-dark-web-medical-records-are-a-hot-commodity[9] Redsocks Malicious Threat Detection (Apr 2018). Dark Web: The Harmful Business of Medical Data. Available at: https://www.redsocks.eu/blog-2/dark-web-the-harmful-business-of-medical-data/[10] The Guardian (Jul 2018). The Medicare machine: patient details of ‘any Australian’ for sale on darknet. Available at: https://www.theguardian.com/australia-news/2017/jul/04/the-medicare-machine-patient-details-of-any-australian-for-sale-on-darknet[11] World Privacy Forum. Medical Identity Theft. Available at: https://www.worldprivacyforum.org/category/med-id-theft/ [12] Entefy (Dec 2017). Medical records fetch a premium on the black market. Then along comes blockchain. Available at: https://www.entefy.com/blog/post/500/medical-records-fetch-a-premium-on-the-black-market-then-along-comes-blockchain[13] McKinsey & Company (May 2016). How pharma companies can better understand patients. Available at: https://www.mckinsey.com/industries/pharmaceuticals-and-medical-products/our-insights/how-pharma-companies-can-better-understand-patients[14] Lewis, R. J., Weintraub, S., Sitler, B., McHugh, J., Zan, R., & Morales, S. (2015). Results: The Future of Pharmaceutical and Healthcare Marketing. [15] Deloitte (2017). Life Sciences and Health Care Prediction 2022. Available at: https://www2.deloitte.com/uk/en/pages/life-sciences-and-healthcare/articles/healthcare-and-life-sciences-predictions.html[16] Fortune (9th Feb 2018). This Little-Known Firm Is Getting Rich Off your Medical Data. Available at: http://fortune.com/2016/02/09/ims-health-privacy-medical-data/[17] Forbes (2016). The Big Data Era of Mosaicked Deidentification: Can we Anonymize Data Anymore? Available at: https://www.forbes.com/sites/kalevleetaru/2016/08/24/the-big-data-era-of-mosaicked-deidentification-can-we-anonymize-data-anymore/#802d2be3f1e2[18] The Century Foundation (2017). Strengthening Protection of Patient Medical Data. Available at: https://tcf.org/content/report/strengthening-protection-patient-medical-data/?agreed=1[19] The Guardian (Jul 2018). ‘Data is a fingerprint’: why you aren’t as anonymous as you think online. Available at: https://www.theguardian.com/world/2018/jul/13/anonymous-browsing-data-medical-records-identity-privacy[20] Wired (2018). Genome Hackers Show No One’s DNA Is Anonymous Anymore. Available at: https://www.wired.com/story/genome-hackers-show-no-ones-dna-is-anonymous-anymore/[21] AARP (2017). Medical Identity Theft: It Can Cost You Thousands. Available at: https://states.aarp.org/medical-identity-theft-can-cost-thousands/ [22] Panda Security. Identity Theft. Available at: https://www.pandasecurity.com/mediacenter/news/identity-theft-statistics/

By Hsiang-Yun L. on July 01, 2019.

With Data Anonymization Becoming A Myth, How Do We Protect Ourselves In This World Of Data?

With humanity moving into the world of big data, it has become increasingly challenging, if not impossible, for individuals to “stay anonymous”.

Every day we generate large amounts of data, all of which represent many aspects of our lives. We are constantly told that our data is magically safe for releasing as long as it is “de-identified”. However, in reality, our data and privacy are constantly exposed and abused. In this article, I will discuss the risks of de-identified data and then examine the extent to which existing regulations effectively secure privacy. Lastly, I will argue the importance for individuals to take more proactive roles in claiming rights over the data they generate, regardless of how identifiable it is.

What can go wrong with “de-identified” data?

Most institutions, companies, and governments collect personal information. When it comes to data privacy and protection, many of them assure customers that only ”de-identified” data will be shared or released. However, it is critical to realize that de-identification is no magic process and cannot fully prevent someone from linking data back to individuals — — for example via linkage attacks. On the other hand, there are also new types of personal data, like genomic data, that simply cannot be de-identified.

Linkage attacks can re-identified you by combining datasets.

A linkage attack takes place when someone uses indirect identifiers, also called quasi-identifiers, to re-identify individuals in an anonymized dataset by combining that data with another dataset. The quasi-identifiers here refer to the pieces of information that are not themselves unique identifiers but can become significant when combined with other quasi-identifiers [1].

One of the earliest linkage attacks happened in the United States in 1997. The Massachusetts State Group Insurance Commission released hospital visit data to researchers for the purpose of improving healthcare and controlling costs. The governor at the time, William Weld, reassured the public that patient privacy was well protected, as direct identifiers were deleted. However, Latanya Sweeney, an MIT graduate student at the time, was able to find William Weld’s personal health records by combining this hospital visit database with an electoral database she bought for only US$ 20 [2].

Another famous case of linkage attack is the Netflix Prize. In October 2006, Netflix announced a one-million-dollar prize for improving their movie recommendation services. They published data about movie rankings from around 500,000 customers between 1998 and 2005 [3]. Netflix, much like the governor of Massachusetts, reassured customers that there are no privacy concerns because “all identifying information has been removed”. However, the research paper How To Break Anonymity of the Netflix Prize Dataset” was later published by A. Narayanan and V. Shmatikov to show how they successfully identified Netflix records of non-anonymous IMDb users, uncovering information that could not be determined from their public IMDb ratings [4].

Some, if not all, data can never be truly anonymous.

Genomic data is some of the most sensitive and personal information that one can possibly have. With the price and time it takes to sequence a human genome advancing rapidly over the past 20 years, people now only need to pay about US$ 1,000 and wait for less than two weeks to have their genome sequenced [5]. Many other companies, such as 23andMe, are also offering cheaper and faster genotyping services to tell customers about their ancestry, health, traits etc [6]. It has never been easier and cheaper for individuals to generate their genomic data, but, this convenience also creates unprecedented risks.

Unlike blood test results having an expiration date, genomic data undergoes little changes over and individuals’ lifetime and therefore has long-lived value [7]. Moreover, genomic data is highly distinguishable and various scientific papers have proven that it is impossible to make genomic data fully anonymous. For instance, Gymrek et al. (2013) argue that surnames can be recovered from personal genomes by linking “anonymous” genomes and public genetic databases [8]. Lippert et al. (2017) also challenge the current concepts of genomic privacy by proving that de-identified genomes can be identified by inferring phenotypic measurements such as physical traits and demographic information [9]. In short, once someone has your genome sequence, regardless of the level of identifiability, your most personal data is out of your hands for good — unless you could change your genome the way you would apply for a new credit card or email address.

That is to say, we, as individuals, have to acknowledge the reality that simply because our data is de-identified doesn’t mean that our privacy or identity is secured. We must learn from linkage attacks and genomic scientists that what used to be considered anonymous might be easily re-identified using new technologies and tools. Therefore, we should proactively own and protect all of our data before, not after, our privacy is irreversibly out of the window.

Unfortunately, existing laws and privacy policies might protect your data far less than you imagine.

Understanding how NOT anonymous your data really is, one might then wonder how existing laws and regulations keep de-identified data safe. The answer, surprisingly, is that they don’t.

Due to the common misunderstanding that de-identification can magically make it safe to release personal data, most regulations at both the national or company levels do not regulate data that doesn’t relate to an identifiable person.

At the national level

In the United States, the Privacy Rule of the Health Insurance Portability and Accountability Act (HIPAA) protects all “Individually Identifiable Health Information (or Protected Health Information, PHI)” held or transmitted by a covered entity or its business associate, in any form or media. The PHI includes many common identifiers such as name, address, birth date, Social Security Number [10]. However, it is noteworthy that there are no restrictions on the use or disclosure of de-identified health information. In Taiwan, one of the leading democratic countries in Asia, the Personal Information Protection Act covers personal information such as name, date of birth, ID number, passport number, characteristics, fingerprints, marital status, family, education, occupation, medical record, medical treatment etc [11]. However, the Act doesn’t also clarify the rights concerning “de-identified” data. Even the European Union, which has some of the most comprehensive legislation for protecting data, states in its General Data Protection Regulation (GDPR) that “the principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable” [12].

Source: Privacy on iPhone — Private Side (https://www.youtube.com/watch?v=A_6uV9A12ok)

At the company level

A company’s privacy policy is to some extent the last resort for protecting an individual’s rights to data. Whenever we use an application or device, we are complied to agree with its privacy policy and to express our consent. However, for some of the biggest technology companies, whose business largely depends on utilizing users’ data, their privacy policies tend to also exclude the “de-identified data”.

Apple, despite positioning itself as one of the biggest champions of data privacy, states in its privacy policy that Apple may “collect, use, transfer, and disclose non-personal information for any purpose [13].” Google also mentions that they may share non-personally identifiable information publicly and with partners — like publishers, advertisers, developers, or rights holders [14]. Facebook, the company that has caused massive privacy concerns over the past year, openly states that they provide advertisers with reports about the kinds of people seeing their ads and how their ads are performing while assuring users that Facebook doesn’t share information that personally identifies the users. Fitbit, which is argued to have 150 billion hours of anonymized heart data from its users [15], states that they may share non-personal information that is aggregated or de-identified so that it cannot reasonably be used to identify an individual [16].”

Overall, none of the governments or companies are currently protecting the de-identified data of individuals, despite the foreseeable risks of privacy abuses if/when such data gets linked back to individuals in the future. In other words, none of those institutions can be held accountable by law if such de-identified data is re-identified in the future. The risks fall solely on individuals.

An individual should have full control and legal recourse to the data he/she generates, regardless of identifiability levels.

Acknowledging that the advancement of technology in fields like artificial intelligence makes complete anonymity less and less possible, I argue that all data generated by an individual should be seen as personal data despite the current levels of identifiability. In a rule-of-law and democratic society, such a new way of viewing personal data will need to come from both bottom-up public awareness and top-down regulations.

As the saying goes, “preventing diseases is better than curing them.” Institutions should focus on preventing foreseeable privacy violations when “anonymous” data gets re-identified. One of the first steps can be publicly recognizing the risks of de-identified data and including it in data security discussions. Ultimately, institutions will be expected to establish and abide by data regulations that apply to all types of personally generated data regardless of identifiability.

As for individuals who generate data every day, they should take their digital lives much more seriously than before and be proactive in understanding their rights. As stated previously, when a supposedly anonymous data is somehow linked back to somebody, it is the individual, not the institution, who bears the costs of privacy violation. Therefore, with more new apps and devices coming up, individuals need to go beyond simply taking what is stated in the terms and conditions without reading through, and acknowledge the degree of privacy and risks to which they are agreeing. Some non-profit organizations such as Privacy InternationalTactical Technology Collective and Electronic Frontier Foundation may be a good place to start learning more about these issues.

Overall, as we continue to navigate the ever-changing technological landscape, individuals can no longer afford to ignore the power of data and the risks it can bring. The data anonymity problems addressed in this article are just several examples of what we are exposed to in our everyday lives. Therefore, it is critical for people to claim and request full control of and adequate legal protections for their data. Only by doing so can humanity truly enjoy the convenience of innovative technologies without compromising our fundamental rights and freedom.

Reference

[1] Privitar (Feb 2017). Think you ‘anonymised’ data is secure? Think again. Available at: https://www.privitar.com/listing/think-your-anonymised-data-is-secure-think-again[2] Privitar (Feb 2017). Think you ‘anonymised’ data is secure? Think again. Available at: https://www.privitar.com/listing/think-your-anonymised-data-is-secure-think-again[3] A.Narayanan and V. Shmatikov (2008). Robust De-anonymization of Large Sparse Datasets. Available at:https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf [4] A.Narayanan and V. Shmatikov (2007). How To Break Anonymity of the Netflix Prize Dataset. Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.100.3581&rep=rep1&type=pdf[5] Helix. Support Page. Available at: https://support.helix.com/s/article/How-long-does-it-take-to-sequence-my-sample [6] 23andMe Official Website. Available at: https://www.23andme.com/[7] F. Dankar et al. (2018). The development of large-scale de-identified biomedical databases in the age of genomics — principles and challenges. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5894154/[8] Gymrek et al. (2013). Identifying personal genomes by surname inference. Available at: https://www.ncbi.nlm.nih.gov/pubmed/23329047 [9] Lippert et al. (2017). Identification of individuals by trait prediction using whole-genome sequencing data. Available at: https://www.pnas.org/content/pnas/early/2017/08/29/1711125114.full.pdf [10] US Department of Health and Human Services. Summary of the HIPAA Privacy Rule. Available at: https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html[11] Laws and regulations of ROC. Personal Information Protection Act. Available at: https://law.moj.gov.tw/Eng/LawClass/LawAll.aspx?PCode=I0050021[12] GDPR. Recital 26. Available at: https://gdpr-info.eu/recitals/no-26/ [13] Apple Inc. Privacy Policy. Available at: https://www.apple.com/legal/privacy/en-ww/ [14] Google. Privacy&Terms (effective Jan 2019). Available at: https://policies.google.com/privacy?hl=en&gl=tw#footnote-info [15] BoingBoing (Sep 2018). Fitbit has 150 billion hours of “anonymized” health data. Available at: https://boingboing.net/2018/09/05/fitbit-has-150-billions-hours.html [16] Fitbit. Privacy Policy (effective Sep 2018). Available at: https://www.fitbit.com/legal/privacy-policy#info-we-collect

By Hsiang-Yun L. on April 29, 2019.

I can’t put the toothpaste back in the tube, what I found when I downloaded my Facebook data

Above is the first post ever written on my Facebook wall. I’m sure it was intended as an innocent, warm welcome, but now it reads more like someone escorting me into my own personal version of the “Hotel California” line: “You can check out any time you like, but you can never leave…”

I’m not so sure Facebook is such a wonderful addiction anymore. Since writing this post I’ve been off Facebook for two months and I want to make a commencement speech to all the youth out there called “It Gets Better.” Perhaps it will be subtitled: “I Didn’t Realize How Much Anxiety Social Media Gave Me Until I Quit It Entirely.”

I recently downloaded all of my Facebook data in preparation to delete my account. I’m not a super-user, I’m not addicted to it. I have genuinely loved using Facebook to connect with my personal network for 11 years, but it’s come to feel like the tide has shifted and now Facebook is using me. Namely Facebook is using my data against me — collecting my every move, likes, interests, conversations, all without my explicit consent or knowledge of how my data is shared, and with whom — I’m finally, really, actually not okay with that.

I have worked in marketing, advertising and public communications for the last 18 years (read: I know how to target communities with FB ads in my sleep). I have been a community organizer for various organizations in my local community (read: I’ve created loads of FB events and FB campaigns). I live in Silicon Valley (read: I’m an early-ish adopter). Plus I have an insatiable curiosity when it comes to stories, people and social trends. What all of this adds up to is I’ve loved Facebook — as a personal and professional tool to get connected to headlines, to culture, to people, to trends, to news, to events, to stories, to gossip, to all of it. Facebook has been a great platform for me to get the connection I desire: I can get the scoop, I can share my scoop, and I can freely lurk around the scoops that everyone else is talking about.

This year something changed. I began to realize the vast amounts of information that Facebook tracks, uses and shares with others. Sure, everyone has had the experience of doing a browser search for … I don’t know … “Best brand of skinny jeans” and then 2 minutes later, there are 5 different brands of skinny jean ads showing in your FB feed. Those targeted ads seemed creepy but mostly harmless, until I realized that’s just the surface. It dawned on me how much information I’ve freely handed to a massive private company, and I am just beginning to comprehend the dangers of not being able to put that toothpaste back in it’s tube.

Facebook has gathered years of my daily activities, online and offline (here’s a list of 98 points it tracks), and spread it across advertisers, companies, websites, lists, etc.

After doing a tiny bit of research on best practices for deleting a Facebook account, I promptly found that I can download all of my Facebook data.

Sifting through this download was astonishing. Remember, I’ve spent years of my life helping brands advertise on FB, I understand how to leverage the information folks list in their profile to target ads, but I was not prepared for what I found in my downloaded data.

Facebook stores literally everything that makes up your profile, everything on the devices you connect from, every move you make, from start to finish:

  • Of course there are all the videos, photos, birthday messages, direct messages, and ephemera you’ve uploaded. That’s great, thanks for holding those for me FB!
  • There is a log of all the times you signed in, including the IP address from where you logged in, and the geolocation from where you logged in. This means it always knows where you. Okay, fine, what’s the rub? Well, FB can make logical leaps based on the intersection between the things you “Like” on the platform and your location: stores you walk into, businesses you frequent, the gym, the market, the office, the bar, the ____ you’re in everyday. That’s a bit stalker-ish.
  • There is a list of all of the personal contact information that has ever been on a device from which you’ve connected to Facebook. (This is where it starts to gets super creepy.) My FB data download has every single phone number I’ve ever had in my phone, any of my phones, any of my devices. I know for a fact that I’ve never listed my phone number on Facebook (privately or publicly), not once. But yet, in my FB data download, I found contact information for all my friends and family members, not to mention loads of people I don’t actually know or care to know: my old landlords, my neighbor’s dad, the lady who bought my table from a Craigslist ad, the nail salon I went to once, phone numbers I know I’ve deleted from my contacts, literally every number is still stored in FB.
  • There is a giant list of all the apps you’ve ever connected to Facebook. Obviously it has a log from Spotify, and with that how many times I’ve listened to Billy Joel last 10 years, #notevenalittlesorry. But I don’t even recognize most of the app names in my Facebook data, and now I know all of these companies have my personal information, too. With some more digging (and research) I’ve learned that this web of my personal data extends even farther: the apps your “friends” are connected to also collect your information. Even if you rarely use FB or are strict about how you connect your profile, any of your “friends” might be soliciting your info out without you knowing it.
  • There is a longer list of all of the ads ever clicked. Honestly, I don’t remember clicking on a single ad while on FB, and yet the list in my data is loooooong.
  • Get this, there’s an even bigger list of all of the companies and advertisers who have your demographic and personal information. I don’t know half of the companies listed in my data, but apparently they know me! And with undisclosed data leaks in the news lately, this is alarming.

Some people say, “But I don’t have anything to hide! I don’t care if FB knows where I live.” You may not care right now, today. But this is an unprecedented amount of permanently undeniable information that we have about ourselves. When teenagers used to talk on the phone with each other all night there weren’t private companies tracking the content, location, diction, or patterns of their chats. This is a new age, and we have no idea in what ways this information will be used by Facebook in the future. What if this information is used against you 30 years from now, in a court of law, where you’re the main suspect in the murder of a coffee shop barista, only because you went to that coffee shop every day for two years straight, which FB knows because you always check FB when you’re waiting for your coffee? But you didn’t even know the barista who was murdered and you coincidentally stopped going two days after he was murdered because a better coffee shop popped up two blocks in the other direction. Maybe that’s far-fetched, but it doesn’t take much to imagine a circumstance when you will find yourself wishing you’d limited how much traceable information is available and beyond your control.

Currently, there are no settings that allow me to limit what data is collected, how, why, or when it is shared, and zero control with whom it is shared. This is the number one reason that I’m leaving Facebook: I’m not allowed to control which or how much data is collected from my profile.

Now, I’m no privacy vegan (thank you, Eva, for the term) — like I said, connection is key to my life. I don’t want to live off the grid, I don’t want to go away, I don’t want to miss precious pictures of my many nephews who live in disparate parts of the country, but I do see the speed train we are on here, heading toward a data economy, and I definitely want to be able to choose how, when and by whom my data is collected. I want a digital environment where I can exercise control over the data that I generate, over the conversations, preferences and interests I have. I’m not comfortable with a platform that knows so much about me and my daily habits and also uses that information at their private-multimillion-dollar-company whim.

This is my number two reason to leave Facebook: The data that FB collects is used without my knowledge or consent in whatever way the company chooses to use it.

Reading Facebook’s terms of service and privacy policy is disheartening. It states, “We use all of the information we have to help us provide and support our Services.”

Let me reiterate: We use. All. The information. We have.

There it is, their license to freely share, use, collect, keep, destroy, sell, manipulate my data, however they choose.

There’s one more reason that I gotta disconnect: Facebook has partnered with so many websites and data mongers that whether you are logged in or not, Facebook is constantly tracking your every move. “It’s alerted every time you load a page with a “Like” or “share” button, or an advertisement sourced from its Atlas network,” says the Washington Post. The Facebook eyes are everywhere, and I don’t want them watching me anymore.

Case in point, while I was researching and writing this article, I was served many “security” posts from Facebook:

Facebook, I’m sure, is set to serve these “security” notices when it notices you’ve been Googling “How to delete Facebook” and “What data does Facebook collect” and “Privacy policy, Facebook.” But these notices aren’t about security, or data, or the platform. They are about how to control who in your network can see what, which is a very superficial version of “privacy controls” and not enough for me.

Getting the latest scoop is certainly not worth the privacy concerns I have, imho.

Of course Facebook may not be the biggest or worst culprit of data manipulation, but surely it is in the top echelon because of the sheer size, reach, and global influence of the company. For me, closing this digital chapter is step one in choosing to live a more empowered digital existence. Seriously, it gets better.

By Maureen Walsh on January 05, 2018.

Defining Property in the Digital Environment. Part Two.

Defining Property in the Digital Environment. Part Two.

The First Principles of Digital Property

In Part One of the series we took a look at the history of property. In this post we begin with a fundamental question, What is property?

What is Property?

At its simplest level, a property is an asset plus a property title. While most people probably consider property to be the stuff that they own, property is technically defined as the rules governing access to and control of assets, whether those assets are land, means of production, inventions, or other creative works. Within every society, laws known as property rights regulate which entities can assert ownership claims to which assets and what rights come with such property claims:

Property rights.

A valid ownership claim functions as a “bundle of rights” for a specific property and can include such rights as:

  • the right to exclusive possession
  • the right to exclusive use and enclosure
  • the right to transfer ownership (conveyance)
  • the right to use as collateral to secure a debt (hypothecation)
  • the right to subdivide (partition)

Property rights are neither absolute nor static; they can vary widely across different societies and can change over time. In Medieval Europe, common law considered all water resources as being statically tied to the land rights in which they were located, such that landholders owned parts of rivers with full accompanying rights. Over time, property rights for water resources have generally changed from being land-based to use-based, thereby allowing non-landowners to hold enforceable property rights. Also consider how different national flavors of political and economic ideologies, such as capitalism, socialism, and communism, have differently dictated who can own which properties, e.g., communism mandating that all means of production can only be owned by the state.

Within most property rights regimes, a property title is the legal instrument by which an entity claims ownership of an asset. Property titles are often embodied in a formal legal document, such as a real estate deed or a motor vehicle title, which serve as physical evidence of the possessor’s claim to property rights.

Property titles are the clearest legal means for defining private property rights.

One function of the property title is to uniquely identify the asset being claimed, most commonly by recording distinctive feature sets, such as geographic coordinates or geological features for land, or serial numbers, such as vehicle identification numbers (VIN) for motor vehicles. The moment that properties lose this unique identification, they become interchangeable commodities that behave more like money than like property. In order for money to circulate seamlessly and easily within a community, it must be completely fungible: It needs to be mutually interchangeable, functionally indistinguishable, and completely impersonal. The moment someone values one dollar bill more than another is the moment the dollar bill ceases to be money and starts to be property. However, the opposite is true for property. To establish an enduring record of a property’s authenticity, an asset’s unique identifier must be recorded in the property title as a permanent and immutable pointer to the asset, such that the asset can always be identified from its corresponding property title.

A second function of a property title is to make the bundle of property rights portable by acting as a container that allows its rights to be transferred from one owner to another. For assets that require a property title, transfers of ownership must be publicly recorded via a centralized government entity, such as a county land registrar or a state department of motor vehicles, in order for the transfer of property rights to be legally recognized. This history of ownership, or provenance, is most often tracked via a formal property system, which records the complete provenance of every registered property:

Property systems.

Piracy and Property Rights

In an ideal world, every property would have a property title. Property titles are the clearest legal means for defining private property rights. At the simplest level, property is provenance. The ability to demonstrate clean title is what protects one’s investment in a property by guaranteeing strong provenance. However, current property systems suffer from high transaction costs, which is why property titles traditionally have been reserved for physical properties whose valuations are high enough to justify the property title costs, such as real estate, vehicles, or works of art. However, if these transaction costs could be reduced to near zero, property titles could be issued for any asset, thereby clarifying property rights and further reducing negative externalities resulting from ambiguous ownership claims.

The Peruvian economist Hernando de Soto Polar has gone so far as to argue that, particularly in developing regions, the lack of access to robust property rights systems is the primary underlying cause of many nations’ most urgent negative externalities. According to De Soto, this inability to demonstrate legal ownership of assets compels many citizens, particularly small entrepreneurs, to seek extralegal remedies for their business problems since traditional means of judicial redress are only available to legal property owners. This massive exclusion from property rights systems results in the emergence of two parallel economies with disparate rules and risks: the official legal economy and a makeshift extralegal economy. It is the flourishing of extralegal shadow economies that generates many of the widespread negative externalities for their larger societies. De Soto coined the term “dead capital” to describe assets locked into such extralegal economies since their lack of property rights explicitly excludes them from becoming wealth-generating property within the larger global economy.

Within the digital environment, there exists a similar extralegal shadow economy in the form of online piracy of copyrighted works. While it is tempting to depict the rise of piracy as an unfortunate side effect of contemporary digital technologies, copyright infringement is as old as copyright itself. However, the recent prevalence of online piracy begs the question: What is it about the current state of digital assets that impels people, who in any other context would never commit crimes of piracy or theft, to engage in acts of piracy? While there are undoubtedly cases of piracy that are a simple matter of people wanting to get something without paying for it, De Soto’s research suggests that, more often than not, such recourse to extralegal solutions stems from too few property rights rather than too many. In the absence of readily available property rights for desired digital assets, otherwise law-abiding citizens resort to piracy to get what they want. Consider that a large portion of piracy occurs in countries that lack international licensing agreements to access high-demand digital assets. A more inclusive and universally accessible property system with low transaction costs that establishes clear property rights for digital assets could radically reshape the current piracy landscape by transforming disenfranchised pirates into invested property owners.

Privacy in the Digital Environment

Finally, it is important to recognize that, in the case of digital assets, there is a significant convergence of private property rights and rights to personal privacy. These seemingly unrelated sets of rights were once intrinsically linked. Historically, the ability to circumscribe an area of land as one’s own created an adequate level of protection of personal privacy through defense against unsolicited trespass. Thus, the fundamental right to private property also served as protection to personal privacy by clearly defining exclusive access rights to properties.

As new technologies have developed, courts have continuously needed to reinterpret the relationship between private property and privacy rights beyond the boundaries of physical properties by extending privacy protections to “people, not places.” These protections have included rights to privacy for posted correspondence, phone conversations, and any form of personal communication in which the content is presumed to be private. Unfortunately, however, these core personal privacy protections have not been as reliably applied to the Internet and personal data. A primary reason for these shortcomings is that most privacy laws are focused solely on protecting the content of digital communication while totally disregarding privacy protections for user metadata, which is often more revealing than the actual content itself. As an example, consider the fact that a mobile device’s detailed log of user location data is usually not protected, despite the fact that the ability to surveil someone’s daily movement patterns is, in most cases, a much more threatening privacy intrusion than monitoring any authored content transmitted from the device.

Online data privacy faces an additional complication with the continued popularization of social media applications and a growing trend towards centralized, third-party cloud computing platforms, both of which customarily require users to voluntarily store personal data on their remote servers. Under many legal systems, the act of voluntarily giving private information to third parties is considered an explicit forfeiture of any expectation to privacy rights over that information. The result of this voluntary surrender of privacy is that government authorities have been permitted to bypass traditional protections against search and seizure without first demonstrating probable cause and obtaining judicial search warrants. Within the context of digital data assets, this doctrine has been interpreted such that any third-party Internet service that stores user data — including everything from Internet service providers, cellular data providers, social media websites, and cloud storage services — must comply with government requests to access to that data, thereby significantly weakening privacy protections across nearly every category of contemporary digital communication practices.

The ability to convert digital assets into properties offers a way out of this privacy dilemma by realigning rights to private property and rights to personal privacy—

—that is, essentially creating the digital equivalent of a fence that affords digital property the same bundle of private property and privacy rights historically attached to land. It is in this potential to protect digital property that we most clearly recognize that private property and privacy are two sides of the same coin.

A property system for digital properties must therefore offer both legal and technical affordances for protecting property rights and privacy rights. At the legal level, the property system must integrate into existing property rights frameworks to such an extent as to guarantee exclusionary access to the data in the same way that exclusionary access is afforded to physical properties. At the technical level, the property system must provide a minimum capacity for heightened security and privacy through strong encryption practices and other barriers to unauthorized access in the same way that security fences or monitoring systems provide an added measure of privacy for physical properties.

In the final and third part of the series we’ll introduce how Bitmark is cleaning up the digital environment by bringing real property rights to digital assets and data.

Sign-up to stay up-to-date on our public beta release.

By Bitmark Inc. on February 22, 2017.