With Data Anonymization Becoming A Myth, How Do We Protect Ourselves In This World Of Data?

With humanity moving into the world of big data, it has become increasingly challenging, if not impossible, for individuals to “stay anonymous”.

Every day we generate large amounts of data, all of which represent many aspects of our lives. We are constantly told that our data is magically safe for releasing as long as it is “de-identified”. However, in reality, our data and privacy are constantly exposed and abused. In this article, I will discuss the risks of de-identified data and then examine the extent to which existing regulations effectively secure privacy. Lastly, I will argue the importance for individuals to take more proactive roles in claiming rights over the data they generate, regardless of how identifiable it is.

What can go wrong with “de-identified” data?

Most institutions, companies, and governments collect personal information. When it comes to data privacy and protection, many of them assure customers that only ”de-identified” data will be shared or released. However, it is critical to realize that de-identification is no magic process and cannot fully prevent someone from linking data back to individuals — — for example via linkage attacks. On the other hand, there are also new types of personal data, like genomic data, that simply cannot be de-identified.

Linkage attacks can re-identified you by combining datasets.

A linkage attack takes place when someone uses indirect identifiers, also called quasi-identifiers, to re-identify individuals in an anonymized dataset by combining that data with another dataset. The quasi-identifiers here refer to the pieces of information that are not themselves unique identifiers but can become significant when combined with other quasi-identifiers [1].

One of the earliest linkage attacks happened in the United States in 1997. The Massachusetts State Group Insurance Commission released hospital visit data to researchers for the purpose of improving healthcare and controlling costs. The governor at the time, William Weld, reassured the public that patient privacy was well protected, as direct identifiers were deleted. However, Latanya Sweeney, an MIT graduate student at the time, was able to find William Weld’s personal health records by combining this hospital visit database with an electoral database she bought for only US$ 20 [2].

Another famous case of linkage attack is the Netflix Prize. In October 2006, Netflix announced a one-million-dollar prize for improving their movie recommendation services. They published data about movie rankings from around 500,000 customers between 1998 and 2005 [3]. Netflix, much like the governor of Massachusetts, reassured customers that there are no privacy concerns because “all identifying information has been removed”. However, the research paper How To Break Anonymity of the Netflix Prize Dataset” was later published by A. Narayanan and V. Shmatikov to show how they successfully identified Netflix records of non-anonymous IMDb users, uncovering information that could not be determined from their public IMDb ratings [4].

Some, if not all, data can never be truly anonymous.

Genomic data is some of the most sensitive and personal information that one can possibly have. With the price and time it takes to sequence a human genome advancing rapidly over the past 20 years, people now only need to pay about US$ 1,000 and wait for less than two weeks to have their genome sequenced [5]. Many other companies, such as 23andMe, are also offering cheaper and faster genotyping services to tell customers about their ancestry, health, traits etc [6]. It has never been easier and cheaper for individuals to generate their genomic data, but, this convenience also creates unprecedented risks.

Unlike blood test results having an expiration date, genomic data undergoes little changes over and individuals’ lifetime and therefore has long-lived value [7]. Moreover, genomic data is highly distinguishable and various scientific papers have proven that it is impossible to make genomic data fully anonymous. For instance, Gymrek et al. (2013) argue that surnames can be recovered from personal genomes by linking “anonymous” genomes and public genetic databases [8]. Lippert et al. (2017) also challenge the current concepts of genomic privacy by proving that de-identified genomes can be identified by inferring phenotypic measurements such as physical traits and demographic information [9]. In short, once someone has your genome sequence, regardless of the level of identifiability, your most personal data is out of your hands for good — unless you could change your genome the way you would apply for a new credit card or email address.

That is to say, we, as individuals, have to acknowledge the reality that simply because our data is de-identified doesn’t mean that our privacy or identity is secured. We must learn from linkage attacks and genomic scientists that what used to be considered anonymous might be easily re-identified using new technologies and tools. Therefore, we should proactively own and protect all of our data before, not after, our privacy is irreversibly out of the window.

Unfortunately, existing laws and privacy policies might protect your data far less than you imagine.

Understanding how NOT anonymous your data really is, one might then wonder how existing laws and regulations keep de-identified data safe. The answer, surprisingly, is that they don’t.

Due to the common misunderstanding that de-identification can magically make it safe to release personal data, most regulations at both the national or company levels do not regulate data that doesn’t relate to an identifiable person.

At the national level

In the United States, the Privacy Rule of the Health Insurance Portability and Accountability Act (HIPAA) protects all “Individually Identifiable Health Information (or Protected Health Information, PHI)” held or transmitted by a covered entity or its business associate, in any form or media. The PHI includes many common identifiers such as name, address, birth date, Social Security Number [10]. However, it is noteworthy that there are no restrictions on the use or disclosure of de-identified health information. In Taiwan, one of the leading democratic countries in Asia, the Personal Information Protection Act covers personal information such as name, date of birth, ID number, passport number, characteristics, fingerprints, marital status, family, education, occupation, medical record, medical treatment etc [11]. However, the Act doesn’t also clarify the rights concerning “de-identified” data. Even the European Union, which has some of the most comprehensive legislation for protecting data, states in its General Data Protection Regulation (GDPR) that “the principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable” [12].

Source: Privacy on iPhone — Private Side (https://www.youtube.com/watch?v=A_6uV9A12ok)

At the company level

A company’s privacy policy is to some extent the last resort for protecting an individual’s rights to data. Whenever we use an application or device, we are complied to agree with its privacy policy and to express our consent. However, for some of the biggest technology companies, whose business largely depends on utilizing users’ data, their privacy policies tend to also exclude the “de-identified data”.

Apple, despite positioning itself as one of the biggest champions of data privacy, states in its privacy policy that Apple may “collect, use, transfer, and disclose non-personal information for any purpose [13].” Google also mentions that they may share non-personally identifiable information publicly and with partners — like publishers, advertisers, developers, or rights holders [14]. Facebook, the company that has caused massive privacy concerns over the past year, openly states that they provide advertisers with reports about the kinds of people seeing their ads and how their ads are performing while assuring users that Facebook doesn’t share information that personally identifies the users. Fitbit, which is argued to have 150 billion hours of anonymized heart data from its users [15], states that they may share non-personal information that is aggregated or de-identified so that it cannot reasonably be used to identify an individual [16].”

Overall, none of the governments or companies are currently protecting the de-identified data of individuals, despite the foreseeable risks of privacy abuses if/when such data gets linked back to individuals in the future. In other words, none of those institutions can be held accountable by law if such de-identified data is re-identified in the future. The risks fall solely on individuals.

An individual should have full control and legal recourse to the data he/she generates, regardless of identifiability levels.

Acknowledging that the advancement of technology in fields like artificial intelligence makes complete anonymity less and less possible, I argue that all data generated by an individual should be seen as personal data despite the current levels of identifiability. In a rule-of-law and democratic society, such a new way of viewing personal data will need to come from both bottom-up public awareness and top-down regulations.

As the saying goes, “preventing diseases is better than curing them.” Institutions should focus on preventing foreseeable privacy violations when “anonymous” data gets re-identified. One of the first steps can be publicly recognizing the risks of de-identified data and including it in data security discussions. Ultimately, institutions will be expected to establish and abide by data regulations that apply to all types of personally generated data regardless of identifiability.

As for individuals who generate data every day, they should take their digital lives much more seriously than before and be proactive in understanding their rights. As stated previously, when a supposedly anonymous data is somehow linked back to somebody, it is the individual, not the institution, who bears the costs of privacy violation. Therefore, with more new apps and devices coming up, individuals need to go beyond simply taking what is stated in the terms and conditions without reading through, and acknowledge the degree of privacy and risks to which they are agreeing. Some non-profit organizations such as Privacy InternationalTactical Technology Collective and Electronic Frontier Foundation may be a good place to start learning more about these issues.

Overall, as we continue to navigate the ever-changing technological landscape, individuals can no longer afford to ignore the power of data and the risks it can bring. The data anonymity problems addressed in this article are just several examples of what we are exposed to in our everyday lives. Therefore, it is critical for people to claim and request full control of and adequate legal protections for their data. Only by doing so can humanity truly enjoy the convenience of innovative technologies without compromising our fundamental rights and freedom.


[1] Privitar (Feb 2017). Think you ‘anonymised’ data is secure? Think again. Available at: https://www.privitar.com/listing/think-your-anonymised-data-is-secure-think-again[2] Privitar (Feb 2017). Think you ‘anonymised’ data is secure? Think again. Available at: https://www.privitar.com/listing/think-your-anonymised-data-is-secure-think-again[3] A.Narayanan and V. Shmatikov (2008). Robust De-anonymization of Large Sparse Datasets. Available at:https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf [4] A.Narayanan and V. Shmatikov (2007). How To Break Anonymity of the Netflix Prize Dataset. Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=[5] Helix. Support Page. Available at: https://support.helix.com/s/article/How-long-does-it-take-to-sequence-my-sample [6] 23andMe Official Website. Available at: https://www.23andme.com/[7] F. Dankar et al. (2018). The development of large-scale de-identified biomedical databases in the age of genomics — principles and challenges. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5894154/[8] Gymrek et al. (2013). Identifying personal genomes by surname inference. Available at: https://www.ncbi.nlm.nih.gov/pubmed/23329047 [9] Lippert et al. (2017). Identification of individuals by trait prediction using whole-genome sequencing data. Available at: https://www.pnas.org/content/pnas/early/2017/08/29/1711125114.full.pdf [10] US Department of Health and Human Services. Summary of the HIPAA Privacy Rule. Available at: https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html[11] Laws and regulations of ROC. Personal Information Protection Act. Available at: https://law.moj.gov.tw/Eng/LawClass/LawAll.aspx?PCode=I0050021[12] GDPR. Recital 26. Available at: https://gdpr-info.eu/recitals/no-26/ [13] Apple Inc. Privacy Policy. Available at: https://www.apple.com/legal/privacy/en-ww/ [14] Google. Privacy&Terms (effective Jan 2019). Available at: https://policies.google.com/privacy?hl=en&gl=tw#footnote-info [15] BoingBoing (Sep 2018). Fitbit has 150 billion hours of “anonymized” health data. Available at: https://boingboing.net/2018/09/05/fitbit-has-150-billions-hours.html [16] Fitbit. Privacy Policy (effective Sep 2018). Available at: https://www.fitbit.com/legal/privacy-policy#info-we-collect

By Hsiang-Yun L. on April 29, 2019.

Coase Theorem in the World of Data Breaches

“This is a really serious security issue, and we’re taking it really seriously,…..I’m glad we found this, but it definitely is an issue that this happened in the first place.”

— Facebook CEO Mark Zuckerburg

(after the company’s security breach that exposed the personal information of 30 million users.[1])

We now live in a world of data. Every single day, each one of us generates some very personal data about what we see, where we go, who we talk to, what we think and even who we are. Data is quickly becoming one of the most critical factors of production in the current market economy. Yet, it also brings negative externalities that cannot and should not be ignored for the market to function effectively. Many economists have proposed theories and tools to tackle the problem of externalities. In this article, I am going to specifically focus on the solution proposed by Ronald Coase in 1960, and show how the theory can be applied to the modern world of data.

When the Market Fails

Before diving into the Coase Theorem, we first need to first talk about “externality”, which can be defined as the positive or negative consequences of economic activities on third parties [2]. The externality is considered to be a form of market failure — as it is the spillover effect of the consumption or production of a good that is not reflected in the price of the good [3]. That is, the market equilibrium fails to capture and reflect the real cost/benefit of economic activity. Some everyday externalities that people encounter including air pollution and cigarette smoking. Another classic example of a negative externality is described by Garrett Hardin in his scientific paper named “The Tragedy of the Commons”, which discusses how individuals tend to exploit shared resources so the demand greatly outweighs supply, and the resource becomes unavailable for the whole [4].

Pollution is a classic example of a negative externality.

Coase Theorem: Assigning Property Rights to Tackle Externalities

Prior to Ronald H. Coase, who was awarded the Nobel Prize for Economics in 1991, economists were prone to consider corrective government actions as the solutions to externalities. For instance, by setting numerical limits on activities with external effects (Command and Control regulation), placing a subsidy to increase consumption of positive externalities, and internalizing the externalities using price system (Pigouvian tax). However, in his publication “The Problem of Social Cost” in 1960, Coase argues that there is a real danger that such government intervention in the economic system, in fact, leads to the protection of those responsible for harmful effects [5]. Instead, he suggests that the market can potentially solve the problem of externalities by itself if property rights are complete and parties can negotiate costlessly.

“We may speak of a person owning land and using it as a factor of production but what the land-owner in fact possesses is the right to carry out a circumscribed list of actions.”

— — C., Ronald (1960). The Problem of Social Cost.

To see how this economic theory can be applied to a real-world problem, let’s take a quick look into the Cap-and-Trade system.

Cap-and-Trade: A real-world application of Coase Theorem

Facing the global challenge of climate change, the European Union created the world’s first international Emission Trading System (ETS) in 2005 with the goal to reduce greenhouse gas emissions. The EU ETS works on a Cap-and-Trade principle — — A cap is set on the total amount of certain greenhouse gases that can be emitted by installations in the system. The cap is reduced over time so that total emissions fall. Within the cap, companies can receive or buy emission allowances which they can trade with one another as needed [6]. In other words, the cap to some extent represents the right to emit certain greenhouse gases, whereas the trading reflects the negotiations Coase argues that can lead to more efficient market allocation.

“Trading brings flexibility that ensures emissions are cut where it costs least to do so. A robust carbon price also promotes investment in clean, low-carbon technologies.”

— The European Commission

According to the EU, the ETS has shown good results as the cap on emissions from power stations and other fixed installations is reduced by 1.74% every year between 2013 -2020 [7], and the emissions are estimated to be 43% lower than in 2005 by 2030 [8].

Coase Theorem in the World of Data Breaches

Living at the age of big data, data breaches have become increasingly common in our daily lives. According to the Identity Theft Resource Center, the number of significant breaches at US businesses, government agencies, and other organizations reached 1300 in 2017, compared to less than 200 in 2005 [9]. This increase is partly due to the fact that the world’s volume of data has grown exponentially over the past decade, giving cybercriminals a greater opportunity to expose massive volumes of data in a single breach [10].

Although it is normally defined as an “incident” where information is stolen or taken from a system without the knowledge or authorization of the system’s owner [11], I suggest viewing data breach (especially those ones involving personal information) as a modern form of negative externality. It is because when the data that institutions captured from individuals to run their business get breached, individuals get spillover effects in terms of privacy and financial loss. Yet, the liabilities of such harm are not clearly defined and therefore taken into account within the market mechanism.

“We found no evidence that any developer was aware of this bug, or abusing the API, and we found no evidence that any Profile data was misused.”

— Google Official Statement disclosing the data leak affecting up to 500,000 accounts [12].

Take Facebook’s security breach in September 2018 as an example. 30 million people (more than the whole population of Australia) had their names and contact details leaked, and within which 14 million of them further had their sensitive information (include gender, locale/language, relationship status, religion, hometown, self-reported current city, birthdate, education, work etc) exposed to the attackers [13]. With the significant harm that this “incident” brought to people’s privacy, what Facebook did was apologizing, saying that it was “a breach of trust” and that they “promise to do better” for the users [14]. Yet no matter how sincere those apologies might be, they cannot and will not solve the core of the problem.

Data breaches cause great harm to society as well as individuals. However, such negative externalities are not well captured and reflected in the market.

It is, however, not to suggest solving data breaches with one-fits-for-all governmental regulations. Because according to Coase, we need to recognize the reciprocal nature of the problem. That is, a data breach cannot happen without Facebook failing to secure its data, but at the same time, it also cannot take place without the users willingly inputting data to the platform. So what is missing here, based on the Coase Theorem, is the clear definition of the rights to data.

In the World of Data where Property Rights are defined and defended

Based on Coase Theorem, the property rights to data, in fact, refer to the rights to carry out a circumscribed list of actions. A couple of examples of actions shall include:

  • Right to control access to one’s data
  • Right to monetize one’s data
  • Right to donate/give away one’s data
  • Right to defend the privacy of one’s data
  • …….

When the above rights to data are clearly defined, individuals are empowered to have legal recourse and bargaining power against a “data breach incident” that inflicts their rights. In the case of Facebook, for example, users will be able to confront Facebook in court for its failure to defend the user’s data privacy and use the data only for the permitted purpose (intentionally or not). Or even before the outbreak of a data breach, which seems inevitable for centralized data storage, users can already negotiate terms with Facebook for the potential risks that Facebook exposes them to. Facing such confrontation and consequences, Facebook will be forced to better capture the costs and risks it bears when storing/utilizing its users’ data. This might lead to a change of business model for Facebook or a new user-platform relationship where Facebook openly compensates users for the risks they are exposed to.

In short, as argued by Coase, once the rights to data are clarified, parties can openly negotiate terms and compensations resulted from the negative externalities — just like how we do with greenhouse gases — and therefore lead to better market equilibrium.

Baby step at a time to tackle market failures in the world of data

The Facebook data breach is not the first of its kind, and unfortunately will not be the last. In fact, it is estimated that data breaches will just become more frequent, bigger and more expensive in the near future. Therefore, although Coase Theorem, similar to all economic theories, has its limitations with real-world applications, it still sheds lights on how defining the rights to data can be the first step toward solving digital world negative externality such as data breach and enabling a better-functioned market mechanism in the long-term.


[1] The New York Times (Sep 2018). Facebook Security Breach Exposes Accounts of 50 Million Users. Available at: https://www.nytimes.com/2018/09/28/technology/facebook-hack-data-breach.html

[2] Quickonomics. Positive Externalities vs Negative Externalities. Available at: https://quickonomics.com/positive-externalities-vs-negative-externalities/

[3] Intelligent Economist. Introduction to externalities. Available at: https://www.intelligenteconomist.com/externalities/

[4] Investopedia. Tragedy Of The Commons. Available at: https://www.investopedia.com/terms/t/tragedy-of-the-commons.asp

[5] C., Ronald (1960). The Problem of Social Cost.

[6] European Commission. EU Emissions Trading System (EU ETS). Available at: https://ec.europa.eu/clima/policies/ets_en

[7] European Commission. EU Emissions Trading System (EU ETS). Available at: https://ec.europa.eu/clima/sites/clima/files/factsheet_ets_en.pdf

[8] European Commission. EU Emissions Trading System (EU ETS). Available at: https://ec.europa.eu/clima/policies/ets_en

[9] Priceonomics. Why Security Breaches Just Keep Getting Bigger and More Expensive. Available at: https://priceonomics.com/why-security-breaches-just-keep-getting-bigger-and/

[10] Digital Guardian (Jan 2019). The History of Data Breaches. Available at: https://digitalguardian.com/blog/history-data-breaches

[11] Trend Micro. Data Breach. Available at: http://www.trendmicro.tw/vinfo/us/security/definition/data-breach

[12] Google (Oct 2018). Project Strobe: Protecting your data, improving our third-party APIs, and sunsetting consumer Google+. Available at: https://www.blog.google/technology/safety-security/project-strobe/

[13] Facebook Newsroom (Oct 2018), An Update on the Security Issue. Available at: https://newsroom.fb.com/news/2018/10/update-on-security-issue/

[14] The Verge (Mar 2018). Mark Zuckerberg apologizes for Facebook’s data privacy scandal in full-page newspaper ads. Available at: https://www.theverge.com/2018/3/25/17161398/facebook-mark-zuckerberg-apology-cambridge-analytica-full-page-newspapers-ads

By Hsiang-Yun L. on February 26, 2019.

I can’t put the toothpaste back in the tube, what I found when I downloaded my Facebook data

Above is the first post ever written on my Facebook wall. I’m sure it was intended as an innocent, warm welcome, but now it reads more like someone escorting me into my own personal version of the “Hotel California” line: “You can check out any time you like, but you can never leave…”

I’m not so sure Facebook is such a wonderful addiction anymore. Since writing this post I’ve been off Facebook for two months and I want to make a commencement speech to all the youth out there called “It Gets Better.” Perhaps it will be subtitled: “I Didn’t Realize How Much Anxiety Social Media Gave Me Until I Quit It Entirely.”

I recently downloaded all of my Facebook data in preparation to delete my account. I’m not a super-user, I’m not addicted to it. I have genuinely loved using Facebook to connect with my personal network for 11 years, but it’s come to feel like the tide has shifted and now Facebook is using me. Namely Facebook is using my data against me — collecting my every move, likes, interests, conversations, all without my explicit consent or knowledge of how my data is shared, and with whom — I’m finally, really, actually not okay with that.

I have worked in marketing, advertising and public communications for the last 18 years (read: I know how to target communities with FB ads in my sleep). I have been a community organizer for various organizations in my local community (read: I’ve created loads of FB events and FB campaigns). I live in Silicon Valley (read: I’m an early-ish adopter). Plus I have an insatiable curiosity when it comes to stories, people and social trends. What all of this adds up to is I’ve loved Facebook — as a personal and professional tool to get connected to headlines, to culture, to people, to trends, to news, to events, to stories, to gossip, to all of it. Facebook has been a great platform for me to get the connection I desire: I can get the scoop, I can share my scoop, and I can freely lurk around the scoops that everyone else is talking about.

This year something changed. I began to realize the vast amounts of information that Facebook tracks, uses and shares with others. Sure, everyone has had the experience of doing a browser search for … I don’t know … “Best brand of skinny jeans” and then 2 minutes later, there are 5 different brands of skinny jean ads showing in your FB feed. Those targeted ads seemed creepy but mostly harmless, until I realized that’s just the surface. It dawned on me how much information I’ve freely handed to a massive private company, and I am just beginning to comprehend the dangers of not being able to put that toothpaste back in it’s tube.

Facebook has gathered years of my daily activities, online and offline (here’s a list of 98 points it tracks), and spread it across advertisers, companies, websites, lists, etc.

After doing a tiny bit of research on best practices for deleting a Facebook account, I promptly found that I can download all of my Facebook data.

Sifting through this download was astonishing. Remember, I’ve spent years of my life helping brands advertise on FB, I understand how to leverage the information folks list in their profile to target ads, but I was not prepared for what I found in my downloaded data.

Facebook stores literally everything that makes up your profile, everything on the devices you connect from, every move you make, from start to finish:

  • Of course there are all the videos, photos, birthday messages, direct messages, and ephemera you’ve uploaded. That’s great, thanks for holding those for me FB!
  • There is a log of all the times you signed in, including the IP address from where you logged in, and the geolocation from where you logged in. This means it always knows where you. Okay, fine, what’s the rub? Well, FB can make logical leaps based on the intersection between the things you “Like” on the platform and your location: stores you walk into, businesses you frequent, the gym, the market, the office, the bar, the ____ you’re in everyday. That’s a bit stalker-ish.
  • There is a list of all of the personal contact information that has ever been on a device from which you’ve connected to Facebook. (This is where it starts to gets super creepy.) My FB data download has every single phone number I’ve ever had in my phone, any of my phones, any of my devices. I know for a fact that I’ve never listed my phone number on Facebook (privately or publicly), not once. But yet, in my FB data download, I found contact information for all my friends and family members, not to mention loads of people I don’t actually know or care to know: my old landlords, my neighbor’s dad, the lady who bought my table from a Craigslist ad, the nail salon I went to once, phone numbers I know I’ve deleted from my contacts, literally every number is still stored in FB.
  • There is a giant list of all the apps you’ve ever connected to Facebook. Obviously it has a log from Spotify, and with that how many times I’ve listened to Billy Joel last 10 years, #notevenalittlesorry. But I don’t even recognize most of the app names in my Facebook data, and now I know all of these companies have my personal information, too. With some more digging (and research) I’ve learned that this web of my personal data extends even farther: the apps your “friends” are connected to also collect your information. Even if you rarely use FB or are strict about how you connect your profile, any of your “friends” might be soliciting your info out without you knowing it.
  • There is a longer list of all of the ads ever clicked. Honestly, I don’t remember clicking on a single ad while on FB, and yet the list in my data is loooooong.
  • Get this, there’s an even bigger list of all of the companies and advertisers who have your demographic and personal information. I don’t know half of the companies listed in my data, but apparently they know me! And with undisclosed data leaks in the news lately, this is alarming.

Some people say, “But I don’t have anything to hide! I don’t care if FB knows where I live.” You may not care right now, today. But this is an unprecedented amount of permanently undeniable information that we have about ourselves. When teenagers used to talk on the phone with each other all night there weren’t private companies tracking the content, location, diction, or patterns of their chats. This is a new age, and we have no idea in what ways this information will be used by Facebook in the future. What if this information is used against you 30 years from now, in a court of law, where you’re the main suspect in the murder of a coffee shop barista, only because you went to that coffee shop every day for two years straight, which FB knows because you always check FB when you’re waiting for your coffee? But you didn’t even know the barista who was murdered and you coincidentally stopped going two days after he was murdered because a better coffee shop popped up two blocks in the other direction. Maybe that’s far-fetched, but it doesn’t take much to imagine a circumstance when you will find yourself wishing you’d limited how much traceable information is available and beyond your control.

Currently, there are no settings that allow me to limit what data is collected, how, why, or when it is shared, and zero control with whom it is shared. This is the number one reason that I’m leaving Facebook: I’m not allowed to control which or how much data is collected from my profile.

Now, I’m no privacy vegan (thank you, Eva, for the term) — like I said, connection is key to my life. I don’t want to live off the grid, I don’t want to go away, I don’t want to miss precious pictures of my many nephews who live in disparate parts of the country, but I do see the speed train we are on here, heading toward a data economy, and I definitely want to be able to choose how, when and by whom my data is collected. I want a digital environment where I can exercise control over the data that I generate, over the conversations, preferences and interests I have. I’m not comfortable with a platform that knows so much about me and my daily habits and also uses that information at their private-multimillion-dollar-company whim.

This is my number two reason to leave Facebook: The data that FB collects is used without my knowledge or consent in whatever way the company chooses to use it.

Reading Facebook’s terms of service and privacy policy is disheartening. It states, “We use all of the information we have to help us provide and support our Services.”

Let me reiterate: We use. All. The information. We have.

There it is, their license to freely share, use, collect, keep, destroy, sell, manipulate my data, however they choose.

There’s one more reason that I gotta disconnect: Facebook has partnered with so many websites and data mongers that whether you are logged in or not, Facebook is constantly tracking your every move. “It’s alerted every time you load a page with a “Like” or “share” button, or an advertisement sourced from its Atlas network,” says the Washington Post. The Facebook eyes are everywhere, and I don’t want them watching me anymore.

Case in point, while I was researching and writing this article, I was served many “security” posts from Facebook:

Facebook, I’m sure, is set to serve these “security” notices when it notices you’ve been Googling “How to delete Facebook” and “What data does Facebook collect” and “Privacy policy, Facebook.” But these notices aren’t about security, or data, or the platform. They are about how to control who in your network can see what, which is a very superficial version of “privacy controls” and not enough for me.

Getting the latest scoop is certainly not worth the privacy concerns I have, imho.

Of course Facebook may not be the biggest or worst culprit of data manipulation, but surely it is in the top echelon because of the sheer size, reach, and global influence of the company. For me, closing this digital chapter is step one in choosing to live a more empowered digital existence. Seriously, it gets better.

By Maureen Walsh on January 05, 2018.

What to expect when you’re using Bitmark

In modern society, freedom and economic prosperity means property ownership and accumulation of wealth. But increasingly the stuff we create and value most exists only in our digital lives, where there’s no system for individual ownership. The Bitmark mission is to empower universal digital ownership so we can live free online. To get there, we are making simple tools that help you to interact with the Bitmark property ownership system. These tools will enable a new level of control and freedom over your digital life. You can:

  • Buy digital works, directly from their creators, that give you — the owner — full rights to use, sell, and trade.
  • Donate your health data, trapped in your devices, to researchers advancing the frontiers of public health.
  • Begin a digital estate to protect your online legacy and wealth.

How to get started

Bitmark will be easily folded into your everyday patterns. We’ve integrated with IFTTT to make asserting ownership over your digital life seamless:

If this, then bitmark

Once you’ve established a Bitmark account, quickly connect it to IFTTT and choose which applets will help you most. Some popular options involve collecting your photos across social media and storing them in one simple place. Your articles can be bitmarked and shared across the web with protected attribution. Or, you can donate the health data from you phone to aid researchers advancing public health.

When your personal data and digital assets can be automatically converted into your own property, the opportunities are almost endless!

When you’re interested in transferring your data to someone else, or perhaps someone asks to receive your digital property, the steps will be very simple to get you there. Since the individual is central to the Bitmark mission, we use the most direct interface possible. Here is an example of a photograph entitled Dune in California that Xarene created then shared over Instagram and bitmarked (assigned property ownership to it) using IFTTT:

Everyone in the world can view the bitmark for this property:

A bitmark contains all the important information necessary to authenticate the ownership claims over a digital property. Included is also the provenance — the full ownership history of the property.

Is Bitmark secure?

A good way to understand security is by asking “Who is in charge of security?” The Bitmark system was designed to put individuals in control.

Your data is first encrypted and then stored in cloud storage that you control. That is totally private. The property title for your asset, known as the bitmark, is stored in the public Bitmark blockchain. That is public so it can be authenticated by anyone.

When you want to transfer ownership of your property (eg: sell, donate, or bestow), technically you are transferring ownership over the bitmark to a new owner. Then they will have rights to the data. Transfer records form a “chain of ownership” acting as a public record that protects both parties to transaction. Once the ownership transfer is recorded in the blockchain, the data will be re-encrypted for the new owner and then transferred to them.

You don’t need to know how blockchains work to use Bitmark. Conceptually, what’s important to understand is that blockchains allow for the exchange of value without central intermediaries. This is something that was thought to be impossible. It’s why Bitcoin, the blockchain for money, is significant. It’s money, issued by the people (not governments or corporations) that can’t be forged or censored. Similarly, Bitmark, the blockchain for property, is significant because data can become a property, owned and stored and transferred by individuals freely as they choose.

What is this all for?

At the simplest level, property is provenance: the “chaining of ownership” from the present all the way to the origin. The ability to demonstrate clean title is what protects one’s investment in a property by guaranteeing strong provenance.

In today’s digital economy, individuals create so much value, yet we have little to no ownership over the content we create and the data we generate.

Bitmark’s tools define a new digital freedom by providing an economic framework of standardized property rights, rules, and infrastructure.

The system will allow individuals to derive value from digital property just as we do from the things we own in the physical world, including selling, buying, transferring, donating, licensing, passing down, protecting, and much more.

When you create your Bitmark account, you will be part of the digital revolution, you will own your piece of the digital economy. Once you’re in the network you will be able to:

  • Feel relief if you’re an artist because your images, artwork, doodles, messages, prints, photos, and work will be yours and yours alone.
  • Take comfort in the security that your digital files, statements, contracts, and information will be safe and tracked if something should happen to you
  • Have peace of mind that you can legitimately share the pieces of your digital world with anyone you choose to and that no one can take that away.
  • Feel like the cool kid on the block because you have just stepped onto the cutting edge of the new digital infrastructure.

Join us!

Here’s our IFTTT service: https://ifttt.com/bitmark

By Bitmark Inc. on June 15, 2017.

UC Berkeley and Bitmark partner to bring data donation to public health studies

UC Berkeley and Bitmark partner to bring data donation to public health studies

Bitmark technology allows users to take ownership of their digital lives and help advance the frontiers of public health

Our phones and Fitbits track our steps, calories, sleep cycles, and more. This data is empowering and helps improve our wellbeing. This data can also aid research in myriad areas. What if you could safely donate your data directly to those who are advancing the frontiers of public health?

Today, we are extremely excited to announce our partnership with UC Berkeley School of Public Health to explore how to accomplish exactly that.

Bitmark will fund two School of Public Health research fellows to conduct studies that securely incorporate personal data from our phones and other devices. When students return in the fall semester, they will have the opportunity to transition from passive internet users to active participants, taking ownership of their digital lives, and helping to advance public health.

“The School of Public Health at UC Berkeley is excited to partner with Bitmark Inc. on this research fellowship. It is a great opportunity for our young researchers to gain valuable hands-on experience at the intersection of public health and technology.”

— Lauren Goldstein, PhD, Director of Research Development, School of Public Health, UC Berkeley.

Announcing the Berkeley fellows!

Bitmark is pleased to announce Madelena Ng and Victor Villalobos as the fellows who will be using Bitmark technology in their future studies as part of this new partnership. In addition to funding support, Bitmark has committed to providing dedicated engineering resources to work closely with fellowship recipients to help them realize their research goals. Here are the brief abstracts about each of their research project plans:

Study 1: Ameliorating Recruitment Challenges for Women’s Health: Madelena Ng

Madelena is a doctoral student in Public Health. Her research aims to evaluate whether digital health technologies alleviate existing challenges in clinical research.

Objective: This study assesses whether recruitment and data collection into a women’s health focused digital study is optimized by leveraging Bitmark’s capability with securing personal data ownership.

Hypothesis: Physical activity and quality of sleep are consistently reported to promote better overall health; we hypothesize these factors are positively correlated with a telling element of women’s health — regular menstrual cycles. In addition, we hypothesize that educating potential participants about data ownership and the Bitmark app will lead to improved participation and participant experience in the proposed digital study.

Methods: Personal health data will be sourced entirely from digital health technologies, specifically Fitbit and Clue, to assess the effects of health behaviors (e.g., physical activity and sleep) on the menstrual cycle. In addition, eligible participants will be randomly assigned to either receive or not receive an education module about data ownership and the goals of the Bitmark app.

Study 2: Improving Diabetes Care Protocols by studying remission cases: Victor Villalobos

Victor is a doctoral candidate in Public Health, his expertise is behavioral design, biostatistics, and lifestyle interventions.

Objective: The objective of the diabetes remission registry is to refine and improve diabetes care protocols through the study of diabetes remission cases.

Methods: Participants will be recruited through digital and traditional channels. After informed consent and verification of clinical improvement, we will apply qualitative research instruments about their natural history of remission. With the use of Bitmark, participants will be able to share detailed information regarding their lifestyles — i.e. dietary composition and frequency; physical activity intensity, duration and frequency; sleeping patterns — and physiological indicators collected through their smartphones and connected devices (i.e. weight scales, heart rate wrists monitors, etc).

Expected Results: We expect to generate insights on the dietary, physical activity and psychological strategies that increase the probability to achieve and maintain diabetes remission.

Further details of this research can be found at diabetesremission.org.

How the studies will work

Bitmark is developing simple tools that connect researchers to potential data donors through popular Messenger apps such as Facebook Messenger and WeChat. These tools, also known as a “bot,” automate the entire donation process:

  1. discovering available studies,
  2. extracting personal data and converting it into digital property,
  3. recording consent such that a researcher can use the valuable digital property in their study.

Berkeley students will know exactly where their data is being used and for what purposes; researchers can directly confirm the provenance of data and the students’ consent to use it. Behind the scenes, the Bitmark bot interfaces with the Bitmark blockchain to provide a verifiable record of data donations, protecting both the researcher and data donor, without relying on central intermediaries.

Studies will collect data two main categories of data:

  1. iOS HealthKit data — such as characteristics (birth date, blood type,…), basic samples (height, weight, body fat,…), sleep samples, food samples (calories, vitamins,…), exercise samples (steps, flights climbed,…) and reproductive samples.
  2. Health tech wearables, devices, and sensor data from over 300 different data streams — such as Fitbit, Nest, Aware, and more.

Individuals can also ask the Bitmark bot simple questions such as, “How is my data been used?” and get back instant answers. At any time participants can opt out of donating data.

About Bitmark

Earlier this year, Bitmark launched their technology in private beta to allow all individuals to own and share their digital data, and take advantage of the value they create online. Currently our personal data is being held in our smart phones, fitness tracking devices, and more; with Bitmark individuals have the freedom to share their data with other individuals, companies, non-profits, schools, and more.

Bitmark is still in a private beta, if you would like to keep up with when the public technology will be released, please sign up to receive infrequent emails here.

By Bitmark Inc. on May 23, 2017.