MIT-White House Big Data Privacy Workshop

Monday, March 3, 2014

As prepared for delivery

Judging by our turnout this morning, it is clear that the question of Big Data and privacy is not only very interesting and important, but it is so to a very broad range of people. This gathering includes leaders from industry, from universities, from government. We have here among us experts on policy, technology and politics. And our speakers will explore Big Data privacy issues in terms of health care and education, civil liberties and national security – and more.

And we need this diversity of perspectives, because the subjects we explore today affect the whole spectrum of society.

We are together this morning because President Obama has focused his Administration on finding the best path forward for the nation on the complex and urgent questions around Big Data.

This workshop is the first of three university-based events that the White House is co-sponsoring, to bring the most important issues to the surface.  Our assignment is to look at privacy: How do we capitalize on Big Data’s potential for good, while maintaining essential privacy protections? And how do we design future technologies, policies and practices, to get that balance right for our society?

And we are here because MIT brings substantial strength in this field, strength magnified today with the insight and experience of our brilliant colleagues from other sectors and institutions.

With all this talent here today, we have been asked to help define the terms of the national conversation, to help set its direction and to raise its ambitions. Central to this conversation are the tensions around Big Data and privacy: the tremendous opportunities, and the profound and pervasive risks.

At this gathering of experts, on a day that promises superb speakers on these topics, I will be brief! But I want to offer one example of how those tensions affect our work at MIT. Not our work on Big Data, but our work in other domains where Big Data is a fact of life, from biomedicine and health care, to energy, to digital learning. In these fields and many others, figuring out how to use Big Data, for the most benefit, with the least risk of harm, presents a fascinating and deeply important challenge.

For instance, MIT is helping to push the frontiers of digital learning – both for a global audience of millions, and for our own students at MIT.

(Later this morning, we will hear from Anant Agarwal, president of edX -- the open source global learning platform that MIT and Harvard launched two years ago. I will just touch on the issues for the digital courses created by MIT faculty on the edX platform, which we call “MITx.”)

 For us as educators, today’s digital learning technologies offer an incredible new opportunity to learn about learning. To give you a sense of scale and scope:

  • MIT has about 129,000 living alumni.
  • But in less than two years, MITx has attracted more than 760,000 unique registered learners, from more than 190 countries. And they have already generated more than 700 million records of student interactions with the edX platform.


We want to study the huge quantities of data about how MITx students interact with our digital courses. We want to measure what really works. We want to use what we learn to improve the way we teach – and to advance the science of teaching overall. And there is so much to learn that we also want others to be able to study our data: We intend for our MITx data to be constructed and curated as a public trust.  All worthy goals, I believe.

But at the same time, we value privacy. And so does the federal government: MITx student data are governed by the Family Educational Rights and Privacy Act, or FERPA. (FERPA is the same law that says that students over 18 have the right to keep their academic records private – even from the parents paying their tuition!) That means that any efforts to use MITx data run into significant practical challenges, as well as serious ethical constraints.  

For instance, legally speaking, in a “Massive Open Online Course,” who counts as a “student,” with protections under FERPA? Those who register, but never view course content? Those who view about half of the course content? Those who explore the course deeply, but don’t take the final exam? Or only those who actually earn a certificate?  Are all these subpopulations “students,” in the FERPA sense?

And, to allow for research, can their data be suitably de-identified? MITx classes include a public forum component. To date, MITx users have posted more than 423,000 forum entries. Their postings often include a great deal of personally identifying information. Correlating forum data with institutionally released de-identified data could produce serious breaches of personal information. How do we set the boundaries, and balance the competing interests?

If you believe in the potential of digital learning, you have to care about the larger question: How can we harness this flood of data to generate positive change – without destroying the very idea of privacy? Parallel questions hover over our work in field after field.

Fortunately, many people in this room are better qualified to answer these questions than I am, including our keynote speaker. So it is my pleasure to introduce someone who has been instrumental in shaping US policy on digital privacy since the 1990s – and the person President Obama has trusted to lead the national conversation on Big Data today.

From 1998 until 2001, John Podesta served President Bill Clinton as chief of staff, with responsibility for all White House policy development, daily operations, Congressional relations and staff activities. He coordinated the work of cabinet agencies around federal budget and tax policy. And he served in the President’s Cabinet and as a principal on the National Security Council.

Previously, he held positions of influence in Washington on a wide variety of subjects, including agriculture, patents, regulatory reform, telecommunications, security, terrorism, government information and privacy. A lawyer by training, Mr. Podesta is a visiting professor of law at the Georgetown University Law Center.

In 2003, he launched the Center for American Progress, a leading progressive think tank that focuses on “21st century challenges such as energy, national security, economic growth and opportunity, immigration, education, and health care.”

In 2008, President Obama chose John Podesta to lead his transition to office. After that assignment, Mr. Podesta returned to the Center for American Progress as president and CEO. Then last December, President Obama asked him to rejoin the White House team. In his current role as Counselor to the President, Mr. Podesta is driving forward a variety of key initiatives, from workforce education and the President’s Climate Action Plan to this new work around Big Data. 

We are very fortunate, and very grateful, to be able to count on his wisdom today. Unfortunately, this storm hit DC much harder than Boston, so Mr. Podesta will be joining us by phone. Please join me in welcoming the voice from DC of Counselor to the President, John Podesta.

*          *          *          *          *          *          *          *          *          *          *         

Good afternoon. Welcome back!

Our lunch conversations set the stage for this afternoon’s outstanding speakers. It is my pleasure to introduce the first of them.

But before I do that, I want to acknowledge two other special guests, whose presence illustrates how important these discussions are and increases the odds of our arriving at meaningful policy answers.

  • One is Massachusetts Secretary of Housing and Economic Development Greg Bialecki. As Governor Patrick’s chief advisor on business development, regulation and consumer affairs, he has been an active supporter of our policy work on Big Data.
  •  And I also want to recognize a new member of the MIT community – former Acting Secretary and General Counsel to the US Department of Commerce, Cameron Kerry. Among his many achievements, he led creation of the Consumer Privacy Bill of Rights and negotiated the key US data sharing and privacy agreements with the European Union and China. Cameron Kerry recently joined MIT’s Media Lab as a Visiting Scholar, where he works with Sandy Pentland on the evolving technology and policy issues around Big Data. We are honored and delighted to have him with us.


Now, a few words of introduction for our keynote speaker, US Secretary of Commerce Penny Pritzker.

Before taking office last June, Secretary Pritzker built a career as a civic and business leader, in industries from real estate and hospitality to financial services. She has served on many corporate boards, and is past executive chairman of TransUnion, a global financial services information company. And President Obama previously appointed her to serve on the Council for Jobs and Competitiveness, and the Economic Recovery Advisory Board.

Her civic interests focus on improving public education. A former member of the Chicago Board of Education, she is also past chair of the Chicago Public Education Fund, the first venture philanthropy to raise private equity to invest in public schools.

The Department of Commerce is the keeper of several big government data sets – Secretary Pritzker has made unleashing the power of Commerce data a top priority of her “Open for Business Agenda.” At MIT, we also know Secretary Pritzker as a strong advocate for the national Advanced Manufacturing Partnership, known as “AMP 2.0.”  It is my honor to serve her as co-chair of that effort – and it is my pleasure to introduce her now, on her first visit to MIT. Please join me in welcoming Secretary Penny Pritzker.