What are the most interesting insights you've uncovered by tracking our government official's browsing activities?
Sofar we don’t have enough data to be able to say anything concrete(at least not publicly).
However, the coolest insights for me would be tracking major news sites. We would be able to see where politicians are getting their news, and potentially how that informs the actions of our federal government as a whole.
It would also be really cool to get it on social media sites like Facebook, and a ton of forums and blogs, to try to paint a picture of what kinds of things our representatives do in their free time. It could paint a more human picture of the most powerful people in the country.
When and how do you plan on releasing the data you have gathered?
We are actually treading carefully around what/how we release the data we collect.
There are a few lines of reasoning behind this:
We want to balance releasing timely insights about our data with the ability of Congress to use that to identify and work around our collection methods.
To limit our exposure legally from different methods of releasing the data.
Determine if there any ways that releasing certain data can undermine our fundamental message. (We are trying to shed light on the ISP Privacy Law that passed in April.)
We want to stay low key and gather as much data as possible to get a more complete picture. Similar to how investigative journalists frequently operate-- if they find a potentially interesting story, they don't publish immediately. They'll keep pulling threads while staying low key to try to get as complete a picture as possible.
Is this legal, will you get in trouble for this?
We've had a lot of help from really cool and awesome legal experts, such as- Ben Wizner of the ACLU, and Anne Klinefelter of UNC, to name a few. What we've found is that our exposure in this project is minimal, and in some cases, it's the sites that use the plugin that may be at risk.
Can you really tell anything from all the random/raw data you collect?
The cool thing with having raw data, is that once we hit a critical mass, it becomes exponentially easier to learn interesting information from the data.
We don't want to release this data and then have the internet collectively weaponize this data against our representatives. So, we need to be really careful with how we use it.
Are you going to make this tool open source, for everyone to use?
Yes. In fact, we will make it open source in a variety of languages and formats.
Right now, we are focused on testing the technical aspects of our tool and data management pipeline. We are trying to hit the sweet spot of a few different targets before we release it to the public:
2. It needs to be robust enough that it can't easily be disabled. This actually conflicts with our client-side only goals, because the more we allow for server-side implementation, the more it can circumvent methods to disable it.
3. It needs to leave as minimal a privacy footprint as possible. We want to avoid sucking in any non-government data, and that means putting a lot of code in the tool that makes it bulkier and more vulnerable to disabling.
So, it sounds like the tool would require cooperation from web masters (and their developers) to implement at all, right? Have you had much success in getting sites to implement the client-side code?
There are a few hurdles to getting a site to implement a tool, but way more hurdles when getting a business to implement the tool on its site.
We're trying to hit a simplicity-robustness-privacy sweet spot, where we want to make a tool that's easy to install, hard to circumvent, and extremely selective in who it tracks.
How will you deal with paring down everyone's data into just the population your seeking to collect on?
We are basically parsing the data through a series of filters similar to this:
Only Federal IPs
Some low-tech solutions that we found to easily filter out non-government data
Some slightly-higher-tech solutions that identify data that definitely belongs to Congress/FCC/White House Administration
Some classifiers that will be able to further segment the data the more we receive
Isn't this mostly just tracking what underpaid interns are doing while they're supposed to be running to Starbucks?
Yes and no.
Just because interns exist doesn't mean reps are immune from tracking. The irony here is that the ISP privacy law was based on the legal argument that ISPs are not utilities, and so are exempt from regulations that apply to utility companies.
If that's so, then congress should be able to work around having to use the internet (and being tracked on it) in the same way they expect us to, and not have it impact their job.
If they can't, then it's a pretty clear indication that ISPs are providing a public utility, and should have to safeguard our data in the same way utilities do.
What we want to do is narrow the scope of our data to Reps and their offices. Especially since much of the policy research is done through their staff, it’s important we try to paint a complete picture of the behaviors of their administrations.
So, what is the ethical justification for Congress Web History? Isn’t this morally wrong somehow?
In order to answer that question (both to you and to myself) I needed to try to pin down the set of axioms that govern my understanding of the right to privacy, and the role the gov't plays in protecting it. In order to do that, I needed to lay out what my beliefs about privacy were, at least as they apply to Members of Congress. To figure out if those beliefs were logically consistent, and then try to validate why each of those beliefs worked.
So here is my explicit understanding of our right to privacy on the internet as it relates to the question at hand.
All private citizens have a right to internet privacy.
Private citizens, when acting on behalf of another entity, may give up some of their rights to privacy to that entity
However, the concession is limited in scope to the activities performed on behalf of the second entity-- that is to say, if a private citizen is acting on behalf of another organization, that organization does not have the right to view the activities of that citizen when they are acting privately.
A concrete example of 3 would be, if you are using a work computer at work, they may be able to view your activities at work. However, that doesn't give them the right to see what you are doing on your computer at home, on your own time.
I work a day job, and while at that job, I understand that everything I'm doing there is monitored. I am ethically fine with that, because while I'm at work, I am acting on behalf of my employer, and not as a private individual, so I do not have a right to privacy separate from that of my employer.
I don't think everyone shares that same viewpoint, however I think most would agree that this is a least a consistent/coherent philosophy of internet privacy.
Coming back to the broader point about Congress-- when a representative goes to work on Capitol Hill, they are not acting as private individuals. They are acting as public officials and representatives of their constituents, and the federal government. As such, they do not have a right to privacy separate from that of the Federal Government. And the general consensus (and the law) is that, roughly, the activities of the Federal gov't should be as transparent as possible, insofar as that transparency does not put the United States at risk.
As a result, their browsing habits are not protected by their individual right to privacy, because they are not acting as private citizens during that time. Which means that their browsing history is something that can (and many would say should) be available to the general public, without it being a violation of privacy.
However, there are limits to this, as per item 3. Just because we are tracking their browsing patterns on Capitol Hill does not grant us the right to then try and track their online behavior when they go home--they are no longer acting on behalf of the U.S. government, and so their private right to privacy protects them from our us snooping through their browsing patterns.
Are you affiliated at all with the Congress-Edits Twitter account that tracks Wikipedia page edits from Congress? If not, are you aware of their account/ code?
We are not affiliated, but yes, we are aware. Ed Summers is one of the inspirations for this project, and we have been in dialogue with him about our own project. Being able to look through his data has helped us better understand what we should track.
How do you separate out the data that comes from publicly available hotspots?
A combination of things-- not all of which we want to talk about. However, one way is the IPs within federal blocks will be categorized into 3 groups: private, public, mixed. We have a number of techniques to figure out which is which, but we'll give you a hint for one: when do Congressional offices close to the public? ;)
Do you think that if Congress people find themselves under the same lens they're trying to turn on us, they'll re-consider their views on privacy?
Ican't remember who specifically this was, but someone on Quora said it really well:
We're probably not going to change the minds of any of the hard-core anti-privacy individuals (i.e. most of the people who voted for this law).
ut in many ways, we're not trying to. We’re trying to educate around them. They may not change, but I'm sure there are plenty of people in their constituencies who do care and just haven't heard about what is going on. We are trying to reach those individuals, because if they know about this law and are angry about it, they can reach out to their reps about it.
And if their reps don't change, they can vote the reps out.