Unless you’ve been living under a rock, you’ll know by now that government agencies around the world are watching everything you do online, collecting this data and using it for various undisclosed purposes. Even before then, we knew that various private companies were harvesting data on us, and we could only hope that the worst they wanted to do was sell us things.
To say I wasn’t comfortable with this arrangement was something of an understatement.
So, I used all this as a spur to get my data out of NSA/Big Corporate controlled systems and onto FOSS based platforms that I own and control.
Starting Point
I was somewhat fortunate in regards my starting point. I had never bought into gmail, so my email accounts are hosted by a private mail server, to which I connect over an encrypted link. My main server, which hosts, among other things this website, is run of a private server in Germany.
My main computers at home are Linux based, and I already make extensive use of encryption; I use DNSCrypt to secure my DNS lookups from prying eyes, have HTTPS Everywhere and Adblock Plus installed on every browser, and secure sites with HTTPS (made considerably more affordable by StartSSL’s provision of free SSL certificates), and private code is hosted on my gitolite (nee gitosis) install rather than Github.
However, I still made use of services like Dropbox and Google drive, talked on Google chat, and use Google analytics for tracking.
The low hanging fruit…
The first thing I did was to grab and install a whole bunch of free certificates from StartSSL to remove the browser warnings from a bunch of the non-user facing sites that I run. This was important since the browser warning encouraged people to click through errors, and since the site always generated an error (even thought the site was being encrypted) it would be very vulnerable to MITM attacks.
Once this was accomplished I installed ownCloud, with the client software configured to talk only to the HTTPS endpoints. This was painless, and basically just a matter of downloading and installing the server software on a subdomain for it (the latter isn’t strictly necessary, but I like having things separate like that). The ownCloud client works exactly like the dropbox one, and is available for Linux, OSX, Windows (and a paid for one for iOS – presumably to drum up some money for the project – but it’s only a few pence).
Next, I started moving my sites away from Google Analytics. The open source world has moved a long way since I last looked at this, and Piwik, the best of breed, is very performant. Again, it was just a matter of installing the software on my server and then changing the embed code on the various sites. WordPress has a very functional plugin that integrates nicely with most themes.
The last easy thing I did was to change my browser’s default search engine from google to Startpage. The reason I picked Startpage over DuckDuckGo (which is the other main alternative) is twofold, firstly, the engine piggybacks off of google (but with identifiers removed), and despite while Google profile you for the NSA they still built a damn good search engine. Second, as a US company based in Pensilvania, DDG falls squarely under the sinister shadow of the US Patriot act and FISA, so, regardless of what they do now, they could still be forced to start spying.
Next, the harder stuff…
Update: while at the time of writing the, events in the pressure cooker article, linked above, were believed to be the result of active surveillance on the part of google, it now turns out to have been the result of an employee tipoff. Nevertheless, it seems nightly unlikely that this honeypot of profiling data isn’t being actively monitored, given how much other stuff is, although at the moment we have no evidence. This is one of the things that makes the Snowden revelations so frustrating.
Methinks you did not read the pressure cooker article carefully. The article states it was employer tipoff, not Google profiling or government spying, that led to the cops showing up.
Hmm! I’m sure they stated google monitoring when I read the article, oh well thanks for the correction, will edit!
“Performant?” I thought. “Is that a word?” Turns out there’s a slew of evidence. Drafting suggestion for PERFORMANT adj. made!
Hah! Do I get credit? 😉
Brothers in arms in this war… One aspect of prism breaking that I’ve been mulling over for a very long time is search. I’ve been in personal dialogue with the core team at duckduckgo about integrating their search results into the Minds.com search (built on elasticsearch) to which they replied:
We don’t have syndication rights from our partners for the web results
and therefore can’t hand it out to other sites:
https://dukgo.com/help/en_US/company/partnerships
Overall they have been great in communicating but ultimately the project just isn’t free enough! Similarly startpage has privacy features but no API that I can see. Now, yandex has an api but it isn’t open source so even if we use that (which ddg does) then we still don’t know who’s watching us. Perhaps Minds could do something similar to what startpage does just so we can atleast get SOMETHING of web results in our search but it isn’t ideal.
yacy.net is more pure decentralized search which I am looking at now but their api works with solr. Curious on thoughts.
Yes, search is a hard problem to crack, and centralised solutions are always going to be open to abuse.
In my ideal world, search would be entirely peer to peer and distributed, so something like yacy.net is something I’ll certainly be looking at for my own use. It’s a hard problem, and I mass adoption would require any such technology to be easy to use and produce as good a result as google.
Regards minds; I think yacy might be something that’s worth experimenting with – depending on how good the results are vs how hard it is to set up and run… it is certainly the most Free of the options. Or writing a proxy, but I wonder if seeing if startpage would be open to working out some sort of deal – maybe having a “powered by startpage” on every minds install would be enough to have them expose a search.json endpoint? No harm in asking 🙂