As if Google’s StreetView wasn’t under enough suspicion from privacy advocates, the company has recently discovered that it had been collecting information from open WiFi networks… such as snippets of e-mail, which web page a person was viewing, or photos. This has apparently been happening for the last three years.
First thing’s first: this only applies to open, unencrypted networks. So if you have WiFi in your home, for goodness sake, put a password on it! If nothing else it’ll stop the dodgy guy across the road from downloading gigabytes of porn on your connection!
But how could this have happened in the first place? Surely Google must know what’s going into their code and, therefore, what their cars are capable of? Well, it’s not quite that simple.
Let the engineers play
Engineers at Google are encouraged to pursue projects for interest. Engineers are allowed to devote twenty percent of their time to projects they’re passionate about – and it’s given rise to some interesting products like Google Suggest, AdSense for Content and Orkut.
An engineer was working on a project to glean information from unsecured WiFi networks and that code somehow found its way into the StreetView software. I don’t have any evidence to back this up, but I suspect the original project was a 20% effort.
Software Engineering often re-uses old code
Good software engineering makes use of previously written code so when Google decided to map WiFi networks for location tracking (think the pseudo-GPS on iPhone before they added an actual GPS receiver) it would have made sense to use code from a previous project that was able to log the details of WiFi networks. What Google hadn’t banked on is that the code also downloaded sample data from unsecured networks. Whether it was a failure in the quality assurance cycle, miscommunication, or some other problem, the StreetView cars were doing more than they were intended to.
Is that even possible though? Well, let me tell you a story. At one of my previous jobs I was part of a team working on some banking software. It was designed to levy charges on people’s bank accounts if they went overdrawn or tried to withdraw their savings without giving the correct amount of notice (yeh, I know, “Booooo!”). We always reused code if it was available, because writing a ten thousand line program from scratch is just stupid if there’s already something that can be adapted.
We got to the testing stage and were looking through the data when we noticed that the charges had a destination account number attached to them. That is, it looked like they were being transferred to another account rather than just being deducted. Remember, this is at the testing stage – it hadn’t gone live with real accounts. We realised that a piece of reused code was stripping the bank account information from earlier in the batch process, and inserting it as the destination account. It turns out it wouldn’t actually have made a difference, because of the way charges were handled, but it gave us a scare and made us realise we needed to be careful with reused code.
The upshot of that is I can fully believe that the StreetView project reused a program from elsewhere and got functionality they didn’t want, alongside the behaviour they did. Does that make it OK? No, of course not, but it means I’m willing to believe it wasn’t deliberate.
And Google’s response?
Google’s response is detailed on their blog:
As soon as we became aware of this problem, we grounded our Street View cars and segregated the data on our network, which we then disconnected to make it inaccessible. We want to delete this data as soon as possible, and are currently reaching out to regulators in the relevant countries about how to quickly dispose of it.
Maintaining people’s trust is crucial to everything we do, and in this case we fell short. So we will be:
- Asking a third party to review the software at issue, how it worked and what data it gathered, as well as to confirm that we deleted the data appropriately; and
- Internally reviewing our procedures to ensure that our controls are sufficiently robust to address these kinds of problems in the future.
In addition, given the concerns raised, we have decided that it’s best to stop our Street View cars collecting WiFi network data entirely.
While all this is understandable and, probably, just a bit of a blunder, it does make you think about how much trust we put in companies. I trust Google with my e-mail, RSS reader, many documents, and photographs. I actually stalked the StreetView car when it did our town to see if I could get on the map (I failed… and that admission probably says something about my mental state). Sure, everyone can make a mistake, but when there’s so much information in the hands of one company they almost get to the point where they just can’t afford to make any.
Mistake or not – privacy peeps will be worried by this. Hopefully Google’s response will solve the problem… at least for a while.
What do you think?
What do you think on this issue? Do you trust Google? Are there any companies you really trust with your data? Do you think this was an innocent mistake, deliberate ploy, or just bad engineering? Let us know in the comments.
Post image by FanIntoFlames – used under Creative Commons License.