Should freedom of information apply to algorithms?

Governments increasingly use data analysis to make decisions that affect citizens. But how transparent are these practices? In a study summarised here, Nicholas Diakopoulos had students file freedom of information requests to obtain, among other things, the algorithms behind government decision-making. Most requests were denied, for a variety of reasons. Some states claimed algorithms aren’t «documents» covered by FOI legislation; others said they were copyrighted.

The article reminded me of the risk profiles Dutch municipal welfare agencies use to decide who to submit to rigorous checks - including very intrusive home searches. As early as 2006, I was involved in a survey by Dutch trade union FNV which found that two in five municipalities used risk profiles for that purpose:

This has the advantage that for a large group of people, unnecessary routine checks can be dispensed with. However, there’s virtually no debate about what criteria can be used without causing unacceptable unequal treatment. Is it ok to select people because they’ve worked in the catering industry, or as a self-employed person? Or because of their nationality?

When the government uses algorithms exert control over citizens (or when they outsource that task, for that matter), there should be accountability. So would it be possible to obtain such algorithms through an FOI request?

I found one decision that suggests that algorithms aren’t a priori excluded from FOI requests - at least so in the eyes of the Utrecht municipality (I used Open State’s FOI search engine to find it). But welfare recipients’ organisation Bijstandsbond informed me that an FOI request has been filed in the past to obtain the risk profiles used by the Amsterdam municipal welfare agency. The request was denied.


Amsterdam has room for another 2.1 million bicycle racks


Amsterdam has a persistent shortage of bicycle racks. Bicycle professor Marco te Brömmelstroet argues that this is really a matter of making choices: the space occupied by four parked cars could easily accomodate 30 bicycle racks.

Amsterdam is a compact city where space is limited. An important goal of the city administration is to create more room for pedestrians and cyclists, but also for green areas.

It so happens that the city of Amsterdam has recently published open data on on-street parking spaces. The data confirms what we already knew: parking spaces for cars occupy a huge amount of public space. The streets of Amsterdam are littered with as many as 265,225 parking spaces. If you exclude the ones with signs (spaces for charging car batteries; car sharing; etcetera), there are still 260,834 of them.

Assuming that each of them could accomodate at least 8 bicycle racks, there’s room for another 2.1 million bicycle racks. Now you probably wouldn’t want to remove all parking spaces and replace them with bicycle racks, but it does illustrate some of the choices that are available regarding the use of public space.

Map detail here.


The open data on on-street parking spaces is available in WFS format which is meant for creating maps but can also be used for downloading data - here’s a Python script that will do the job. I set the location of the parking spaces to the centre of the surrounding envelope.

I would have liked to display the data on an interactive map using Leaflet and D3js, but I’m afraid the quarter million data points would crash the browser. Instead I used OSM map data in combination with Qgis to display the parking spaces. Unfortunately, this means you can’t zoom in.

As for the parking space to bicycle rack ratio: I’m assuming a typical parking space takes up 12 to 14 m2. Cyclists’ organisation Fietsersbond has calculated that regular bicycle racks take up between 0.84 and 1.18 m2 per bicycle. The city of Amsterdam is a bit more conservative and estimates that a bicycle rack takes up about 1.5 m2, including the room needed to remove the bicycle. This suggests that the number of bicycle racks that could be created per parking space lies somewhere between 8 and 9.3.

Web scrapping

Search volume for web scrapping and web scraping according to Google Trends (52-week moving average).

Search volume for web scrapping as percentage of volume for both terms, according to Google Trends (52-week moving average).

I came across the following quote on the Web Scraping website:

I searched my email and found over the last few years I received 76 messages from clients containing the text Web Scrapping rather than the usual spelling Web Scraping. And this is not unique to my clients - currently Google has 122,000 results for “Web Scrapping” compared to 447,000 results for “Web Scraping” - the correct spelling returns only 4x the number of results. So in light of this common spelling mistake I registered the domain and redirected it here.

I thought this had to be a joke - but it wasn’t. The domain redirect actually works, and there does appear to be a persistent search volume for web scrapping, even if its share of total web scra[p]{1,2}ing searches has declined considerably.


The moralism and hypocrisy around ad blockers

With my new iPhone, I can finally install ad blockers. When I tried to find information about the available options, I was struck by the moralism and hypocrisy of many articles on the subject. This subtitle says it all: How to use ad-blockers in iOS 9 (and why you shouldn’t).

Sure, the article makes some valid points. One may question Apple’s motives for allowing ad blockers. And certainly, one may question Adblock’s policy to allow «acceptable ads» from companies that pay them a fee (so use an alternative like the open source uBlock Origin instead). But the claim that ad blocking could «kill journalism as we know it» seems a bit over the top.

The advertising industry tries to frame ad blocking as an attack on «the little guy», by which they mean small, independent publishers. Their strategy is similar to the Home taping is killing music campaign of the 1980s, by which the music industry tried to make us believe that home taping was bad for musicians. In reality, home taping was killing the profits of the very industry that was exploiting those musicians in the first place.

Journalists should be paid for their work, but I’m not convinced advertising is the solution. Ads are annoying, they slow down the internet, they waste valuable surface on mobile screens, they often come with scripts that track you and sometimes they spread malware. Perhaps even more importantly: ideally, journalists shouldn’t depend on advertising in the first place, because advertising is killing independent journalism.

So how should journalists get paid? I’m not sure there’s an easy answer. One way is to pay collectively, which may work rather well (BBC), but it does entail some degree of state regulation. Another way is to buy subscriptions from each site or publisher who publish interesting articles - but that’s rather cumbersome.

A practical alternative are subscription services like Blendle - described as the «the Netflix or Spotify for journalism» (although it’s more like iTunes in that you pay per article). Blendle is an interesting initiative, but there’s reason for caution.

If successful, services like Blendle may well develop into large corporations that try to control access to news stories - much like Spotify tries to control access to music (and Facebook tries to control access to news stories). The outcome could be that subscription services become profitable by exploiting journalists. Also, subscription services could amass an unhealthy degree of control over what we read, and could introduce similar opaque algorithms as the ones Facebook uses to decide what content we get to see.

Things might get interesting if journalists would draw inspiration from musicians and set up cooperatives. These could take the form of not-for-profit Blendle alternatives that offer independent quality journalism at a fair price, produced by journalists who are paid a fair wage for their work.

For now, ad blockers not only offer practical benefits; they also force the internet to address its unhealthy dependency on advertising.


Twitter discovers Steven Kruijswijk

Tweets mentioning names of riders, as percentage of all tweets with hashtag #giro, per day (smoothing applied). Data updated every hour (to update chart, clear browers history or click here to view the chart). Chart:

If all goes well, Steven Kruijswijk might just be the first Dutch rider in ages to conquer the podium in a large race, journalist Thijs Zonneveld wrote on Friday 20 May. At that point, Twitter hadn’t really discovered Kruijswijk yet. That changed on Saturday, when Kruijswijk won the pink jersey.