champagne anarchist | armchair activist

Is it still ok to ridicule pie charts

Workers without job security as a percentage of all working people in the Netherlands. The pink slice shows the percentage in 2003; the red slice how much this has increased since. Data Statistics Netherlands, chart dirkmjk.nl. Relaunch animation.

In a series of articles that caused a bit of a commotion among chart geeks, Robert Kosara summarised the findings of a number of studies on pie charts. In one of the articles, he observes:

Pie charts are generally looked down on in visualization, and many people pride themselves on saying mean things about them and the people who use them.

I guess I’m one of those people who look down on pie charts. Sure, I’m not as outspoken as the respected Edward Tufte, who famously wrote that «the only worse design than a pie chart is several of them». I’m not always against pie charts and I’ve even experimented with animated pie charts to illustrate change in a proportion. But I’m not above making lame jokes about pie charts either. My rule of thumb would be: don’t use pie charts - unless you can come up with a good reason why you should use one in a particular situation.

Kosara describes a number of studies in which he measured how accurately people interpret pie charts and other charts showing a proportion, e.g. 27%. According to his findings, exploded pie charts are doing worse than regular pie charts (phew!) and square pie charts are doing better. Interestingly, a stacked bar chart appears to be doing worse than a regular pie chart (note that a stacked bar chart depicting a single proportion amounts to something that looks like a progress bar).

It’ll be interesting to see how this holds up in future studies. But for now, the finding that (stacked) bar charts are doing worse than pie charts may come as a bit of a shock, for there appears to be a sort of consensus that bar charts are generally better than pie charts. Question is, better at what?

Workers without job security as a percentage of all working people in the Netherlands. Data Statistics Netherlands, chart dirkmjk.nl.

A bar chart is quite good at showing that the level of workers without job security in the Netherlands was higher in 2015 than in 2014. But which chart type is better at showing how much the share has increased between 2003 and 2015? Until recently I would have said «the bar chart» without hesitation, but now I’m not so sure anymore.

That said - I think it’s still ok to ridicule 3D exploded pie charts.

Robert Kosara summarises his findings here and here. The recent studies were done in collaboration with Drew Skau; an older study in collaboration with Caroline Ziemkiewicz. The Tufte quote is from his book The Visual Display of Quantitative Information. The charts above show workers with permanent jobs and a fixed number of hours per week, as a percentage of all working people in the Netherlands (not just employees), source CBS.

Should freedom of information apply to algorithms?

[Update below] - Governments increasingly use data analysis to make decisions that affect citizens. But how transparent are these practices? In a study summarised here, Nicholas Diakopoulos had students file freedom of information requests to obtain, among other things, the algorithms behind government decision-making. Most requests were denied, for a variety of reasons. Some states claimed algorithms aren’t «documents» covered by FOI legislation; others said they were copyrighted.

The article reminded me of the risk profiles Dutch municipal welfare agencies use to decide who to submit to rigorous checks - including very intrusive home searches. As early as 2006, I was involved in a survey by Dutch trade union FNV which found that two in five municipalities used risk profiles for that purpose:

This has the advantage that for a large group of people, unnecessary routine checks can be dispensed with. However, there’s virtually no debate about what criteria can be used without causing unacceptable unequal treatment. Is it ok to select people because they’ve worked in the catering industry, or as a self-employed person? Or because of their nationality?

When the government uses algorithms exert control over citizens (or when they outsource that task, for that matter), there should be accountability. So would it be possible to obtain such algorithms through an FOI request?

I found one decision that suggests that algorithms aren’t a priori excluded from FOI requests - at least so in the eyes of the Utrecht municipality (I used Open State’s FOI search engine to find it). But welfare recipients’ organisation Bijstandsbond informed me that an FOI request has been filed in the past to obtain the risk profiles used by the Amsterdam municipal welfare agency. The request was denied.

[Update 2 July 2016] - Aside from the question whether you can FOI an algorithm, in Europe it may become possible to ask for «an explanation of the decision reached after [algorithmic] assessment» as a result of the EU’s General Data Protection Regulation, according to this analysis. Not only would this create more transparancy; it would also put technical constraints on programmers in that their algorithms have to be interpretable.

Tags: 

Amsterdam has room for another 2.1 million bicycle racks

kaart

Amsterdam has a persistent shortage of bicycle racks. Bicycle professor Marco te Brömmelstroet argues that this is really a matter of making choices: the space occupied by four parked cars could easily accomodate 30 bicycle racks.

Amsterdam is a compact city where space is limited. An important goal of the city administration is to create more room for pedestrians and cyclists, but also for green areas.

It so happens that the city of Amsterdam has recently published open data on on-street parking spaces. The data confirms what we already knew: parking spaces for cars occupy a huge amount of public space. The streets of Amsterdam are littered with as many as 265,225 parking spaces. If you exclude the ones with signs (spaces for charging car batteries; car sharing; etcetera), there are still 260,834 of them.

Assuming that each of them could accomodate at least 8 bicycle racks, there’s room for another 2.1 million bicycle racks. Now you probably wouldn’t want to remove all parking spaces and replace them with bicycle racks, but it does illustrate some of the choices that are available regarding the use of public space.

Map detail here.

Method

The open data on on-street parking spaces is available in WFS format which is meant for creating maps but can also be used for downloading data - here’s a Python script that will do the job. I set the location of the parking spaces to the centre of the surrounding envelope.

I would have liked to display the data on an interactive map using Leaflet and D3js, but I’m afraid the quarter million data points would crash the browser. Instead I used OSM map data in combination with Qgis to display the parking spaces. Unfortunately, this means you can’t zoom in.

As for the parking space to bicycle rack ratio: I’m assuming a typical parking space takes up 12 to 14 m2. Cyclists’ organisation Fietsersbond has calculated that regular bicycle racks take up between 0.84 and 1.18 m2 per bicycle. The city of Amsterdam is a bit more conservative and estimates that a bicycle rack takes up about 1.5 m2, including the room needed to remove the bicycle. This suggests that the number of bicycle racks that could be created per parking space lies somewhere between 8 and 9.3.

[Update 3 July 2016] - The city of Nijmegen reckons it can fit as many as 10 bicycle racks on a parking space.

Web scrapping

Search volume for web scrapping and web scraping according to Google Trends (52-week moving average).

Search volume for web scrapping as percentage of volume for both terms, according to Google Trends (52-week moving average).

I came across the following quote on the Web Scraping website:

I searched my email and found over the last few years I received 76 messages from clients containing the text Web Scrapping rather than the usual spelling Web Scraping. And this is not unique to my clients - currently Google has 122,000 results for “Web Scrapping” compared to 447,000 results for “Web Scraping” - the correct spelling returns only 4x the number of results. So in light of this common spelling mistake I registered the domain webscrapping.com and redirected it here.

I thought this had to be a joke - but it wasn’t. The domain redirect actually works, and there does appear to be a persistent search volume for web scrapping, even if its share of total web scra[p]{1,2}ing searches has declined considerably.

Tags: 

The moralism and hypocrisy around ad blockers

With my new iPhone, I can finally install ad blockers. When I tried to find information about the available options, I was struck by the moralism and hypocrisy of many articles on the subject. This subtitle says it all: How to use ad-blockers in iOS 9 (and why you shouldn’t).

Sure, the article makes some valid points. One may question Apple’s motives for allowing ad blockers. And certainly, one may question Adblock’s policy to allow «acceptable ads» from companies that pay them a fee (so use an alternative like the open source uBlock Origin instead). But the claim that ad blocking could «kill journalism as we know it» seems a bit over the top.

The advertising industry tries to frame ad blocking as an attack on «the little guy», by which they mean small, independent publishers. Their strategy is similar to the Home taping is killing music campaign of the 1980s, by which the music industry tried to make us believe that home taping was bad for musicians. In reality, home taping was killing the profits of the very industry that was exploiting those musicians in the first place.

Journalists should be paid for their work, but I’m not convinced advertising is the solution. Ads are annoying, they slow down the internet, they waste valuable surface on mobile screens, they often come with scripts that track you and sometimes they spread malware. Perhaps even more importantly: ideally, journalists shouldn’t depend on advertising in the first place, because advertising is killing independent journalism.

So how should journalists get paid? I’m not sure there’s an easy answer. One way is to pay collectively, which may work rather well (BBC), but it does entail some degree of state regulation. Another way is to buy subscriptions from each site or publisher who publish interesting articles - but that’s rather cumbersome.

A practical alternative are subscription services like Blendle - described as the «the Netflix or Spotify for journalism» (although it’s more like iTunes in that you pay per article). Blendle is an interesting initiative, but there’s reason for caution.

If successful, services like Blendle may well develop into large corporations that try to control access to news stories - much like Spotify tries to control access to music (and Facebook tries to control access to news stories). The outcome could be that subscription services become profitable by exploiting journalists. Also, subscription services could amass an unhealthy degree of control over what we read, and could introduce similar opaque algorithms as the ones Facebook uses to decide what content we get to see.

Things might get interesting if journalists would draw inspiration from musicians and set up cooperatives. These could take the form of not-for-profit Blendle alternatives that offer independent quality journalism at a fair price, produced by journalists who are paid a fair wage for their work.

For now, ad blockers not only offer practical benefits; they also force the internet to address its unhealthy dependency on advertising.

Tags: 

Pages