Identifying “Underproduced” Software

I wrote this blog post with Kaylea Champion and a version of this post was originally posted on the Community Data Science Collective blog.

Critical software we all rely on can silently crumble away beneath us. Unfortunately, we often don’t find out software infrastructure is in poor condition until it is too late. Over the last year or so, I have been supporting Kaylea Champion on a project my group announced earlier to measure software underproduction—a term we use to describe software that is low in quality but high in importance.

Underproduction reflects an important type of risk in widely used free/libre open source software (FLOSS) because participants often choose their own projects and tasks. Because FLOSS contributors work as volunteers and choose what they work on, important projects aren’t always the ones to which FLOSS developers devote the most attention. Even when developers want to work on important projects, relative neglect among important projects is often difficult for FLOSS contributors to see.

Given all this, what can we do to detect problems in FLOSS infrastructure before major failures occur? Kaylea Champion and I recently published a paper laying out our new method for measuring underproduction at the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2021 that we believe provides one important answer to this question.

A conceptual diagram of underproduction. The x-axis shows relative importance, the y-axis relative quality. The top left area of the graph described by these axes is 'overproduction' -- high quality, low importance. The diagonal is Alignment: quality and importance are approximately the same. The lower right depicts underproduction -- high importance, low quality -- the area of potential risk.
Conceptual diagram showing how our conception of underproduction relates to quality and importance of software.

In the paper, we describe a general approach for detecting “underproduced” software infrastructure that consists of five steps: (1) identifying a body of digital infrastructure (like a code repository); (2) identifying a measure of quality (like the time to takes to fix bugs); (3) identifying a measure of importance (like install base); (4) specifying a hypothesized relationship linking quality and importance if quality and importance are in perfect alignment; and (5) quantifying deviation from this theoretical baseline to find relative underproduction.

To show how our method works in practice, we applied the technique to an important collection of FLOSS infrastructure: 21,902 packages in the Debian GNU/Linux distribution. Although there are many ways to measure quality, we used a measure of how quickly Debian maintainers have historically dealt with 461,656 bugs that have been filed over the last three decades. To measure importance, we used data from Debian’s Popularity Contest opt-in survey. After some statistical machinations that are documented in our paper, the result was an estimate of relative underproduction for the 21,902 packages in Debian we looked at.

One of our key findings is that underproduction is very common in Debian. By our estimates, at least 4,327 packages in Debian are underproduced. As you can see in the list of the “most underproduced” packages—again, as estimated using just one more measure—many of the most at risk packages are associated with the desktop and windowing environments where there are many users but also many extremely tricky integration-related bugs.

This table shows the 30 packages with the most severe underproduction problem in Debian, shown as a series of boxplots.
These 30 packages have the highest level of underproduction in Debian according to our analysis.

We hope these results are useful to folks at Debian and the Debian QA team. We also hope that the basic method we’ve laid out is something that others will build off in other contexts and apply to other software repositories.

In addition to the paper itself and the video of the conference presentation on Youtube by Kaylea, we’ve put a repository with all our code and data in an archival repository Harvard Dataverse and we’d love to work with others interested in applying our approach in other software ecosytems.


For more details, check out the full paper which is available as a freely accessible preprint.

This project was supported by the Ford/Sloan Digital Infrastructure Initiative. Wm Salt Hale of the Community Data Science Collective and Debian Developers Paul Wise and Don Armstrong provided valuable assistance in accessing and interpreting Debian bug data. René Just generously provided insight and feedback on the manuscript.

Paper Citation: Kaylea Champion and Benjamin Mako Hill. 2021. “Underproduction: An Approach for Measuring Risk in Open Source Software.” In Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2021). IEEE.

Contact Kaylea Champion (kaylea@uw.edu) with any questions or if you are interested in following up.

The Free Software Foundation and Richard Stallman

I served as a director and as a voting member of the Free Software Foundation for more than a decade. I left both positions over the last 18 months and currently have no formal authority in the organization.

So although it is now just my personal opinion, I will publicly add my voice to the chorus of people who are expressing their strong opposition to Richard Stallman’s return to leadership in the FSF and to his continued leadership in the free software movement. The current situation makes me unbelievably sad.

I stuck around the FSF for a long time (maybe too long) and worked hard (I regret I didn’t accomplish more) to try and make the FSF better because I believe that it is important to have voices advocating for social justice inside our movement’s most important institutions. I believe this is especially true when one is unhappy with the existing state of affairs. I am frustrated and sad that I concluded that I could no longer be part of any process of organizational growth and transformation at FSF.

I have nothing but compassion, empathy, and gratitude for those who are still at the FSF—especially the staff—who are continuing to work silently toward making the FSF better under intense public pressure. I still hope that the FSF will emerge from this as a better organization.

How markets coopted free software’s most powerful weapon (LibrePlanet 2018 Keynote)

Several months ago, I gave the closing keynote address at LibrePlanet 2018. The talk was about the thing that scares me most about the future of free culture, free software, and peer production.

A video of the talk is online on Youtube and available as WebM video file (both links should skip the first 3m 19s of thanks and introductions).

Here’s a summary of the talk:

App stores and the so-called “sharing economy” are two examples of business models that rely on techniques for the mass aggregation of distributed participation over the Internet and that simply didn’t exist a decade ago. In my talk, I argue that the firms pioneering these new models have learned and adapted processes from commons-based peer production projects like free software, Wikipedia, and CouchSurfing.

The result is an important shift: A decade ago,  the kind of mass collaboration that made Wikipedia, GNU/Linux, or Couchsurfing possible was the exclusive domain of people producing freely and openly in commons. Not only is this no longer true, new proprietary, firm-controlled, and money-based models are increasingly replacing, displacing, outcompeting, and potentially reducing what’s available in the commons. For example, the number of people joining Couchsurfing to host others seems to have been in decline since Airbnb began its own meteoric growth.

In the talk, I talk about how this happened and what I think it means for folks of that are committed to working in commons. I talk a little bit about the free culture and free software should do now that mass collaboration, these communities’ most powerful weapon, is being used against them.

I’m very much interested in feedback provided any way you want to reach me including in person, over email, in comments on my blog, on Mastodon, on Twitter, etc.


Work on the research that is reflected and described in this talk was supported by the National Science Foundation (awards IIS-1617129 and IIS-1617468). Some of the initial ideas behind this talk were developed while working on this paper (official link) which was led by Maximilian Klein and contributed to by Jinhao Zhao, Jiajun Ni, Isaac Johnson, and Haiyi Zhu.

Access Without Empowerment (LibrePlanet 2015 Keynote)

At LibrePlanet 2015 (the FSF’s annual conference), I gave a talk called “Access Without Empowerment” as one of the conference keynote addresses. As I did for my 2013 LibrePlanet talk, I’ve edited together a version that includes the slides and I’ve posted it online in WebM and on YouTube.

Here’s the summary written up in the LibrePlanet program:

The free software movement has twin goals: promoting access to software through users’ freedom to share, and empowering users by giving them control over their technology. For all our movement’s success, we have been much more successful at the former. I will use data from free software and from several related movements to explain why promoting empowerment is systematically more difficult than promoting access and I will explore how our movement might address the second challenge in the future.

In related news, registration is open for LibrePlanet 2016 and that it’s free for FSF members. If you’re not an FSF member, the FSF annual fundraiser is currently going on so now would be a great time to join.

TheSetup ChangeLog

Several years ago, I did a long interview with TheSetup — a fantastic website that posts of interviews with nerdy people that ask the same four questions:

  1. Who are you, and what do you do?
  2. What hardware are you using?
  3. And what software?
  4. What would be your dream setup?

Because I have a very carefully considered — but admittedly quite idiosyncratic — setup, I spent a lot of time preparing my answers. Many people have told me that they found my write-up useful. I recently spoke with several students who said it had been assigned in one of their classes!

Of course, my setup has changed since 2012. Although the vast majority is still the same, there is a growing list of modifications and additions. To address this, I’ve been keeping a changelog on my wiki where I detail every major change and addition I’ve made to the setup that I described in the original interview.

DRM on Streaming Services

For the 2015 International Day Against DRM, I wrote a short essay on DRM for streaming services posted on the Defective by Design website. I’m republishing it here.

Between 2003 and 2009, most music purchased through Apple’s iTunes store was locked using Apple’s FairPlay digital restrictions management (DRM) software, which is designed to prevent users from copying music they purchased. Apple did not seem particularly concerned by the fact that FairPlay was never effective at stopping unauthorized distribution and was easily removed with publicly available tools. After all, FairPlay was effective at preventing most users from playing their purchased music on devices that were not made by Apple.

No user ever requested FairPlay. Apple did not build the system because music buyers complained that CDs purchased from Sony would play on Panasonic players or that discs could be played on an unlimited number of devices (FairPlay allowed five). Like all DRM systems, FairPlay was forced on users by a recording industry paranoid about file sharing and, perhaps more importantly, by technology companies like Apple, who were eager to control the digital infrastructure of music distribution and consumption. In 2007, Apple began charging users 30 percent extra for music files not processed with FairPlay. In 2009, after lawsuits were filed in Europe and the US, and after several years of protests, Apple capitulated to their customers’ complaints and removed DRM from the vast majority of the iTunes music catalog.

Fundamentally, DRM for downloaded music failed because it is what I’ve called an antifeature. Like features, antifeatures are functionality created at enormous cost to technology developers. That said, unlike features which users clamor to pay extra for, users pay to have antifeatures removed. You can think of antifeatures as a technological mob protection racket. Apple charges more for music without DRM and independent music distributors often use “DRM-free” as a primary selling point for their products.

Unfortunately, after being defeated a half-decade ago, DRM for digital music is becoming the norm again through the growth of music streaming services like Pandora and Spotify, which nearly all use DRM. Impressed by the convenience of these services, many people have forgotten the lessons we learned in the fight against FairPlay. Once again, the justification for DRM is both familiar and similarly disingenuous. Although the stated goal is still to prevent unauthorized copying, tools for “stripping” DRM from services continue to be widely available. Of course, the very need for DRM on these services is reduced because users don’t normally store copies of music and because the same music is now available for download without DRM on services like iTunes.

We should remember that, like ten years ago, the real effect of DRM is to allow technology companies to capture value by creating dependence in their customers and by blocking innovation and competition. For example, DRM in streaming services blocks third-party apps from playing music from services, just as FairPlay ensured that iTunes music would only play on Apple devices. DRM in streaming services means that listening to music requires one to use special proprietary clients. For example, even with a premium account, a subscriber cannot listen to music from their catalog using an alternative or modified music player. It means that their television, car, or mobile device manufacturer must cut deals with their service to allow each paying customer to play the catalog they have subscribed to. Although streaming services are able to capture and control value more effectively, this comes at the cost of reduced freedom, choice, and flexibility for users and at higher prices paid by subscribers.

A decade ago, arguments against DRM for downloaded music focused on the claim that users should have control over the music they purchase. Although these arguments may not seem to apply to subscription services, it is worth remembering that DRM is fundamentally a problem because it means that we do not have control of the technology we use to play our music, and because the firms aiming to control us are using DRM to push antifeatures, raise prices, and block innovation. In all of these senses, DRM in streaming services is exactly as bad as FairPlay, and we should continue to demand better.

Community Data Science Workshops Post-Mortem

Earlier this year, I helped plan and run the Community Data Science Workshops: a series of three (and a half) day-long workshops designed to help people learn basic programming and tools for data science tools in order to ask and answer questions about online communities like Wikipedia and Twitter. You can read our initial announcement for more about the vision.

The workshops were organized by myself, Jonathan Morgan from the Wikimedia Foundation, long-time Software Carpentry teacher Tommy Guy, and a group of 15 volunteer “mentors” who taught project-based afternoon sessions and worked one-on-one with more than 50 participants. With overwhelming interest, we were ultimately constrained by the number of mentors who volunteered. Unfortunately, this meant that we had to turn away most of the people who applied. Although it was not emphasized in recruiting or used as a selection criteria, a majority of the participants were women.

The workshops were all free of charge and sponsored by the UW Department of Communication, who provided space, and the eScience Institute, who provided food.

cdsw_combo_images-1The curriculum for all four session session is online:

The workshops were designed for people with no previous programming experience. Although most our participants were from the University of Washington, we had non-UW participants from as far away as Vancouver, BC.

Feedback we collected suggests that the sessions were a huge success, that participants learned enormously, and that the workshops filled a real need in the Seattle community. Between workshops, participants organized meet-ups to practice their programming skills.

Most excitingly, just as we based our curriculum for the first session on the Boston Python Workshop’s, others have been building off our curriculum. Elana Hashman, who was a mentor at the CDSW, is coordinating a set of Python Workshops for Beginners with a group at the University of Waterloo and with sponsorship from the Python Software Foundation using curriculum based on ours. I also know of two university classes that are tentatively being planned around the curriculum.

Because a growing number of groups have been contacting us about running their own events based on the CDSW — and because we are currently making plans to run another round of workshops in Seattle late this fall — I coordinated with a number of other mentors to go over participant feedback and to put together a long write-up of our reflections in the form of a post-mortem. Although our emphasis is on things we might do differently, we provide a broad range of information that might be useful to people running a CDSW (e.g., our budget). Please let me know if you are planning to run an event so we can coordinate going forward.

Installing GNU/Linux on a 2014 Lenovo Thinkpad X1 Carbon

I recently bought a new Lenovo X1 Carbon. It is the new second-generation, type “20A7” laptop, based on Intel’s Haswell microarchiteture with the adaptive keyboard. It is the version released in 2014. I also ordered the Thinkpad OneLink Dock which I have returned for the OneLink Pro Dock which I have not yet received.

The system is still very new, challenging, and different, but seems to support GNU/Linux reasonably well if you are willing to run a bleeding edge version and/or patch your kernel and if you are not afraid to spend an afternoon or two tweaking things. What follows are my installation notes for Debian testing (jessie) when I installed it in early May 2014. My general impressions about the laptop as a GNU/Linux system — and overall — are at the end of this write-up.

System Description

The X1 Carbon I ordered included the 512GB SSD, the 14.0 inch WQHD (2560×1440) 260 nit touchscreen, and the maximum 8GB of memory. I believe the rest is not particularly negotiable but includes a 720p HD Camera, a 45.2Wh battery, and an Intel Dual Band Wireless 7260AC with Bluetooth 4.0.

For those that are curious Here is the output of lspci on the system:

00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 0b)
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)
00:03.0 Audio device: Intel Corporation Haswell-ULT HD Audio Controller (rev 0b)
00:14.0 USB controller: Intel Corporation Lynx Point-LP USB xHCI HC (rev 04)
00:16.0 Communication controller: Intel Corporation Lynx Point-LP HECI #0 (rev 04)
00:16.3 Serial controller: Intel Corporation Lynx Point-LP HECI KT (rev 04)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I218-LM (rev 04)
00:1b.0 Audio device: Intel Corporation Lynx Point-LP HD Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 6 (rev e4)
00:1c.1 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 3 (rev e4)
00:1d.0 USB controller: Intel Corporation Lynx Point-LP USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Lynx Point-LP LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation Lynx Point-LP SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation Lynx Point-LP SMBus Controller (rev 04)

BIOS/Firmware

The BIOS firmware is non-free and proprietary as it the case with all ThinkPads and nearly all laptops. According to this thread there is a bug in the default BIOS that means that suspend to RAM is broken in GNU/Linux.

You can get updated BIOS at the Lenovo’s ThinkPad X1 Carbon (Type 20A7, 20A8) Drivers and software page by looking in the the “BIOS” section. Honestly, the easiest approach is probably to download the Windows BIOS Update utility (documentation is here) which you can use to run the BIOS update from within Windows before you install GNU/Linux.

If that’s not an option (e.g., if you’ve already installed GNU/Linux) the best method is to download the bootable CD ISO from the same page. Of course, since the X1 Carbon has no optical media, you have to find another way to boot the CD image. I struggled to get the ISO to boot from USB using the usually reliable dd method. This message suggest that the issue had to do with the El Torito wrapper:

“I had to dump the eltorito image from the ISO they provide, after that I was able to dd the resulting image to a flash drive and the bios update went well, no cdrom needed.”

I updated to version 1.13 of the BIOS which fixes the suspend/resume bug. By the time you read this, there may be newer versions that fix other things so check the Lenovo website.

Installing Debian

I installed Debian testing using the March 19, 2014 “Alpha 1” release of the Debian Installer for Jessie (currently testing). I installed in graphical mode. With the WQHD screen, everything was extremely tiny but it worked flawlessly.

I downloaded the amd64 net install image from the normal place and installed the rest of the system using the built-in Ethernet port which required no firmware or extra drivers. I did the normal dd if=FILENAME.iso of=/dev/sdX method of getting the installer onto the a USB stick to boot. I turned off restricted boot in BIOS first. In general, the latest version of the Debian installation guide is always a good source of guidance on installing Debian.

I used the Debian installer wizard to partition and selected “Use entire disk and partition it for LVM and encrypted data” which kept the UEFI partitions around. The system installed with no errors or issues and booted up normally afterward. The grub menu is hilariously narrow on the WQHD screen.

If you want to use the built-in wireless and/or Bluetooth, you will need to install the non-free iwlwifi firmware package. It is very lame that we still have to do this to use hardware we have purchased.

What Works and Doesn’t

The following stuff works the first time I booted into the GNOME 3 desktop and logged in:

  • The WQHD 2560×1440 screen
  • The touchscreen
  • Both the TrackPoint and the touchpad
  • Built-in e1000e Ethernet using the dongle
  • The keyboard plus the “adaptive” row of F1-F12 keys.
  • External monitor using the full HDMI or mini-DisplayPort connectors
  • Audio (both speakers and microphone)
  • The camera/webcam

The following stuff works if you install non-free firmware:

  • Internal Wireless
  • Bluetooth 4.0

The following stuff works with qualifications:

  • Suspend to RAM — Works once you have updated the firmware.
  • The adaptive keyboard — The F1-F12 keys work but the “button” that theoretically lets you switch to different sets of function buttons (e.g., volume, brightness) does nothing.
  • Disabling the touchpad — There is a BIOS option to disable the touchpad. It works in Windows and does nothing at all in GNU/Linux.

I have not tried:

  • The fingerprint reader

Disabling the touchpad

As a long-term ThinkPad user, I love the TrackPoint pointing stick. If you plan on using this, the built-in touchpad is incredibly aggravating because it is very easy to brush against it while using the TrackPoint.

In BIOS, there is an option to disable the touchpad. Although this works in Windows, it does absolutely nothing in GNU/Linux. Part of the issue is that, unlike the older X1 Carbon and other ThinkPads, there are no TrackPoint buttons. Instead of buttons, there are regions at the top of the touchpad which are configured, in software, to act like buttons. If you want to be able to click, the touchpad can never be truly turned off.

This is not problem unique to the Haswell X1 Carbon and a number of people have been struggling with this issue on other Lenovo laptops. Essentially, what you need to do is configure your touchpad so that the buttons are where you want them and so that it ignores any input for the purposes of cursor movement.

There are a few ways of doing this but this answer from an askubuntu.com question has the solution I ended up using:

Open file /etc/X11/xorg.conf.

Add a section “InputClass” with identifier “Default clickpad buttons”.

Create an option for SoftButtonAreas to values 70% 0 1 42% 36% 70% 1 42%, this is size of the right and middle button.

Enable option AreaBottomEdge and change value to 1, this will disable touchpad movement.

If everything done right, your class should looks like:

Section "InputClass"
     Identifier "Default clickpad buttons"
     MatchDriver "synaptics"
     Option "SoftButtonAreas" "70% 0 1 42% 36% 70% 1 42%"
     Option "AreaBottomEdge" "1"
EndSection

Essentially, the first Option line will create a middle button that is 36% of the width and 42% of the height, and a right button that is 34% of the width and 42% of the height. The synaptics manpage (man synpatics) will give you more detail on the general way this works.

Fixing the Adaptive Keyboard

The most wild feature of the laptop is the adaptive keyboard strip. The strip is a back-lit LCD that looks almost like E Ink screen and acts as a touchscreen keyboard. The default mode gives you the F1-F12 keys. If you “press” the keys (since they aren’t buttons, you just put your finger on top of them) they act like normal F-keys. You can Ctrl-Alt-F1, etc., to switch to virtual terminals out of the box. There are four modes: “Function” (i.e., normal F-keys), Home, Web, and Chat. The last three overlap quite a bit (e.g., they all have brightness and volume). You can play with an example on the Lenovo homepage.

In Windows, switching programs will apparently change these “keys” so that an appropriate set of buttons is shown for the application you are using. You can also change these keys manually with a big “Fn” button at the far left of the adaptive keyboard strip.

As I write this this, released kernels do not support the adaptive keyboard Fn button which means you cannot use anything other than the F-keys out of the box. I believe it also means that resuming from suspend to RAM breaks these keys.

That said, Shuduo Sang from Canonical has released several versions of a patch to to the thinkpad_acpi kernel module which adds support for the Home mode. The other modes (web and chat) do not seem to be supported. The latest version of the patch is on on the Linux Kernel Mailing List and the relevant commits are:

330947b save and restore adaptive keyboard mode for suspend and,resume
3a9d20b support Thinkpad X1 Carbon 2nd generation's adaptive keyboard

Although this is not supported in Debian testing at the time of writing, a bug was filed in Debian and quickly fixed by Ben Hutchings in Debian kernel version 3.14.2-1 which is currently in sid/unstable. As a result, if you install the latest version kernel from Debian unstable (3.14.2-1 or later), the adaptive keyboard just works.

If you aren’t using Debian and if kernel you are using does not have support, you might be patching your kernel.

General Impressions

As I have described in my interview with The Setup, I have been a user of ThinkPad X-series laptops for many years. This is my sixth X-series ThinkPad.

Overall, I quite like the hardware! Once things mature a little bit, I think that this will be a great laptop for running GNU/Linux. That said, I ordered the laptop without realizing that the X1 Carbon had gone through a major revision! The keyboard was quite a suprise. I think that changing a system so radically without changing the model name/number is a very bad move on Lenovo’s part.

There are two remaining issues with the system I’m still struggling with: (1) the keyboard layout is freaky and weird, and (2) the super high resolution screen breaks many things.

The quality of the keyboard itself is great and worthy of the ThinkPad name. That said, there are two ways in which it is strange. The first is the adaptive keyboard strip. Overall, it works surprisingly well and I think it is a clever idea. My sense is that the strip is more annoying in Windows because it changes out from under you all the time. In GNU/Linux, only manual changing of modes is supported. This, in my opinion, is a feature. I do miss the real feedback you get from pressing keys but for F-keys and volume-keys that I don’t use often this isn’t too important. On the downside, I have realized several times that I had been holding down a “button” for several seconds and not noticed.

The more annoying issue with the keyboard is the way that the other keys have moved around. Getting rid of the CapsLock is wonderful! How has this taken so long? Replacing it with a split Home and End keys is nuts. I’ve remapped the Home and End to put Control back where it should be. My right Control to now Home but I still don’t have an End key. The split Backspace and Delete is not a problem for me. The tilde/apostrophe is in a very bad place. There is no Insert, Print Screen/SysRq, Scroll Lock, Pause/Break or NumLock. They are all just gone. Surprisingly, I haven’t missed any of them.

The second issue is the 2560×1440 resolution on the 14 inch screen. I use a 27 inch external monitor with the same native resolution laptop but, by my arithmetic, the pixel density on the laptop is 210 DPI instead 109 DPI on the external monitor. The result is “the scaling problem” and it’s a huge pain that seems mostly unsolved on any operating system.

Fonts and widgets that look good on the laptop look huge on my external monitor. Stuff that looks good on my external monitor looks minuscule on the laptop. I routinely move windows between my laptop screen and my large monitor. Until I find a display system that can handle this kind of scaling effectively, this requires changing font size and zooming all the time. At the moment, I’m shrinking and expanding my font size using the built in hot keys in Emacs, Gnome Terminal, and Firefox/Iceweasel. I love the high resolution screen but the current situation is crazy-making.

Finally, this setup will not get you into the Church of Emacs and it’s not about to find its way onto the FSF’s list of endorsed hardware. For one, I paid the Windows tax. Beyond that, there is the non-free BIOS and the need for non-free firmware to use the wireless and Bluetooth. This is standard for ThinkPads but it isn’t getting any easier to swallow. There are alternatives in the form of Gluglug’s X60 laptops running CoreBoot, Lemote Yeelong laptops, Bunnie Huang’s Novena and others that are better in these regards. I am very excited for these projects but, for a number of reasons, these just weren’t an option for the laptop I use for my research computing.

Update: I’ve changed he configuration option for the synaptics touchpad to match what I’m now actually doing.

Community Data Science Workshops in Seattle

Photo from the Boston Python Workshop – a similar workshop run in Boston that has inspired and provided a template for the CDSW.
Photo from the Boston Python Workshop – a similar workshop run in Boston that has inspired and provided a template for the CDSW.

On three Saturdays in April and May, I will be helping run three day-long project-based workshops at the University of Washington in Seattle. The workshops are for anyone interested in learning how to use programming and data science tools to ask and answer questions about online communities like Wikipedia, Twitter, free  and open source software, and civic media.

The workshops are for people with no previous programming experience and the goal is to bring together researchers as well as participants and leaders in online communities.  The workshops will all be free of charge and open to the public given availability of space.

Our goal is that, after the three workshops, participants will be able to use data to produce numbers, hypothesis tests, tables, and graphical visualizations to answer questions like:

  • Are new contributors to an article in Wikipedia sticking around longer or contributing more than people who joined last year?
  • Who are the most active or influential users of a particular Twitter hashtag?
  • Are people who participated in a Wikipedia outreach event staying involved? How do they compare to people that joined the project outside of the event?

If you are interested in participating, fill out our registration form here. The deadline to register is Wednesday March 26th.  We will let participants know if we have room for them by Saturday March 29th. Space is limited and will depend on how many mentors we can recruit for the sessions.

If you already have experience with Python, please consider helping out at the sessions as a mentor. Being a mentor will involve working with participants and talking them through the challenges they encounter in programming. No special preparation is required.  If you’re interested,  send me an email.

“When Free Software Isn’t Better” Talk

In late October, the FSF posted this video of a talk called When Free Software Isn’t (Practically) Better that I gave at LibrePlanet earlier in the year. I noticed it was public when, out of the blue, I started getting both a bunch of positive feedback about the talk as well as many people pointing out that my slides (which were rather important) were not visible in the video!

Finally, I’ve managed to edit together a version that includes the slides and posted it online and on Youtube.

The talk is very roughly based on this 2010 article and I argue that, despite our advocacy, free software isn’t always (or even often) better in practical terms. The talk moves beyond the article and tries to be more constructive by pointing to a series of inherent practical benefits grounded in software freedom principles and practice.

Most important to me though, the talk reflects my first serious attempt to bring together some of the findings from my day job as a social scientist with my work as a free software advocate. I present some nuggets from my own research and talk about about what they mean for free software and its advocates.

In related news, it also seems worth noting that I’m planning on being back at LibrePlanet this March and that the FSF annual fundraiser is currently going on.

Iceowl’s Awesome New Icon

If you’re a Debian user, you are probably already familiar with some of the awesome icons for IceWeasel (rebranded Mozilla Firefox), IceDove (rebranded Mozilla Thunderbird) and IceApe (rebranded Mozilla SeaMonkey).

iceweasel_icon-200pxicedove_icon-200px    iceape_icon-200px

I was pretty ambivalent about the decision to rebrand Firefox until I saw some of proposed the IceWeasel icons which — in my humble opinion — were just too cute, and awesome, to pass up.

iceweasel-old

Until very recently however, IceOwl (rebranded Mozilla Sundbird) had no such awesome icon. Quite a while ago, I filed bug #658664 in Debian complaining that “iceowl does not include awesome icy owl icons.” I wrote:

I was extremely disappointed when I installed Iceowl and discovered that it does not ship with an awesome logo or icons showing a picture of an “IceOwl.” Instead, it seems to be represented by picture of a (boring) paper calendar which is very generic and not awesome at all.

IceWeasel, IceDove, and IceApe each include extremely awesome logos/icons that have really cool looking white illustrations of “icy” weasels, doves, and apes. IceOwl needs a similarly awesome logo to use as its icon.

This bug seems particularly egregious because owls actually live in icy climates and come in white versions! For example:

https://commons.wikimedia.org/wiki/File:Snowy_Owl_-_Schnee-Eule.jpg

While illustrators need to imagine what an “ice ape” or “ice weasel” might look like, there is no such need for imagination in the case of an ice owl!

As far as I’m concerned, this bug should be release critical. Hopefully, someone will upload a patch quickly!

Finally, after many months of all of us suffering in silence, Nick Morrott came along and fixed the bug with the creation of this new, incredibly awesome, icy owl logo!

iceowl_icon-350px

Resurrecting Debian Seattle

seattle_skyline_night      debian_logo

When I last lived in Seattle, nearly a decade ago, I hosted the “Debian Seattle Social” email list. When I left the city, the mailing list eventually fell victim to bitrot.

When Allison Randall asked me about the list a couple months ago, I decided that moving back to Seattle was a good excuse to work with Allison and some others to revive the community. Toward that end, I’ve put up a little website and created a new mailing list. It’s hosted on Alioth this time which will be reliable than me. Since it has been years, we have not moved over the old subscriber list so you’ll have to sign up again if you were on it before.

If you’re a Debian developer or user and you’d like to hear about infrequent Debian social gatherings in the Seattle area, you should sign up on the list!

Conversation on Freedom and Openness in Learning

On Monday, I was a visitor and guest speaker in a session on “Open Learning” in a class on Learning Creative Learning which aims to offer “a course for designers, technologists, and educators.” The class is being offered publicly by the combination — surprising but very close to my heart — of Peer 2 Peer University and the MIT Media Lab.

The hour-long session was facilitated by Philipp Schmidt and was mostly structured around a conversation with Audrey Watters and myself. The rest of the course materials and other video lectures are on the course website.

You can watch the video on YouTube or below. I thought it was a thought-provoking conversation!

If you’re interested in alternative approaches to learning and free software philosophy, I would also urge you to check out an essay I wrote in 2002: The Geek Shall Inherit the Earth: My Story of Unlearning. Keep in mind that the essay is probably the most personal thing I have ever published and I wrote it more than a decade ago it as a twenty-one year old undergraduate at Hampshire College. Although I’ve grown and learned enormously in the last ten years, and although I would not write the same document today, I am still proud of it.

Goodbye PyBlosxom, Hello WordPress

Since 2004, I’ve used the blogging software PyBlosxom. Over that time, the software has served me well and I have even written a series of patches and plugins.

PyBloxsom is blog software designed for hackers. It assumes you already have a text editor you love and relies on features of a POSIX filesystem instead of a relational database. It’s designed so you can keep your blog under revision control (since 2004 I’ve used GNU Arch, baz, bzr and now git). It is also hackers’ software in the sense that you should expect to write code to use it (e.g., the configuration is pure Python). I love it.

What PyBlosxom does not have is a large community. This summer, the most recent long-time maintainer of the project, Will Kahn-Greene, stepped down. Although the project eventually found a new maintainer, the reality is the project entered maintenance mode years ago and many features available in more popular blogging platforms are unlikely to make it PyBlosxom. The situation with comment spam is particularly dire. I’ve written several antispam plugins but time has shown that I don’t have the either the expertise or the time to make them as awesome as they need to be to really work in today’s web.

So, after many months of weighing, waffling, and planning I’ve switched to WordPress — a great piece of free software with an enormous and established community As you’ll know if you’ve read my interview on The Setup, I think a lot about the technology I surround myself with. I considered WordPress when I started my blog back in 2004 and rejected it soundly. Eight years ago, I would have laughed at you if you told me I’d be using it today; If PyBlosxom is for hackers, WordPress is designed for everyone else. But the way I evaluate software has changed over that period.

In the nineties, I used to download every new version of the Linux kernel to compile it — it took hours! — to try out the latest features. Configurabilty, hackability, and the ability to write my own features was — after a point — more important than the features the software came with. Today, I’m much more aware of the fact that for all the freedom that my software gives me, I simply do not have the time, energy, or inclination to take advantage of that freedom to hack very often.

Today, I give much more value to software that is not just free, but that is maintained by a community of people who can do all the work that I would do if I had unlimited time. Although I don’t have the time or experience to make WordPress do everything I would like, the collective of all WordPress users does. And they’ve done a lot of it already!

The flip side matters as well: Today, I give more value to other people using my software. When WordPress doesn’t do something and I write a plugin or patch, there are tons of people ready to pick it up and use it and perhaps even to collaborate on it with me. Want to guess how many patches my PyBlosxom plugins have received? None, if my memory serves me.

In the past, I’ve written about how free software is a victory even when it doesn’t build a community. I still believe that. But the large communities at the heart of the most successful free software communities (the promise of “open source”) are deeply important in a way that I increasingly value.

In that spirit: If you want to make the jump from PyBlosxom to WordPress, I’ve shared a Git repository with the scripts I used and wrote for the transition.

Freedom for Users, Not for Software

I finally published a short essay I wrote about a year ago: Freedom for Users, Not for Software.

Anybody who has hung around the free software community for a while will be familiar with the confusion created by the ambiguity between "free as in price" versus "free as freedom." In the essay I argue that there is a less appreciated semantic ambiguity that arises when we begin to think that what matters is that software is free. Software doesn’t need freedom, of course; Users of software need freedom. My essay looks at how the focus on free software, as opposed to on free users, has created challenges and divisions in the free software movement.

The essay was recently published in Wealth of the Commons: A World Beyond Market and State, a book edited by David Bollier and Silke Helfrich and published by Levellers Press. The book includes essays by 73 authors that include some other folks from the free software and free culture communities along with a ton of people working on very different types of commons.

My essay is short and has two parts: The first is basically a short introduction to free software movement. The second lays out what I see as major challenges for free software. I will point out that these are some of the areas that I am working most closely with the FSF — who are having their annual fundraiser at the moment — to support and build advocacy programs around.