Advancing A Definition of Free Culture

(C) Copyright 2007 -- Benjamin Mako Hill
Distributed under the terms of the CC BY-SA License

Presentation at DebConf 7

Introduction

This will be more of a talk of tons of little lightning talks.

Interrupt me in the middle please as I don't think there's going to be a lot of room for questions at the end.

For each area, I'm going to give:

Note

3 minutes

Software Engineering Research

There are researchers, mostly academic, who are interested in how software gets produced.

Many work in business schools and people focused on information technology management.

Many study Debian.

Note

4 minutes

Universidad Rey Juan Carlos de Madrid

Jesús M. González-Barahona, et al at the Universidad Rey Juan Carlos de Madrid has put together a bunchf of research with papers including:

  • Measuring Libre Software Using Debian 3.1 (Sarge) as a Case Study: Preliminary Results
  • Counting potatoes: The size of Debian 2.2

I won't spend too much on it because Javier talked about this in depth on last Sunday.

The major question is one of statistic gathering, and more importantly, about how big and economically important is free software.

David Wheeler wrote a paper looking at Red Hat where he concluded it was "a gigabuck" but this group at Universidad Rey Juan Carlos has been demonstrating the much larger community in Debian and also a growing amount of information about how software is produced.

Research is primarily descriptive.

The swelling size of Debian is good for something. :)

Note

7 minutes

ETH Zurich

Sampling in Open Source Software Development: The case for using the Debian GNU/Linux Distribution by Sebastian Spaeth, Matthias Stuermer, Stefan Haefliger, Georg von Krogh (ETH Zurich):w

Similar to the previous argument, Spaeth et al asked:

How can we do useful samples of the free software community?

The usual answer is SourceForge (70,000 projects) which has lots of problems:

  • Most projects are failed;
  • Very incomplete;
  • Contains few of the biggest or most important projects;

Primarily interested in "reuse" of components and the way that things might work.

We argue that a GNU/Linux distribution, such as Debian, is better suited for the sampling of projects because it avoids biases and contains unique information only available in an integrated environment. Especially research on the reuse of components can build on dependency information inherent in the Debian GNU/Linux packaging system.

Debian is useful because it:

  • Assemble a comprehensive usually complete list;
  • Debian contains programs which are actually one version of the distribution used, as at least one person performed work to be compatible, hence form an inward inclusion into the repository;
  • Rules and guidelines insure a standard for adding software;
  • Packages in on version of the software are designed to be compatible;
  • Sections and tags provide a bunch of useful metadata;
  • Maintainers provide a contact person;
  • On 58% of 157 randomly selected Debian libraries were listed on freshmeat only 39% were hosted on a collaboration platform
  • Does not include unsuccessful projects or windows projects;

Note

10 minutes

Martin Michlmayr

Martin Michlmayr (former DPL) has recently finished a thesis that used Debian as one of seven case studies:

Quality Improvement in Volunteer Free and Open Source Software Projects: Exploring the Impact of Release Management

Major questions asked:

Can volunteer teams, with their high volatility regarding project collaborators ensure consistent levels of quality in their output?

The dissertation:

  • Begins with a discussion of quality management;
  • Ends up with a longer discussion of release management in particular and does this with 7 in-depth case studies;

Methodologically, the dissertation focuses primarily on qualitative data (e.g., interviews, "following projects over two years", mailing lists, documents, direct observation).

Results and conclusions:

Argues strongly for time-based releases, not feature based releases:

This dissertation has shown that feature based release management in FOSS projects is often associated with lack of planning, which leads to problems, such as delays and low levels of quality. Especially important for volunteers:

Time based releases are associated with two factors that act as important coordination mechanisms:

  1. Regularity: the production of releases according to a specific interval allows projects to create regular reference points which show contributors what kind of changes other members of the project have made. Regularity also contributes to familiarity with the release process, and it leads to more disciplined processes.
  2. Schedules: by using time rather than features as the orientation for a release, planning becomes possible in voluntary projects. Time based projects can create schedules which describes important deadlines and which contains dependency information between different work items and actors.

Note

14 minutes

Quality and Reliance on Volunteers

Quality and the Reliance on Individuals in Free Software Projects

Martin Michlmayr and Benjamin Mako Hill

What are the problems and solutions associated with voluntary labor on free software projects?

Problems with single maintainership: people get busy, etc. NMU becomes stigmatized.

Problems with group maintainerships: lots of commits but no uploads.

Solutions: Situations like the currently uploader situation.

Note

17 minutes

Martin Krafft

Will speak on his own research.

Note

20 minutes

Robles, etc.

Quantitative analysis of volunteers working in the Debian project...

They asked a variety of questions, I'll handle them one-by-one:

Number of maintainers: Growing by ~35% each year.

Team maintainership: From 14 (1.3%) to 600 (7.4%) packages in 6.5 years. Also impressive even when not including the Debian QA team.

Tracking individual maintainers: There were 216 contributions to 2.0. In 2004, only 121 (55%) remained. A rough half-life of 7.5 years seems to persist. People still involved maintain as many or even more packages. Burnout seems to be an on or off thing.

Maintainers who leave: More than 60% of packages are adopted. If a package falls out of release, it will be very likely that it will not be added again. Maturity model and determining which packages might fall apart is not clear.

Experience and importance: more experienced maintainers tend to maintain software that more important. New maintainers tend to maintain less widely used software.

Other results:

Debian is not adding maintainers as fast as it adds packages. The ratio of maintainers to packages is increasing which presents a potential problem.

Note

24 minutes

Social Scientists

Siobhan

O'Mahony, S., Ferraro, F. (2004). Managing the Boundary of an “Open” Project. (March 30, 2004). Harvard NOM Working Paper No. 03-60.

Analysis in depth of the creation of the Debian new maintainer process.

They looked a lot at the factors that increased likelihood of becoming an AM and the likelihood that someone would become a developer who started or expressed interest in starting the process.

The greatest predictor they found for someone becoming an AM was the number of connections within the network (having five signatures increased the likelihood by 65%). Tenure had a negative effect. New members were more interested in NM and AM related issues.

In other words, new well connected formed the NM committee and make well connected a new requirement of participation -- encouraging people to be at least as connected as they were. The cycle repeats.

The process became increasingly and increasingly strict.

Note

27 minutes

Biella

Will speak on her own research.

Note

31 minutes

Lawyers

GFDL

Background:

  • Debian-Legal criticized the GFDL as non-free. Singled out the invariant use sections in particular.
  • FSF-Debian committee created with myself and Don Armstrong.
  • New draft of the GFDL does not include invariant sections and fixes most of the other issues. The result seems to be documents that will be DFSG free.

Note

34 minutes

CC 3.0

Evan Prodromou wrote up a summary of a long discussion on CC.

  1. Limit scope of requests to remove references.
  2. Waive attribution after request to remove references.
  3. Allow access-controlled private distribution.
  4. Allow distribution of rights-restricted copies of works if unrestricted copies are also made available.
  5. Require "credit for comparable authorship" rather than "comparable authorship credit".
  6. Specify "other credit."
  7. More clearly identify non-license trademark restrictions.
  8. Rephrase overreaching trademark restrictions.

They approached us and asked to be part of the process.

The result: we got almost everything we wanted except (4), (7), and (8) which we ultimately decided were not real freedom issues.

Note

37 minutes

GPLv3

Don, Branden, Greg Pomerantz,and others were brought explicitly in the committee process. Probably more than any single project (except GNU).

This is precisely because Debian has a reputation of building up a strong ability to engage with issues of licensing critique and a strong committement to software freedom.

The FSF might not always agree but they have a deep respect for the Debian approach to these issues.

AGPLv3

Debian called the AGPLv2 non-free because it involved barriers to modification. Through this critique and other decisions, a new version has been released that does not include barriers to modification at all.

Note

40 minutes

Trademark Policies

Debian and SPI have work and thinking done on an open use trademark license. While we don't know what we're doing, we're respected.

Note

40 minutes

Derivers

When people go to create a distribution, it is usually based on Debian. There will be a derivers workshop tomorrow that talks about these things.

Summary

Debian is seems an important as:

Note

45 minutes