The Cost of Collaboration for Code and Art

This post was written with Andrés Monroy-Hernández for the Follow the Crowd Research Blog. The post is a summary of a paper forthcoming in Computer-Supported Cooperative Work 2013. You read also read the full paper: The Cost of Collaboration for Code and Art: Evidence from Remixing. It is part of a series of papers I have written with Monroy-Hernández using data from Scratch. You can find the others on my academic website.

Does collaboration result in higher quality creative works than individuals working alone? Is working in groups better for functional works like code than for creative works like art? Although these questions lie at the heart of conversations about collaborative production on the Internet and peer production, it can be hard to find research settings where you can compare across both individual and group work and across both code and art. We set out to tackle these questions in the context of a very large remixing community.

Example of a remix in the Scratch online community, and the project it is based off. The orange arrows indicate pieces which were present in the original and reused in the remix.

Remixing platforms provide an ideal setting to answer these questions. Most support the sharing, and collaborative rating, of both individually and collaboratively authored creative works. They also frequently combine code with artistic media like sound and graphics.

We know that that increased collaboration often leads to higher quality products. For example, studies of Wikipedia have suggested that vandalism is detected and removed within minutes, and that high quality articles in Wikipedia, by several measures, tend to be produced by more collaboration. That said, we also know that collaborative work is not always better — for example, that brainstorming results in less good ideas when done in groups. We attempt to answer this broad question, asked many times before, in the context of remixing: Which is the better description, “the wisdom of crowds” or “too many cooks spoil the broth”? That, fundamentally, forms our paper’s first research question: Are remixes, on average, higher quality than single-authored works?

A number of critics of peer production, and some fans, have suggested that mass collaboration on the Internet might work much better for certain kinds of works. The argument is that free software and Wikipedia can be built by a crowd because they are functional. But more creative works — like music, a novel, or a drawing — might benefit less, or even be hurt by, participation by a crowd. Our second research question tries to get at this possibility: Are code-intensive remixes, higher quality than media-intensive remixes?

We try to answers to these questions using a detailed dataset from Scratch – a large online remixing community where young people build, share, and collaborate on interactive animations and video games. The community was built to support users of the Scratch programming environment: a desktop application with functionality similar to Flash created by the Lifelong Kindergarten Group at the MIT Media Lab. Scratch is designed to allow users to build projects by integrating images, music, sound and other media with programming code. Scratch is used by more than a million, mostly young, users.

Measuring quality is tricky and we acknowledge that there are many ways to do it. In the paper, we rely most heavily a measure of peer ratings in Scratch called loveits — very similar to “likes” on Facebook. We find similar results with several other metrics and we control for the number of views a project receives.

In answering our first research question, we find that remixes are, on average, rated as being of lower quality than works of single authorship. This finding was surprising to us but holds up across a number of alternative tests and robustness checks.

In answering our second question, we find rough support for the common wisdom that remixing tends to be more effective for functional works than for artistic media. The more code-intensive a project is, on average, the closer the gap is between a remix and a work of single authorship. But the more media-intensive a project is, the bigger the gap. You can see the relationships that our model predicts in the graph below.

Two plots of estimated values for prototypical projects showing the predicted number of loveits using our estimates. In the left panel, the x-axis varies number of blocks while holding media intensity at the sample median. The right panel varies the number of media elements while holding the number of blocks at the sample median. Ranges for each are from 0 to the 90th percentile.

Both of us are supporters and advocates of remixing. As a result, we were initially a little troubled by our result in this paper. We think the finding suggests an important limit to the broadest claims of the benefit of collaboration in remixing and peer production.

That said, we also reject the blind repetition of the mantra that collaboration is always better — for every definition of “better,” and for every type of work. We think it’s crucial to learn and understand the limitations and challenges associated with remixing and we’re optimistic that this work can influence the design of social media and collaboration systems to help remixing and peer production thrive.

For more, see our full paper, The Cost of Collaboration for Code and Art: Evidence from Remixing.

Heading West

University of Washington Quad in Cherry Blossom Season

This week, I accepted a job on the faculty of at the University of Washington Department of Communication. I’ve arranged for a post-doc during the 2013-2014 academic year which I will spend at UW as an Acting Assistant Professor. I’ll start the tenure-track Assistant Professor position in September 2014. The hire is part of a "big data" push across UW. I will be setting up a lab and research projects, as well as easing into a teaching program, over the next couple years.

I’m not going to try to list all the great people in the department, but UW Communication has an incredible faculty with a strong background in studying the effect of communication technology on society, looking at political communication, enagement, and collective action, and tracing out the implications of new communication technologies — in addition to very strong work in other areas. Years ago, I nearly joined the department as a graduate student. I am unbelievably happy that their faculty has invited me to join as a colleague.

Outside of my new department, the University of Washington has a superb group of folks working across the school on issues of quantitative and computational social science, human-computer interaction, and computer-supported cooperative work. They are hiring a whole bunch of folks, across the university, who specialize in data-driven social science. I already have a bunch of relationships with UW faculty and students and am looking forward to expanding and deepening those.

On a personal level, Mika and I are also very excited to return to Seattle. I grew up in the city and I’ve missed it, deeply, since I left — now nearly half my lifetime ago! It will be wonderful to be much closer to many of my family members.

But I know that I will miss the community of friends and colleagues that I’ve built in Boston over the last 7+ years just as deeply. I’m going to miss the intellectual resources, and the intellectual community, that folks in Cambridge get to take for granted. That said, I plan to maintain affiliations and collaborations with folks at Harvard and MIT and will have resources that let me spend time in Boston doing that.

If you are curious what I’m going to be up to — and what the future is likely to hold in terms of my research — you should check the material I’ve put online as part of the job market this year. I’ve posted just about everything on my academic website. This includes a little four page research statement which describes the work I’ve done and the directions I’ve been thinking about taking it.

The academic job market is challenging and confusing. But it’s given me a lot of opportunity to reflect, at length, on both the substance of my research and the academy and its structures and processes. I’ve got a list of blog topics queued up based on that thinking. I’ll be posting them here on my blog over the next few months.

Cultivated Disinterest in Professional Sports

Like many of my friends, I have treated professional sports with cultivated indifference. But a year and a half ago, I decided to become a football fan.

Several years ago, I was at a talk by Michael Albert at MIT where he chastised American intellectuals for what he claimed was cultivated disdain of professional sports. Albert suggested that sports reflect the go-to topic for small talk and building rapport across class and context. But he suggested that almost everybody who used the term "working class struggle" was incapable of making small talk with members of the working class because — unlike most working class people (and most people in general) — educated people systematically cultivate ignorance in sports.

Professional sports are deeply popular. In the US, Sunday Night Football is now the most popular television show among women in its time slot and the third most popular television in America among 18-49 year old women. That it is also the most popular television show in general is old news. There are very few things that anywhere near half of Americans have in common. Interest in football is one of them. An enormous proportion of the US population watches the Superbowl each year.

I recognized myself in Albert’s critique. So I decided to follow a local team. I picked football because it is the most popular sport in America and because their strong revenue sharing system means that either team has a chance to win any given match. My local team is the New England Patriots and I’ve watched many of the team’s games or highlights over the last season and a half. I’ve also followed a couple football blogs.

A year and half in, I can call myself a football fan. And I’ve learned a few things in the process:

  1. With a little effort, getting into sports is easy. Although learning the rules of a sport can be complicated, sports are popular because people, in general, find them fun to watch. If you watch a few games with someone who can explain the rules, and if you begin to cheer for a team, you will find yourself getting emotionally invested and excited.
  2. Sports really do, as Albert implied, allow one to build rapport and small talk across society. I used to dread the local cab driver who would try to make small talk by mentioning Tom Brady or the Red Sox. No more! Some of these conversations turn into broader conversations about life and politics.
  3. Interest in sports can expand or shrink to fill the time you’re willing to give it. It can mean just glancing through the sports sections of the paper and watching some highlights here or there. Or it can turn into a lifestyle.
  4. It’s not all great. Football, like most professional sports, is deeply permeated with advertisements, commercialism, and money. Like other sports, it is also violent. I don’t think I could ever get behind a fight sport where the goal is to hurt someone else. The machoness and absence of women in the highest levels of most professional sports bothers me deeply.

I’ve also tried to think a lot about why I, like most of my friends, avoided sports in the past. Disinterest in sports among academics and the highly educated is, in my experience, far from passive. I’ve heard people almost compete to explain the depth of their ignorance in sports — one doesn’t even know the rules, one doesn’t own a television, one doesn’t know the first thing about the game. I did the same thing myself.

Bethany Bryson, a sociologist at JMU has shown that increased education is associated with increased inclusiveness in musical taste (i.e., highly educated people like more types of music) but that these people are most likely to reject music that is highly favored by the least educated people. Her paper’s title sums up the attitude: "Anything But Heavy Metal". For highly educated folks, it’s a sign of cultivation to be eclectic in one’s tastes. But to signal to others that you belong in the intellectual elite, it can pay in cultural capital to dislike things, like sports, that are enormously popular among the least educated parts of society.

This ignorance among highly educated people limits our ability to communicate, bond, and build relationships across different segments of society. It limits our ability to engage in conversations and build a common culture that crosses our highly stratified and segmented societies. Sports are not politically or culturally unproblematic. But they provide an easy — and enjoyable — way to build common ground with our neighbors and fellow citizens that transcend social boundaries.

Time to Boot

Last weekend, my friend Andrés Monroy-Hernández pointed out something that I’ve been noticing as well. Although the last decade has seen a huge decrease in the time my laptop takes to boot, the same can not be said for the increasing powerful computer in my pocket that is my phone.

Graph showing increasing boot-times for phones and decreasing boot-times for laptops.

As the graph indicates, I think my cross-over was around 2010 when I acquired an SSD for my laptop.

Pregnant with Suspense

A couple days ago, I woke up to this exciting series of text messages from a unfamiliar phone number.

Text messages describing the birth of a child, a picture of a newborn, and a response at the end asking who it is and if it was a wrong number.

Because I’ve not received a reply in the last couple days, because it was a Seattle phone number but I haven’t lived in Seattle for years, and because I don’t know of anyone in Seattle who was about to give birth, I’m pretty confident that this was indeed a case of misdirected text messages!

But whoever you are: Congratulations! I know it was a mistake, but that really made my day!

Open Brands

In late July, the Awesome Foundations invited me to participate in an interesting conversation about open brands at their conference. Awesome is a young collection of organizations struggling with the idea of if, and how, they want to try to control who gets call themselves Awesome. I was asked to talk about how the free software community approaches the issue.

Guidance from free software is surprisingly unclear. I have watched and participated in struggles over issues of branding in every successful free software project I’ve worked in. Many years ago, Greg Pomerantz and I wrote a draft trademark policy for the Debian distribution over a couple beers. Over the last year, I’ve been working with Debian Project Leader Stefano Zacchiroli and lawyers at the Software Freedom Law Center to help draft a trademark policy for the Debian project.

Through that process, I’ve come up with three principles which I think lead to more clear discussion about whether a free culture or free software should register a trademark and, if they do, how they should think about licensing it. I’ve listed those principles below in order of importance.

1. We want people to use our brands. Conversation about trademarks seem to turn into an exercise in imagining all the horrible ways in which a brand might be misused. This is silly and wrong. It is worth being extremely clear on this point: Our problem is not that people will misuse our brands. Our problem is that not enough people will use them at all. The most important goal of a trademark policy should be to make legitimate use possible and easy.

We want people to make t-shirts with our logos. We want people to write books about our products. We want people to create user groups and hold conferences. We want people to use, talk about, and promote our projects both commercially and non-commercially.

Trademarks will limit the diffusion of our brand and, in that way, will hurt our projects. Sometimes, after carefully considering these drawbacks, we think the trade-off is worth making. And sometimes it is. However, projects are generally overly risk averse and, as a result, almost always err on the side of too much control. I am confident that free software and free culture projects’ desire to control their brands has done more damage than all brand misuse put together.

2. We want our projects to be able to evolve. The creation of a trademark puts legal power to control a brand in the hands of an individual, firm, or a non-profit. Although it might not seem like such a big deal, this power is, fundamentally, the ability to determine what a project is and is not. By doing this, it creates a single point of failure and a new position of authority and, in that process, limits projects’ ability to shift and grow organically over time.

I’ve heard that in US politics, there is no trademark for the terms Republican or Democrat and that you do not need permission to create an organization that claims to be part of either party. And that does not mean that everybody is confused. Through social and organizational structures, it is clear who is in, who is out, and who is on the fringes.

More importantly, this structure allows for new branches and groups outside of the orthodoxy to grow and develop on the margins. Both parties have been around since the nineteenth century, have swapped places on the political spectrum on a large number of issues, and have played host to major internal ideological disagreements. Almost any organization should aspire to such longevity, internal debate, and flexibility.

3. We should not confuse our communities. Although they are often abused, trademarks are fundamentally pro-consumer. The point of legally protected brands is to help consumers from being confused as the source of a product or service. Users might love software from the Debian project, or might hate it, but it’s nice for them to be able to know that they’re getting "Debian Quality" when they download a distribution.

Of course, legally protected trademarks aren’t the only way to ensure this. Domains names, internal policies, and laws against fraud and misrepresentation all serve this purpose as well. The Open Source Initiative applied for a trademark on the term open source and had their application rejected. The lack of a registered trademark has not kept folks from policing use of the term. Folks try to call their stuff "open source" when it is not and are kept in line by a community of folks who know better.

And since lawyers are rarely involved, it is hardly clear that a registered trademark would help in the vast majority of these these situations. It is also the case that most free software/culture organizations lack the money, lawyers, or time, to enforce trademarks in any case. Keeping your communities of users and developers clear on what is, and what isn’t, your product and your project is deeply important. But how we choose to do this is something we should never take for granted.

A Model of Free Software Success

Last week I helped organize the Open and User Innovation Conference at Harvard Business School. One of many interesting papers presented there was an essay on Institutional Change and Information Production by Fabio Landini from the University of Siena.

At the core of the paper is an economic model of the relationship between rights protection and technologies that affects the way that cognitive labor can be divided and aggregated. Although that may sound very abstract (and it is in the paper), it is basically a theory that tries to explain the growth of free software.

The old story about free software and free culture (at least among economists and many other academics) is that the movements surged to prominence over the last decade because improvements in communication technology made new forms of mass-collaboration — like GNU/Linux and Wikipedia — possible. "Possible", for these types of models, usually means profit-maximizing for rational, profit-seeking, actors like capitalist firms. You can basically think of these attempts as trying to explain why open source claims that free licensing leads to "better quality, higher reliability, more flexibility, lower cost" are correct: new technology makes possible an open development process which leads to collaboration which leads to higher quality work which leads to profit.

Landini suggests there are problems with this story. One problem is that it treats technology as being taken for granted and technological changes as effectively being dropped in from outside (i.e., exogenous). Landini points out that software businesses build an enormous amount of technology to help organize their work and to help themselves succeed in what they see as their ideal property rights regime. The key feature of Landini’s alternate model is that it considers this possibility. What comes out the other end of the model is a prediction for a multiple equilibrium system — a situation where there are several strategies that can be stable and profitable. This can help explain why, although free software has succeeded in some areas, its success has hardly been total and usually has not led to change within existing proprietary software firms. After all, there are still plenty of companies selling proprietary software. In Landini’s model, free is just one of several winning options.

But Landini’s model raises what might be an even bigger question. If free software can be as efficient as proprietary software, how would anybody ever find out? If all the successful software companies out there are doing proprietary software, which greedy capitalist is going to take the risk of seeing if they could also be successful by throwing exclusive rights out the window? In the early days, new paths are always unclear, unsure, and unproven.

Landini suggests that ethically motivated free software hackers provide what he calls a "cultural subsidy." Essentially, a few hackers are motivated enough by the ethical principles behind free software that they are willing to contribute to it even when it isn’t clearly better than proprietary alternatives. And in fact, historically speaking, many free software hackers were willing to contribute to free software even when they thought it was likely less profitable than the proprietary alternative models. As Landini suggests, this group was able to build technological platforms and find new social and business arrangements where the free model actually is competitive.

I think that the idea of an "cultural subsidy" is a nice way to think about the important role that ethical arguments play in movements like free software and free culture. "Open source" style efficiency arguments persuade a lot of people. Especially when they are true. But those arguments are only ever true because a group of ethically motivated people fought to find a way to make them true. Free software didn’t start out as competitive with proprietary software. It became so only because a bunch of ethically motivated hackers were willing to "subsidize" the movement with their failed, and successful, attempts at free software and free culture projects and businesses.

Of course, the folks attracted by "open source" style superiority arguments can find the ethical motivated folks shrill, off-putting, and annoying. The ethically motivated folks often think the "efficiency" group is shortsighted and mercenary. But as awkward as this marriage might be, it has some huge upsides. In Landini’s model, the ethical folks can build their better world without convincing everyone else that they are right and by relying, at least in part, on the self-interest of others who don’t share their principles. Just as the free software movement has done.

I think that Landini’s paper is a good description of the critically important role that the free software movement, and the FSF in particular, can play. The influence and importance of individuals motivated by principles can go far beyond the groups of people who take an ethical stand. They can make involvement possible for large groups of people who do not think that taking a stand on a particular ethical issue is even a good idea.

User Innovation on NPR Radio

I was invited onto NPR in Boston this week for a segment on user innovation alongside Eric von Hippel (my advisor at MIT) and Carliss Baldwin from Harvard Business School.

I talked about innovation that has happened on the CHDK platform — a cool firmware hack for Canon cameras example I use in some of my teaching — plus a little bit about free software, the democratization of development and design tools, and a little bit about user communities that LEGO has cultivated.

I would have liked the conversation and terminology to do more to emphasize user freedom and free software, but I’m otherwise pretty happy with the result. The segment will be aired again on NPR in Boston this weekend and is available on the WGBH website.

The Global Iron Blogger Network

Since last November, I’ve been participating in and coordinating Iron Blogger: a drinking club where you pay $5 to a "beer" pool if you fail to blog weekly.

The revival of Iron Blogger in Boston has been a big success. Even more exciting, however, is that Iron Blogger concept has spread. There are now two other Iron Blogger instances: in San Francisco coordinated by Parker Higgens, and in Berlin run by Nicole Ebber and Michelle Thorne.

Yesterday, we convened a virtual meeting of the Global Iron Blogger Council (i.e., an email thread) and we all agreed a new on iron blogger rule that might sweeten the deal for jet-setting prospective Iron Bloggers: any paid-up member of any Iron Blogger club can attend meet-ups in any other Iron Blogger cities if they happen to be in town for one. Because We Are One.

If you want to join us in Boston, we have some room through attrition. Rust bloggers, perhaps? If you’d like to join, you should contact me.

And if you’d like to set up your own in a different city, the code is in git. One warning, however. As those of us that have set it up have figured out, the documentation for the software to run Iron Blogger is between poor and non-existent. If you do want to set up your own instance, please get in touch. I’m happy to give you some pointers that you’ll probably need but, more importantly, I’d like to work with the next brave soul to put together documentation of the setup process along the way.

Wiki Conferencing

I am in Berlin for the Wikipedia Academy, a very cool hybrid free culture community plus refereed academic conference organized, in part, by Wikimedia Deutschland. On Friday, I was very excited to have been invited to give the conference’s opening keynote based on my own hybrid take on learning from failures in peer production and incorporating a bunch of my own research. Today, I was on a panel at the conference about free culture and sharing practices. I’ll post talks materials and videos when the conference puts them online.

I will be in Berlin for the next week or so before I head to directly to Washington, DC for Wikimania between the 11th and 15th. I’ll be giving three talks there:

Between then and now, I’m taking the next week in Berlin to catch up on work, and with friends. If you’re in either place and want to meet up, please get in touch and lets try to arrange something.

Why Facebook’s Network Effects are Overrated

A lot of people interested in free software, and user autonomy and network services are very worried about Facebook. Folks are worried for the same reason that so many investors are interested: the networks effects brought by hundreds of millions of folks signed up to use the service.

Network effects — the concept that a good or service increases in value as more people use it — are not a new problem for free software. Software developers target Microsoft Windows because that is where the large majority of users are. Users with no love for Microsoft and who are otherwise sympathetic to free software use Windows because programs they need will only run there.

Folks worried about Facebook are afraid for similar reasons. Sure, you can close down your Facebook account and move to Diaspora. But who will you talk to there? You can already hear people complaining about Facebook the same way they’ve been complaining about Windows or Office for years. People feel that their hands are tied and that their software, and their social network, will be determined by what everybody is doing.

I’m worried about Facebook. But I’m not too intimidated by Facebook’s network effects for two reasons.

First, using Facebook doesn’t preclude using anything else.

Twitter has enormous overlapping functionality with Facebook. Sure, people use the systems very differently. But they both ask you to create lists of friends and followers and are designed around sending and receiving short status messages. Millions of people do both and both systems are thriving. For the millions of people who use both Facebook and Twitter, the two services have had to negotiate their marginal utility in a world they share with the other one. People decide that Twitter is for certain types of short messages and Facebook is for others. But these arrangements shift over time.

And the relationships between services aren’t always peaceful coexistence. Remember Friendster? Remember Orkut? Remember Tribe? Remember MySpace? MySpace, and all the others, are great examples of how social networks die. They very slowly fade away. MySpace users signed up for Facebook accounts and used both. They almost never just switched. Over time, as one platform became more attractive than the other, for many complicated reasons, attention and activity shifted. People logged in on MySpace less and Facebook more and, eventually, realized they were effectively no longer MySpace users. Anyone that has been on the Internet long enough to watch a few of these shifts from one platform to another knows that they’re not abrupt — even if they can be set in motion by a particular event or action. Users of social networking sites simply don’t have to choose in the way that a person choosing to boot Windows and GNU/Linux does.

I’m sure the vast majority of people with Diaspora accounts use Facebook actively. This is not a problem for Diaspora. It is how Diaspora — or whatever else eventually achieves what many of us hoped Diaspora would — could win.

Second, Facebook is for the ephemeral.

Facebook is primarily used for information that was produced very recently. This week if not today. If not this hour. Facebook has an enormous amount of data that users have fed it that may be hard to get out and move somewhere else. But most people don’t care very much about having any regular access to the large majority of this information. What people care deeply about is having access to the data that they and their friends created today. And that data can just as easily be created somewhere else tomorrow. Or, with the right tools, created just as easily in both places.

Compare this to something like Windows where moving away would require learning, converting, and perhaps even writing, new software. Perhaps even in new programming languages that most developers don’t know yet. Compared to Windows, a migration away from Facebook will be easy.

Facebook’s photo galleries are an example of an important place where this holds less well. Social network information — i.e., the list of who is friends with who — is another example of something that is persistently valuable. That said, people really enjoy the act of finding and friending. Indeed, this process was part of the initial draw of Facebook and other social networks.

None of this means that Facebook is over. It doesn’t even mean that its ascendancy will be slowed. What it does mean is that Facebook is vulnerable to the next thing more than many technology firms that have benefited from network effects in the past. If users are given compelling reasons to switch to something else, they can with less trouble and they will.

That compelling reason might be a new social network with better features or an awesome distributed architecture that allows freedom for users and the ability of those users to benefit from new and fantastic things that Facebook’s overseers would never let them have and without the things Facebook’s users suffer through today. Or it might be a sexier proprietary box to store users’ private information. It doesn’t mean that I’m not worried about Facebook. I remain deeply worried. It’s just not very hard for me to imagine the end.

Date Arithmetic

When I set an alarm, my clock, now running on the computer in my pocket, is smart enough to tell me how much time will pass until the alarm is scheduled to sound. This has eliminated the old problem of sleeping past meetings before being surprised by an alarm precisely half a day after I had originally planned to wake.

The price has been having to know exactly how little I will sleep: a usually depressing fact that had previously been obscured by my difficulty doing time arithmetic in my most somnolent moments.