copyrighteous

June 10, 2017June 14, 2017

The Wikipedia Adventure

I recently finished a paper that presents a novel social computing system called the Wikipedia Adventure. The system was a gamified tutorial for new Wikipedia editors. Working with the tutorial creators, we conducted both a survey of its users and a randomized field experiment testing its effectiveness in encouraging subsequent contributions. We found that although users loved it, it did not affect subsequent participation rates.

A major concern that many online communities face is how to attract and retain new contributors. Despite it’s success, Wikipedia is no different. In fact, researchers have shown that after experiencing a massive initial surge in activity, the number of active editors on Wikipedia has been in slow decline since 2007.

The number of active, registered editors (≥5 edits per month) to Wikipedia over time. From Halfaker, Geiger, and Morgan 2012.

Research has attributed a large part of this decline to the hostile environment that newcomers experience when begin contributing. New editors often attempt to make contributions which are subsequently reverted by more experienced editors for not following Wikipedia’s increasingly long list of rules and guidelines for effective participation.

This problem has led many researchers and Wikipedians to wonder how to more effectively onboard newcomers to the community. How do you ensure that new editors Wikipedia quickly gain the knowledge they need in order to make contributions that are in line with community norms?

To this end, Jake Orlowitz and Jonathan Morgan from the Wikimedia Foundation worked with a team of Wikipedians to create a structured, interactive tutorial called The Wikipedia Adventure. The idea behind this system was that new editors would be invited to use it shortly after creating a new account on Wikipedia, and it would provide a step-by-step overview of the basics of editing.

The Wikipedia Adventure was designed to address issues that new editors frequently encountered while learning how to contribute to Wikipedia. It is structured into different ‘missions’ that guide users through various aspects of participation on Wikipedia, including how to communicate with other editors, how to cite sources, and how to ensure that edits present a neutral point of view. The sequence of the missions gives newbies an overview of what they need to know instead of having to figure everything out themselves. Additionally, the theme and tone of the tutorial sought to engage new users, rather than just redirecting them to the troves of policy pages.

Those who play the tutorial receive automated badges on their user page for every mission they complete. This signals to veteran editors that the user is acting in good-faith by attempting to learn the norms of Wikipedia.

An example of a badge that a user receives after demonstrating the skills to communicate with other users on Wikipedia.

Once the system was built, we were interested in knowing whether people enjoyed using it and found it helpful. So we conducted a survey asking editors who played the Wikipedia Adventure a number of questions about its design and educational effectiveness. Overall, we found that users had a very favorable opinion of the system and found it useful.

Survey responses about how users felt about TWA.

Survey responses about what users learned through TWA.

We were heartened by these results. We’d sought to build an orientation system that was engaging and educational, and our survey responses suggested that we succeeded on that front. This led us to ask the question – could an intervention like the Wikipedia Adventure help reverse the trend of a declining editor base on Wikipedia? In particular, would exposing new editors to the Wikipedia Adventure lead them to make more contributions to the community?

To find out, we conducted a field experiment on a population of new editors on Wikipedia. We identified 1,967 newly created accounts that passed a basic test of making good-faith edits. We then randomly invited 1,751 of these users via their talk page to play the Wikipedia Adventure. The rest were sent no invitation. Out of those who were invited, 386 completed at least some portion of the tutorial.

We were interested in knowing whether those we invited to play the tutorial (our treatment group) and those we didn’t (our control group) contributed differently in the first six months after they created accounts on Wikipedia. Specifically, we wanted to know whether there was a difference in the total number of edits they made to Wikipedia, the number of edits they made to talk pages, and the average quality of their edits as measured by content persistence.

We conducted two kinds of analyses on our dataset. First, we estimated the effect of inviting users to play the Wikipedia Adventure on our three outcomes of interest. Second, we estimated the effect of playing the Wikipedia Adventure, conditional on having been invited to do so, on those same outcomes.

To our surprise, we found that in both cases there were no significant effects on any of the outcomes of interest. Being invited to play the Wikipedia Adventure therefore had no effect on new users’ volume of participation either on Wikipedia in general, or on talk pages specifically, nor did it have any effect on the average quality of edits made by the users in our study. Despite the very positive feedback that the system received in the survey evaluation stage, it did not produce a significant change in newcomer contribution behavior. We concluded that the system by itself could not reverse the trend of newcomer attrition on Wikipedia.

Why would a system that was received so positively ultimately produce no aggregate effect on newcomer participation? We’ve identified a few possible reasons. One is that perhaps a tutorial by itself would not be sufficient to counter hostile behavior that newcomers might experience from experienced editors. Indeed, the friendly, welcoming tone of the Wikipedia Adventure might contrast with strongly worded messages that new editors receive from veteran editors or bots. Another explanation might be that users enjoyed playing the Wikipedia Adventure, but did not enjoy editing Wikipedia. After all, the two activities draw on different kinds of motivations. Finally, the system required new users to choose to play the tutorial. Maybe people who chose to play would have gone on to edit in similar ways without the tutorial.

Ultimately, this work shows us the importance of testing systems outside of lab studies. The Wikipedia Adventure was built by community members to address known gaps in the onboarding process, and our survey showed that users responded well to its design.

While it would have been easy to declare victory at that stage, the field deployment study painted a different picture. Systems like the Wikipedia Adventure may inform the design of future orientation systems. That said, more profound changes to the interface or modes of interaction between editors might also be needed to increase contributions from newcomers.

This blog post, and the open access paper that it describes, is a collaborative project with Sneha Narayan, Jake Orlowitz, Jonathan Morgan, and Aaron Shaw. Financial support came from the US National Science Foundation (grants IIS-1617129 and IIS-1617468), Northwestern University, and the University of Washington. We also published all the data and code necessary to reproduce our analysis in a repository in the Harvard Dataverse. Sneha posted the material in this blog post over on the Community Data Science Collective Blog.

May 18, 2017June 14, 2017

Children’s Perspectives on Critical Data Literacies

Last week, we presented a new paper that describes how children are thinking through some of the implications of new forms of data collection and analysis. The presentation was given at the ACM CHI conference in Denver last week and the paper is open access and online.

Over the last couple years, we’ve worked on a large project to support children in doing — and not just learning about — data science. We built a system, Scratch Community Blocks, that allows the 18 million users of the Scratch online community to write their own computer programs — in Scratch of course — to analyze data about their own learning and social interactions. An example of one of those programs to find how many of one’s follower in Scratch are not from the United States is shown below.

Last year, we deployed Scratch Community Blocks to 2,500 active Scratch users who, over a period of several months, used the system to create more than 1,600 projects.

As children used the system, Samantha Hautea, a student in UW’s Communication Leadership program, led a group of us in an online ethnography. We visited the projects children were creating and sharing. We followed the forums where users discussed the blocks. We read comment threads left on projects. We combined Samantha’s detailed field notes with the text of comments and forum posts, with ethnographic interviews of several users, and with notes from two in-person workshops. We used a technique called grounded theory to analyze these data.

What we found surprised us. We expected children to reflect on being challenged by — and hopefully overcoming — the technical parts of doing data science. Although we certainly saw this happen, what emerged much more strongly from our analysis was detailed discussion among children about the social implications of data collection and analysis.

In our analysis, we grouped children’s comments into five major themes that represented what we called “critical data literacies.” These literacies reflect things that children felt were important implications of social media data collection and analysis.

First, children reflected on the way that programmatic access to data — even data that was technically public — introduced privacy concerns. One user described the ability to analyze data as, “creepy”, but at the same time, “very cool.” Children expressed concern that programmatic access to data could lead to “stalking“ and suggested that the system should ask for permission.

Second, children recognized that data analysis requires skepticism and interpretation. For example, Scratch Community Blocks introduced a bug where the block that returned data about followers included users with disabled accounts. One user, in an interview described to us how he managed to figure out the inconsistency:

At one point the follower blocks, it said I have slightly more followers than I do. And, that was kind of confusing when I was trying to make the project. […] I pulled up a second [browser] tab and compared the [data from Scratch Community Blocks and the data in my profile].

Third, children discussed the hidden assumptions and decisions that drive the construction of metrics. For example, the number of views received for each project in Scratch is counted using an algorithm that tries to minimize the impact of gaming the system (similar to, for example, Youtube). As children started to build programs with data, they started to uncover and speculate about the decisions behind metrics. For example, they guessed that the view count might only include “unique” views and that view counts may include users who do not have accounts on the website.

Fourth, children building projects with Scratch Community Blocks realized that an algorithm driven by social data may cause certain users to be excluded. For example, a 13-year-old expressed concern that the system could be used to exclude users with few social connections saying:

I love these new Scratch Blocks! However I did notice that they could be used to exclude new Scratchers or Scratchers with not a lot of followers by using a code: like this:

when flag clicked

if then user’s followers < 300

stop all.

I do not think this a big problem as it would be easy to remove this code but I did just want to bring this to your attention in case this not what you would want the blocks to be used for.

Fifth, children were concerned about the possibility that measurement might distort the Scratch community’s values. While giving feedback on the new system, a user expressed concern that by making it easier to measure and compare followers, the system could elevate popularity over creativity, collaboration, and respect as a marker of success in Scratch.

I think this was a great idea! I am just a bit worried that people will make these projects and take it the wrong way, saying that followers are the most important thing in on Scratch.

Kids’ conversations around Scratch Community Blocks are good news for educators who are starting to think about how to engage young learners in thinking critically about the implications of data. Although no kid using Scratch Community Blocks discussed each of the five literacies described above, the themes reflect starting points for educators designing ways to engage kids in thinking critically about data.

Our work shows that if children are given opportunities to actively engage and build with social and behavioral data, they might not only learn how to do data analysis, but also reflect on its implications.

This blog-post and the work that it describes is a collaborative project by Samantha Hautea, Sayamindu Dasgupta, and Benjamin Mako Hill. We have also received support and feedback from members of the Scratch team at MIT (especially Mitch Resnick and Natalie Rusk), as well as from Hal Abelson from MIT CSAIL. Financial support came from the US National Science Foundation.

May 9, 2017June 14, 2017

Surviving an “Eternal September:” How an Online Community Managed a Surge of Newcomers

Attracting newcomers is among the most widely studied problems in online community research. However, with all the attention paid to challenge of getting new users, much less research has studied the flip side of that coin: large influxes of newcomers can pose major problems as well!

The most widely known example of problems caused by an influx of newcomers into an online community occurred in Usenet. Every September, new university students connecting to the Internet for the first time would wreak havoc in the Usenet discussion forums. When AOL connected its users to the Usenet in 1994, it disrupted the community for so long that it became widely known as “The September that never ended”.

Our study considered a similar influx in NoSleep—an online community within Reddit where writers share original horror stories and readers comment and vote on them. With strict rules requiring that all members of the community suspend disbelief, NoSleep thrives off the fact that readers experience an immersive storytelling environment. Breaking the rules is as easy as questioning the truth of someone’s story. Socializing newcomers represents a major challenge for NoSleep.

Number of subscribers and moderators on /r/NoSleep over time.

On May 7th, 2014, NoSleep became a “default subreddit”—i.e., every new user to Reddit automatically joined NoSleep. After gradually accumulating roughly 240,000 members from 2010 to 2014, the NoSleep community grew to over 2 million subscribers in a year. That said, NoSleep appeared to largely hold things together. This reflects the major question that motivated our study: How did NoSleep withstand such a massive influx of newcomers without enduring their own Eternal September?

To answer this question, we interviewed a number of NoSleep participants, writers, moderators, and admins. After transcribing, coding, and analyzing the results, we proposed that NoSleep survived because of three inter-connected systems that helped protect the community’s norms and overall immersive environment.

First, there was a strong and organized team of moderators who enforced the rules no matter what. They recruited new moderators knowing the community’s population was going to surge. They utilized a private subreddit for NoSleep’s staff. They were able to socialize and educate new moderators effectively. Although issuing sanctions against community members was often difficult, our interviewees explained that NoSleep’s moderators were deeply committed and largely uncompromising.

That commitment resonates within the second system that protected NoSleep: regulation by normal community members. From our interviews, we found that the participants felt a shared sense of community that motivated them both to socialize newcomers themselves as well as to report inappropriate comments and downvote people who violate the community’s norms.

Finally, we found that the technological systems protected the community as well. For instance, post-throttling was instituted to limit the frequency at which a writer could post their stories. Additionally, Reddit’s “Automoderator”, a programmable AI bot, was used to issue sanctions against obvious norm violators while running in the background. Participants also pointed to the tools available to them—the report feature and voting system in particular—to explain how easy it was for them to report and regulate the community’s disruptors.

This blog post was written with Charlie Kiene. The paper and work this post describes is collaborative work with Charlie Kiene and Andrés Monroy-Hernández. The paper was published in the Proceedings of CHI 2016 and is released as open access so anyone can read the entire paper here. A version of this post was published on the Community Data Science Collective blog.

February 3, 2017June 14, 2017

New Dataset: Five Years of Longitudinal Data from Scratch

Scratch is a block-based programming language created by the Lifelong Kindergarten Group (LLK) at the MIT Media Lab. Scratch gives kids the power to use programming to create their own interactive animations and computer games. Since 2007, the online community that allows Scratch programmers to share, remix, and socialize around their projects has drawn more than 16 million users who have shared nearly 20 million projects and more than 100 million comments. It is one of the most popular ways for kids to learn programming and among the larger online communities for kids in general.

Front page of the Scratch online community (https://scratch.mit.edu) during the period covered by the dataset.

Since 2010, I have published a series of papers using quantitative data collected from the database behind the Scratch online community. As the source of data for many of my first quantitative and data scientific papers, it’s not a major exaggeration to say that I have built my academic career on the dataset.

I was able to do this work because I happened to be doing my masters in a research group that shared a physical space (“The Cube”) with LLK and because I was friends with Andrés Monroy-Hernández, who started in my masters cohort at the Media Lab. A year or so after we met, Andrés conceived of the Scratch online community and created the first version for his masters thesis project. Because I was at MIT and because I knew the right people, I was able to get added to the IRB protocols and jump through the hoops necessary to get access to the database.

Over the years, Andrés and I have heard over and over, in conversation and in reviews of our papers, that we were privileged to have access to such a rich dataset. More than three years ago, Andrés and I began trying to figure out how we might broaden this access. Andrés had the idea of taking advantage of the launch of Scratch 2.0 in 2013 to focus on trying to release the first five years of Scratch 1.x online community data (March 2007 through March 2012) — most of the period that the codebase he had written ran the site.

After more work than I have put into any single research paper or project, Andrés and I have published a data descriptor in Nature’s new journal Scientific Data. This means that the data is now accessible to other researchers. The data includes five years of detailed longitudinal data organized in 32 tables with information drawn from more than 1 million Scratch users, nearly 2 million Scratch projects, more than 10 million comments, more than 30 million visits to Scratch projects, and much more. The dataset includes metadata on user behavior as well the full source code for every project. Alongside the data is the source code for all of the software that ran the website and that users used to create the projects as well as the code used to produce the dataset we’ve released.

Releasing the dataset was a complicated process. First, we had navigate important ethical concerns about the the impact that a release of any data might have on Scratch’s users. Toward that end, we worked closely with the Scratch team and the the ethics board at MIT to design a protocol for the release that balanced these risks with the benefit of a release. The most important features of our approach in this regard is that the dataset we’re releasing is limited to only public data. Although the data is public, we understand that computational access to data is different in important ways to access via a browser or API. As a result, we’re requiring anybody interested in the data to tell us who they are and agree to a detailed usage agreement. The Scratch team will vet these applicants. Although we’re worried that this creates a barrier to access, we think this approach strikes a reasonable balance.

Beyond the the social and ethical issues, creating the dataset was an enormous task. Andrés and I spent Sunday afternoons over much of the last three years going column-by-column through the MySQL database that ran Scratch. We looked through the source code and the version control system to figure out how the data was created. We spent an enormous amount of time trying to figure out which columns and rows were public. Most of our work went into creating detailed codebooks and documentation that we hope makes the process of using this data much easier for others (the data descriptor is just a brief overview of what’s available). Serializing some of the larger tables took days of computer time.

In this process, we had a huge amount of help from many others including an enormous amount of time and support from Mitch Resnick, Natalie Rusk, Sayamindu Dasgupta, and Benjamin Berg at MIT as well as from many other on the Scratch Team. We also had an enormous amount of feedback from a group of a couple dozen researchers who tested the release as well as others who helped us work through through the technical, social, and ethical challenges. The National Science Foundation funded both my work on the project and the creation of Scratch itself.

Because access to data has been limited, there has been less research on Scratch than the importance of the system warrants. We hope our work will change this. We can imagine studies using the dataset by scholars in communication, computer science, education, sociology, network science, and beyond. We’re hoping that by opening up this dataset to others, scholars with different interests, different questions, and in different fields can benefit in the way that Andrés and I have. I suspect that there are other careers waiting to be made with this dataset and I’m excited by the prospect of watching those careers develop.

You can find out more about the dataset, and how to apply for access, by reading the data descriptor on Nature’s website.

January 30, 2017January 30, 2017

Supporting children in doing data science

As children use digital media to learn and socialize, others are collecting and analyzing data about these activities. In school and at play, these children find that they are the subjects of data science. As believers in the power of data analysis, we believe that this approach falls short of data science’s potential to promote innovation, learning, and power.

Motivated by this fact, we have been working over the last three years as part of a team at the MIT Media Lab and the University of Washington to design and build a system that attempts to support an alternative vision: children as data scientists. The system we have built is described in a new paper—Scratch Community Blocks: Supporting Children as Data Scientists—that will be published in the proceedings of CHI 2017.

Our system is built on top of Scratch, a visual, block-based programming language designed for children and youth. Scratch is also an online community with over 15 million registered members who share their Scratch projects, remix each others’ work, have conversations, provide feedback, bookmark or “love” projects they like, follow other users, and more. Over the last decade, researchers—including us—have used the Scratch online community’s database to study the youth using Scratch. With Scratch Community Blocks, we attempt to put the power to programmatically analyze these data into the hands of the users themselves.

To do so, our new system adds a set of new programming primitives (blocks) to Scratch so that users can access public data from the Scratch website from inside Scratch. Blocks in the new system gives users access to project and user metadata, information about social interaction, and data about what types of code are used in projects. The full palette of blocks to access different categories of data is shown below.

The new blocks allow users to programmatically access, filter, and analyze data about their own participation in the community. For example, with the simple script below, we can find whether we have followers in Scratch who report themselves to be from Spain, and what their usernames are.

Simple demonstration of Scratch Community Blocks

In designing the system, we had two primary motivations. First, we wanted to support avenues through which children can engage in curiosity-driven, creative explorations of public Scratch data. Second, we wanted to foster self-reflection with data. As children looked back upon their own participation and coding activity in Scratch through the project they and their peers made, we wanted them to reflect on their own behavior and learning in ways that shaped their future behavior and promoted exploration.

After designing and building the system over 2014 and 2015, we invited a group of active Scratch users to beta test the system in early 2016. Over four months, 700 users created more than 1,600 projects. The diversity and depth of users creativity with the new blocks surprised us. Children created projects that gave the viewer of the project a personalized doughnut-chart visualization of their coding vocabulary on Scratch, rendered the viewer’s number of followers as scoops of ice-cream on a cone, attempted to find whether “love-its” for projects are more common on Scratch than “favorites”, and told users how “talkative” they were by counting the cumulative string-length of project titles and descriptions.

We found that children, rather than making canonical visualizations such as pie-charts or bar-graphs, frequently made information representations that spoke to their own identities and aesthetic sensibilities. A 13-year-old girl had made a virtual doll dress-up game where the player’s ability to buy virtual clothes and accessories for the doll was determined by the level of their activity in the Scratch community. When we asked about her motivation for making such a project, she said:

I was trying to think of something that somebody hadn’t done yet, and I didn’t see that. And also I really like to do art on Scratch and that was a good opportunity to use that and mix the two [art and data] together.

We also found at least some evidence that the system supported self-reflection with data. For example, after seeing a project that showed its viewers a visualization of their past coding vocabulary, a 15-year-old realized that he does not do much programming with the pen-related primitives in Scratch, and wrote in a comment, “epic! looks like we need to use more pen blocks. :D.”

Additionally, we noted that that as children made and interacted with projects made with Scratch Community Blocks, they started to critically think about the implications of data collection and analysis. These conversations are the subject of another paper (also being published in CHI 2017).

In a 1971 article called “Teaching Children to be Mathematicians vs. Teaching About Mathematics”, Seymour Papert argued for the need for children doing mathematics vs. learning about it. He showed how Logo, the programming language he was developing at that time with his colleagues, could offer children a space to use and engage with mathematical ideas in creative and personally motivated ways. This, he argued, enabled children to go beyond knowing about mathematics to “doing” mathematics, as a mathematician would.

Scratch Community Blocks has not yet been launched for all Scratch users and has several important limitations we discuss in the paper. That said, we feel that the projects created by children in our the beta test demonstrate the real potential for children to do data science, and not just know about it, provide data for it, and to have their behavior nudged and shaped by it.

This blog post and the paper it describes are collaborative work with Sayamindu Dasgupta. We have also received support and feedback from members of the Scratch team at MIT (especially Mitch Resnick and Natalie Rusk), as well as from Hal Abelson. Financial support came from the US National Science Foundation. The paper itself is open access so anyone can read the entire paper here. This blog post was also posted on Sayamindu Dasgupta’s blog, on the Community Data Science Collective blog, and in several other places.

July 4, 2016

Studying the relationship between remixing & learning

With more than 10 million users, the Scratch online community is the largest online community where kids learn to program. Since it was created, a central goal of the community has been to promote “remixing” — the reworking and recombination of existing creative artifacts. As the video above shows, remixing programming projects in the current web-based version of Scratch is as easy is as clicking on the “see inside” button in a project web-page, and then clicking on the “remix” button in the web-based code editor. Today, close to 30% of projects on Scratch are remixes.

Remixing plays such a central role in Scratch because its designers believed that remixing can play an important role in learning. After all, Scratch was designed first and foremost as a learning community with its roots in the Constructionist framework developed at MIT by Seymour Papert and his colleagues. The design of the Scratch online community was inspired by Papert’s vision of a learning community similar to Brazilian Samba schools (Henry Jenkins writes about his experience of Samba schools in the context of Papert’s vision here), and a comment Marvin Minsky made in 1984:

Adults worry a lot these days. Especially, they worry about how to make other people learn more about computers. They want to make us all “computer-literate.” Literacy means both reading and writing, but most books and courses about computers only tell you about writing programs. Worse, they only tell about commands and instructions and programming-language grammar rules. They hardly ever give examples. But real languages are more than words and grammar rules. There’s also literature – what people use the language for. No one ever learns a language from being told its grammar rules. We always start with stories about things that interest us.

In a new paper — titled “Remixing as a pathway to Computational Thinking” — that was recently published at the ACM Conference on Computer Supported Collaborative Work and Social Computing (CSCW) conference, we used a series of quantitative measures of online behavior to try to uncover evidence that might support the theory that remixing in Scratch is positively associated with learning.

Of course, because Scratch is an informal environment with no set path for users, no lesson plan, and no quizzes, measuring learning is an open problem. In our study, we built on two different approaches to measure learning in Scratch. The first approach considers the number of distinct types of programming blocks available in Scratch that a user has used over her lifetime in Scratch (there are 120 in total) — something that can be thought of as a block repertoire or vocabulary. This measure has been used to model informal learning in Scratch in an earlier study. Using this approach, we hypothesized that users who remix more will have a faster rate of growth for their code vocabulary.

Controlling for a number of factors (e.g. age of user, the general level of activity) we found evidence of a small, but positive relationship between the number of remixes a user has shared and her block vocabulary as measured by the unique blocks she used in her non-remix projects. Intriguingly, we also found a strong association between the number of downloads by a user and her vocabulary growth. One interpretation is that this learning might also be associated with less active forms of appropriation, like the process of reading source code described by Minksy.

The second approach we used considered specific concepts in programming, such as loops, or event-handling. To measure this, we utilized a mapping of Scratch blocks to key programming concepts found in this paper by Karen Brennan and Mitchel Resnick. For example, in the image below are all the Scratch blocks mapped to the concept of “loop”.

We looked at six concepts in total (conditionals, data, events, loops, operators, and parallelism). In each case, we hypothesized that if someone has had never used a given concept before, they would be more likely to use that concept after encountering it while remixing an existing project.

Using this second approach, we found that users who had never used a concept were more likely to do so if they had been exposed to the concept through remixing. Although some concepts were more widely used than others, we found a positive relationship between concept use and exposure through remixing for each of the six concepts. We found that this relationship was true even if we ignored obvious examples of cutting and pasting of blocks of code. In all of these models, we found what we believe is evidence of learning through remixing.

Of course, there are many limitations in this work. What we found are all positive correlations — we do not know if these relationships are causal. Moreover, our measures do not really tell us whether someone has “understood” the usage of a given block or programming concept.However, even with these limitations, we are excited by the results of our work, and we plan to build on what we have. Our next steps include developing and utilizing better measures of learning, as well as looking at other methods of appropriation like viewing the source code of a project.

This blog post and the paper it describes are collaborative work with Sayamindu Dasgupta, Andrés Monroy-Hernández, and William Hale. The paper is released as open access so anyone can read the entire paper here. This blog post was also posted on Sayamindu Dasgupta’s blog and on Medium by the MIT Media Lab.

February 11, 2016

Unhappy Birthday Suspended

More than 10 years ago, I launched Unhappy Birthday in a fit of copyrighteous exuberance. In the last decade, I have been interviewed on the CBC show WireTap and have received an unrelenting stream of hate mail from random strangers.

With a recently announced settlement suggesting that “Happy Birthday” is on its way into the public domain, it’s not possible for even the highest-protectionist in me to justify the continuation of the campaign in its original form. As a result, I’ve suspended the campaign while I plan my next move. Here’s the full text of the notice I posted on the Unhappy Birthday website:

Unfortunately, a series of recent legal rulings have forced us to suspend our campaign. In 2015, Time Warner’s copyright claim to “Happy Birthday” was declared invalid. In 2016, a settlement was announced that calls for a judge to officially declare that the song is in the public domain.

This is horrible news for the future of music. It is horrible news for anybody who cares that creators, their heirs, etc., are fairly remunerated when their work is performed. What incentive will there be for anybody to pen the next “Happy Birthday” knowing that less than a century after their deaths — their estates and the large multinational companies that buy their estates — might not be able to reap the financial rewards from their hard work and creativity?

We are currently planning a campaign to push for a retroactive extension of copyright law to place “Happy Birthday,” and other works, back into the private domain where they belong! We believe this is a winnable fight. After all, copyright has been retroactively extended before! Stay tuned! In the meantime, we’ll keep this page here for historical purposes.

—“Copyrighteous“ Benjamin Mako Hill (2016-02-11)

February 3, 2016

Welcome Back Poster

My office door is on the second floor in front the major staircase in my building. I work with my door open so that my colleagues and my students know when I’m in. The only time I consider deviating from this policy is the first week of the quarter when I’m faced with a stream of students, usually lost on their way to class and that, embarrassingly, I am usually unable to help.

I made this poster so that these conversations can, in a way, continue even when I am not in the office.

January 4, 2016January 4, 2016

Celebrate Aaron Swartz in Seattle (or Atlanta, Chicago, Dallas, NYC, SF)

I’m organizing an event at the University of Washington in Seattle that involves a reading, the screening of a documentary film, and a Q&A about Aaron Swartz. The event coincides with the third anniversary of Aaron’s death and the release of a new book of Swartz’s writing that I contributed to.

The event is free and open the public and details are below:

WHEN: Wednesday, January 13 at 6:30-9:30 p.m.

WHERE: Communications Building (CMU) 120, University of Washington

We invite you to celebrate the life and activism efforts of Aaron Swartz, hosted by UW Communication professor Benjamin Mako Hill. The event is next week and will consist of a short book reading, a screening of a documentary about Aaron’s life, and a Q&A with Mako who knew Aaron well – details are below. No RSVP required; we hope you can join us.

Aaron Swartz was a programming prodigy, entrepreneur, and information activist who contributed to the core Internet protocol RSS and co-founded Reddit, among other groundbreaking work. However, it was his efforts in social justice and political organizing combined with his aggressive approach to promoting increased access to information that entangled him in a two-year legal nightmare that ended with the taking of his own life at the age of 26.

January 11, 2016 marks the third anniversary of his death. Join us two days later for a reading from a new posthumous collection of Swartz’s writing published by New Press, a showing of “The Internet’s Own Boy” (a documentary about his life), and a Q&A with UW Communication professor Benjamin Mako Hill – a former roommate and friend of Swartz and a contributor to and co-editor of the first section of the new book.

If you’re not in Seattle, there are events with similar programs being organized in Atlanta, Chicago, Dallas, New York, and San Francisco. All of these other events will be on Monday January 11 and registration is required for all of them. I will be speaking at the event in San Francisco.

January 3, 2016

The Boy Who Could Change the World: The Writings of Aaron Swartz

The New Press has published a new collection of Aaron Swartz’s writing called The Boy Who Could Change the World: The Writings of Aaron Swartz. I worked with Seth Schoen to introduce and help edit the opening section of book that includes Aaron’s writings on free culture, access to information and knowledge, and copyright. Seth and I have put our introduction online under an appropriately free license (CC BY-SA).

Over the last week, I’ve read the whole book again. I think the book really is a wonderful snapshot of Aaron’s thought and personality. It’s got bits that make me roll my eyes, bits that make me want to shout in support, and bits that continue to challenge me. It all makes me miss Aaron terribly. I strongly recommend the book.

Because the publication is post-humous, it’s meant that folks like me are doing media work for the book. In honor of naming the book their “progressive pick” of the week, Truthout has also published an interview with me about Aaron and the book.

Other folks who introduced and/or edited topical sections in the book are David Auerbach (Computers), David Segal (Politics), Cory Doctorow (Media), James Grimmelmann (Books and Culture), and Astra Taylor (Unschool). The book is introduced by Larry Lessig.

January 2, 2016

Access Without Empowerment (LibrePlanet 2015 Keynote)

At LibrePlanet 2015 (the FSF’s annual conference), I gave a talk called “Access Without Empowerment” as one of the conference keynote addresses. As I did for my 2013 LibrePlanet talk, I’ve edited together a version that includes the slides and I’ve posted it online in WebM and on YouTube.

Here’s the summary written up in the LibrePlanet program:

The free software movement has twin goals: promoting access to software through users’ freedom to share, and empowering users by giving them control over their technology. For all our movement’s success, we have been much more successful at the former. I will use data from free software and from several related movements to explain why promoting empowerment is systematically more difficult than promoting access and I will explore how our movement might address the second challenge in the future.

In related news, registration is open for LibrePlanet 2016 and that it’s free for FSF members. If you’re not an FSF member, the FSF annual fundraiser is currently going on so now would be a great time to join.

December 26, 2015December 31, 2015

Trust your technolust.

If you’ve ever lusted for a “Trust your technolust.” poster like the one seen in background of the climactic sequence in the 1995 film Hackers, you’re in luck. Just print this PDF template (also an SVG) onto a piece of yellow US letter paper.

Although I’m not even the first person I know to reproduce the poster, I did spend some time making sure that I got the typeface, kerning, wordspacing, and placement on the page just right. I figured I would share.

December 16, 2015

TheSetup ChangeLog

Several years ago, I did a long interview with TheSetup — a fantastic website that posts of interviews with nerdy people that ask the same four questions:

Who are you, and what do you do?
What hardware are you using?
And what software?
What would be your dream setup?

Because I have a very carefully considered — but admittedly quite idiosyncratic — setup, I spent a lot of time preparing my answers. Many people have told me that they found my write-up useful. I recently spoke with several students who said it had been assigned in one of their classes!

Of course, my setup has changed since 2012. Although the vast majority is still the same, there is a growing list of modifications and additions. To address this, I’ve been keeping a changelog on my wiki where I detail every major change and addition I’ve made to the setup that I described in the original interview.

August 1, 2015August 1, 2015

Understanding Hydroplane Races for the New Seattleite

It’s Seafair weekend in Seattle. As always, the centerpiece is the H1 Unlimited hydroplane races on Lake Washington.

In my social circle, I’m nearly the only person I know who grew up in area. None of the newcomers I know had heard of hydroplane racing before moving to Seattle. Even after I explain it to them — i.e., boats with 3,000+ horse power airplane engines that fly just above the water at more than 320kph (200mph) leaving 10m+ (30ft) wakes behind them! — most people seem more puzzled than interested.

I grew up near the shore of Lake Washington and could see (and hear!) the races from my house. I don’t follow hydroplane racing throughout the year but I do enjoy watching the races at Seafair. Here’s my attempt to explain and make the case for the races to new Seattleites.

Before Microsoft, Amazon, Starbucks, etc., there were basically three major Seattle industries: (1) logging and lumber based industries like paper manufacturing; (2) maritime industries like fishing, shipbuilding, shipping, and the navy; (3) aerospace (i.e., Boeing). Vintage hydroplane racing represented the Seattle trifecta: Wooden boats with airplane engines!

The wooden U-60 Miss Thriftway circa 1955 (Thriftway is a Washinton-based supermarket that nobody outside has heard of) below is a picture of old-Seattle awesomeness. Modern hydroplanes are now made of fiberglass but two out of three isn’t bad.

Although the boats are racing this year in events in Indiana, San Diego, and Detroit in addition to the two races in Washington, hydroplane racing retains deep ties to the region. Most of the drivers are from the Seattle area. Many or most of the teams and boats are based in Washington throughout the year. Many of the sponsors are unknown outside of the state. This parochialness itself cultivates a certain kind of appeal among locals.

In addition to old-Seattle/new-Seattle cultural divide, there’s a class divide that I think is also worth challenging. Although the demographics of hydro-racing fans is surprisingly broad, it can seem like Formula One or NASCAR on the water. It seems safe to suggest that many of the demographic groups moving to Seattle for jobs in the tech industry are not big into motorsports. Although I’m no follower of motorsports in general, I’ve written before cultivated disinterest in professional sports, and it remains something that I believe is worth taking on.

It’s not all great. In particular, the close relationship between Seafair and the military makes me very uneasy. That said, even with the military-heavy airshow, I enjoy the way that Seafair weekend provides a little pocket of old-Seattle that remains effectively unchanged from when I was a kid. I’d encourage others to enjoy it as well!

May 6, 2015May 6, 2015

Books Room

Is the locked “books room” at McMahon Hall at UW a metaphor for DRM in the academy? Could it be, like so many things in Seattle, sponsored by Amazon?

Mika noticed the room several weeks ago but felt that today’s International Day Against DRM was a opportune time to raise the questions in front of a wider audience.