How markets coopted free software’s most powerful weapon (LibrePlanet 2018 Keynote)

Several months ago, I gave the closing keynote address at LibrePlanet 2018. The talk was about the thing that scares me most about the future of free culture, free software, and peer production.

A video of the talk is online on Youtube and available as WebM video file (both links should skip the first 3m 19s of thanks and introductions).

Here’s a summary of the talk:

App stores and the so-called “sharing economy” are two examples of business models that rely on techniques for the mass aggregation of distributed participation over the Internet and that simply didn’t exist a decade ago. In my talk, I argue that the firms pioneering these new models have learned and adapted processes from commons-based peer production projects like free software, Wikipedia, and CouchSurfing.

The result is an important shift: A decade ago,  the kind of mass collaboration that made Wikipedia, GNU/Linux, or Couchsurfing possible was the exclusive domain of people producing freely and openly in commons. Not only is this no longer true, new proprietary, firm-controlled, and money-based models are increasingly replacing, displacing, outcompeting, and potentially reducing what’s available in the commons. For example, the number of people joining Couchsurfing to host others seems to have been in decline since Airbnb began its own meteoric growth.

In the talk, I talk about how this happened and what I think it means for folks of that are committed to working in commons. I talk a little bit about the free culture and free software should do now that mass collaboration, these communities’ most powerful weapon, is being used against them.

I’m very much interested in feedback provided any way you want to reach me including in person, over email, in comments on my blog, on Mastodon, on Twitter, etc.


Work on the research that is reflected and described in this talk was supported by the National Science Foundation (awards IIS-1617129 and IIS-1617468). Some of the initial ideas behind this talk were developed while working on this paper (official link) which was led by Maximilian Klein and contributed to by Jinhao Zhao, Jiajun Ni, Isaac Johnson, and Haiyi Zhu.

Honey Buckets

When I was growing up in Washington state, a company called Honey Bucket held a dominant position in the local portable toilet market. Their toilets are still a common sight in the American West.

Honey Bucket brand portable toilet. Photo by donielle. (CC BY-SA)

They were so widespread when I was a child that I didn’t know that “Honey Bucket” was the name of a company at all until I moved to Massachusetts for college. I thought “honey bucket” was just the generic term for toilets that could be moved from place-to-place!

So for the first five years that I lived in Massachusetts, I continued to call all portable toilets “honey buckets.”

Until somebody asked me why I called them that—five years after moving!—all my friends in Massachusetts thought that “honey bucket” was just a personal, idiosyncratic, and somewhat gross, euphemism.

Natural experiment showing how “wide walls” can support engagement and learning

Seymour Papert is credited as saying that tools to support learning should have “high ceilings” and “low floors.” The phrase is meant to suggest that tools should allow learners to do complex and intellectually sophisticated things but should also be easy to begin using quickly. Mitchel Resnick extended the metaphor to argue that learning toolkits should also have “wide walls” in that they should appeal to diverse groups of learners and allow for a broad variety of creative outcomes. In a new paper, Sayamindu Dasgupta and I attempted to provide an empirical test of Resnick’s wide walls theory. Using a natural experiment in the Scratch online community, we found causal evidence that “widening walls” can, as Resnick suggested, increase both engagement and learning.

Over the last ten years, the “wide walls” design principle has been widely cited in the design of new systems. For example, Resnick and his collaborators relied heavily on the principle in the design of the Scratch programming language. Scratch allows young learners to produce not only games, but also interactive art, music videos, greetings card, stories, and much more. As part of that team, Sayamindu was guided by “wide walls” principle when he designed and implemented the Scratch cloud variables system in 2011-2012.

While designing the system, Sayamindu hoped to “widen walls” by supporting a broader range of ways to use variables and data structures in Scratch. Scratch cloud variables extend the affordances of the normal Scratch variable by adding persistence and shared-ness. A simple example of something possible with cloud variables, but not without them, is a global high-score leaderboard in a game (example code is below). After the system was launched, we saw many young Scratch users using the system to engage with data structures in new and incredibly creative ways.

cloud-variable-script
Example of Scratch code that uses a cloud variable to keep track of high-scores among all players of a game.

Although these examples reflected powerful anecdotal evidence, we were also interested in using quantitative data to reflect the causal effect of the system. Understanding the causal effect of a new design in real world settings is a major challenge. To do so, we took advantage of a “natural experiment” and some clever techniques from econometrics to measure how learners’ behavior changed when they were given access to a wider design space.

Understanding the design of our study requires understanding a little bit about how access to the Scratch cloud variable system is granted. Although the system has been accessible to Scratch users since 2013, new Scratch users do not get access immediately. They are granted access only after a certain amount of time and activity on the website (the specific criteria are not public). Our “experiment” involved a sudden change in policy that altered the criteria for who gets access to the cloud variable feature. Through no act of their own, more than 14,000 users were given access to feature, literally overnight. We looked at these Scratch users immediately before and after the policy change to estimate the effect of access to the broader design space that cloud variables afforded.

We found that use of data-related features was, as predicted, increased by both access to and use of cloud variables. We also found that this increase was not only an effect of projects that use cloud variables themselves. In other words, learners with access to cloud variables—and especially those who had used it—were more likely to use “plain-old” data-structures in their projects as well.

The graph below visualizes the results of one of the statistical models in our paper and suggests that we would expect that 33% of projects by a prototypical “average” Scratch user would use data structures if the user in question had never used used cloud variables but that we would expect that 60% of projects by a similar user would if they had used the system.

Model-predicted probability that a project made by a prototypical Scratch user will contain data structures (w/o counting projects with cloud variables)

It is important to note that the estimated effective above is a “local average effect” among people who used the system because they were granted access by the sudden change in policy (this is a subtle but important point that we explain this in some depth in the paper). Although we urge care and skepticism in interpreting our numbers, we believe our results are encouraging evidence in support of the “wide walls” design principle.

Of course, our work is not without important limitations. Critically, we also found that rate of adoption of cloud variables was very low. Although it is hard to pinpoint the exact reason for this from the data we observed, it has been suggested that widening walls may have a potential negative side-effect of making it harder for learners to imagine what the new creative possibilities might be in the absence of targeted support and scaffolding. Also important to remember is that our study measures “wide walls” in a specific way in a specific context and that it is hard to know how well our findings will generalize to other contexts and communities. We discuss these caveats, as well as our methods, models, and theoretical background in detail in our paper which now available for download as an open-access piece from the ACM digital library.


This blog post, and the open access paper that it describes, is a collaborative project with Sayamindu Dasgupta. Financial support came from the eScience Institute and the Department of Communication at the University of Washington. Quantitative analyses for this project were completed using the Hyak high performance computing cluster at the University of Washington.

Climbing Mount Rainier

Mount Rainier is an enormous glaciated volcano in Washington state. It’s  4,392 meters tall (14,410 ft) and extraordinary prominent. The mountain is 87 km (54m) away from Seattle. On clear days, it dominates the skyline.

Drumheller Fountain and Mt. Rainier on the University of Washington Campus
Drumheller Fountain and Mt. Rainier on the University of Washington Campus (Photo by Frank Fujimoto)

Rainier’s presence has shaped the layout and structure of Seattle. Important roads are built to line up with it. The buildings on the University of Washington’s campus, where I work, are laid out to frame it along the central promenade. People in Seattle typically refer to Rainier simply as “the mountain.”  It is common to here Seattlites ask “is the mountain out?”

Having grown up in Seattle, I have an deep emotional connection to the mountain that’s difficult to explain to people who aren’t from here. I’ve seen Rainier thousands of times and every single time it takes my breath away. Every single day when I bike to work, I stop along UW’s “Rainier Vista” and look back to see if the mountain is out. If it is, I always—even if I’m running late for a meeting—stop for a moment to look at it. When I lived elsewhere and would fly to visit Seattle, seeing Rainier above the clouds from the plane was the moment that I felt like I was home.

Given this connection, I’ve always been interested in climbing Mt. Rainier.  Doing so typically takes at least a couple days and is difficult. About half of people who attempt typically fail to reach the top. For me, climbing rainier required an enormous amount of training and gear because, until recently, I had no experience with mountaineering. I’m not particularly interested in climbing mountains in general. I am interested in Rainier.

On Tuesday, Mika and I made our first climbing attempt and we both successfully made it to the summit. Due to the -15°C (5°F) temperatures and 88kph (55mph) winds at the top, I couldn’t get a picture at the top. But I feel like I’ve built a deeper connection with an old friend.


Other than the picture from UW campus, photos were all from my climb and taken by (in order): Jennifer Marie, Jonathan Neubauer, Mika Matsuzaki, Jonathan Neubauer, Jonathan Neubauer, Mika Matsuzaki, and Jake Holthaus.

Is English Wikipedia’s ‘rise and decline’ typical?

This graph shows the number of people contributing to Wikipedia over time:

The Rise and Decline of Wikipedia The number of active Wikipedia contributors exploded, suddenly stalled, and then began gradually declining. (Figure taken from Halfaker et al. 2013)

The figure comes from “The Rise and Decline of an Open Collaboration System,” a well-known 2013 paper that argued that Wikipedia’s transition from rapid growth to slow decline in 2007 was driven by an increase in quality control systems. Although many people have treated the paper’s finding as representative of broader patterns in online communities, Wikipedia is a very unusual community in many respects. Do other online communities follow Wikipedia’s pattern of rise and decline? Does increased use of quality control systems coincide with community decline elsewhere?

In a paper that my student Nathan TeBlunthuis is presenting Thursday morning at the Association for Computing Machinery (ACM) Conference on Human Factors in Computing Systems (CHI),  a group of us have replicated and extended the 2013 paper’s analysis in 769 other large wikis. We find that the dynamics observed in Wikipedia are a strikingly good description of the average Wikia wiki. They appear to reoccur again and again in many communities.

The original “Rise and Decline” paper (we’ll abbreviate it “RAD”) was written by Aaron Halfaker, R. Stuart Geiger, Jonathan T. Morgan, and John Riedl. They analyzed data from English Wikipedia and found that Wikipedia’s transition from rise to decline was accompanied by increasing rates of newcomer rejection as well as the growth of bots and algorithmic quality control tools. They also showed that newcomers whose contributions were rejected were less likely to continue editing and that community policies and norms became more difficult to change over time, especially for newer editors.

Our paper, just published in the CHI 2018 proceedings, replicates most of RAD’s analysis on a dataset of 769 of the  largest wikis from Wikia that were active between 2002 to 2010.  We find that RAD’s findings generalize to this large and diverse sample of communities.

We can walk you through some of the key findings. First, the growth trajectory of the average wiki in our sample is similar to that of English Wikipedia. As shown in the figure below, an initial period of growth stabilizes and leads to decline several years later.

Rise and Decline on Wikia The average Wikia wikia also experience a period of growth followed by stabilization and decline (from TeBlunthuis, Shaw, and Hill 2018).

We also found that newcomers on Wikia wikis were reverted more and continued editing less. As on Wikipedia, the two processes were related. Similar to RAD, we also found that newer editors were more likely to have their contributions to the “project namespace” (where policy pages are located) undone as wikis got older. Indeed, the specific estimates from our statistical models are very similar to RAD’s for most of these findings!

There were some parts of the RAD analysis that we couldn’t reproduce in our context. For example, there are not enough bots or algorithmic editing tools in Wikia to support statistical claims about their effects on newcomers.

At the same time, we were able to do some things that the RAD authors could not.  Most importantly, our findings discount some Wikipedia-specific explanations for a rise and decline. For example, English Wikipedia’s decline coincided with the rise of Facebook, smartphones, and other social media platforms. In theory, any of these factors could have caused the decline. Because the wikis in our sample experienced rises and declines at similar points in their life-cycle but at different points in time, the rise and decline findings we report seem unlikely to be caused by underlying temporal trends.

The big communities we study seem to have consistent “life cycles” where stabilization and/or decay follows an initial period of growth. The fact that the same kinds of patterns happen on English Wikipedia and other online groups implies a more general set of social dynamics at work that we do not think existing research (including ours) explains in a satisfying way. What drives the rise and decline of communities more generally? Our findings make it clear that this is a big, important question that deserves more attention.

We hope you’ll read the paper and get in touch by commenting on this post or emailing Nate if you’d like to learn or talk more. The paper is available online and has been published under an open access license. If you really want to get into the weeds of the analysis, we will soon publish all the data and code necessary to reproduce our work in a repository on the Harvard Dataverse.

Nate TeBlunthuis will be presenting the project this week at CHI in Montréal on Thursday April 26 at 9am in room 517D.  For those of you not familiar with CHI, it is the top venue for Human-Computer Interaction. All CHI submissions go through double-blind peer review and the papers that make it into the proceedings are considered published (same as journal articles in most other scientific fields). Please feel free to cite our paper and send it around to your friends!


This blog post, and the open access paper that it describes, is a collaborative project with Aaron Shaw, that was led by Nate TeBlunthuis. A version of this blog post was originally posted on the Community Data Science Collective blog. Financial support came from the US National Science Foundation (grants IIS-1617129,  IIS-1617468, and GRFP-2016220885 ), Northwestern University, the Center for Advanced Study in the Behavioral Sciences at Stanford University, and the University of Washington. This project was completed using the Hyak high performance computing cluster at the University of Washington.

Mako Hate

I recently discovered a prolific and sustained community of meme-makers on Tumblr dedicated to expressing their strong dislike for “Mako.”

Two tags with examples are #mako hate and #anti mako but there are many others.

I’ve also discovered Tumblrs entirely dedicated to the topic!

For example, Let’s Roast Mako describes itself “A place to beat up Mako. In peace. It’s an inspiration to everyone!

The second is the Fuck Mako Blog which describes itself with series of tag-lines including “Mako can fuck right off and we’re really not sorry about that,” “Welcome aboard the SS Fuck-Mako;” and “Because Mako is unnecessary.” Sub-pages of the site include:

I’ll admit I’m a little disquieted.

UW Stationery in LaTeX

The University of Washington’s brand page recently started publishing letterhead templates that departments and faculty can use for official communication. Unfortunately, they only provide them in Microsoft Word DOCX format.

Because my research group works in TeX for everything, Sayamindu Dasgupta and I worked together to create a LaTeX version of the “Matrix Department Signature Template” (the DOCX file is available here). We figured other folks at UW might be interested in it as well.

The best way to get the template to use it yourself is to clone it from git (git clone git://code.communitydata.cc/uw_tex_letterhead.git). If you notice issues or if you want to create branches with either of the other two types of official UW stationary, patches are always welcome (instructions on how to make and send patches is here)!

Because the template relies on two OpenType fonts, it requires XeTeX. A detailed list of the dependencies is provided in the README file. We’ve only run it on GNU/Linux (Debian and Arch) but it should work well on any operating system that can run XeTeX as well as web-based TeX systems like ShareLaTeX.

And although we created the template, keep in mind that we don’t manage UW’s brand identity in anyway. If you have any questions or concerns about if and when you should use the letterhead, you should contact brand and creative services with the contact information on the stationery page.

XORcise

https://mako.cc/copyrighteous/wp-content/uploads/2018/02/384px-Venn0110.svg_.png

XORcise (ɛɡ.zɔʁ.siz) verb 1. To remove observations from a dataset if they satisfy one of two criteria, but not both. [e.g., After XORcising adults and citizens, only foreign children and adult citizens were left.]

“Stop Mang Fun of Me”

Somebody recently asked me if I am the star of bash.org quote #75514 (a snippet of online chat from a large collaboratively built collection):

<mako> my letter "eye" stopped worng
<luca> k, too?
<mako> yeah
<luca> sounds like a mountain dew spill
<mako> and comma
<mako> those three
<mako> ths s horrble
<luca> tme for a new eyboard
<luca> 've successfully taen my eyboard apart
       and fxed t by cleanng t wth alcohol
<mako> stop mang fun of me
<mako> ths s a laptop!!

It was me. A circuit on my laptop had just blown out my I, K, ,, and 8 keys. At the time I didn’t think it was very funny.

I no idea anyone had saved a log and had forgotten about the experience until I saw the bash.org quote. I appreciate it now so I’m glad somebody did!

This was unrelated to the time that I poured water into two computers in front of 1,500 people and the time that I carefully placed my laptop into a full bucket of water.

My Kuro5hin Diary Entries

Kuro5hin logo

Kuro5hin (pronounced “corrosion” and abbreviated K5) was a website created in 1999 that was popular in the early 2000s. K5 users could post stories to be voted upon as well as entries to their personal diaries.

I posted a couple dozen diary entries between 2002 and 2003 during my final year of college and the months immediately after.

K5 was taken off-line in 2016 and the Internet Archive doesn’t seem to have snagged comments or full texts of most diary entries. Luckily, someone managed to scrape most of them before they went offline.

Thanks to this archive, you can now once again hear from 21-year-old-me in the form of my old K5 diary entries which I’ve imported to my blog Copyrighteous. I fixed the obvious spelling errors but otherwise restrained myself and left them intact.

If you’re interested in preserving your own K5 diaries, I wrote some Python code to parse the K5 HTML files for diary pages and import them into WordPress using it’s XML-RPC API. You’ll need to tweak the code to use it but it’s pretty straightforward.

Introducing Computational Methods to Social Media Scientists

The ubiquity of large-scale data and improvements in computational hardware and algorithms have provided enabled researchers to apply computational approaches to the study of human behavior. One of the richest contexts for this kind of work is social media datasets like Facebook, Twitter, and Reddit.

We were invited by Jean BurgessAlice Marwick, and Thomas Poell to write a chapter about computational methods for the Sage Handbook of Social Media. Rather than simply listing what sorts of computational research has been done with social media data, we decided to use the chapter to both introduce a few computational methods and to use those methods in order to analyze the field of social media research.

A “hairball” diagram from the chapter illustrating how research on social media clusters into distinct citation network neighborhoods.

Explanations and Examples

In the chapter, we start by describing the process of obtaining data from web APIs and use as a case study our process for obtaining bibliographic data about social media publications from Elsevier’s Scopus API.  We follow this same strategy in discussing social network analysis, topic modeling, and prediction. For each, we discuss some of the benefits and drawbacks of the approach and then provide an example analysis using the bibliographic data.

We think that our analyses provide some interesting insight into the emerging field of social media research. For example, we found that social network analysis and computer science drove much of the early research, while recently consumer analysis and health research have become more prominent.

More importantly though, we hope that the chapter provides an accessible introduction to computational social science and encourages more social scientists to incorporate computational methods in their work, either by gaining computational skills themselves or by partnering with more technical colleagues. While there are dangers and downsides (some of which we discuss in the chapter), we see the use of computational tools as one of the most important and exciting developments in the social sciences.

Steal this paper!

One of the great benefits of computational methods is their transparency and their reproducibility. The entire process—from data collection to data processing to data analysis—can often be made accessible to others. This has both scientific benefits and pedagogical benefits.

To aid in the training of new computational social scientists, and as an example of the benefits of transparency, we worked to make our chapter pedagogically reproducible. We have created a permanent website for the chapter at https://communitydata.cc/social-media-chapter/ and uploaded all the code, data, and material we used to produce the paper itself to an archive in the Harvard Dataverse.

Through our website, you can download all of the raw data that we used to create the paper, together with code and instructions for how to obtain, clean, process, and analyze the data. Our website walks through what we have found to be an efficient and useful workflow for doing computational research on large datasets. This workflow even includes the paper itself, which is written using LaTeX + knitr. These tools let changes to data or code propagate through the entire workflow and be reflected automatically in the paper itself.

If you  use our chapter for teaching about computational methods—or if you find bugs or errors in our work—please let us know! We want this chapter to be a useful resource, will happily consider any changes, and have even created a git repository to help with managing these changes!


The book chapter and this blog post were written with Jeremy Foote and Aaron Shaw. You can read the book chapter here. This blog post was originally published on the Community Data Science Collective blog.

OpenSym 2017 Program Postmortem

The International Symposium on Open Collaboration (OpenSym, formerly WikiSym) is the premier academic venue exclusively focused on scholarly research into open collaboration. OpenSym is an ACM conference which means that, like conferences in computer science, it’s really more like a journal that gets published once a year than it is like most social science conferences. The “journal”, in this case, is called the Proceedings of the International Symposium on Open Collaboration and it consists of final copies of papers which are typically also presented at the conference. Like journal articles, papers that are published in the proceedings are not typically published elsewhere.

Along with Claudia Müller-Birn from the Freie Universtät Berlin, I served as the Program Chair for OpenSym 2017. For the social scientists reading this, the role of program chair is similar to being an editor for a journal. My job was not to organize keynotes or logistics at the conference—that is the job of the General Chair. Indeed, in the end I didn’t even attend the conference! Along with Claudia, my role as Program Chair was to recruit submissions, recruit reviewers, coordinate and manage the review process, make final decisions on papers, and ensure that everything makes it into the published proceedings in good shape.

In OpenSym 2017, we made several changes to the way the conference has been run:

  • In previous years, OpenSym had tracks on topics like free/open source software, wikis, open innovation, open education, and so on. In 2017, we used a single track model.
  • Because we eliminated tracks, we also eliminated track-level chairs. Instead, we appointed Associate Chairs or ACs.
  • We eliminated page limits and the distinction between full papers and notes.
  • We allowed authors to write rebuttals before reviews were finalized. Reviewers and ACs were allowed to modify their reviews and decisions based on rebuttals.
  • To assist in assigning papers to ACs and reviewers, we made extensive use of bidding. This means we had to recruit the pool of reviewers before papers were submitted.

Although each of these things have been tried in other conferences, or even piloted within individual tracks in OpenSym, all were new to OpenSym in general.

Overview

Statistics
Papers submitted 44
Papers accepted 20
Acceptance rate 45%
Posters submitted 2
Posters presented 9
Associate Chairs 8
PC Members 59
Authors 108
Author countries 20

The program was similar in size to the ones in the last 2-3 years in terms of the number of submissions. OpenSym is a small but mature and stable venue for research on open collaboration. This year was also similar, although slightly more competitive, in terms of the conference acceptance rate (45%—it had been slightly above 50% in previous years).

As in recent years, there were more posters presented than submitted because the PC found that some rejected work, although not ready to be published in the proceedings, was promising and advanced enough to be presented as a poster at the conference. Authors of posters submitted 4-page extended abstracts for their projects which were published in a “Companion to the Proceedings.”

Topics

Over the years, OpenSym has established a clear set of niches. Although we eliminated tracks, we asked authors to choose from a set of categories when submitting their work. These categories are similar to the tracks at OpenSym 2016. Interestingly, a number of authors selected more than one category. This would have led to difficult decisions in the old track-based system.

distribution of papers across topics with breakdown by accept/poster/reject

The figure above shows a breakdown of papers in terms of these categories as well as indicators of how many papers in each group were accepted. Papers in multiple categories are counted multiple times. Research on FLOSS and Wikimedia/Wikipedia continue to make up a sizable chunk of OpenSym’s submissions and publications. That said, these now make up a minority of total submissions. Although Wikipedia and Wikimedia research made up a smaller proportion of the submission pool, it was accepted at a higher rate. Also notable is the fact that 2017 saw an uptick in the number of papers on open innovation. I suspect this was due, at least in part, to work by the General Chair Lorraine Morgan’s involvement (she specializes in that area). Somewhat surprisingly to me, we had a number of submission about Bitcoin and blockchains. These are natural areas of growth for OpenSym but have never been a big part of work in our community in the past.

Scores and Reviews

As in previous years, review was single blind in that reviewers’ identities are hidden but authors identities are not. Each paper received between 3 and 4 reviews plus a metareview by the Associate Chair assigned to the paper. All papers received 3 reviews but ACs were encouraged to call in a 4th reviewer at any point in the process. In addition to the text of the reviews, we used a -3 to +3 scoring system where papers that are seen as borderline will be scored as 0. Reviewers scored papers using full-point increments.

scores for each paper submitted to opensym 2017: average, distribution, etc

The figure above shows scores for each paper submitted. The vertical grey lines reflect the distribution of scores where the minimum and maximum scores for each paper are the ends of the lines. The colored dots show the arithmetic mean for each score (unweighted by reviewer confidence). Colors show whether the papers were accepted, rejected, or presented as a poster. It’s important to keep in mind that two papers were submitted as posters.

Although Associate Chairs made the final decisions on a case-by-case basis, every paper that had an average score of less than 0 (the horizontal orange line) was rejected or presented as a poster and most (but not all) papers with positive average scores were accepted. Although a positive average score seemed to be a requirement for publication, negative individual scores weren’t necessary showstoppers. We accepted 6 papers with at least one negative score. We ultimately accepted 20 papers—45% of those submitted.

Rebuttals

This was the first time that OpenSym used a rebuttal or author response and we are thrilled with how it went. Although they were entirely optional, almost every team of authors used it! Authors of 40 of our 46 submissions (87%!) submitted rebuttals.

Lower Unchanged Higher
6 24 10

The table above shows how average scores changed after authors submitted rebuttals. The table shows that rebuttals’ effect was typically neutral or positive. Most average scores stayed the same but nearly two times as many average scores increased as decreased in the post-rebuttal period. We hope that this made the process feel more fair for authors and I feel, having read them all, that it led to improvements in the quality of final papers.

Page Lengths

In previous years, OpenSym followed most other venues in computer science by allowing submission of two kinds of papers: full papers which could be up to 10 pages long and short papers which could be up to 4. Following some other conferences, we eliminated page limits altogether. This is the text we used in the OpenSym 2017 CFP:

There is no minimum or maximum length for submitted papers. Rather, reviewers will be instructed to weigh the contribution of a paper relative to its length. Papers should report research thoroughly but succinctly: brevity is a virtue. A typical length of a “long research paper” is 10 pages (formerly the maximum length limit and the limit on OpenSym tracks), but may be shorter if the contribution can be described and supported in fewer pages— shorter, more focused papers (called “short research papers” previously) are encouraged and will be reviewed like any other paper. While we will review papers longer than 10 pages, the contribution must warrant the extra length. Reviewers will be instructed to reject papers whose length is incommensurate with the size of their contribution.

The following graph shows the distribution of page lengths across papers in our final program.

histogram of paper lengths for final accepted papersIn the end 3 of 20 published papers (15%) were over 10 pages. More surprisingly, 11 of the accepted papers (55%) were below the old 10-page limit. Fears that some have expressed that page limits are the only thing keeping OpenSym from publshing enormous rambling manuscripts seems to be unwarranted—at least so far.

Bidding

Although, I won’t post any analysis or graphs, bidding worked well. With only two exceptions, every single assigned review was to someone who had bid “yes” or “maybe” for the paper in question and the vast majority went to people that had bid “yes.” However, this comes with one major proviso: people that did not bid at all were marked as “maybe” for every single paper.

Given a reviewer pool whose diversity of expertise matches that in your pool of authors, bidding works fantastically. But everybody needs to bid. The only problems with reviewers we had were with people that had failed to bid. It might be reviewers who don’t bid are less committed to the conference, more overextended, more likely to drop things in general, etc. It might also be that reviewers who fail to bid get poor matches which cause them to become less interested, willing, or able to do their reviews well and on time.

Having used bidding twice as chair or track-chair, my sense is that bidding is a fantastic thing to incorporate into any conference review process. The major limitations are that you need to build a program committee (PC) before the conference (rather than finding the perfect reviewers for specific papers) and you have to find ways to incentivize or communicate the importance of getting your PC members to bid.

Conclusions

The final results were a fantastic collection of published papers. Of course, it couldn’t have been possible without the huge collection of conference chairs, associate chairs, program committee members, external reviewers, and staff supporters.

Although we tried quite a lot of new things, my sense is that nothing we changed made things worse and many changes made things smoother or better. Although I’m not directly involved in organizing OpenSym 2018, I am on the OpenSym steering committee. My sense is that most of the changes we made are going to be carried over this year.

Finally, it’s also been announced that OpenSym 2018 will be in Paris on August 22-24. The call for papers should be out soon and the OpenSym 2018 paper deadline has already been announced as March 15, 2018. You should consider submitting! I hope to see you in Paris!

This Analysis

OpenSym used the gratis version of EasyChair to manage the conference which doesn’t allow chairs to export data. As a result, data used in this this postmortem was scraped from EasyChair using two Python scripts. Numbers and graphs were created using a knitr file that combines R visualization and analysis code with markdown to create the HTML directly from the datasets. I’ve made all the code I used to produce this analysis available in this git repository. I hope someone else finds it useful. Because the data contains sensitive information on the review process, I’m not publishing the data.


This blog post was originally posted on the Community Data Science Collective blog.