Assignment 3: Let's Crowdsource An Exam

    Questions due Saturday May 4 at 12:00pm

    Remixes due Wednesday May 8 at 11:59pm

    Votes due Saturday May 11 at 11:59pm

    [Break for project milestone and exam]

    Reflection due Tuesday May 28 at 11:59pm [late days allowed]

Goal

Crowdsourcing can be a powerful means of constructing collective wisdom. However, as economist Andrew Lo puts it, the wisdom of crowds can also descend into the madness of mobs. The goal in this assignment is to try our hand at wielding the wisdom edge of this particular double-edged sword, and to learn how challenging it can be to control it.

The output of this assignment will be a public question bank that the staff samples from to construct the exam for CS 278. After writing and revising the questions, the class will vote on which questions constitute the best exam. We will publish the top ~10% of questions based on those votes as an exam question bank so you can study from it. Questions from the exam will be drawn from this question bank, with a few staff-written questions added to the exam as well. You will automatically get full credit on any exam question that you contributed to, either in writing the original draft or the remixed revision. We're all writing our own exam here, so make sure it's good!

This assignment will be unlike most other assignments in that it will unfurl across multiple cascading deadlines. This setup is because most crowdsourcing algorithms are pipelines, so one part needs to finish before the next part can start. We are disallowing late days on most of these deadlines because it would negatively impact the other students whose questions you are remixing. Writing questions and remixes, and the final reflection, may each take a few hours; voting should take approximately one hour.

Stage 1: Write questions

In this stage, you will write questions for consideration in the class exam. Your goal is to generate three exam questions from three different lectures from the class so far. In Stage 2, another student will remix your questions, and you will remix other students' questions. To ensure there is enough time for classmates to remix your questions, no late days will be allowed for State 1.

First, use this link to be assigned three lectures out of the lectures that will be covered in the exam. Guest visitor days are not included in this sampling process.

  1. Going Viral
  2. Bustling Streets and Ghost Towns
  3. Norms
  4. Cold Start
  5. Growing Pains
  6. Feed Ranking
  7. Strong and Weak
  8. Group Collaboration
  9. The Wisdom of Crowds
  10. Peer Production

Write one excellent exam question per sampled lecture. The exam will be in class, closed-book and closed-notes. So, your question needs to be an effective closed book question. Don't focus on simple recall questions: instead, test whether people deeply understand and can apply a concept from the lecture.

The question should be answerable in one to two short paragraphs: no essay questions, no multiple choice, no true/false. For each question you write, make sure that it focuses on one or two concepts from a single lecture — don't go too broad, or it won't be answerable in a short paragraph.

Furthermore, because everyone in the class is familiar with a different set of socio-technical systems, don't assume that the reader of the question knows about or uses any system you mention in your question — explain anything about the system that would be needed to answer the question, or name multiple possible socio-technical systems in case the student doesn't know about the one you're referencing. Finally, when we are sampling questions for the exam, we will be looking for questions that engage with concepts on a deeper level, and potentially draw on important ideas beyond the "one big concept" in each lecture—so if you’re aiming to have your question chosen and get full credit for it on the exam, go beyond the surface.

Aim for your question to separate the students who understand the concept at an A grade level from the students who understand the concept at a B grade level: that someone who had only studied to a B level understanding of this content would get the question wrong, but someone who had studied to an A level understanding would get it right. Ideas for questions might include:

  • Given a concrete social computing system or situation, can the student apply the right social science concept from class to explain why that situation is occurring?
  • Given a problem with a social computing system or a design goal, can the student recommend the right design intervention from class that will address the problem or achieve the design goal?
  • Could the student predict which of two alternative social computing systems might be more likely to achieve a given design goal, and ground their rationale in a concept from the class?
  • Could the student predict what undesired outcome might arise from a particular social computing design?

Questions that will not get chosen by the TA:

  • If everybody writes their question about the same exact topic for a lecture, any individual question will be unlikely to be chosen since there will be so many covering the same thing. Dig a bit deeper.
  • What is [concept]?
  • How does [concept] work on [system that not everyone has heard of]?
  • Questions that require deep Stanford-specific knowledge (remember, SCPD students may not be familiar with Stanford culture)
  • Questions that (a) have an excessive number of sub-questions, (b) have more than a few sub-questions, and (c) are split into many sub-questions.

To give you a sense, here are some sample questions from previous years. Obviously, don't use these:

  • Pick a socio-technical platform: either one with an anonymous environment, or one that uses real names. What are the benefits of having the user's identity being anonymous/real name in the context of the socio-technical platform you chose? Explain how these norms may change if the use of identity on the site were reversed (eg: if a platform replaces anonymous names with usernames, or vice versa).
  • You are building a new social platform called Bernster, which lets you connect to other people interested in human-computer interaction. Your co-founder, Michael, just wants to launch and let users do whatever they want. What problems might arise if you don’t explicitly set up norms for user behavior?

Submit your three questions, one per lecture, on our online system. You will have the opportunity to resubmit your questions as many times as you like before the deadline.

Q: What if I forget to submit something for Stage 1? Notify course staff. As long as you are assigned lectures, continue with Stage 2.

Stage 2: Remix questions

Crowdsourcing would be easy if everything always went exactly as you intended. But other people aren't in your head, so things can go in unexpected directions. Now, you're going to be entrusting your peers with your questions, and hoping that they turn out the way you wanted. And you, likewise, will be remixing other students' questions, trying to improve them in order to make the best exam possible. These remixed questions are the ones that will be voted on and sampled from to create the exam. Because other students' extra credit depends on your remixes of their questions being entered into the system before the voting goes live, you may not take any late days on Stage 2.

When Stage 2 launches, you will receive an announcement with a link to receive three questions written by others. For each question, explain the learning goal of the question and remix it to generate three alternative rewrites:

  • Medium-difficulty question: Rewrite each question to improve the original question, keeping the original intended difficulty level. We will call these questions medium difficulty, in that they are separating a student with a B level understanding of the concept from a student with an A level understanding of the concept. Your question remix might clarify the wording, tune the difficulty so that it better fits the medium-level difficulty criteria, make sure that the question really drills in to a desired concept and not too many concepts at once, ensure that it can be answered in one–two paragraphs, or anything else that you think needs to be improved. While the changes may be minor wording fixes if that's the best path, you may not leave the question as is even if you think it's good: such is the nature of many crowdsourcing workflows.
  • Easy question: consider the question that you just improved to be a Medium difficulty question, because it separates the students with a B level understanding of the concept from students with an A level understanding of the concept. Now, create a remixed version that is an Easy question, able to separate students with a C level understanding of the concept from students with a B level understanding of the concept. This remixed question should be testing the same general learning goal or concept in the same style as the original question, but simpler.
  • Hard: The final remixed question should be Hard, able to separate the students with an A-minus level understanding of the concept from students with an A level understanding of the concept. Again, this should be a remix of the original question, rather than a complete rewrite from scratch.

You now have three remixes of the same question: an Easy question, a Medium question, and a Hard question.

Submit all three question remixes for each original question (a total of 3 remixes/original question * 3 original questions = 9 remixes) and the learning goal for each question to our online system. You will have the opportunity to resubmit your remixes as many times as you like before the deadline.

Q: What if I forget to submit something for Stage 2? Notify course staff. Continue with Stage 3.

Stage 3: Vote on questions for the exam

Now, it's time to identify which questions we think will make the most effective exam questions. This will have massive implications for your exam. You have less time for this part of the assignment, but it should be doable in an hour or less. We cannot delay the launching of the question bank to study for the exam; for this reason, you may not take late days on Stage 3.

You will complete 50 paired comparison votes. Each comparison will be asking for your opinion on two possible exam questions. Decide which of the two will make for a better exam question:

  • Does the question effectively test the concept in question?
  • Is the question a good fit for a closed-book exam?
  • Is the question at its stated difficulty level?

Comparisons will always be sampled from the same group (e.g., compare two Easy questions in one round, compare two Hard questions in another round). You will not be able to vote on any questions that you authored in Stage 2, or any questions that were written by remixing your questions from Stage 1. We will rank questions using TrueSkill scores from these paired comparisons, like with the meme ranking from Assignment 1.

Since this is a crowdsourcing assignment, we will follow the crowdsourcing strategy of including gold standard tasks in the question set as attention checks. Gold standard questions are pairs of options that the TAs have handpicked to have clear correct answers as to which question is better. If you fail the gold standard questions, you will be asked to redo your votes. Just like real crowdsourcing workflows, we include these attention checks because, in the past, some students would just click randomly in order to get their votes over and done with.

The staff will release the exam question bank after the votes close.

Q: What if I forget to submit something for Stage 3? You will still be able to do the reflection using random results in Stage 4.

Stage 4: Final reflections

What happened to your questions? Visit this link to see how your question and remix submissions did. If you missed any submission stages (ie. submitting original questions and/or remixes), you can go to this link to view a random question set's progression through the crowdsourcing pipeline to help you with the reflection.

Please submit a PDF of up to 500 words containing:

  • Your original, unremixed questions (not counted towards word count)
  • Your remixes of other students' questions (not counted towards word count)
  • The remixed versions (written by other students) of your original questions (not counted towards word count)
  • A description of what you were aiming for with your original questions
  • A reflection on what happened when your classmates remixed your original questions, including whether they made them better or worse, and why that might have happened
  • An analysis of why you think the voting on your questions turned out the way that it did, including whether you agree with the final scores assigned to those questions.
  • Go look again at the published question bank, which captured the class's highest-voted questions, and compare those questions to yours. Based on this and reflections of what happened to your questions in the crowdsourcing process, write brief overall reflections on crowdsourcing as a technique for eliciting collective wisdom. Draw on course concepts in your reflection.

Honor Code

As the crowdsourcing lectures made clear, for crowdsourcing to be effective, we need to aggregate independent judgments. So, it is critical that you not collaborate on this assignment. This includes sharing questions with others, or communicating about what to vote for. Imagine that this were any other CS class with coding: in those classes, you would be disallowed from sharing your code with fellow students, or creating a massive Google Drive of shared code to work off of each other. Apply those norms from other classes here. Attempts to "game the system" by upvoting your friends' questions, or upvoting questions for any reasons other than the three criteria in the assignment, are in violation of the principles described in the course and of the Honor Code. Such votes would bias the final question pool in a way that unfairly disadvantages people who genuinely worked hard to write good questions and advantage people with large social networks in the class.

After the question bank is published, you are welcome to study with others for the exam. You may hold study sessions with other students to discuss the questions in the question bank before the exam, but you may not collaborate on, or share, written notes or answers. The exam will be closed-book, closed-note.

Extra Credit

Any question that is sampled on the exam that you can take credit for, you get automatic full credit on the exam. If you created the original question in Stage 1, or if you created the remixed version in Stage 2, and the question is included on the exam, you will get free credit on that question in the exam. This means that if a remixed question that you wrote (or a question remixed from your original question) gets voted highly in the TrueSkill rankings, you have a stronger chance of getting those free points on the exam.

Grading

You will be graded primarily on two factors: (1) your completion of the process; and (2) your analysis of the crowdsourcing pipeline and what happened to your questions.

Grading Rubric

Category Insufficiency Adequacy Proficiency Mastery
Questions, Remixes, Votes
12 points
Points for on-time submission of each stage of the crowdsourcing process. Four points for each of: (1) Write and submit three questions on the given lectures; (2) Remix three questions into Easy, Medium, and Hard variants; (3) Vote on 50 pairs of questions.
Reflection: Remixing
4 points
Incomplete or inadequate completion of the reflection. Reports what happened to the questions, but no reflection on whether they were improved, or reasons why the changes occurred as they did. Explains whether the remixed questions improved or weakened the initial intent, but with only surface-level reflection as to why. Puts forward a plausible and thoughtful analysis as to why the questions were remixed the way that they were.
Reflection: Voting
4 points
Incomplete or inadequate completion of the reflection. Reports what happened during voting, but no reflection on why or whether you agree. Only surface-level exploration of why the voting turned out the way that it did (e.g., "I guess people didn't like it") Puts forward a plausible and thoughtful analysis as to why the vote results occurred, and why they agree or disagree with the outcome.
Reflection: Crowdsourcing Process
4 points
Reflection does not engage with the crowdsourcing outcomes. Reflection offers a point of view but does not integrate course concepts. Reflection offers a point of view on crowdsourcing that engages surface-level with course concepts. Strong integration of course concepts to ground an opinion on crowdsourcing as a process of eliciting collective wisdom.