# What product managers can learn from hospice nurses

I wasn’t expecting to learn anything related to product management and software development in a book about how we age and care for aging family, but I did. I have just finished reading Being Mortal by Atul Gawande, who also wrote and conducted the research and transformational changes in The Checklist Manifesto. (Image credit: Public domain: https://commons.wikimedia.org/wiki/File:Stethoscope-2.png)

From doctor-knows-best to patient-knows-best. Gawande talks about the changing role of doctors over time. In our grandparent’s era, ‘doctors knew best’. That was the age of the authoritarian doctor who made the decisions and was trusted to do so. Now, in contrast, for the most part, doctors are considered technical experts who can share information, but decisions ultimately rest with the patient. The idea is that the patient knows best, when given all the facts. However, when patients face important crossroads in their treatment and there are many uncertainties, neither approach works well and both lead to escalating interventions and, often, miserable people.

Both lead to suffering. The authoritarian path didn’t work well for patients because it didn’t take the patient’s fears and hopes into account at all. Without the patient’s preferences, doctors recommend actions within their own sphere of expertise.  Surgeons recommend surgery rather than hospice because surgery is what they know best. But Gawande, who was trained in the current technical-expert-sharing-information model of doctoring, illustrates how information sharing goes just as wrong when it comes to delivering the experience that patients would want. He tells several stories from his own practice, where patients clearly said, I don’t want to suffer, and I don’t want pointless heroics, but then choose to proceed through many, many rounds of painful procedures with very low probability of success. Why is that?

Software development has the same duality. In software development, we also have this same duality. The authoritarian model:  Which employee is the ‘owner’ of this product? They should make all the decisions about what to develop, and be responsible for the consequences.  Or, alternatively, the ‘Information’ model; let’s have the expert (product manager) gather the facts and present choices to leaders and other stakeholders; or lets develop objective metrics to guide us. And similarly, it can feel like we see-saw between decisions made with too little information and decisions that feel like the information was there, but it was never pulled together into the right decision. So I was very interested in why neither approach is working for doctors and patients, and what might be a better approach.

Why the informed patient model still fails. Gawande’s analysis of the informed patient that still makes the wrong decisions is that they don’t have the experience or the medical model to make the decision, even when they have all the facts. So, assuming you aren’t an astronaut or experienced physicist, think about it this way. If I put you in a space capsule, ask you where you want to go (which is what the doctors-know-best doctors forgot to do), and I tell you a bunch of readouts and their percentage likelihood of being correct, and then ask you whether to launch, you still aren’t going to be able to make a good decision, because you don’t have a model in your head about how all those measurements add up. Gawande specifically talks about how a patient might be imagining that a particular procedure with a high likelihood of success could extend their life by years, when in reality it is likely to be weeks, not years. The patient doesn’t have the experience to put all the information together into a coherent model and make a good choice.

So now what? So, is there a middle ground? Gawande describes a model where doctors gather even MORE information about what a patient WANTS by using four questions that come from the world of hospice. Then, the doctor combines the patient’s answers with their professional experience to guide patients in making decisions that are consistent with the patient’s own desires.

Learning from hospice nurses. The four questions are also interesting. 1. What do you understand about your situation? 2. What do you fear? 3. What do you hope for? 4. What trade offs are you willing and not willing to make?

Ask, tell, or guide? A middle path to product development.  In the world of software development, many organizations (including the one I work for) hire Product Managers to lead product development. In specific, Product Managers are responsible for deciding what features should be added to a product and with what priority. So the question is “Are Product Managers owners (the authoritarian model), expert consultants (the information sharing model), or expert guides (the new model Gawande proposes)?”

I would posit that the same insight Gawande has about doctoring is the right insight for product development. Product Managers don’t own their product. The product isn’t FOR them and there are too many critical stakeholders for them to be owners. But they also can’t present information and expect decisions from business leadership, precisely because the business leaders don’t have the full context and understanding of the detailed workings of the product and market. The product managers DO have that context.

My key insight from Being Mortal is those four questions that hospice nurses taught doctor’s to use to help them guide their patients. I am curious about whether those questions can be adapted to gather the right information, especially from business leadership and organizational stakeholders that don’t directly interact with the product, to allow Product Managers to wisely incorporate their requirements into good decisions.

So here are the four questions again

2. What do you fear?
3. What do you hope for?
4. What trade offs are you willing and not willing to make?

Frankly, they almost work as is. I would only change the phrase ‘your situation’ in the first one to match the context. It could be ‘what do you understand about our goals’, ‘what do you understand about our revenue position’, ‘what do you understand about our strategy for …’

I am going to give these a try. Let me know if you do too!

# Getting to the truth, the ground truth, and nothing but the ground truth.

## Takeaways for learning from HCOMP 2019, Part 2

At HCOMP 2019, there was a lot of information about machine learning that I found relevant to building educational technology. Surprisingly to me, I didn’t find other ed-tech companies and organizations at the Fairness, Accountability, and Transparency conference I went to last year in Atlanta or the 2019 HCOMP conference. Maybe ed-tech organizations don’t have research groups that are publishing openly and thus they don’t come to these academic conferences. Maybe readers of this blog will send me pointers to who I missed!

Mini machine learning terminology primer from a novice (skippable if you already know these): To train a machine learning algorithm that is going to decide something or categorize something, you need to start out with a set of things for which you already know the correct decisions or categories. Those are the ‘ground-truths’ that you use to train the algorithm. You can think of the algorithm as a toddler. If you want the algorithm to recognize and distinguish dogs from cats, you need to show it a bunch of dogs and cats and tell it what they are. Mom and Dad say —  “look, a kitty”; “see the puppy?” An algorithm can be ‘over-fitted’ to the ground truth you give it. The toddler example is when your toddler knows the animals you showed them (that Fifi is a cat and Fido is a dog), but doesn’t know what new animals are, for example the neighbor’s pet cat. To add a further wrinkle, if you are creating a ground-truth, it is always great if you have Mom and Dad to create the labels, but sometimes all you can get are toddlers (novices) labeling. Using novices to train is related to the idea of wisdom of the crowd, where the opinion of a collection of people is used rather than a single expert.  You can also introduce bias into your algorithm by showing it only calico cats in the training phase, causing it to only label calicos as “cats” later on. Recent real world examples of training bias come from facial recognition algorithms that were trained on light-skinned people and therefore have trouble recognizing black and brown faces.

Creating ground truth: A whole chunk of the talks were about different ways of creating ‘ground truths’ using ‘wisdom of the crowd’ techniques. Ed-tech needs quite a bit of ground-truth about the world to train algorithms to help students learn effectively. “How difficult is this task or problem?” “What concepts are needed to do this task/problem?” “What concepts are a part of this text/example/explanation/video?” “Is this solution to this task/problem correct, partially correct, displaying a misconception, or just plain wrong?”

Finding the best-of-the-crowd: Several of the presentations were about finding and motivating the best of the crowd. If you can find and/or train ‘experts’ in the crowd, you can get to the ground-truth at lower cost (in time or money). I am hoping that ed-tech can use these techniques to crowdsource effective practice exercises, examples, solutions, and explanations.

1. Wisdom of the toddlers. Heinecke, et. al (https://aaai.org/ojs/index.php/HCOMP/article/view/5279) described a three step method for obtaining a ground truth from non-experts. First, they used a large number of people and expensive mathematical methods to obtain a small ground truth. (If we are sticking with the cats and dogs example from the primer above, you have a large number of toddlers tell you whether a few animals are cats and dogs and use math to decide which animals ARE cats and ARE dogs using wisdom of the toddlers.) From there, step 2 is to find a small set of those large numbers of people who were the best at determining a ground-truth, and use them to create more ground-truth. (Find a group of toddlers who together labeled the cats and dogs correctly, and use them to label a whole bunch more cats and dogs). Finally, you use the large set of ground truth to train a machine learning algorithm. I think this is very exciting for learning content because we have students and faculty doing their day to day work and we might be able to find sets of them that can help answer the questions above.
2. Misconceptions of the herd: One complicating factor in educational technology ground truths is the prominent presence of misconceptions. The Best Paper winner at the conference, Simoiu et. al (https://aaai.org/ojs/index.php/HCOMP/article/view/5271), found an interesting, relevant, and in hindsight unsurprising result. This group did a systematic study of crowds answering 1000 questions from 50 different topical domains. They found that averaging the crowd’s answers almost always yields significantly better results than the average (50th percentile) person. They also wanted to see the effects of social influences on the crowd. When they showed the ‘consensus’ answer (current three most popular answers) to individuals, the crowd would be swayed by early wrong answers and thus did NOT perform on average better than the average unswayed person. Since misconceptions (wrong answers due to faulty understanding) are well known phenomena in learning, and are particularly resistant to change (if you haven’t seen Derek Muller’s wonderful 6 minute TED talk about this, go see it now!) we need to be particularly careful not to aid their contagion when introducing social features.

Are misconceptions like overfitting in machine learning? As an aside, my friend and colleague Sidney Burrus told an interesting story that sheds light on the persistence of misconceptions. Sidney talked about how, during the initial transition point between an earth-centered and sun-centered model of the solar system, the earth-centered model was much better at predicting orbits, because people had spent a lot of time adding detail to the model to help it correctly predict known phenomena. The first sun-centered models, however, used circular orbits and did a poor job of prediction, even though they had more ‘truth’ in them ultimately. Those early earth-centered models were tightly ‘fitted’ to the known orbits. They would not have been good at predicting new orbits, just like an overfitted machine learning model will fail on new data.

# Key meetings and the power of “What’s up with that?”

One of the biggest privileges of leadership is building on the brilliance and creativity of others. What I can do on my own is small compared to what I can do as part of a team. And, hopefully, what I have learned over the years helps the ambitious, creative, brilliant people that work with me achieve meaningful goals. It is definitely a messy business because people are messy, precisely because of the unique talents and perspectives we all bring. The following are not exclusively my ideas, but I have tested and tested and tested them over again, and they have proven their value. Where possible, I will tell you what sources I based these strategies on.

1. Some meetings are key. You have to meet with people one on one every week for at least 30 minutes.
1. Why one on one? Because when something is uncomfortable or going wrong it either won’t come out in a group meeting or will come out sideways and create the additional need for understanding and repair with a lot more people.
2. Why once a week? Because if the frequency is less than that, the vacations, travel, and illness that occasionally derail these meetings  create breaks that can span three to four weeks and a month is definitely too long for a problem to fester.
3. Why 30 minutes? It takes 30 minutes to talk about a complicated subject. I actually have found that if I meet with someone about projects AND people (including themself), I schedule 45 minutes minimum because the complicated subject often comes up after some simpler things get discussed.
2. The power of “What is up with that?” I learned this respectful strategy for recovering from failures from The 7 Habits of Highly Effective People by Steven Covey and the wonderful parenting book how-to-talk-so-kids-will-listen-and-listen-so-kids-will-talk by Adele Faber. The basic idea is that, when you think something has gone wrong, the most important factor in resolving the problem is understanding it. The first step is to describe the problem non-judgementally. Then ask “What is up with that?” (For non American English speakers, you can try “What happened with that”? Or “What is going on with that?”) So, for example, the non-judgemental description part might be “I noticed that we aren’t going to deliver feature-x by time-y” or “I noticed that we had an outage yesterday,” or “People looked tense in meeting z.” Then you ask “What is up with that?” to the person most likely to be responsible and/or have critical information.

As a leader or manager it is your responsibility to discuss any setbacks your team faces, but no leader is perfect at handling those conversations. This technique works for anyone, but it is particularly useful for both under and over-reactors. For those of you who might prefer to avoid conflicts, it gives you a way to discuss something in a simple, factual, non-confrontational way. It also works for those who tend to jump to conclusions and overreact, which could intimidate the receiver of the message,  by instead giving the receiver space and respect to respond.

The key to this technique is not to overload either question with whatever stories you might be telling yourself about why or who is to blame. Leaving those stories out preserves trust, maintains a team atmosphere, keeps the person you are asking on your side and un-defensive. The form of the question is important because it is neutral and open to the information the person you are asking has. You will be surprised what you learn. From a heart-felt “I screwed up and here is how I am fixing it” to “Oh, because we got an opportunity for even better feature la-di-da” to “We didn’t have this important thing we need. Can we work together to figure out how to get that?” It takes a ton of practice, but the good part of this is that the more you do it, the more you trust that you really do have colleagues working with you and not against you.

# HCOMP 2019 Part 1 – Motivation isn’t all about credit.

HCOMP 2019 Humans and machines as teams – takeaways for learning

The HCOMP (Human Computation) 2019 conference was about humans and machines working as teams and, in particular, combining ‘crowd workers’ (like those on Mechanical Turk and Figure Eight) and machine learning effectively to solve problems. I came to the conference to ‘map the field’ to learn about what people are researching and exploring in this area and to find relevant tools for building effective educational technology (ed-tech). I had an idea that this conference could be useful because ed-tech often combines the efforts of large numbers of educators and learners with machine learning recommendations and assistance. I wasn’t disappointed. The next few posts contain a few of the things that I took away from the conference.

Pay/Credit vs. Quality/Learning. Finding the sweet spot. Ed-tech innovators and crowd work researchers have a similar optimization problem: finding the sweet spot between fairness and accuracy. For crowd workers, the tension comes from a need to pay fairly for time worked, without inadvertently incentivizing lower quality work. The sweet spot is fair pay for repeatably high quality work. We have an almost identical optimization problem with student learning, if you consider student “pay” to be credit for work, and student “quality” to be learning outcomes. The good news is that while the two are often in tension with each other, those sweet spots can be found. Two groups in particular found interesting results in this area.

1. Quality without rejection: One group investigating repeatability of crowd work (Qarout et. al) found that there was a difference in quality (about 10%) between work produced for Figure Eight and Amazon Turk (AT). Amazon Turk allows requesters to reject work they deem low-quality and Figure Eight doesn’t and the AT workers completed tasks at about 10% higher quality. However, the AT workers also reported higher stress. Students also report high levels of stress over graded work and fear making mistakes, both of which can result in detriments to learning, but we have found that students on average put in less effort when work is graded for completion rather than correctness. Qarout et. al tried a simple equalizer. At the beginning of the job, on both platforms, they explicitly said that no work would be rejected, but that quality work would be bonused. This adjustment brought both platforms up to the original AT quality, and these modified AT tasks were chosen faster than the original ones because the work was more appealing once rejection was off the table. It makes me think we should be spending a lot of research time on how to optimize incentives for students expending productive effort without overly relying on credit for correctness. If we can find an optimal incentive, we have a chance to both increase learning and decrease stress at the same time. Now that is a sweet spot.

2. Paying fairly using the wisdom of the crowd: A second exploration that has implications for learning is FairWork (Whiting, et. al). This group at Stanford created a way for those wishing to pay $15/hour to Amazon Turk workers to algorithmically make sure that people are paid an average of$15/hour. Figuring out how long a task takes on AT is hard, similar to figuring out how long a homework takes, so what the Stanford group did was ask workers to report how long their task took and then throw out outliers and average that time. They then used Amazon’s bonusing mechanism to auto-bonus work up to \$15/hour. The researchers used some integrated tools to time a sample of workers (with permission) to see if the self-reported averages were accurate and found that they were. They plan to continue to research how well this works over time. For student work, we want to know whether students are spending enough effort to learn and we want them to get fair credit for their work. So it makes sense to try having students self-report their study time, and using some form of bonusing for correctness to balance incentivizing effort without penalizing the normal failure that is part of trying and learning.

# Accessibility Sprint – Part 3: Giving non-visual feedback for learning from interacting with PhET simulations

This is the third part of a series of blog posts about a coding sprint about creating interactive online learning that is usable for people with disabilities.

The first post gives an overview of the coding sprint. Each of these subsequent posts describes the work of one team.

### Sims Team Goal

Make the University of Colorado Boulder’s well respected, freely available, open-source PhET simulations more accessible for students who cannot see the simulation. By providing just the right amount of aural feedback about what is happening in the simulation after an action taken by a learner, blind and low-vision students could interact with the simulation, hear the results, and try additional actions to understand the underlying physics principles.

For example, PhET has been working on making their Balloon and Static electricity simulation accessible by including scene descriptions that screen readers read aloud in order to orient learners that can’t just look around to see what looks controllable. The controls are all accessible via keyboard actions. But, when a learner takes an action, for instance removing a charged wall that is keeping the balloon steady, the resulting balloon movement must be described. It would overwhelm the listener if small changes are repetitively described, and it can be confusing if messages end up being read out of logical order. For instance, messages about the balloons movement might end up being read behind a message describing its reaching an object and stopping.

This group decided to work on extending the messaging being reported by this balloon sim, in order to better report very dynamic events, such as moving the balloon, or the balloon moving itself (attracted to sweater) without overwhelming and overlapping messages. To do this, they designed an UtteranceQueue, which is a FIFO (first in, first out) message queue with certain rules: it takes an object that contains an utterance, an object the utterance is associated with, an expected utterance time (to delay before the next utterance) and a callback that returns a boolean, to allow the utterance to be cancelled, rather than spoken, when it reaches the top of queue. This should allow a simulation programmer to design the set of messages a particular object should report. For example the balloon would report being moved, as well as its state of charge, and whether it is stuck to something. The callback would allow, for example, the balloon movement messages to cancel themselves if the balloon is in fact now stuck to the sweater or wall.

### Testing (of the earlier version)

While the above development was occurring, one of the team members, Kelly, tested the feedback announcer function in the existing version of the balloon sim (the one before the code sprint) and got some user feedback for the group. The person that she tested with had worked with the sim before, but not with the new scene narration. Her test subject found the narration volubility to be just about right. He did, however, want to have a way to repeat some narration.

### Demo

At the end of the day, this group demonstrated the operation of the new UtteranceQueue when the wall is removed and the balloon starts drifting toward the sweater. The movement was described (and not overly repetitive) and when the balloon got to the sweater that event was narrated. No other messages followed.

People who worked in this group: Jesse Greenberg, Darron Guinness, Ross Reedstrom, Kelly Lancaster

# Accessibility sprint – part 2: Creating a mobile-friendly and accessible Infobox for maps

This is the second part of a series of blog posts about a coding sprint that happened the day before CSUN 17. The sprint was about creating interactive online learning that is usable for people with disabilities. This whole software area is called accessibility, and known as inclusive design.

The first post gives an overview of the coding sprint. Each of these subsequent posts describes the work of one team.

## Creating a mobile-friendly and accessible Infobox for maps

### Team Goal

Create a widget for helping people who are blind or have low vision explore maps that display statistical information (think popular vote winners in the US). This type of map is called a choropleth.

The existing infobox widget takes statistical data in a simple format and works with hot spots on an svg map to bring up an info box as a user mouses over or tabs to different regions on the map. The current version, however, isn’t accessible for low vision, doesn’t work well with screen readers, and doesn’t work on mobile. The team worked on improving these aspects of the widget (which can be reused for any statistical map).

## United States 2016 presidential race: Popular vote by state.

### Demo at the end of the day

Doug Schepers demonstrated the improvements. The demo showed the map tool improved for low vision and screen reader access. For low vision, the state selection outline was thickened, the info box contrast was increased and made resizable, the info box placement was adjusted to make sure the selected state was not covered. The ability to select the next state via tabbing on the states was added. Selection is currently in alphabetical order, and a better system would work on the navigation also. He also demonstrated using a screen reader and being able to select a state and hear it read the info box for each state. It uses ARIA Live Regions to update things. The statistical data is formatted using simple name, value pairs.

The ultimate goal is to define a simple standard for describing statistical map data and provide an open-source, reusable, accessible widget for interacting with these maps.

Doug Schepers and Derek Riemer worked together.

The code is available here: https://github.com/benetech/Accessible-Interactives-Dev/tree/master/MapInteractives

# CSUN 17 Acessibility Coding Sprint for People with Disabilities (Making learning accessible) – Part 1

Last week, my colleagues at OpenStax, Phil Schatz, Ross Reedstrom and I attended the 2nd annual pre-CSUN (but third overall) accessibility coding sprint to help make learning materials useable by people with disabilities.

## Prior accessibility coding sprints

The first took place in 2013 and was jointly sponsored by my Shuttleworth Foundation fellowship and Benetech and held at the offices of SRI. You can read more about that one in these earlier posts (2013-accessibility-post-1, post-2, post-3, post-4, and post-5). The second took place last year before the CSUN 2016 Accessibility Technology Conference in sunny San Diego and was again sponsored by funds from my Shuttleworth Foundation fellowship and by Benetech. That one focused specifically on tools for creating accessible math. Read more in Benetech’s blog post under “Sprinting towards accessible math”, Murray Sargent’s follow up post on accessible trees and Jamie Teh’s post about creating an open-source proof-of-concept extension of math speech rules used by the NVDA browser to make them sound more natural.

## This year’s sprint Participants at work

This one again took place in not-quite-as-sunny San Diego (California has been getting lots of rain) before this year’s CSUN-17 conference. The focus was on making interactive learning content accessible. And the very cool thing from my perspective is that my fellowship had nothing to do with the organization of this one. Benetech and MacMillan Learning sponsored and organized this one. The attendance was the largest ever with 30-ish in person participant and 5 or so attending remotely. We had several developers that both create accessible software and use assistive technology themselves.

Like previous sprints, we spent time initially getting to know each other and brainstorming and then divided into multiple teams ranging from a single person to five people working together to prototype, explore, or make progress on a particular accessibility feature. In upcoming posts, I will highlight each of the team’s goals and what they demonstrated at the end of the day.