How to Make Tech Interviews a Little Less Awful | by Rachel Thomas | Medium |

https://medium.com/@racheltho/how-to-make-tech-interviews-a-little-less-awful-c29f35431987

Everything about the current interview process in tech is broken. I suspect that, no matter what, being evaluated and making judgements on others for a decision that has a big impact on you both is never going to be fun — especially since both of you only have a limited amount of time for the process. However, I think there is plenty of room to make tech interviews slightly less awful than they currently are.

The dreaded white-board interview! Photo from #WOCinTech chat

What is your goal in interviewing candidates? If you answered “to hire the best candidate” you should reconsider. Forming the best team is a goal that will better serve your company. Will finding candidates with the same strengths and same background as yourself add a lot to your company? Or might these similar candidates also have the same weaknesses, same blind spots, and same skills gaps that you have? What research on team performance should you be using when deciding who and how to hire?

A team of CMU and MIT researchers published a series of studies in Science showing that the smartest individuals did not result in the smartest teams, and that individual IQs did not map directly to the collective intelligence of a team. The smartest teams had the following 3 characteristics:

members contributed more equally rather than letting 1 or 2 people dominate
members scored higher at reading complex emotional states from people’s faces
had more women. That’s right, having more women on a team predicted that the team would perform better.

Even if you could gather a team consisting of only those mythical “10x rockstar ninjas,” it would perform worse than a team of qualified women who are good at reading emotional states.

So, what do we know about how hiring tends to work in practice — are companies behaviours in line with this research? The answer, as we’ll see, is 100% “no”!

Triplebyte, a technical recruiting company, conducts technical interviews and then sends the best candidates on to interview at the most prestigious startups. They have collected extensive data, both from over 300 technical interviews and how those candidates fare in earning job offers. The #1 finding from Triplebyte’s research is that “the types of programmers that each company looks for often have little to do with what the company needs or does. Rather, they reflect company culture and the backgrounds of the founders.”

Triplebyte’s key advice for job seekers is to read the bios of founders and apply to companies where the CTO shares your background. Since only 3% of VC funding goes to women and less than 1% goes to Black founders, this advice will be hard to follow for applicants from underrepresented groups or with non-traditional backgrounds. How rare is it for a founder to think that a Black woman candidate reminds him of himself? Before you suggest that the solution is just for women and people of color to found more companies, remember that investors prefer ideas pitched by a man more than an identical pitch from a woman, and that out of funded companies, those with a Black women founder raise an average of just $36,000 compared to an average of $1.3 million for companies founded by White men.

Photo from #WOCinTech Chat

People think that being smart means knowing the same things they know. Interviewers don’t realize that they’re bad at judging how much a question tests intelligence and how much it tests non-essential familiarity with particular concepts from the interviewer’s expertise. For crafting interview questions, this means that you must get an engineer with a very different background from the person who created a question to vet that question and confirm how difficult it would be for someone without a background in [insert non-essential framework or concept].

I was once on a team where the manager frequently ignored the team’s feedback from interviews to unilaterally make hiring decisions. The only pattern I could find was that he liked candidates that were similar to himself, and had a demoralizing effect on team morale to have our feedback ignored.

Powerful voices in tech have given the advice to “avoid false positives” (accidentally hiring employees who turn out not to be good) at all costs, even though this will result in more false negatives (accidentally rejecting candidates who would’ve been great). Steve Yegge, senior staff engineer and manager at Google for 11 years, writes, “Google has a well-known false negative rate, which means we sometimes turn away qualified people, because that’s considered better than sometimes hiring unqualified people. This is actually an industry-wide thing, but the dial gets turned differently at different companies. At Google the false-negative rate is pretty high.” Max Levchin, cofounder and former CTO of PayPal, says of the early days of PayPal, “There are some legendary-ish tales of me not hiring people because they used the wrong word in an interview… I’m sure we had lots of false negatives, but we have very few false positives.”

Joel Spolsky, cofounder of Stack Overflow, echoes this advice, adding “if you reject a good candidate, I mean, I guess in some existential sense an injustice has been done, but, hey, if they’re so smart, don’t worry, they’ll get lots of good job offers.” This would be reassuring if false negatives were random and nobody experienced more than a few. However, if an entire industry’s interview processes are biased against a particular group of people, members of that group will have a hard time getting hired anywhere, regardless of how talented they are. Consider false negatives in the light of the following two stories about the job searches of a Black engineer and a trans engineer.

Justin Webb, a software engineer with a decade of experience, who had just graduated from Hack Reactor (the most selective coding academy, and one that boasts a 99% job placement rate) was the only Black student in his class and the only one who couldn’t find a job within 6 months of graduating. Webb describes his interview process at GitHub as typical of what he faced at many companies. At one point, what he believed to be the final interview for the job turned out to just be the 4th round in a 6-step process. Github eventually rejected him, saying that they didn’t see him as a good fit, despite his having passed their most recent interview challenge. “If the way I passed the test wasn’t right enough, I could’ve learned the right way to get the answers right,” Webb said. “But hearing ‘right’ wasn’t right enough was frustrating.”
February Keeney, an engineering manager at GitHub and a trans woman, writes “my career has become an A/B Test in gender. With the clear ‘winner’ being male.” For the first 15 years of her career in tech, she presented as male and received an offer every single time she had an on-site interview. After she transitioned, she spent a year job searching, receiving numerous rejections. Keeney reports that mid-transition, “Finally, one day, I gave up. I went to an interview without nail polish, no lip gloss. I presented as male as possible. Lo and behold: I got an offer.”

Keeney recounts, “On a couple of occasions, I noticed a clear antagonistic shift when the interviewer realized I was trans. The questions got unfairly difficult and the tone more deeply interrogatory. It is not hard to ensure a candidate does poorly on an interview if you are really determined to undermine them.” She writes of the frustration of interviews that went well, where there were multiple rounds and onsites, and then finally at the end she would be rejected based on something that was discussed as a non-issue in the very first phone screen.

It’s easy for Spolsky to say that false negatives are not a big deal, but brutal for Webb or Keeney to endure a discouraging 12 months of job searching, receiving surprising rejections after interviews that went well. Companies with biased interview processes are not aware of how many great Black, Latino, and trans people they are rejecting who would have made excellent employees, and everybody is losing out.

Photo from #WOCinTech Chat

If at all possible, track the people that you reject to see if any of those rejections were false negatives. Trek Glowacki, a software engineer at marketing startup Popular Pays, wrote “I’ve been twitter following the careers of people we interviewed but passed on at my last gig. Turns out we were almost always wrong. We had a group called ‘Bar Raisers’ who mainly torpedoed candidates that lacked ‘CS Fundamentals’. We passed on so many good people”.

Marco Rogers, an engineering manager at tech startup Clover Health, aptly pointed out “Sometimes the reality of privilege is as subtle as a lack of extra scrutiny when you’re trying to get a job”.

It is common in tech for interviewers to all have their own pet interview question, and for each candidate for a position to be interviewed by different sets of interviewers, thus receiving different questions. At many companies, interviewers produce a simple binary hire/no-hire vote (as opposed to gathering data across several criteria). Some of the interview questions may have little, if anything, to do with the day-to-day work required for the position.

Your interview process should follow the following four simple guidelines:

Resemble actual work the candidate would be doing in their job
Clear rubrics
Consistent and standardized
Don’t have an elite-candidate fast path

Let’s look at each of these in more detail.

Resemble actual work the candidate would be doing in their job.

Developer Yegor Bugayenko (the author of a highly rated book about object oriented programming) writes about the huge waste of time it was when Amazon flew him out for an interview that consisted of 4 hours of whiteboarding algorithm questions. If the recruiter had said, “We’re looking for an algorithm expert,” he would have declined and both he and Amazon could ha’ve avoided wasting their time. “Clearly, I’m not an expert in algorithms. There is no point in giving me binary-tree-traversing questions; I don’t know those answers and will never be interested in learning them,” Bugayenko writes.

Photo from #WOCinTech Chat

Thomas and Erin Ptacek, formerly founders and principals at Matasano, suggest constructing a test for a web developer position by taking an application you’ve written, removing some features (such as search or the customer order update), bundling up the app with all its assets onto a virtual machine, give it to the candidate, and have them add back the feature you removed. Popular communications app Slack has built an engineering team that is more diverse than many other tech companies, and Slack Director of Engineering Julia Grace describes a similar process of a real world coding challenge as part of the Slack engineering interview. Although Matasano and Slack give these challenges as take-home assignments, I think they could also work as part of an onsite interview [see the next section for the take-home debate]. For instance, the Airbnb data science team gives candidates a project to work on during the onsite, in which they have a few hours to work by themselves but can ask questions of the team during this time.

Lever, a tech start-up that builds popular recruiting software used by Netflix, Reddit, Yelp, and Lyft, describes mock code reviews as a key part of its own interview process for software engineers. Lever engineer Zach Millman writes, “Problem-solving and getting things done are important, but ideally we’d also hire engineers who are great at API design, naming, testing, maintenance, scaling, extensibility, etc. — all of those unglamorous things that actually make software projects successful and sustainable” and that mock code reviews are designed to get at these values. Importantly, mock code reviews assess a candidate’s communication skills and prioritization.

Clear rubrics

Research has found that the less well-defined the criteria for a hiring decision, the more that bias and post-hoc justifications play a role in decisions. A study by Yale researchers titled “Constructed Criteria: Redefining Merit to Justify Discrimination” found that when a male candidate had more practical experience and a female candidate had more academic experience, people chose the male candidate and ranked practical experience as more important. However, when the roles were flipped (the man had more academic experience and the woman more practical experience), people still chose the male candidate and instead ranked academic experience as more important. This effect was mitigated by forcing people to decide on their criteria before looking at the applications.

Elena Grewal, interim head of data science at Airbnb, identified that women were being disproportionately weeded out during the take-home exercise part of the interview process. “We took a close look at this and realized that the people who were grading the exercise had no clear rubric, so we changed this and made it clear what we were looking for, we made the grading consistent, and if a person was successful they were moved to the next round.” Since instituting that change, the percentage of women on the data science team has doubled from 15% to 30%.

Slack’s exercise is graded against a set of over 30 predetermined criteria. As discussed in the Yale study above, deciding on a clear scoring rubric before evaluating candidates reduces bias. The Ptaceks recommend collecting as much data as possible, and that this data be objective facts, such as: unit test coverage, algorithmic complexity, and handling a known corner case.

Consistent and standardized

Every candidate should receive the same test, be asked the same questions, and should be graded in the same way. One way that Slack further attempts to remove bias and standardize the results is by hiding the name and resume of the person submitting the test from the grader. (Reminder: resumes with traditionally African-American names are viewed more negatively than resumes with stereotypically White names, and job applications with female names are viewed more negatively than job applications with male names).

Photo from #WOCinTech Chat

To reduce bias, tests don’t just need to be graded consistently; the entire process of evaluating a candidate needs to be consistent. Professor Lauren Rivera of Northwestern’s Business School “played a fly on the wall during hiring meetings at one consulting firm. She found that the team paid little attention when white men blew the math test but close attention when women and Black people did. Because decision makers (deliberately or not) cherry-picked results, the testing amplified bias rather than quashed it.”

Don’t have an elite-candidate fast path

It’s valuable to know what your tests say about a candidate you know you want to hire. Seeing how your elite candidates perform may help you discover that some of your tests/questions/criteria aren’t useful. An HR director at a food company discovered “that white managers were making only strangers — most of them minorities — take supervisor tests and hiring white friends without testing them.” Hiring through the friend networks of current employees is one of the primary ways that tech companies recruit, and there can be a lot of variation in interview experience depending on who at the company you know and how well you know them. Given that three-quarters of White people have no non-White friends, we can guess the race of most referrals in tech companies with predominantly White employees.

We often mistake confidence for competence. Women perceive their abilities as being worse than they are, whereas men have an inflated sense of their abilities. This difference causes groups to be more likely to pick male leaders because of their over-confidence, compared to more qualified women who are less confident. Tech interview practices lead us to rate confident candidates more highly than less confident candidates. The solution is not just to tell women to be more confident, because women are perceived negatively for displaying traits associated with confidence in men (negotiating for themselves, speaking more in meetings).

No. Performing under adversarial pressure from co-workers is not inherently part of the job of being a developer. Even for positions that would involve frequent on-calls or racing to fix failing production systems, developers can be a collaborative team uniting around an external issue, not adversaries against their fellow co-workers. If your company has a work environment of adversarial competition or hostility, you will have trouble retaining and attracting employees and should rethink your culture.

Even the best interview process in the world will not be enough to overcome a biased or toxic culture. This is not just a theoretical issue — even a year before the most recent revelations about Uber’s toxic culture, engineers were sharing on Medium the responses they send to recruiters declining to even interview with Uber (see Tess Rinerson’s Dear Uber Recruiter and Tara Adiseshan’s Dear Uber Recruiter Part 2, both from March 2016). Note that it is not just Uber — there are many, many companies losing applicants as women warn each other about their experiences with toxic companies.

Photo from #WOCinTech Chat

This is a highly contested question and there are reasonable arguments on both sides.

One of the primary concerns about work sample tests is that they can be very time-consuming, which puts people with less time (parents, those caring for elderly or sick relatives, people with health challenges, or anyone needing to work multiple jobs) at an unfair disadvantage. On the other hand, given that any recruiting process must by definition take up a significant chunk of an applicant’s time, the flexibility of a take-home assignment can be the best option for many. Furthermore, it’s likely to more closely reflect the real work that an applicant will be doing, compared to white-board coding in 45 minutes with someone looking over their shoulder making comments.

If you do use a take-home assignment, you should:

Have a time limit (with candidates choosing when they will receive the assignment, and then agreeing to email it in 1–3 hours later) and don’t be too time-consuming. I once spent two consecutive full weekends on take-home coding assignments, while working full-time, and I felt so exhausted afterwards that I planned to take a month off from my job search (fortunately, those assignments eventually led to offers). And this was before I had a child!
Have a reasonable screen beforehand. It’s not fair to ask someone to do hours of work if there’s little chance you’ll invite them for an onsite. I feel angry when I hear about friends spending weeks on multiple take-home challenges (for a single company) and then being told that they don’t have enough experience. That is messed up! Companies should know whether a candidate has enough experience from the resume or initial phone screen, before wasting dozens of hours of time.
Consider paying candidates for time spent. I have the financial means to hire a babysitter for my toddler if I wanted to do a take-home assignment for a company that I was excited about, but not everyone does.
Realize that some people will choose not to apply. I’ve decided against applying to a number of companies once I realized how long or poorly constructed the take-home assignment was.
Consider giving people options, because personal preferences vary. I find pair-programming interviews to be the most anxiety producing (they make me very self-conscious), but many people prefer them. I enjoy whiteboarding because of my math background. If you offer candidates a choice, be sure to gather data and make sure that you are fair across challenge-type.

Both Slack’s engineering team and Airbnb’s data science teams make use of take-homes, and they are doing better at diversity hiring than most tech companies. However, these teams also have rigorously pre-defined criteria for grading, so perhaps that is the reason behind their diversity. Maybe the reason I love the Ptaceks’ blog post isn’t that they use take-homes, but that it is so infused with respect, thoughtfulness and consideration for the interviewee. Perhaps any interviewing format could work as long as it is thoughtfully constructed, involves consideration and respect for the interviewee (and their time), has clear criteria, is consistent and fair, and gathers data.

Hopefully, it is abundantly clear by now that if you want to increase diversity at your company, you will need to completely overhaul your interview process (and that if you don’t increase diversity, your organizational performance will suffer). However, the very first step to increasing diversity, which you MUST do before you do anything else, is to make sure you are treating everyone, including the women and people of Color who are already working at your company, VERY WELL. This includes helping them prepare for promotions, listening when they speak, and firing their harassers. Unfortunately, this is rare. You can read more here.

Interviewing is less fun than coding, and designing a thoughtful interview process is even less fun, but it is crucial that you put time into doing it well. Otherwise, you will not notice awesome candidates right under your nose, and your company is not assembling the best team possible. At most companies, technical interviewers receive no training, and interviewing is a thankless task, considered a distraction from real work. It will take more time than you are currently putting in to interviewing to start doing it better, and you need to appreciate your employees that do it well.

This post is part 3 in a series. Here is part 1 (about bullshit diversity strategies) and part 2 (about why women quit tech).