The ideal and reality of algorithmic interviews

Large tech companies often advocate algorithmic interviews because they are too large to afford the huge costs of inefficient code. But can an algorithmic interview really reflect one’s true strength?

Author | Dan Luu
Translator | Meniscus, editor | Tu Min produced | CSDN (ID: CSDNnews)

The following is a translation:

Whenever I ask people in big fashion technology companies what algorithm testing is necessary, they usually say, “Our system is too large, and we can’t allow anyone to accidentally write an O (n ^ 2) algorithm. The system is down. “The funny thing about this sentence is that although a large part of my contribution to these companies was to solve algorithmic problems at the telephone interview level, I still couldn’t pass the interview! When I say this, people often think I mean that half of my interviews have failed. I actually failed more than half of my interviews.

When I wrote my interview experience as an article, some people thought that my article was too boring because many of my interviews failed. They think that I should summarize these failures into a table, because no one wants to read a 10,000-word article, and this chapter itself only describes a series of failures. I went through about 40 interviews, but probably only passed 1-2 (or even zero).

To illustrate the “algorithmic problem of the telephone interview level” I mentioned above, let’s first look at a few examples.

I worked for a large company, and one of the company’s teams wrote a core library that implemented mutable arrays. At each resize, if the array’s backing store overflows, this implementation adds a fixed number of elements and copies the old array into the newly allocated array-just a slightly larger array. This is a classic counter-example of implementing a variable array, as doing so causes the time required to resize the array to grow linearly (rather than a constant). This is a classic example most commonly mentioned in flat analysis.

Some people may be unfamiliar with the algorithmic problems of telephone interviews with large technology companies. The problems I have encountered include:

* A “simple” programming / algorithm problem, and maybe a “very simple” warm-up problem before.

* A series of “very easy” programming / algorithmic problems.

* A bunch of brain teasing problems (this kind of problem is rare for comprehensive talent, but not rare for low-level or performance-linked positions).

They think that the implementation of such arrays is a very simple problem, so they are classified into the “very easy” category. Usually in the “real” telephone interview question, this is just a warm-up problem. Later, there are many similar simple problems. However, the variable array caused about 1% of the garbage collection pressure in all the JVM code of this company (the memory allocation caused by this variable array is the second largest in all code), and it also caused a significant CPU usage rate. Fortunately, they did not use this implementation as a general-purpose mutable array, only for a semi-dedicated program, and therefore “only” caused 1% of all garbage collection tasks. However, if asked this question during the interview, it feels like most of their team members can correctly implement this function in the interview. I can solve this problem, and it will bring more benefits to my boss than to make money in my life.

The memory allocation occupied by this problem is second, and the first is to convert a pair of long values ​​into a byte array. This implementation also comes from the same core library. It seems that the reason they wrote this operation was that someone wrote (perhaps copy-pasted) a hash function whose input was a byte array, and later modified this function to accept two byte arrays and proceed in sequence Operation, so the interface of this hash function is (byte (), byte ()). To call this function with two longs, they used a very convenient conversion function from the tool library to convert long to byte (). In addition to allocating byte () and filling long into it, this function can also reverse the bytes of long (It seems that the original purpose of this function is to convert long values ​​into ordered network bytes.)

Unfortunately, trying to make such a function more reasonable is a big project, so my solution was to change the interface of the hash function to accept a pair of longs (instead of a pair of byte arrays) and then let This function performs byte reversal instead of performing it as a separate step (since this hash function has shuffled the order of the bytes, no additional work is required). By removing these unnecessary resource allocations, my boss has earned more money than I have ever made in my life.

In theory, the speed of constant level improvement is not an algorithmic problem, but it still occurs in algorithmic interviews. As follow-up algorithmic questions, they often ask me: “Can you increase the speed of your code?” The answers to these questions often involve simple optimizations and performance improvements.

Take an example to illustrate the problem I encountered in the interview: save IDs as integers, but according to the context of the topic, you know that IDs are tightly compressed values, so you can save these IDs as bits. Bit’s interview question is different from the real world where there are a lot of redundant array implementations, because in the real world the solution to this problem is far from the expected answer, so no one may ask you to increase the speed at a constant level. More likely, your interview is likely to have failed by this point.

The following example is from another company, BitFunnel (the search engine used in Bing) configuration problems, this is an example of an algorithm problem at the interview level.

I can’t explain this solution with a few strokes in this article. In simple terms, you need to configure a set of bloom filters. One way is to write a black box optimization function that tries to find the best solution through gradient descent. As far as I know, this method always produces some strange characteristics, and the final configuration is not too ideal, only by reducing the tightness of the bloom filter (that is, investing more resources, that is, more money) ) To solve this problem.

In order to create a better solution, you will find that the basic operation in BitFunnel is equivalent to probability multiplication, so for any particular configuration, you only need to multiply multiple probabilities to determine how the configuration is performed. Since the configuration space is not very large, you only need to use a few for loops, iterate through the possible configuration spaces, and then pick the best configuration set. But this is not entirely correct, because probability multiplication assumes an independence that is not true in reality, but this method seems to work well, just like the Naive Bayes spam filter also assumes that the two words in the email The probability of occurrence is independent. If you want a complete solution, you need to calculate non-independent probabilities in detail, although this is beyond the scope of the interview.

The above are the three examples that I think of. Every time I summarize the ten most common interview questions, I always think of these questions. If I sit down and write out each interview question I have experienced, it may exceed one. Hundred. As far as I personally know, other people have experienced more than one hundred exam questions. Whether it is an example mentioned in this article or an example that I have not described in detail, these questions have the following common features:

* These questions can be expressed as interview questions.

* If stated as an interview question, you must expect that most people (or everyone) in the relevant team will be able to give the correct answer within the time required for the interview.

* The annual cost savings from solving these problems are more than the money I make in my life.

* We can reasonably assume that these questions have been around for a long time, otherwise they would not be used as interview questions.

At the beginning of this article, I mentioned that large technology companies often advocate algorithmic interviews because they are too large to afford the huge cost of inefficient code. In my experience, every company I’ve worked with has an algorithmic interview like this. In order to recruit capable people to solve algorithmic problems at work, algorithmic exam questions in the interview alone will not work.

One reason is that even if the purpose of a large company is to find ways to ensure that employees can solve algorithmic problems, this practice will motivate many and even most developers to avoid turning this capability into money.

Of the three solutions in the example above, two are already in production and one is not. If you change teams and stick to it stubbornly (that is, not one of the following: Suppose I have reason to believe that my team will accept my solution; or that the team needs me to come forward and propose a solution ; Or I persist stubbornly until they adopt my solution), the probability that my solution will be applied in practice is so much, that is, two thirds.

You can insist that the success rate (two-thirds) is already high. If I casually join a team, chances are that efficiency is neither the goal of this team nor the goal of the organization. This company may have spent a lot of energy motivating the team to achieve their goals (otherwise, what is the significance of setting goals?) If they are required to accept my solution, then they need to test, integrate, deploy changes, and bear the deployment risk (Because the risk of all deployments cannot be zero). That’s tantamount to saying that I’m asking the team to take risks to do work that is worthless to them. Although people sometimes ignore incentives and adopt different solutions, they are unlikely to spend a lot of free time seeking efficiency improvements, and they will spend their normal working time on work that is consistent with the team’s goals.

Suppose the goal of a company is not to ensure that developers can pass the algorithm test, but to motivate developers to use relatively efficient algorithms. I don’t think the above three problems will be left behind in this situation. After so many years, they have not been resolved. Let’s also assume that when a company evaluates the code, it will observe the most frequently called code to find the most computationally intensive part of the library. Under these two assumptions, the “trick” does not involve any algorithmic problems. It only needs to look at it. It can be solved through incentive measures. The third example is not inevitable, because there is no standard tool to tell you where the problem lies. However, you can also attribute the results to some kind of algorithmic problem, because this example is the core part of a paper that has won the “Best Paper Award” in the IR field, but in fact this “trick” only uses high school mathematics This means that the real trick is that as long as there is enough time, it can be solved using high school mathematics methods.

Among the companies I have worked for, there is indeed one that “does not ask algorithmic questions during interviews, they pay more attention to those aspects that are beneficial to the company’s development.” While working at this company, I found that there is only one solution that seems to meet the above example (if this company is larger, all the conditions can be met, but because the company’s size is limited, improving efficiency is not so important, Although I put a lot of effort into it, the annual income brought to the company will not exceed the money I make in my life).

I think the main reason I only found such an example is that many people think that their job is to make the company better and better, so the solution that can directly bring high value does not exist at all because the system is designed like this Yes, it’s not easy to find areas for improvement. Although there are few exceptions, there are people who will try to do the right thing instead of being forced to give in to the immediate interests of the company. These people may have solved these algorithmic problems before I did.

The algorithm / programming part of this company is easier to interview, it is much easier than the telephone interview of major technology companies, and we have basically not conducted system design interviews.

For a while, we tried on-site algorithmic interviews. Although it was difficult, it was actually common in telephone interviews with large companies. We later cancelled these interview questions because none of the graduates we interviewed could answer these questions (we did not give such questions to experienced candidates). It may be that our company’s reputation is not big enough, and people who come to our company for interview have not been able to answer these questions, so we simply cannot filter candidates by fashionable methods like other companies. In contemporary discussions about interviews, our approach is often referred to as “reducing the bar,” but I don’t know why we set the threshold when the threshold required for our job is low (sometimes not even). It’s so high. And, in some cases, although you want to set a threshold, the threshold is only 2 inches high, and people can easily step over.

In terms of actual productivity, this company is the most productive of the companies I have worked for. I think the reasons are cultural and too complex to be fully explored in this article, but I think our discussion is helpful because we understand that we cannot screen out the perfect candidate through algorithmic tests, and We assume that if our culture is doing the right thing, rather than focusing on the immediate goal, then people will take on the job at work.

If some companies want people to solve interview-level algorithmic problems at work, then they can achieve this through incentives. To this end, we need to put extra effort or think of other ways, instead of filtering the interviewees just by solving algorithmic problems on the whiteboard.


This article is a CSDN translation. Please indicate the source when reprinting it.

Hot tweet recommendation

☞ Xiaomi MIX Alpha won the Million Dollar Technology Award; Sony may launch a bezel-less phone; Linus does not recommend using ZFS | Geeky headlines ☞ Six major things every primary developer should know CSDN blog selection ☞2019 Internet events: Who is the winner?

☞ Chinese programmers killed in computer robbing in the US, hundreds of people mourn

☞2019, NLP “Highlight Time”

☞ Explain several key basic knowledge of CPU

☞ What are the bottlenecks and thresholds for developing a Dapp on Ethereum? | Blog posts

Every “watching” you ordered, I seriously take it as a favorite

Source link:


What do you think?

0 points
Upvote Downvote

New CME Bitcoin Options Spark ‘Unusually Strong Activity’ — JPMorgan

the best-performing asset of the decade