CoRecursive #065

From Competitive Programming to APL

With Conor Hoekstra

Today on the show, we have solving algorithmic programming problems. You know when you interview for a job to write CSS and they ask you to reverse a binary tree on the whiteboard using C and in constant memory space? It’s that kind of thing. These problems have their roots in algorithmic programming contests. And our guest, Conor Hoekstra, is a former competitor.

Conor’s story started in university when he was competing in programming contests. But it’s interesting how if you follow a single idea long and hard enough, you end up in exciting places like writing software for GPUs.

For Conor, that idea is finding the perfect way to express the solution to a programming problem. His journey will take him into LeetCode, into getting his dream job and all the way to APL, which is one of the more unusual and obscure programming languages.

Transcript

Note: This podcast is designed to be heard. If you are able, we strongly encourage you to listen to the audio, which includes emphasis that's not on the page

Chasing the Perfect Solution

Adam: Hello and welcome to CoRecursive. I’m Adam Gordon Bell, the host. Each episode, someone shares the fascinating story behind some piece of software being built.

And today on the show, we have solving algorithmic programming problems. You know when you interview for a job to write CSS and they ask you to reverse a binary tree on the whiteboard using C and in constant memory space, it’s that kind of thing. These problems have the roots in algorithmic programming contests. And our guest is a former competitor.

Conor: My name is Conor Hoekstra and I currently am a senior library software engineer working for NVidia on the rapids QDF Library.

Adam: Conor’s story starts in university when he was competing in programming contests. But it’s interesting how if you follow a single idea long and hard enough, you end up in interesting places like writing software for GPUs.

But I think I’m getting ahead of myself, this story is starts in 2012, when Conor was still in university.

Free Pizza and a Programming Contest

Conor: I wasn’t a computer science student at the time, but I was taking some computer science classes and I just happened to walk by the computer science lab. And there was a poster on the wall that said competitive programming contest, free pizza, meet Saturday at lunch. And I like competing in things and it sounded fun. I had really no idea what it was about. So, I showed up. So, I studied at Simon Fraser University, which is in Vancouver, BC. We had a bunch of computer labs there. So, it took place there.

If I recall, there’s probably about 20 to 30 people that were competing. And you basically just sit down at a Linux workstation. They give you a couple instructions. You code and any of the list of languages, but most of the popular ones, C++, Java, Python, et cetera, are all usable.

You code up your solution from start to finish. This was a qualifier for a contest called ICPC, the Inter Collegiate Programming Contest. You have to code a whole text file from start to scratch.

So, if you’re doing in C++, you have to include all the include statements. You have to have a main function. And typically, they give you a specification as your input will be one integer that represents T, the number of test cases and then five lines per test case, where each line represents the data that you need to solve it. So, you have to be responsible for reading in that data and then outputting to whatever standard out to solve it correctly.

Adam: Then you submit the file, like it’s the single file, you submit it, and then they compile it. And they hit it with test cases through standard.

Conor: Yeah. So typically, they’ll give you two or three test cases. So, they’ll give the problem statement. They’ll give you two or three simple ones that you can test to see whether you’ve gotten the basic cases right.

And then you submit a.cpp for C++ or .py for Python, and then it’s going to run against a bunch of tests, and you might get back a failure. So, if you’ve missed some corner case, maybe it was an empty list or an empty string that you didn’t take care of, it’s not going to tell you what test case you failed on, it’s just going to tell you that you failed.

And you have to go and stare at your code and figure out, “Okay, what’s the corner case I missed.” And then resubmit once you think you found it. And then finally, you’ll get like a pass and then you move on to the next problem.

The way competitive programming works is you have a set amount of time. For this contest, it was two hours. You have a set of problems. I think there was six or eight of them. And the goal is to solve as many as possible as fast as possible.

I, in my first attempt at this, didn’t know how to actually submit the problems. So, I solved I think three or four of them and didn’t understand that there was a time penalty the longer you waited. So, when there was 10 minutes left, I went up to the individual, the prof was running the contest and said, “Hey, I’m being unsuccessful with submitting these.”

He sort of looked at me strangely like, “That’s the whole aim of this. What are you talking about?” He said, “How many have you submitted?” I said, “Well, I’ve solved for, but I haven’t submitted any.” And so, he just sort of shook his head, came over. I ended up submitting a couple of them, but then ran out of time.

But luckily, there was a second qualifier. And so, I sort of went back to my dorm room, figured out how to actually compete in these. And then in the second one, I did well enough to get on to one of the university teams. It was like the B or the C tier team.

Adam: So, you joined the team, were you excited? Did you have to meet the team or what’s the process after you made it onto the team?

Conor: So, yeah, once we joined the team, it was three members. And we basically had practice contests once every couple weeks. And this was just over a semester. And then we had we ended up all three teams like the A team, the B team, and the C team ended up going to the regionals.

And we didn’t end up doing that well because we were competing with people that had been studying their whole university careers at this, just practicing at competitive coding. Because it’s one thing to get past the beginner level, it’s just solving simple algorithmic things. The intermediate level, you need to know basically general classes of algorithms. But to get to the super advanced level, you basically are memorizing all these custom different, like searching algorithms.

And so, you’ll read a problem and know like, “Oh, that’s the Floyd insert name, insert name, specific algorithm that you need to use.” And that just takes time and energy. You’re actually allowed to bring binders of prewritten code into these contests.

So, as long as you can recognize what algorithm is needed to solve it, a lot of times, it’s just time and energy of studying these esoteric algorithms that otherwise you probably wouldn’t end up needing to know.

Three Coders, One Keyboard

Adam: So, you’ll have a binder with all these tabs on it? And you’ll be flip to that and start just typing that in?

Conor: Yeah, yeah. And typically, or not typically, the way that ICPC works, once you’re in the regionals, you have teams of three people and one computer. So, the way that it’s typically working is that you’ve got one person coding up their solution that they’ve already handwritten and solved sort of on paper. And so, two people are handwriting their code on paper and then you swap out to type.

So, you basically need to be able to solve your problem without a keyboard. And a lot of the times, when you’re doing that, if you can just pull a piece of paper out and say, “Oh, this is the algorithm that I need,” you can just scribble some notes around it for reading in and reading out or writing out your answer, and you’re good to go.

Adam: The algorithm can’t just be correct, it also has to be performant. Because every test case has to be handled in two seconds of CPU execution time.

Conor: And for simple problems, it doesn’t matter because the input size is not that large.

But when you get to the harder problems, one of the first things you do is you look at what’s the size, so it’s a 10,000 array, you know that basically, you only have 1 times 10 to the eight operations. So, if you’ve got 1 times 10 to the five elements in your array, a quadratic algorithm, you know is going to get what’s called a TLE, a Time Limit Exceeded.

So, right off the bat, when you’re at these harder questions, you’re looking at the size of the input, and automatically ruling out some brute force quadratic algorithm, which is sort of a neat part of the problem, too, is some people say, “Oh, how come you don’t just do this?” And it’s like, “Well, we know that that’s not going to work.” And that leads to I think 90% plus of folks are using C++ because in those problems, the difference between using Python and C++ is going to be an accepted solution.

Adam: I guess it would give you a neat insight into complexity of solving problems because you have practice that sort of cutting things out just based on knowing that it won’t be this barrier.

Conor: Yeah. It’s interesting because that doesn’t typically come up in the real world where you know you have two seconds to solve something. So, then, “Oh, the algorithm that I’m choosing, or the code that I’m writing is going to be completely driven by time complexity.”

Typically, you write the code, and then you optimize it later if you find that something’s too slow. You’re not starting with time complexity and then working backwards. But it definitely is a good way to learn the difference between linear quadratic and log in and stuff like that.

Balloons, Then a Bloomberg Disaster

Adam: Conor’s team went on to compete at the Pacific Northwest Regionals. The regionals take place in a large computer lab.

Conor: Everyone is huddled around computers. And what they do is they have colored balloons for each problem. So, problem A will have a red balloon, problem B will have an orange balloon.

And every time a team solves a problem, they come and bring these helium-filled balloons and tie it to your computer. So, you can look around the lab where the live contest is happening and see just by how many balloons are hanging above someone’s computer, basically, who’s winning.

So, that whole part of it, I thought was super, super cool. But then yeah, so after that one regional contest, I don’t think any of the SFU teams, I think probably UBC and Stanford, they’re typically the two teams that do the best. Sometimes, Berkeley pops up there. And after that, yeah, that was basically sort of the end of my university competitive programming career.

Adam: Five years later, competitive coding long behind them, Conor got invited to a job interview at Bloomberg for a software developer role.

Conor: And at the time, I wasn’t a software engineer per se. I was a programmer that used code at my day job as an actuary.

So, I worked for an insurance firm. Bloomberg ended up reaching out to me. And I said, “Sure, I’d love to interview.” Even if it’s just to learn what it’s like. And they said, “Okay, we’re going to do our interview on a site called HackerRank.” They’ve got this sort of side site called CodePair, which enables sort of this live streaming sort of like a Google Doc, but specifically with an editor.

And you can actually test your code live. So, you write a function and then you can test it on the sample input. And they said, “If you’re not familiar, check out HackerRank. Here’s a couple problems.” And that was the first time I’d heard of HackerRank.

So, I ended up solving a few of the problems, thought it was awesome. And I was like, “Oh, yeah, this is just like what I used to do five years ago.” And then ended up just absolutely bombing that interview. It was so bad. Definitely one of the worst interviews I’ve ever had in my life.

Adam: What happened?

Conor: We were doing CodePair. I don’t remember the exact question. I’m pretty sure the answer to it was using a stood partition, which is just a single liner from the standard algorithm library, which basically, partition means something different in every language. But in C++, you’d give partition, basically a container, a sequence of elements, and a unary predicate. And it puts all of the elements that return true for the unary predicate at the beginning and all the ones that return false at the end.

So, if you have the numbers 1 to 10, and you do a partition, which is even, the even numbers 2, 4, 6, 8 balls show up at the beginning, followed by 1, 3, 5, 7. So, I’m pretty sure that was what the answer to the question was. I didn’t even know what a partition was at that point of time. So, I wasn’t going to be writing that down.

But it got to this point where I needed to know how to, I think, swap two iterators or something like that. And I asked if there was a built-in function for swapping two iterators. There is. It’s called iter_swap. But at the time, I didn’t know about it. And the interviewer said, “Yes, there is.” And I said, “Do you mind telling me what it is?” And he said, “No.” And I’m pretty sure this was just the warm up question. And I stumbled my way through it. But I’d that was sort of the end of the interview.

And just, so it was just a lack of my knowledge of the standard library. It was just sort of eye opening too, because I was not used to the classic big tech company where they just ask you like algorithms and data structures questions.

Is the Coding Interview Just a Hoop?

Every single company, Google, Facebook, you name it, that are getting thousands and thousands of applicants, they basically have a standard format, where they phone screen you. If you do well enough, you go to onsite. And then on the onsite, you’re going to have four out of five of the interviews purely being on data structures and algorithms.

And they don’t care what language you do it in. But yeah, you have to get good at answering these sort of questions that aren’t going to ever show up in your day job. You’re going to be being hired to even write some frontend code that doesn’t make use of this stuff, but that’s how they test you.

So, I basically went on a journey of over a year just trying to get really, really good at what they call LeetCode problems or AKA competitive programming problems.

Adam: I had a similar experience, at some point. I interviewed, I don’t know how long this was, I suspect I’m older than you. But I interviewed at Stack Overflow, the website where they tell you answers to questions. And I wasn’t expecting, yeah, one of these algorithmic questions and I just bombed out.

And the guy who was doing it was, he was weary of the world. He probably had four of these a day and was barely paying attention. And he was like, the first guy. They probably have this giant funnel where just most people never make it past that first stage. And he barely had effort to tell me like, “Yeah, this isn’t going well.” It’s like, “We’re just going to hang on here. I’m going to say no. And then I’m on to the next meeting.”

But I found it super painful to be rejected like that. Because I felt like I was a good software developer, just not that. I mean, did you have that kind of response?

Conor: I honestly can’t recall exactly how I felt. I think probably in the moment for me, it was a bit demoralizing because you’re so excited about this opportunity.

At the same time, though, my perspective on myself was I wasn’t really a software engineer. I didn’t even think that I qualified for this interview. I was just taking it because I thought, what the hell, what’s the worst that could happen? And then probably the worst that could happen did happen.

But more of my thought was that I had gone through a few of the HackerRank questions. And I knew that that’s something that I can be extremely good at. And I saw the interview format. And so, it sucks to bomb, but I knew it’s just a matter of time and energy. Those types of questions compared to what you have to do in actuarial science and these crazy five-hour exams that they make you take, that seemed a lot easier to do. It’s just time and energy that you need to put in.

And there’s a lot of people that will tell you that is just disheartening that it has nothing to do with your day job. And it takes time and energy and people a lot of the times are like, “Well, why do I have to go and learn this sort of extra skill that is passing these interview questions in order to do something that I already know how to do.” I completely agree with that.

It’s a broken model, but I feel like it’s the same quote about democracy is the best worst form of government or whatever that we have. How do you do a detailed interview process that is not prejudiced against certain demographics of people?

And that sort of it’s quasi standardized, but also looking at more than just your ability, it’s a hard problem to solve and this is the best solution that these big tech companies have at the moment.

Adam: Do you know the concept of signaling like from economics?

Conor: I do not. I don’t think.

Adam: Economists sometimes talk about whether university, whether it’s actually valuable or whether it’s just a form of signaling. For an education to work as signaling could work as this, I want to hire people, but I want them to be in the top 20% of smartness.

So, I just find some bar, getting a physics degree, that has nothing to do with the job that I’m prepared to do. But it effectively acts as a screen. In a pure signaling world, the person going into the physics program leaves no smarter, and no better to do the job that I’d like to hire them from.

But by me hiring from that pool, all those people are prescreened. The person who goes into physics, they learn nothing in the program, but they’re buying the signal that says I was able to pass this program. So, there can be actually no educational value in a pure signaling world. But it can still be an effective method.

It’s probably not good in the greater sense. But both parties in an economic sense could be motivated to do something like that. I think like a coding challenge, you could view it that way and say, like, “This won’t have anything to do with your job. But we know that the people who pass these seem to be able to do their jobs.”

Conor: Yeah. There’s probably a lot of that behind the process, even for so many of the credentials behind certain professions. You’ll hear people talk about like, “Well, this test, it’s not something that I’m actually learning skills to do, whatever the domain is, to do on my day job. It’s just a hoop.”

But they make folks jump through this hoop because of that reason. The folks that can jump through the hoop typically, it’s got some correlation with being able to execute well on whatever it is that that profession is responsible for doing.

Saturday Nights Chasing a Rating

Adam: Now, Conor knows what the hoop is, he just has to jump through it. And because of his competitive coding experience, he actually has a great strategy for getting where he needs to be to get through these job interviews.

Conor: Basically, LeetCode has a contest every Saturday, depending on what time zone you’re in. It’s typically in the evening on the Eastern Standard Time zone. And I would do that contest every week.

And then all the other websites, HackerRank, CodeForces, TopCoder, HackerEarth, there’s a couple others, they have contests going on sometimes they’re monthly, sometimes they’re weekly. So, there’s no shortage of contests that you can actually take part in, which is, I think, the best for simulating an interview experience. But there’s also just thousands of questions in the backlog. You can go look at all these past contests and just solve whichever ones you want.

I slowly but surely just started competing and practicing. And whenever I would fail to get a question, go back and look at other folks’ answers to the questions and slowly but surely, you just you start to improve.

Adam: So, when you were competing on the Saturdays, did you get your computer set up? You brew yourself a coffee and sit there? Do you have your binder? I don’t know, what’s the process?

Conor: Definitely not coffee, because they typically are at like 9:30 or 10:30 at night. So, if you’re having coffee that late, you’re not going to sleep anytime soon.

Adam: Yeah.

Conor: But yeah, so that LeetCode contest is four questions, 90 minutes, and it’s crazy. The top people finish all four questions in like 20 minutes, all of them. And they get the first one within 30 seconds. I don’t even know how they read them in that amount of time.

And so, yeah, you typically start writing … On a given contest, when I was sort of at the height, I would usually get either three to four, or four to four. Obviously, the contests, where you are able to solve them all, it’s super awesome.

Probably the highlight was every once in a while, I would get the first one or two super quickly within a few minutes. And then they show you a leaderboard of the current people that are in the lead. So, a couple times, I managed to be in the top 10 for the first couple problems. I think the highest that I ever ranked on a contest was like 60th or 70th out of a few thousand people. So, I’m definitely not an expert level competitive programmer.

And so, yeah, you solve the first two problems rather quickly. I mean, this past contest that just happened two days ago, the first problem I can’t actually recall because it was super easy. And the second problem was called max ice cream. And it was basically you’re given integer value and then a list of numbers. And the list of numbers represent prices of different ice creams. And then the integer value you get is a total amount of money.

So, you say you’re given $6, and then you’ve got ice cream prices 1, 2, 3 and 4, but it doesn’t need to be sorted. It can be in any order. So, the question is, what’s the maximum number of ice cream cones you can buy with the money that you have?

So, in an imperative language, typically the way that I saw being solved was you sort all the numbers. And then you have a for loop that keeps track of how many ice creams you bought and the total money that you have left. And so, then you just you keep buying the cheapest ice cream that you can until you run out of money and then you return the integer.

But there’s different ways you can solve that functionally. There’s a super nice way you can solve that in array programming language. And note that LeetCode doesn’t actually have any array. So, they only give you a list of I think 10 or 15 languages you can solve from.

So, if the website doesn’t support it, you can’t code in it. But all the popular languages, Python, C++, Java, Rust. I saw, they added Racket the other day, which is not as popular, I would say. So, they are starting to add more and more languages. And yeah, the 90 minutes are up, and then you see where you’re ranked at the end of the day.

And then, the really cool thing about all these sites that I haven’t mentioned is that they keep track of your rating, which is, for me probably was what drove me the most. Obviously, I wanted to do well in the interviews and stuff. But seeing a graph of your rating over time, which hopefully is going up, is super motivating. And it shows you like what percentile you’re in.

So, I think I’m in the 98th percentile or something. But when I started out, I obviously wasn’t. And so, each time that you compete, if you finish higher than what your average finish has been up to that point, your rating is going to go up. And so, over time, your rating slowly goes up. And I don’t know, for me, that’s a very encouraging and motivating thing is to see that you’re doing better. And “Oh, it’s Saturday will my rating go up? We’ll see.”

Adam: Yeah. What was your goal here? Did you imagine yourself, I’m going to hit number one on the leaderboard, and then what were you picturing?

Conor: I definitely had zero goals to be number one, not even, I think top 100 if you can get that, it’s amazing. It’s sort of entering a marathon or half marathon or 10k. Your goal can’t be to be number one. There’s just people that are going to beat you. But having some goal of being in the top 5% or something, that’s an achievable goal, depending on where you’re starting out from.

So, over time, obviously, you want to see your rating go up. But ultimately, the reason I was spending all this time practicing these problems was to get a job at a technology first company. So, that’s the thing is, while I was working at my first career as an actuary, I was falling in love with the software engineering side of things.

And at a certain point, I decided what I wanted to do was to switch to a company that was sort of technology first, big tech companies like Google and Facebook. They are technology companies first.

One Year of Prep for One Interview

Conor: I have friends that have sort of done the same thing, and their strategy has been going back to school. I had talked to folks, and they just said, if you can just learn how to get good at these interviews, you can probably land a job. And so, over the period of a year, I just kept practicing.

Adam: The plus side of it is what you just mentioned there. It does make it an open field for people to apply for these jobs. It’s not like you need to have a computer science degree from a certain place. There is a clear path that people can take, which can be empowering.

Conor: On one hand, yes. There’s no “discrimination” based on credentials. At the same time, though, it did take me a year of practicing this stuff.

There are definitely arguments that say that, depending on where you are at in your life, you don’t have the ability on top of a full-time job to then go and sort of train yourself in these sort of interview type questions. But I do agree that in some sense, it is very meritocratic given that you have the time to go and sort of train yourself on how to answer these types of questions.

Adam: Conor didn’t just train himself, he started a YouTube channel where he would explain solutions to others. And eventually, he got an interview at a technology first company, Amazon.

Conor: It’s funny, because at that point, I had progressively been getting better and better at the interviews. And I went into the Amazon interview. And if I recall, I think like three out of the five interviews, I thought I absolutely crushed. But then one of them I thought I didn’t do as well in, the fifth interview that wasn’t algorithms and data structures was a behavioral question.

And in the Amazon interviews, they ask you not to sort of retail the same … You get behavioral questions, like small behavioral questions in each interview, and Amazon has these 14 leadership principles that you’re supposed to weave into your answers, like customer obsession, think big, and stuff like that. So, if you’ve done your research, you’re supposed to reverse engineer, what’s the leadership principle they’re trying to figure out here. And then weave that exact leadership principle into your answer.

And so, the last interview I thought went awfully. I had answered the question, and then she told me that I didn’t answer the question. And I said, “Well, I have another example that I think is way better. But I sort of talked about it in a previous interview.” She’s like, “Okay, just go with that one.”

And then, after I answered it that way, she’s like, “Why didn’t you just say that from the beginning?” And I was like, “Well, I didn’t want to say the same thing.” And she’s like, “You need to learn how to market yourself better.” So, she started talking to me about how I wasn’t approaching and answering these questions the right way. And she’s like, “That was a fantastic answer. And if you had just started with it.”

Anyways, it was a very, very odd experience. And at the end of it, I thought that it went well, but not amazing, but then I ended up getting the offered.

From YouTube Solutions to Haskell

Adam: Conor had reached his goal, but he was sort of hooked. He kept his YouTube channel going and people kept leaving comments.

Conor: You should show Python, you should show Java, but I was a C++ developer. So, I was focusing on C++. And then at some point, in order to appease all the folks leaving comments, I added Java and Python. So, I was solving solutions and I was solving problems in all three of those languages.

And then at some point, probably it sort of started while I was at Amazon, we started a study group. A principal engineer there wanted to learn more about lambdas and functional programming. So, we started a small study group with five or six people. And we started working through a Coursera course that was taught by Erik Meijer.

Adam: Is this Erik Meijer with like the tie-dyed shirt?

Conor: It is. Yes, it is. Yeah. I think the course is called introduction of functional programming. We ended up using their textbook, and then switched to a Real World Haskell, which only we only got halfway through. But yeah, it was probably the highlight of being an Amazon was this lunch study group where we meet once a week.

Because yeah, it was basically like a C++ dev, the principal engineer. He was sort of a polyglot, but mostly C, C++, Java, that kind of stuff. And then other folks were Java developers. So, it was mostly just a bunch of people with experience in imperative or procedural languages, none with any sort of functional experience. And stumbling through this textbook, having our eyes open was very, very cool.

Adam: Was there a moment you recall where it was like, this is really cool, this is something different?

Conor: Yeah, I can’t remember what chapter it was. But when they start to introduce function composition with a dot operator in Haskell, in C++ 20 now, we’re getting ranges, which are sort of more composable algorithms, but pre C++ 20, which it was 2018 at the time, I didn’t know about any of that.

And so, seeing that like I said, if you want to sum the even numbers in a list in Haskell, that’s just sum.filtereven. And it’s technically a point free solution because you’re not mentioning your arguments. And the equivalent solution in C++, even if you are making use of algorithms is, it’s going to be to different algorithm calls.

And so, it’s just a lot more noisy in C++. It might be a way more performant, but it’s a lot more noisy. And then just seeing, it’s less than a single line. It’s three functions. It’s sum.filtereven. Evens are built-in in predicate, so you don’t have to build up a custom lambda.

At least for me, it blew my mind how simple function composition and having functions as first class in your language, how that can change the simplicity of the solutions you come up with.

APL Keeps Finding Conor

Adam: Now, Conor is not just on LeetCode, but I’m finding these elegant mind-expanding solutions to LeetCode problems. And that leads him in an unexpected direction.

Conor: So, I heard about APL five different times between the year 2010 and 2019.

Adam: I like that you kept the count.

Conor: Well, I wasn’t explicitly keeping track. I found it curious that once every couple years, this random language that I heard about first in university was sort of popping up. So, the very first time was in my second year of university 2010. And I was in an actuarial mathematics course taught by the best professor I’ve ever had, Dr. Gary Parker. He was at the time, the chair of the program.

And he just made a small remark during one of the lectures. And he said, “Oh, back in the day, all actuaries we all use this language called APL. And it was beautiful. It was just this super elegant language, that it was built for actuaries, and scientists, and mathematicians.”

The problem was though is that you could write in a single line what, in another language, might take literally a hundred. The problem though is that when you went to go back and read that or someone else was trying to read what you did, they could never understand it. And so, it’s sort of got this reputation of being a write-only language. And so, then it ended up sort of dying. But that was the first time that I ever heard of APL, and nothing more was said about it.

And then fast forward four years to 2014, my oldest sister is an engineer and at the time was working at an engineering consulting company. And she just randomly was complaining about this software program called METSIM, which is short for Metallurgical Simulation.

So, she was an engineer that specialized in metallurgy. And they were using this simulation program for fluid dynamics. And apparently, the scripting language or the language that you use to build up these little programs, and it was APL. And I was like, “Huh, that’s interesting. I’ve heard about APL four years ago.” And she’s like, “Yeah, it’s the worst. It doesn’t make any sense.”

It was the worst part of her job was this METSIM and having to figure out what APL is doing. She says it looked like hieroglyphics. And so, I thought it was interesting. The second time I heard about it.

And then two years later, my youngest sister, she’s a quant that works at a private asset management firm in Canada called Conor, Clark & Lunn. She was remarking that she was using this database called KDB+ and a language with a single letter called Q.

And at that time, I was into programming and on my way to becoming a software engineer. And I looked up Q and I was like, that’s interesting. I’ve never heard of Q. I don’t know all the languages, but it seems super esoteric. And then on the Q Wikipedia page, it mentioned that Q was basically a wordified version of K. And K is a derivative of APL. And I was like, “What the heck?” So, now two of my sisters are doing something that’s APL adjacent.

And then fast forward a couple more years, there was three podcasts on functional geekery, which is next to CoRecursive is one of my favorite podcasts. And they, Episode 64, 65, and I think 76, all had guests talking about APL. And it was at this point in time … Or actually I skipped the fourth one. There was a really funny story in January 2019 called iota shaming, where basically this Twitter battle erupted between C++ developers, because someone had used iota in a piece of example code.

And iota is an algorithm that generates a sequence of numbers that increased by one. So, if you go iota zero to 10, or whatever, you get 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. And the name iota comes from APL.

And so, this big Twitter war erupted when someone said, “Oh, look how smart I am for knowing what iota is.” And was trying to sort of snarkily saying that we shouldn’t be borrowing these really awful names from old dead languages. And then another individual came out and said, “Well, it’s actually historical name and it has meaning.” Anyways, that was the fourth instance.

The fifth instance was the podcast. And at that point, I was actually on a trip to Yosemite at the time, and my partner was asleep in the car, and I would listen to podcasts while driving into Yosemite because we were staying just outside the park. And I’m listening to these people just gush about APL, and how it’s this beautiful quasi functional language.

And then I finally decided to check it out. I went to a site called tryapl.org, where you can just sort of mess around with the language. And then at that point, I immediately fell in love with the language within 10 minutes of using it. Yeah, that was a long-winded story.

Falling in Love With Array Programming

Adam: Well, what made you fall in love with it?

Conor: So, the context is that this is June 2019. And I had just finished giving a talk, my very first conference talk called Algorithm Intuition, which sort of gave me this reputation of being an algorithm expert or someone that knew a lot about algorithms. And this was in May of 2019.

So, literally, a month later, I’m back from Yosemite on this website. And one of the first things that I do is plus reduction, which is just summing up the numbers in a list. And then in a reduction in APL is just a slash. So, you type +/ and then a few numbers, and it adds them up.

And there’s another algorithm in APL called the scan, which is just the other slash, so I’m not sure which ones forward or backward, but one slash is a reduction and the other slash is a scan. And what a scan does is it’s basically a reduction, but it outputs all the incremental results as well.

So, if you have five numbers 1, 1, 1, 1, 1, and you’d do a plus reduction, the resulting answer is going to be five. If you do a plus scan on 1, 1, 1, 1, 1, you get 1, 2, 3, 4, 5. And so, one of the properties of a scan is that the last element is always equal to the equivalent reduction. So, it’s an algorithm that actually exists in C++ by the name of partial sum.

So, our reduction in C++ is called a stood accumulate. And our scan is called the partial sum. And because accumulate and partial sum seemingly from a sort of naming point of view, have nothing to do with each other, I never made the connection that basically they are sibling algorithms. One just gives you the last element, the other one gives you all the incremental results.

So, a scan and a reduction are extremely closely tied together. And I had never made the connection. And so, I supposedly just gave this talk, where I’m talking about algorithms and saying how they’re so awesome.

And then a month later, I’m at this website that has got all these crazy looking characters. And I know nothing about the language, but I made this observation just based on what the characters look like. They’re basically vertically symmetric. That one’s the opposite of the other one.

And it was just like, so it happened in like 10 minutes. And my mind sort of was just blown. I was like, “How did I not make this observation before?” I spent at this point two years learning about algorithms, practicing these competitive programming problems, how have I never made this observation?

And I actually went and searched in all the different languages, what do languages call these? And there’s only two languages where there’s any symmetry between the scan and the reduce name. One of them is closure, which causes their reduction reduce, and they call their scan reductions. So, you can sort of see that like, “Oh, it’s actually just saying you’re just doing incremental reductions.”

And then in D, which is sort of a competitor with C++, they call theirs, I believe, fold and cumulative fold. Every other language, there’s absolutely zero symmetry between the naming. Haskell calls them fold L and fold R and then scan L and scan R. And scan and fold are really good names for both of those algorithms.

But unless if you sort of are able to see that relationship between the two, it’s there to be discovered. You’re not going to think about it just because of the names of these algorithms. And because it happens so quickly, I realized that whoever designed this language clearly put a lot of thought into what these characters visually mean. That there’s a relationship between if a character looks similar to another character, it’s probably for a reason.

And so, people think about this language as an unreadable language, but it’s really not fair. I think we chat about this before. But I like to say that it’s the same thing that people say about Chinese. If you don’t read or speak Chinese, you don’t say that, “Oh, Chinese is unreadable.” You just acknowledge the fact that you don’t read or speak Chinese. For someone that does read or speak Chinese or any Asian language that doesn’t have sort of the A to Zed alphabet, those languages are readable if you read them.

And that’s the same thing with APL. The problem is that every other programming language typically looks similar to each other and uses ASCII characters to represent words and variable names. So, APL looks so different.

So, we have this like gut reaction to say, “Oh, that’s the most unreadable thing ever.” But if you take the time to learn how to read APL, it’s honestly my opinion, the most beautiful language that I’ve ever seen. And the things that you can express where every Unicode symbol represents an algorithm. It’s just absolutely beautiful.

Adam: Part of this beauty, Conor says, is the simplicity. In APL, everything is an array.

Conor: So, a one rank array is just like a list or a vector, a list of numbers. A two-rank array is a matrix. A three-rank array is a cube of numbers, if you will, and it keeps going on like that, even a single number is technically a single element.

There’s no such thing as other data structures. You don’t have like a hash map that you can build up. We’re going to represent a hash map as some sort of two-column array, where the first column are the keys and the second column is the values.

APL Glyphs Mirror the Math

Adam: Another part of the beauty is the symmetry between what an operator looks like and the action it performs. So, imagine you have a matrix of numbers, a five by five grid, and you want to reverse the elements in the matrix. So, you have five rows of 1, 2, 3, 4, 5, and you want 5, 4, 3, 2, 1. You do that with the reverse operator.

Conor: So, reverse is a circle with a straight line through it. If you can, you can visualize reversing the reverse symbol because it is vertically symmetric. If you flip it 180 degrees, if you picture it in like a 3D land, it’s the same thing after you flip it 180 degrees, along that sort of vertical line.

Adam: And this circle with a vertical line through it is actually a visual description of the operation. If you imagine in your mind’s eye making a vertical line through the center column of our five by five matrix, every element in the middle will stay there when we reverse it. But everything else rotates around it 180 degrees so that the first element becomes the last and the last element becomes the first.

Conor: Transpose is the exact same thing. Instead of a straight line, it has a line at like a 45-degree angle. And a transpose is flipping a matrix basically along that lines. And so, the way you can think of it is the top right number is going to get swapped with the bottom left number, and every other number sort of is going to follow that pattern. So, if you’re on the diagonal from the top left to the bottom right, those numbers all stay in the same place because that’s the line that defines the 45-degree angle.

Adam: You have a matrix, it’s like a square, and you’re folding it diagonally at 45, and taking the top part to the bottom and the bottom to the top.

Conor: Yup, exactly. And that’s what the transpose glyph looks like. So, it’s a circle with that same line. But instead of it being vertical, it’s now along that diagonal.

And so, the way you can think about the reverse and the transpose algorithms or symbols is that they are basically visual representations of the axis that you’re “rotating” for lack of a better word around. And what’s crazy is that I didn’t see this for months. I just was like, “Oh, that’s transposed. That’s reverse.”

And then at some point, I was reading a blog article where it pointed out that like, oh, yeah, once you see that the reverse and the transpose, and there’s also another algorithm called reverse first, which has a circle with a horizontal line through it, which is basically doing a reverse. But instead of from left to right, it’s doing it from top to bottom. And the blog said, “Once you see it, you can never unsee it.” And I was like, “See what?” And then I looked at it and I was like, “Oh my goodness, how did I not see that? That’s amazing.”

And that’s the thing is, so many of the APL symbols have stuff like that. So, there’s an article by Roger Hui, who was the primary implementer of the J language, which is a daughter language or derivative language of APL. He has a blog called My Favorite Character, and his favorite character is a character called Log, which is for logarithms.

And he says it’s its favorite for multiple reasons, but one of them is it’s because it’s a circle with a star inscribed in it. So, there’s a star in APL, which is for exponents. So, if you want to go one or 10 to the power of two, you go 10 star two.

But for logarithms, it’s a circle with that same star inside of it. And he points out that if you actually think about cutting a log is sort of via profile, what you’re really going to end up with is like a circle, and then circles for the rings inside of it. But that circle with a star in it is sort of an approximation for that profile view of a log. And once again, that’s like something that I never thought of in a million years.

Adam: You mentioned Chinese, is there’s some sort of similarity between this idea of like ideograms or something, it’s visually…

Conor: Yeah, yeah, absolutely. The reason I know this is I lived in China for a year, and at one point was not fluent, but I could speak colloquially and get around day to day. And I had to study written Chinese as well, is the radicals typically have meanings.

So, the character for tree basically looks like a tree. Think of a T, where underneath each of the arms of the T has like two more lines that sort of form a triangle. So, it looks like it has a sort of relationship to what a tree looks like. And an easier example is kou, which is K-O-U in Pinyin, that is the word, and it’s just a square, that is the word for both mouth and door, I believe. So, it’s a square that sort of represents what a door might look like.

If you look at a newspaper, it all just seems like overwhelming, like, well, I need to go and learn 2000 characters. How am I ever going to learn how to read it? But if you just look at a single character at a time and you break apart the radicals … So, I believe it’s mu or shu is the word for wood or tree. And that radical that looks like a tree shows up in a lot of wooden things. So, if you look at what the character is for chair or for table or things that are made of wood, it contains that wood radical.

So, Chinese is actually very, very beautiful in certain respects because it has that same kind of visual meaning. Even if you have no idea what a certain character is, if you break it apart, and you know what each of the radicals are, you can basically play a game of what is wood plus something plus something, and then take a guess at what this character actually represents. And you’re probably will be wrong, but you might be close to actually what it is, which is really cool.

Kenneth Iverson’s Radical Notation

Adam: Although there are some similarities between Chinese characters and APL, APL was created by a Canadian, Kenneth Iverson in a book called A Programming Language.

Conor: Which is where APL gets its name from. And at that time, APL, or as it was called at the time, Iverson Notation, it was a tool for teaching. It had nothing to do with programming languages. It was a tool for expressing algorithms and that’s how it was used in a programming language, the 1962 text.

Fast forward a few years to 1966 and IBM had decided that they wanted to actually implement the notation as a language. And the IBM 360 is actually implemented in APL 360. And so, 1966 was the first implementation. And then to shorten the story, fast forward to 1980, Kenneth Iverson ended up going to a company called IP Sharp that was headquartered in Toronto.

And histories that have could have played out so much differently. In 1982, Bill Gates was doing a whirlwind tour of North America and swung by IP Sharp and Associates in Toronto. And there was a meeting between a few of the IP Sharp folks and Bill Gates. And Bill Gates was getting ready to make the PC amongst APLers and array programmers. There’s this anecdote of sort of Bill Gates for a long time was a big fan of APL and had an APL sort of handbook in his desk.

And right before the PC, there was a computer called the IBM 5100 that actually had a switch on it for both basic mode and APL mode. So, you can just toggle between the two.

So, there was a time in history right at the late ’70s where APL was just as prominent as basic. Unfortunately, the folks at IP Sharp and Associates thought that mainframes were just here to stay because that was their business was mainframe computing and didn’t really see what was coming with the personal computer.

And so, the PC that Bill Gates ended up creating came with basic, not with APL and never ended up shipping an APL and because of that, we live in a world where a lot of languages, I think look closer to sort of the imperative style basics. But there is an alternate universe where Bill Gates decided to put APL on the PC instead of basic and then we’d be living in a way different world today.

A Lie, a Plane Ticket, and Arthur

Adam: So, Conor started doing his LeetCode videos using APL. And he also reached out to this interesting character, and Iverson protege named Arthur Whitney. Arthur was involved in creating A, K and J, all APL-inspired programming languages. The finance press refers to him as the reclusive genius of banking IT.

Conor: I learned a lot about Arthur because of all my research and then I randomly reached out to him, and just asked if he would have time to chat. And then he responded, this was in early 2019, before COVID really showed up. And he said, “Well, do you ever come down to New York?” And I totally lied and said, “Yeah, all the time.” And then he said, “Well, if you ever find yourself down here, let me know and we can have dinner and chat a bit.”

And so, I booked a plane ticket and booked a hotel room and spent two or three days just around the corner from where he works. And then yeah, I got to have dinner with him and chat with them over those couple of days. And he’s just an absolute genius.

Adam: So, I had heard of him, I think through Bryan Cantrill, who’s interviewed him. I think he went to his house and interviewed him. I don’t know if it was Brian or somebody else was saying that he made a deal with the devil that he would be the best programmer of all time. But that the tradeoff was that nobody would be able to read his code. I found online the original J interpreter or something like that, and it’s written in C, and it’s just like, I don’t know what it is.

Conor: Yeah. I started a project a couple months ago. I’m slowly porting that code base, the J source code to C++ 20. And yeah, I think there should be a name for these types of code bases, because it’s not C, it’s like a macro variant of C, where it’s a CDSL, where 80% of your “library” is macros. I did a search, there’s 10,000 macros in the source code, and those macros are used like functions.

Adam: Yeah. That seems like there’s a vague relationship to what you were saying before about the competitive coders using like a whole bunch of shorthands to do their coding.

Conor: It’s actually funny that you mentioned that because that is literally what the best competitive programmers do. The first thing that they do if they’re not in actual ICPC and they’re just competing on Top Coder or CodeForces is they copy and paste in a template, which includes all the headers, includes the main function, but then also has a ton of macros defined. A bunch of which are actually like macros for loops, because you can reduce like 10 or 15 characters that you have to type out with a simple do macro.

And I never actually noticed that relationship, that some of the exact same macros that exist in the J code base are exactly what competitive programmers use.

Can APL Run on a Thousand Cores?

Adam: It’s not just competitive coding that has a similarity to Arthur’s array programming style. But data science is also very APL like in some ways. And through a strange coincidence, while Conor was deep into APL, he left Amazon for NVidia to work on a GPU-accelerated data science library.

Conor: Yeah, it’s crazy that this happened, that now I work on an open source library that is inspired by pandas and pandas was written by an individual named Wes McKinney who used to work at a firm called AQR, which is a quant firm. And he was directly influenced by J, which is the daughter language of APL.

Adam: At NVidia, Conor’s ability to solve something in an APL style is coming in very handy for writing code that will distribute across the thousands of cores of a modern GPU.

Conor: For example, we explained reductions in scans earlier. A reduction is something that’s basically you’re taking an element and applying a binary operation to it. You can do that in serial, just taking one element at a time. And if you’ve got the five numbers, 1, 1, 1, 1, 1, you can just go through and one plus one is two, two plus one is three, et cetera. That’s a serial algorithm.

But you can also do something called a tree reduction. You can basically construct like a tree, add two elements at a time, and then in the next pass, take the results of two of the first operations and add those. And then you end up at the end of the day getting the exact same answer, but you didn’t do things in order. So, obviously, if you only have a five-element array, you’re not going to see much of a difference.

But if you’ve got like a five billion element array, then putting that on a GPU is basically going to chunk that up. It’s going to sum up all the individual chunks, and then it’s going to come and do some sort of three reduction, and you’re going to get the same result.

And because it’s doing that over thousands of course, it’s going to be way, way, way faster than if you’re doing it serially on a GPU where you can only take two numbers at a time in sequence.

Adam: Why does an array language, why is that helpful for expressing these?

Conor: Because branching doesn’t really fit super well into the GPU model. And array programming inherently avoids that. And so many of the primitives in APL are trivially parallelizable.

So, reductions and scans are two of the most common operations in APL. And if you look at there’s a library that NVidia puts out open source called Thrust, which is basically just GPU-accelerated versions of the standard library and C++, they basically have a stood reduce or thrust reduce and a thrust inclusive and exclusive scan, which are just two different versions of scans. And those are bread and butter parallel algorithms. So, so many of the primitives in APL are just inherently parallelizable.

So, you can map basically the primitives directly to a parallel algorithm that runs on the GPU or is it other languages, the way you solve things, if you’re doing branches, they don’t truly map to the GPU, whereas array programming languages, it’s basically the way they’re designed is operating on these huge arrays or huge matrices. It makes it like the match made in heaven from my point of view.

Adam: It’s not actually a fluke. It’s like this language was designed by Iverson to be able to take in these arrays and matrixes. And because of that, it ends up having this form where it ends up being data parallel. And then like GPUs exist. And isn’t that convenient? GPUs are great if you can do things data parallel.

Conor: Exactly. Yeah, everything’s in array. So, it works out perfectly.

Learning Programming Paradigms

Adam: There’s a cliché about different programming paradigms changing the way you think. I think sometimes when I hear it, I roll my eyes a little bit, but there’s definitely truth to it. And it’s hard to put your finger on exactly where the truth lies.

To me, it’s a little bit like living abroad. I was an exchange student in university. I did a small part of my undergrad degree in Australia. And when you travel, you eventually go home. But for me, what changed based on my travels was how I view my homeland. Just living somewhere else for long enough to experience life there, you kind of start to question little assumptions about how you lived before.

Conor is still working in C++. But when he’s doing that, sometimes he’s thinking about how he would solve a problem in APL. And I think that, that different perspective that you bring home to your primary programming language, that’s the power of learning how to solve problems in a different paradigm.

Outro

So, that was the show. I actually talked to Conor for quite a while, and I made him walk me through the ice cream problem from LeetCode that we talked about using APL. I’m going to try to figure out how to put a video of that up on the episode page, which you can find in the show notes in your podcast player.

Also, Conor and Bob Theriault and several others started the podcasts all about array programming languages. I’ll put a link to that on the episode page as well. Until next time, thank you so much for listening.

Excerpt of Array Programming

. **Conor:** Or actually that was right, but I should have said shift four.

Adam: Backtick shift four, there we go.

Conor: Yeah.

Adam: I’m looking at a weird arrow that point … What is it? Is this triangle?

Conor: This is a triangle with a line through it. Yeah.

Adam: With a line through, okay.

Conor: And then type X again and right bracket.

Adam: Right bracket.

Conor: And then, hit Enter.

Adam: Okay. Let me just see, up arrow right.

Conor: Or yeah, up arrow might work as well. And then if you put a six at the beginning of the expression, and then backtick shift six. That’s the transpose.

Adam: Backtick.

Conor: So, yeah, backtick shift six. Maybe I’ve got this wrong. Oh, no, sorry, without the shift. Backtick six.

Adam: Oops, backtick six. There we go.

Conor: And so, if you hit Enter on this now, we’re going to get our Boolean array.

Adam: Oh, okay.

Support CoRecursive

Hello,
I make CoRecursive because I love it when someone shares the details behind some project, some bug, or some incident with me.

No other podcast was telling stories quite like I wanted to hear.

Right now this is all done by just me and I love doing it, but it's also exhausting.

Recommending the show to others and contributing to this patreon are the biggest things you can do to help out.

Whatever you can do to help, I truly appreciate it!

Thanks! Adam Gordon Bell

Support The Podcast