AI Exposes a Major Problem with Universities

I’ve heard that AI, or more properly, Large Language Models (LLMs), are a disaster for colleges and universities. Many people take this to be an indictment of the students, and there is some truth to that, but they’re missing the degree to which this is a damning indictment of Academia. If your tests give excellent grades to statistical text generators, you weren’t testing what you thought you were and the grades you gave didn’t mean what you thought they meant.

Of course, it’s been an open secret that grades have meant less and less over the years. The quality of both students and professors has been going down, though no one wants to admit it. This is, however, a simple consequence of the number of students and professors growing so much over the last 50 or so years. In the USA, something like 60% of people over the age of 25 have attended college with close to 40% of them having a degree. 60% of people can’t all be in the top 1%. 40% of people also can’t all be in the top 1%. At most, in fact, 1% of people can be in the top 1%. When a thing becomes widespread, it must trend toward mediocrity.

So this really isn’t a surprise. Nor, frankly, is it a surprise that Universities held on to prestige for so much longer than they deserved it—very few human beings have the honesty to give up the good opinion of others that they don’t deserve, and the more people who pile onto a ponzi scheme, the more people have a strong interest in trying to keep up the pretence.

Which is probably why Academics are reacting so desperately and so foolishly to the existence of chatGPT and other LLMs. They’re desperately trying to prevent people from using the tools in the hope that this will keep up their social status. But this is a doomed enterprise. The mere fact that the statistical text generator can get excellent grades means that the grades are no longer worth more than the statistical text generator. And to be clear: this is not a blow for humanity, only for grades.

To explain what I mean, let me tell you about my recent experiences with using LLM-powered tools for writing software. (For those who don’t know, my day job is being head of the programming department at a small company.) I’ve been using several, mostly preferring GitHub Co-Pilot for inline suggestions and Aider using DeepSeek V3 0324 for larger amounts of code generation. They’re extremely useful tools, but also extremely limited. Kind of in the way that a back hoe can dig an enormous amount of dirt compared to a shovel, but it still needs an operator to decide what to dig.

What I and all of my programmer friends who have been trying LLM-powered tools have found is that “vibe coding,” where you just tell the LLM what you want and it designs it, tends to be an unmaintainable disaster above a low level of complexity. However, where it shines is in implementing the “leaf nodes” of a decision tree. A decision tree is a name for how human beings handle complex problems: we can’t actually solve complex problems, but we can break them down into a series of simpler problems that, when they’re all solved, solve the complex problem. But usually these simpler problems are still complex, and so they need to be broken down into yet-simpler problems. And this process of breaking each sub-problem down eventually ends in problems simple enough that any (competent) idiot can just directly solve it. These are the leaf nodes of the decision tree. And these simple problems are what LLMs are actually good at.

This is because what LLMs actually do is transforms in highly multi-dimensional spaces, or in less technical language, they reproduce patterns that existed in their training data. They excel at any problem which can be modeled as taking input and turning it into a pattern that existed in its training data, but with the details of the input substituted for the details in the training data. This is why they’re so good at solving the problems that any competent idiot can solve—solutions to those problems were abundant in its training data.

The LLMs will, of course, produce code for more complex things for which the solution did not already exist in its training data, but the quality of these solutions usually range from terrible to not-even-a-solution. (There are lots of people who will take your money and promise you more than this; there are always people who will use hype to try to separate people from their money. I’ve yet to hear of the case where they are not best ignored.)

Now, I’ve encountered the exact problem of a test being rendered obsolete by LLMs. In hiring programmers, I’ve had excellent results making the first interview a programming sample specification that people had 5 business days to complete. (To prove good faith, I’d give them my implementation to it right after they submitted theirs.) It was a single page, fairly detailed specification, but it left room for creativity, too. However, you can throw it into any high-end LLM these days and get a perfectly workmanlike result. This is obviously not useful as a first interview anymore.

One possible response would be to try to prevent the use of LLMs, such as by asking people to write it in front of me (e.g. during a video call with a shared screen). But what would be the point of that? If we hired the person, I’d expect them to use LLMs as a tool at work. (Used properly, they increase productivity and decrease stress.)

It only took a minute or two of thinking about this to realize that the problem is not that LLMs can implement the programming sample, but that the programming sample was only slightly getting at what I wanted to find out about the person. What I want to know is whether they can design good software, not whether they can rapidly implement the same kind of code that everyone (competent) has written ten times at least.

So I came up with a different first interview sample. Instead of having people do something which is 10% what I want to see and 90% detail work, I have switched to asking the candidates to write a data format for our products, focusing on size efficiency balanced with robustness and future expansion based on where they think our products might go in the future. This actually gets at what I want to know—what is the person’s judgement like—and uses very little of their time doing anything an LLM could do faster.

I haven’t hired anyone since making this change, so I’m not in a position to say how well this particular solution to the problem works. I’m only bringing it up to show the kind of thinking that is necessary—asking yourself what it is that you are actually trying to get at, rather than just assuming that your approach is getting at that. (In my defense, it did work quite a lot better for the intended purpose than FizzBuzz, which we had used before. So it was very much a step in the right direction.)

That Academia’s response to LLMs is to try to just get rid of them, rather than to use them to figure out what the weakness in their testing have been, tells you quite a lot about what a hollow shell Academia has become.

Testing Computer Programs

My oldest son, who does yet know how to program, told me a great joke about programmers testing the programs they’ve written:

A programmer writes the implementation of a bartender. He then goes into the bar and orders one beer. He then orders two beers. He orders 256 beers. He order 257 beers. He order 9,999 beers. He orders 0.1 beers. He orders zero beers. He orders -1 beers. Everything works properly.

A customer walks in and asks where the bathroom is. The bar catches fire.

It’s funny ’cause it’s true.

It’s easy, when you design a tool, to test that it works for the purpose the tool exists for. What it’s very easy to miss is all of the other possible uses of the tool. To take a simple example: when you’re making a screwdriver, it’s obvious to test the thing for driving screws. It’s less obvious to test it as a pry bar, a chisel, an awl, or a tape dispenser.

This disparity is inherent in the nature of making tools versus using them. Tools are made by tool-makers. The best tool makers use their own tools, but they are only one person. Each person has his way of solving a problem, and he tends to stick to that way because he’s gotten good at it. When he goes to make a tool, he makes it work well for how he will use it, and often adds features for variations on how he can think to use it to solve the problems he’s making the tool to solve. If he’s fortunate enough to have the resources to talk to other people who will use the tool, he’ll ask them and probably get some good ideas on alternative ways to use it. But he can’t talk to everyone, and he especially can’t talk to the people who haven’t even considered using the tool he hasn’t made yet.

That last group is especially difficult, since there’s no way to know what they will need. But they will come, because once the tool exists, people who have problems where this new tool will at least partially solve their problem will start using it to do so, since they’re better off with it than they were before, even though the tool was never meant to do that.

This isn’t much of a problem with simple tools like a screwdriver, since it doesn’t really have any subtleties to it. This can be a big problem with complex tools, and especially with software. When it comes to software design, you can talk to a bunch of people, but mostly you have to deal with this through trial-and-error, with people reporting “bugs” and you going, “why on earth would you do that?” and then you figure it out and (probably) make changes to make that use case work.

The flip side is a big more generally practical, though: when considering tools, you will usually have the most success with them if you use them for what they were designed to do. The more you are using the tool for some other purpose, the more likely you are to run into problems with it and discover bugs.

For me this comes up a lot when picking software libraries. Naive programmers will look at a library and ask, “can I use this to do what I want?” With more experience, you learn to ask, “was this library designed to do what I want to do?” Code re-use is a great thing, as is not re-inventing the wheel, but this needs to be balanced out against whether the tool was designed for the use for which you want to use it, or whether you’re going to be constantly fighting it. You can use the fact that a car’s differential means that its drive wheels will spin in the mud to dig holes, but that will stop working when car manufacturers come out with limited-slip differentials because they’re making cars for transportation, not digging holes.

That’s not to say that one should never be creative in one’s use of a tool. Certainly there are books which work better for propping up a table than they do for being read. Just be careful with it.

Twitter Trending Is One of the Worst Ideas Ever

I’ve talked before about how bad social media is (in its current forms) in Social Media is Doomed and talked about some ways to deal with it (in its current forms) in Staying Sane on Social Media. Today I want to talk a little bit about how Twitter Trending is either designed or might as well be designed to amplify the worst aspects of social media. (If you’re not familiar with it, Twitter Trending shows you a realtime-updated list of hot topics that a lot of people are discussing this minute.

Twitter Trending, since it is a snapshot of what is being discussed in high volumes, necessarily captures what people are not taking the time to think about. When people take time to think about a subject, they do not all take the same amount of time to think, and so they will not post at the same time. To post the same time, people must be posting almost immediately upon hearing about the subject. (There is some complex stochastic mathematics I’m oversimplifying, but the conclusion is the same.) To post upon hearing something, one must either be a subject matter expert who can instantly recognize context people will need in order to understand the hot topic, or else one must fool enough to think that one’s immediate, unthinking reaction is worth other people’s time. The latter will naturally predominate among the people posting immediately, for the simple reason that subject matter experts are rare.

So we have a collection of posts, mostly by fools. How to make this work? How about not using a criteria for what to show people which has nothing to do with quality. Most recent, most viewed, and most responded-to would all do well to give the highest likelihood of not getting the best tweets (or are they called xits, now?) without having to laboriously rate all of the tweets for quality then pick the lowest.

Now that we’ve selected what may well be the worst of the worst, and is at best the average of the worst, Twitter Trending now adds one more layer of awful: importance. The very act of showing people these randomly (with respect to quality) selected tweets makes them important. Since they’re likely to be the dumbest comments of fools, this will naturally spark outrage, because it is particularly bad when the worst fools have to offer is elevated within society. Worse still, Twitter Trending presents this, not as a window into the dregs of what humanity has to offer, but as something neutral. Since, among non-psychopaths, the default reason to call someone’s attention to something is because their life will be better for it, Twitter Trending implicitly calls this garbage, good.

Some day Twitter will be able to use AI to show people a curated feed of the worst things ever tweeted, but until then, Twitter Trending is about the closest humanity can currently come.

There is, however, some good news. At least if you use Chrome, or one of its derivatives, like Brave (which is what I use): Twitter Control Panel. It removes a bunch of the worst features of Twitter, as well as doing some other stuff I don’t much care about (mostly changing the rebranding of Twitter to X). It’s still social media, but it helps to limit the worst excesses of present-day social media.

(Note, because internet: so far as I know Twitter Control Panel is not a commercial enterprise and I have no affiliation with whoever it is who makes it.)

Good Afternoon December 31, 2016

Good afternoon on this the thirty first day of December, in the year of our Lord’s incarnation 2016.

Fair warning, I won’t be talking about the changing of the year because as a mathematician I just can’t get worked up about rolling over of numbers in arbitrarily chosen bases. We all have our weaknesses.

I put together a second computer, for my second son, so that he and his older brother can play minecraft together. He had been borrowing an old laptop of mine which was very slow (“laggy” is the term my eldest son has picked up from the minecraft youtubers he watches, though to my ears that refers more to events happening after the fact rather than low framerates). Interestingly he’s a bit young for most cooperative games because in minecraft they tend to be puzzle games, and he’s not old enough to really understand puzzles. Herobrine’s Mansion, which is a class RPG hack-and-slash adventure has proven to be nearly ideal since it is cooperative but the cooperation is all about hitting undead things with your sword until they’ve been restored to a normal state of being dead. It’s a wholesome activity—demon-infested corpses should (in general) be put down quickly and inhumanely—and simple enough even for young children to get the idea and not screw up the game for their older siblings. (It’s also really cute to hear him screaming, “Naughtie zombie! Naughtie skeleton!” as he bashes them with his preferred simulation of a re-dead maker.) I suspect in another year or so they’ll be able to play the games with logic puzzles in them, which will be very awesome to watch. Incidentally, I really enjoy playing Herobrine’s mansion with them. Hack-and-slash are some of my favorite games, and were since I was a child.

Which brings me back to the topic of restored continuity. With new technology people keep recreating old games, both for nostalgia and because the old games were good—I was going to say, “and just lacked good graphics”, but sometimes the graphics were good (if mostly by being skilfully suggestive), and in Minecraft unless you’re using a high quality resource pack (like Chromahills) together with a shader pack like the SEUS shaders, Minecraft doesn’t have good graphics. Anyway, there was a huge disruption of culture in the late 1800s and the first half or three quarters of the 1900s, but I think things are settling down. My parents, I believe, felt somewhat disconnected from me, and their parents—again, I believe—felt somewhat disconnected from them. But I don’t feel disconnected from my children. I don’t mean in a complete sense, of course; all parents have a strong connection to their children. I just mean culturally. My children play the same sorts of games that I played as a child, if perhaps as a somewhat older child than they are now. Then again I listened to the same sort of music my parents did—I was a big Simon & Garfunkel fan as a child—so there is probably some similarity there too. And the things I used to do that they don’t, I mostly don’t do now either. I don’t feel any loss of cultural connection because we had land-line phones and kids these days only use cell phones. I only use cell phones these days too. I think this is classed in, “my children are better off than I was”, even though I can’t take any credit for it unlike the immigrants who worked their fingers to the bone so their children would grow up with a good education and an easy job that doesn’t leave them weary and sore in the evening.

It’s an interesting subject, because for example audio codecs have gotten slightly better than mp3s, but it doesn’t matter much and the music is still the same (especially pop music with its eternal three cords). And amusingly phones are getting HD voices at a time when people increasing text rather than call each other. That in particular I find a fascinating trend. As soon as the technology in cell phones got good enough, we didn’t move to video calling or video+smell-o-vision or whatever. We “regressed” to pure text. Bad news for the blind, perhaps; great news for the deaf, and for most of much more convenient. But we’re not going to see HD text which is radically different from the texts we send now. I’m not such a fool as to think that life will be unchanged in 100 years—in A Stitch in Space I certainly gave a vision of progressed technology with implantables that overlay on top of our optic and auditory nerves, though I didn’t flesh its implications out all the way—but I suspect that we’re going to see technological progress appearing to slow down because of human preferences. We will preferentially adopt new technology which doesn’t require much change of us, and so new technology will often emulate old technology with improvements, and the people who grew up with that old technology will feel that things haven’t changed all that much. We’ll see, of course. Nothing is so hard to predict as the future. But at the very least I sure am enjoying it as my kids do the things that I did as a kid, or those things with mild variations.

God bless you.

Good Morning December 10th, 2016

Good morning on this the tenth day of December in the year of our Lord 2016.

Winter is clearly here in force now. I was waiting for my oldest son at the bus stop and felt like I was slowly turning into an icicle. And I deal better with cold than with heat. There’s something fascinating about the cycle of how in northern climes the world dies off then comes back to life again. It’s an interesting metaphor, anyway. It also raises a curious question about fiction set in lands that are in permanent snow: what’s the basis of life there? It various with the fictional environment, of course, but perhaps the ones I find most interesting are the ones where there are warm lands nearby, so the basis of life is something like fish which wander into the colds to birth their young where there are fewer predators. It can make for some very pretty images.

As I’m working on the video response to my friend’s nephew which I mentioned before, I did a quick video which is just a short description of how to use a neodymium magnet as a stud finder, in place of an electronic stud finder or knocking with one’s finger and judging how hollow the sound is. The video wasn’t great but came out alright. As the British would say, it’s fit for purpose. But there’s one mistake in it where I want to put a text overlay and for some reason the video editor just isn’t playing sound. I’m sure I’ll fix it eventually, but it’s a reminder of the continual frustration of using technology. None of it works very well. Chesterton complained about this in What’s Wrong With the World:

Cast your eye round the room in which you sit, and select some three or four things that have been with man almost since his beginning; which at least we hear of early in the centuries and often among the tribes. Let me suppose that you see a knife on the table, a stick in the corner, or a fire on the hearth. About each of these you will notice one speciality; that not one of them is special. Each of these ancestral things is a universal thing; made to supply many different needs; and while tottering pedants nose about to find the cause and origin of some old custom, the truth is that it had fifty causes or a hundred origins. The knife is meant to cut wood, to cut cheese, to cut pencils, to cut throats; for a myriad ingenious or innocent human objects. The stick is meant partly to hold a man up, partly to knock a man down; partly to point with like a finger-post, partly to balance with like a balancing pole, partly to trifle with like a cigarette, partly to kill with like a club of a giant; it is a crutch and a cudgel; an elongated finger and an extra leg. The case is the same, of course, with the fire; about which the strangest modern views have arisen. A queer fancy seems to be current that a fire exists to warm people. It exists to warm people, to light their darkness, to raise their spirits, to toast their muffins, to air their rooms, to cook their chestnuts, to tell stories to their children, to make checkered shadows on their walls, to boil their hurried kettles, and to be the red heart of a man’s house and that hearth for which, as the great heathens said, a man should die.

Now it is the great mark of our modernity that people are always proposing substitutes for these old things; and these substitutes always answer one purpose where the old thing answered ten. The modern man will wave a cigarette instead of a stick; he will cut his pencil with a little screwing pencil-sharpener instead of a knife; and he will even boldly offer to be warmed by hot water pipes instead of a fire. I have my doubts about pencil-sharpeners even for sharpening pencils; and about hot water pipes even for heat. But when we think of all those other requirements that these institutions answered, there opens before us the whole horrible harlequinade of our civilization. We see as in a vision a world where a man tries to cut his throat with a pencil-sharpener; where a man must learn single-stick with a cigarette; where a man must try to toast muffins at electric lamps, and see red and golden castles in the surface of hot water pipes.

This is not precisely the complain that modern technology doesn’t work, but it’s tied to it, for modern technology being more complicated, it is more prone to failure. And nowhere is this more true than in computers, which in general barely work. (I say this as a professional programmer.) But even when they barely work, they are marvelous things, allowing us to do all sorts of marvelous things like write and read blog posts. And whenever these things which barely worked in the first place do fail, we get very frustrated by it. Which is natural enough; but I try to remind myself of how close all modern technology comes to not working, and to remember that even if computers and phones and such work 99% of the time, it is still when they work that is the exception, not when they fail. For all their success is snatched from failure. It is really a miracle that they work at all. It’s not accurate to the small picture, exactly, but it is accurate to the big picture. We live in an enchanted world, and it’s healthy to remember all he millions of men who have lived and died without ever having placed a single telephone call, or whose computer never booted up at all because it would be several centuries until the invention of electricity on demand.

God bless you.

Science, Magic, and Technology

There is an interesting observation made, I believe, by Isaac Asimov:

Any sufficiently advanced technology is indistinguishable from magic.

This has been applied many times in science fiction to produce some form of techno-mage, but what’s more interesting is that the origins of modern science were in magic, specifically in astrology and alchemy. The goals of science were the same as that of magic: to control the natural elements. If you really study the history, it’s not even clear how to distinguish modern science from renaissance magic; in many ways the only real dividing line is success. There is some truth to the idea that alchemists whose techniques worked got called chemists to distinguish them from the alchemists whose ideas didn’t work. This is by no means a complete picture, because there was also at the same time natural philosophy, i.e. the desire to learn how the natural world worked purely for the sake of knowledge.

Natural philosophy has existed since the Greeks—Aristotle did no little amount of it—but it especially flourished in the renaissance with the development of optics which allowed for the creation of microscopes and telescopes. Probably more than anything else this marked the shift towards what we think of as modern science. As Edward Feser argues, the hallmark of modern science is viewing nature as a hostile witness. The ancients and medievals looked at the empirical evidence which nature gave, but they tended to trust it. Modern science tends to assume that nature is a liar. Probably more than any other single cause, being able to look at nature on scales we could not before and seeing that it looked different resulted in this shift towards distrusting nature. Some people feel a sense of wonder when looking through a microscope, but many people feel a sense of betrayal.

Another significant historical event was when the makers of technology started using the knowledge of natural philosophy in order to make better technology. This may sound strange to modern ears, who are used to thinking of technology as applied science, but in fact technological advancements very rarely rely on any new information about how the world works which was gained by disinterested researchers who published their results for the sake of curiosity. Technology mostly advances by trial and error modifying existing technology, and especially by trial and error on materials and techniques. In fact, no small amount of science has consisted of investigating why technology actually works.

But sometimes technology really does follow fairly directly from basic scientific research. One of the great examples is radio waves, which were discovered because the Maxwell’s theory of electromagnetism predicted that they existed. Another of the great examples of technology following from basic scientific research is the atomic bomb.

I suspect that these as well as other, lesser, examples, helped to solidify the identification between science and engineering. And I don’t want to overstate the distinction. In some cases the views of the natural world brought about by science have certainly helped engineers to direct their investigations into suitable materials and designs for the technology they were creating. But counterfactuals are very difficult to consider well, and it is by no means clear that the material properties which were discovered by direct investigation but also explained by scientific theories would not have been discovered at roughly the same time, or perhaps only a little later.

However that would have gone, the association between science and technology is presently a very strong one, and I think that this is why Dawkinsian atheists so often announce an almost religious devotion to science. I’ve seen it expressed like this (not an exact quote):

Science has given us cars and smartphones, so I’m going to side with science.

Anyone who actually knows anything about orthodox Christianity knows that there is no antipathy between science and religion. Though it is important to note that I mean this in the sense of there being no antipathy between natural philosophy and religion. In this sense, Christianity has been a great friend to science, providing no small amount of the faith that he universe operates according to laws (i.e. that being a creature is has a nature) and that these laws are intelligible to human reason. Moreover, the world having been created by God, it is interesting, since to learn about creation is to learn about the creator. It is no accident that plenty of scientists have been Catholic priests. The world is a profoundly interesting place to a Christian.

But there is a sense in which the Dawkinsian atheist is right, because he doesn’t really care about natural philosophy. What he cares about is technology, and when he talks about science he really means the scheme of conquering nature and bending it to our will. And this is something towards which Christianity is sometimes antagonistic. Not really to the practice, since technology is mostly a legitimate extension of our role as stewards of nature, but to the spirit. And it is antagonistic because this spirit is an idolatrous one.

The great difference between pagan worship and Christian worship is that Christian worship is an act of love, whereas pagan worship is a trade. Pagan deities gain something by being worshiped, and are willing to give benefits in exchange for it. This relationship is utterly obvious in both the Iliad and the Odyssey, but it is actually nowhere so obvious as when the Israelites worshiped the golden calf. For whatever reason this often seems to be taken to be a reversion to polytheism, where the golden calf is an alternative god to Yahweh. That is not what it is at all. If you read the text, after the Israelites gave up their gold and it was cast into the shape of a calf, they worshiped it and said:

Here is your God, O Israel, who brought you out of the land of Egypt.

The Israelites were not worshipping some new god, or some old god, but the same god who brought them out of Egypt. The problem was that they were worshiping him not as God, but as a god. That is, they were not entering into a covenant with him, but were trying to control him in order to get as much as they could out of him. Granted, as in all of paganism it was control through flattery, but at its root flattery has no regard for its object.

And this is the spirit which I think we can see in the people who say, “Science has given me the car and the iPhone, I will stick with Science.” They are pledging their allegiance to their god, because they hope it will continue to give them favors. And it is their intention to make sacrifices at its altars. This is where scientists become the (mostly unwitting) high priests of this religion; the masses do not ordinarily make sacrifices themselves, but give the sacrifices to the priests of the god to make sacrifice on their behalf. And so scientists are given money (i.e funded) as an offering.

To be clear, this is not the primary reason science gets funded. Dawkinsian atheists (and other worshipers of science) tend to be less powerful (and less numerous) than they imagine themselves. Still, this is, I think, how they view the world, except without the appropriate terminology because they look down on all other pagans.

And I think that it is largely this, and not the silly battles with fundamentalists and other young-earth creationists that result in their perception of a war between science and religion. There were other historical reasons for the belief in a war between science and religion, but I am coming to suspect that they had their historical time and then waned, and Dawkinsian atheism is resurrecting the battle for other reasons. They are idolaters, and they know Christianity is not friendly to idolatry. And idolaters always fear what will happen if their god does not get what it wants.