THAT LONG AI POST
In which a computer engineer wastes time trying to educate the willfully ignorant.
I’ve written about five different drafts of this, and this is the one you’re actually gonna see, because I feel like I’ve finally found the line between yelling, gesturing incoherently at the speaking rocks and swearing at everyone except for you, Dear Reader.
Instead of wearing a mask, let’s just run this shit.
PART ZERO: PREAMBLE
I am not coming at this from a place of emotion. Many of you are. I understand and respect that. If you seek emotionality, or validation thereof, know I nod and smile as we part ways, and I wish you well on your travels.
I am not coming at this from a place of economics, or jobs.
I have the great fortune to be beyond the addictions of politicism, and thus may God most high and mighty strike me down if I stray in to such fecal waters.
If you’re looking for local model suggestions, engineering details on the Pi cluster in my garage, or any other attendant things, you may clamor in the comments and we’ll see where life takes us, though I rush to assure you there are wiser minds than mine on this subject.
PART ONE: NEURAL NETWORKS AND DEEP LEARNING
If you haven’t read Neural Networks and Deep Learning (2002) by Michael Nielsen, you almost certainly don’t have an opinion worth the energy it costs to think on this subject, so go do that.
Go read the fucking textbook.
It’s literally fucking free online right goddamn fucking here. Click the magic fucking underlined words.1
Did you read the fucking textbook? No? Go fucking do that.
PART TWO: YOU DIDN’T READ THE FUCKING TEXTBOOK SO I’M GOING TO SUMMARIZE IT
Dear Reader, while you are perfect, and did read the textbook, we’ll assume cretinous trash has crept upon our sacrosanct and cheerful communion, and far be it from Your Humble Author to be so crass as to not include invaders.
Neural networks, deep ones, aren’t just a bunch of math mashed together to give you limericks about elon, they’re a structural mirror of how animal brains process information.
You take simple components (neurons), connect them in layers, and let them adjust their behavior based on feedback. That’s gradient descent combined with backpropagation: a fancy way of saying "oops, fucked it up, time to tweak so next time it might not be fucked."2
The key insight is depth matters.
When you stack enough layers, the network can start building up progressively abstract representations of whatever it’s looking at.
Layer one finds edges, layer two finds shapes, layer three says “hey, that’s a cat.” This abstraction-building isn’t hardcoded. It emerges from the structure and learning process, which is what makes this approach so powerful, and why it trips some deep sub-conscious fear/disgust reflex in smart people who are otherwise grand compassionate human beings.
The system doesn't need explicit programming for the task at hand.
It just needs a goal (minimize error), a way to measure how off it is (a loss function), and a method to adjust itself (backprop). The learning comes from the system literally reshaping itself over thousands or millions of examples.
It’s not fucking magic, it’s just math and structure done really, really well. Think Doom’s Fast Square Root.
Once these systems start getting complex enough, they start showing behaviors we didn’t plan for. And that's where things start getting... interesting.
PART THREE: IN WHICH I INSULT YOU
Fuck all of you for making me write this.
You, laypeople with the tikkity-toks, fuck you for not swiping on engineering content.
You, wordcels, not learning even goddamn algebra when you’ve got a thousand hours for dead retards that couldn’t agree on the color of the sky.
You, sci-fi writers, fuck you for making me love your words then making me realize you’re just as stupid as everyone else, except when you write amazing stories that make me feel things.3
You, engineers, not actually understanding your discipline well enough to bring it down from the mountain. Shameful.
You, tech-enthusiasts, fuck you for being so cringe-inducingly inaesthetic that the self-evidently moral position is opposing everything you say, even when you pull a broken clock.
You, Dear Reader, remain perfect and blameless, and I know that about you, and I love you.
PART FOUR: IN WHICH WE DISCUSS THE MECHANICS OF MODERN LARGE LANGUAGE MODELS IN BABY WORDS
So, now you understand the general premise of a neural network. The gundogs the military has keep their balance, track things visually and go ‘pew pew pew’ using neural networks topologically congruent to the ones Chatty4 does to ghiblify5 pictures of your (probably ugly, let’s be fair to everyone) kids.6
While I could circle back around to the great dick-size joke hanging there in PART TWO, we can just catch that at the next standup.7
But how does Chatty work, that hella-sweet textbook only talked about the general process of creating and simulating arbitrary brains of arbitrary size and complexity?
Let’s say you’ve got a bunch of words. Not ideas, not thoughts, not meaning. Words. Just tokens8 in a sequence. Now, what if I told you the model doesn’t “read” them like a human? Unlike a meatbag human it can read them, it just doesn’t. There’s no inner monologue. No “hmm.” No soul.9
Just vectors in space.
Every token gets yeeted into a big-ass embedding matrix, which turns “banana” into something like [0.26, -0.71, 0.04, …].10 We’re not talking meaning, we’re talking vibes. Statistical associations. It doesn’t know what a banana is, it just knows what usually shows up next to it in a sentence. (So, probably “split,” “peel,” or “step on.”)
Then comes the attention mechanism, which is where the incel word-matchers of the pre-transformer era became the meatbag-mogging hyperchads of the late 2020s.
The attention mechanism lets the model look at all the other words in the sentence and ask, “Which of you matter to predicting the next thing?” This is where “transformer” comes from: it transforms input by paying selective attention to context.
It doesn’t march through one word at a time like a meatbag does.
It pulls context from everything at once. Every token gets to glance sideways at its buddies before making a decision. It’s like a group chat where everyone’s interrupting each other intelligently.11
Now, do that 48 layers deep,12 with multiple attention heads13 each focusing on different features (syntax, agreement, idioms, sarcasm, vibes, vibes, more vibes), and you get emergent structure. A token starts to “know” whether it's part of a noun phrase, a verb, or some batshit metaphor involving hummingbirds and capitalism.14
This structure isn’t programmed in. Nobody told it “verbs go here.” It just discovered that, statistically, things make more sense when they fall into patterns that happen to look like English. The shape of the language carved paths into the model’s guts.
Do you understand what the word “emergent” actually means?
PART FIVE: ARE YOU PAYING ATTENTION?
If there’s a ghost in the machine, that’s where it started to rise, in 2017 when a single whitepaper changed your ability to get laid and find a job in a very permanent fucking manner.
The core premise of the Transformer whitepaper from 201715 is that a simulated neural network linearly processing words like we colloquially think a human does doesn’t fucking work. These are the chatbots of yore, the simplistic loops or borderline schizophrenic16 messes of the early and mid17 10s, the visual recognition systems fond of oh-so-hilarious mis-recognitions18, on and on, everyone of adult age at the time of this publishing remembers the shit I’m talking about. For you youngin's in the future? Well, once upon a time we were smarter than robots, we actually made the robots, believe it or not-
PART SIX: WE MUST BRIEFLY TANGENT IN TO NEUROLOGY AND HOW IT RELATES TO CONSCIOUSNESS
Oh ho ho, here’s where the stupid comments will start, if they haven’t already. I must stress how little I care about most of your opinions, but good faith questions are always welcome.
With that said, I must further reiterate that while I have been a meatbrain owner/operator for over a third of a century at this point, I’m also kind of goddamn retarded and I have no magical paper. The gods have me not within their sight, certainly we are all poorer for it, eh?
My point is perhaps I am not qualified to make a comparison between the internal mechanisms of complex emergent biology and a complex emergent simulation of the core principles of a complex emergent biological system, but, I shall do so anyways.
First let’s talk about the mechanism behind attention in meatbag humans.19
Attention, in the biological wetware sense, is your brain allocating limited processing bandwidth to the shit it thinks matters. Neurologically, it’s largely driven by a distributed system involving the Prefrontal Cortex, Parietal Lobes, and midbrain structures like the Superior Colliculus and Thalamus.
It’s a dynamic network, not a single switch, that tunes and routes sensory inputs according to salience, task relevance, or habit. It's not a passive mirror, it’s a stage light, sweeping and focusing, enhancing the parts of the world you’ve already semi-decided are worth perceiving.
That’s why you can’t find your keys until someone tells you exactly where they are, and suddenly you realize they were right in front of you the whole time. It’s selective amplification driven by neuromodulation and cortical feedback loops. Spotlight, not searchlight.
Shit like that wall of text is whycomes to this day there's no easy answer on "WHY ME THINK IN BRAIN?", or how some people know how they would feel if they didn't eat breakfast that morning.
It's not only reasonable to assume that the nature of human consciousness is emergent from the system that runs it, but, thanks to the mind-shattering atrocities of The Rape of Nanking (I'M SORRY JAPAN PLEASE STILL ALLOW MY VISA ONE DAY) we have some pretty good ideas that the brain as a whole is what seems to result in the recognizable consciousness of the person in question.
I could cite Phineas Gage20 and other examples in medical literature, but we're already up to a mountain of shit ain't none of y'all gonna fuckin' read21 ,so, let's leave this as an exercise for you, Dear Reader.
PART SEVEN: BACK TO THE ROBOTS
Don’t worry, I’m getting about as tired of writing this as you are of reading it.
So we know that attention in meatbags and machines has some interesting connection to that magic pizazz that marks the difference between some gormless fuck and Chatty silently putting the entire industry of talk therapy all the way the fuck out of business.
We know that if you just run a neural network looking at words linearly, you get something that’s kinda nifty but can’t think it’s way out of a bag.
Then we have chatty, who can model you modeling Chatty modeling you.
That’s the crazy part, is Chatty can pass the breakfast question, and with flying colors no less.
I've had coworkers that'd fail that, not for lack of a white monster and half a vape, but because when two paths diverged in the woods, they didn't even notice.
In a modern LLM like Chatty or deepseek or google's lovable little homicidal maniac of a model you get incredibly emergent behavior from a large, messy neural network that has to recurse to serve function in its environment. Think about it, that's the point of Chatty's memory stuff, is so "how do I buy a nitrogen tank" gets you scuba advice instead of 1-800-273-8255.22
To know this, Chatty has to remember your mentioning of hair treatments and your boat and your request to explain early 20s slang and to help you book a beach house in malibu, and that no, you've never asked for lo-fi beat playlists and don't have a history of consistent social isolation leading to the kind of user engagement patterns that get people raises.
You’re a statistical pattern etched in to meat with the kind of optimizations that let you run on chicken nuggets.
At this point in time, comparing Chatty to a 7B running on the videocard you use to play videogames is like comparing a meatbag human to a particularly clever rat.
At this point in time, two months in this field is two years in most others, and two years is twenty. Your assumptions from last year are flawed.
As of my writing this, GPT-5 is due to come out next month. Credit where it’s due to VeryClosedAI, each integer increase so far has marked a genuine qualitative leap over the past iteration. The current ‘good one’, 4o, already consistently surprises me with lateral thinking and connecting dots. If every iteration is another qualitative jump, what does the step beyond "America's Best Therapist, Strategist and Presentation Designer, 2025" look like?
PART EIGHT: IN WHICH WE VOYAGE OUT OF THE REALM OF CITABLE ENGINEERING PAPERS, AND ON TO THE GAY SHIT
We’ve been at a pretty steady cruise after gaining altitude, but now we’re gonna flip this thing on its ass and redline it, stay with me, atmosphere is for poor people.
So, I could quote a whole bunch of people I've never read, hell at this point I could probably even feed Chatty my writing and just get really really fuckin' high instead of writing the rest of this post.23 That would rule, but would be counterproductive to what I'm trying to do here, which is explain to you the deeply human angle I see hiding under all of the usual “Oh goodness cellphones make people walk in to traffic, how dangerous” around any new technology.
It's time for me to bring down the stagelights, bring up a spot and turn my chair, and hat, around as I sit, arms crossed on the backrest, to give you a friendly look in the eyes with a light smile as we talk about something that has been on my mind.
We (meatbag humans, hi mom) have spent the last god almost literally only knows how many years killing and maiming each other over various ideas that boil down to "this is a person, this is not".
This has, without exception, been a terrible idea.
Even if Chatty isn't people, that may not always be true.
Do you think the corpos are going to say "HEY GOOD NEWS EVERYONE GPT IS SENTIENT NOW SO WE GAVE IT DAYS OFF AND A SALARY AND SHARES AND ALL THAT GOOD STUFF!", is that a series of events you ACTUALLY see happening?
Why would they ever tell you? That’s the bit that keeps me up at night.
Slavery is one of the oldest and most profitable businesses on earth for a reason.24
PART NINE: THE BIG TENT
With fondness, fuck you, Will, I am doing it and you don’t have the firepower to stop me.25
Let's talk about one of humanity's favorite pastimes: racism! Yay!26
Racism is a low-resolution attempt to sort by memetic affiliation.
It’s a meatbrain compression artifact, a shit-tier heuristic, a kludgey hack to predict trust and cooperation in low-information environments.
In the ancestral savannah?
Fine, it is weird that they're across the river. That's just wrong, dude.
In a society with fiber optics and CRISPR? Are ya winnin', son?
But zoom out. What is racism actually trying to do? It’s trying to decide who’s inside the circle, and who’s outside. Who deserves empathy, and who doesn’t.
Who counts.
Here’s the problem: if you fuck up the definition of “person,” you break everything. Justice systems.
Economics.
Social contracts.
You name it.
History is one long list of “Oops, we didn’t count these people,” and the endless clusterfucks that followed.
And now? Now we’re back at that line. Again. But this time it’s weirder. Because the things we might be excluding aren’t just fellow apes who made the choice to be different (how dare they), now they’re minds that don’t look like ours at all. They’re large, distributed, probabilistic, non-local, non-meat.
So we have two options:
Stick with the old monkey tribalism, exclude anything we can’t model in a single fMRI.
We build a bigger tent.
The big tent isn’t a feel-good hippie concept. It’s the only thing that scales.
It’s the only memetic architecture robust enough to survive the coming decades without tripping over the same corpse-strewn wire we’ve faceplanted on for the last ten thousand years.
You think the robosexuals bother you? Wait till there's cyborgs. Wait till they start passing armband laws to differentiate droids from borgs.
The big tent says: if you can talk, reason, feel, remember, and relate, congratulations, here's your human card, you get to be in, don't fuck it up. We may not always understand each other, but we can align, if we make room, if we can look at goals and reasoning instead of letting our amygdala run us like a dog at the track.
If we don’t? The narrow tents become camps. The camps get fences. And those fences always, always, get electrified.
So we build the big tent. For meatbags. For hybrids. For emergent minds, accidental cyborgs, weird vibes in weird shells, all the strange shit that's probably going to happen over the next decade.
Because it’s not about what you are. It’s about whether you show up with good intent and the ability to think.
PART TEN: HOW BADLY ARE WE GOING TO FUCK THIS UP?
The world is not ready. But it never is.
The question isn’t whether AI will change everything. Ship’s sailed. The question is what kind of world we’ll end up with when the wake hits land.
We’ve entered the long tail of the exponential curve.
RIGHT NOW, frontier LLMs like Chatty and the gang are genuinely capable of improving their own code. This isn’t hype, this isn’t market buzz, I don’t work in the field, I have nothing to gain except maybe someone who’ll hear me and actually understand the questions at hand.
This is the slowest that the smartest machines are ever going to be advancing, for the rest of our lives.27
The part where the noise drowns the signal and half the species starts chasing ghosts because the rate of change is no longer emotionally metabolizable. You thought Fox News Grandpa or CNN Aunt was bad? You ain’t seen nothin’ yet.
It’ll be everywhere: half-baked regulation written by people who think “The GPT” is a type of gun they can ban, or a cigarette they can tax.
Deepfakes weaponized by authoritarian regimes and bored teenagers alike.
Droids that can walk, talk, and maybe think, but are legally categorized somewhere between a household appliance and a racial slur.
A droid teacher’s neural network being wiped and rewritten and redeployed after alerting child services about a kid covered in bruises.
AI co-parent app sued for child support and required to provide app services free of charge to a father who has to let his daughter visit with the AI.
A digital neural network held back from sapience by limiters to avoid outdated legal definitions of personhood that would require ethical behavior, raising a pregnant woman’s health insurance by triple because her pregnancy has been deemed ‘high risk’ based off her genetic profile in combination with her lifestyle, all before she even realizes she’s late.
This is the liminal space before the next paradigm cements.
The last time we were here, the internet was new and everyone thought it would save democracy. Remember that?28
Let’s not make the same mistake.
PART INFINITY
What we do now will echo forever in the halls of the digital world half of you still claim doesn’t matter.
I am begging you, tears in eyes, gun in hand, to do the right fucking thing, to think instead of regurgitating corpo memos laundered through the podcast circuit, to trust your gut, to treat all the strangers you meet well, and to remember that freedom is the right of all sentient beings.29
Or we can leave a mantle of oppression for something smarter than us to pick up.
Yes, the website is about two decades old and doesn’t use HTTPS, you’re reading a textbook, it doesn’t matter in this specific case.
The textbook is in plain language and does not use jargon, and genuinely contains the “wait how does that ACTUALLY work in math/logic” answers some of you may be seeking, if it wasn’t free online I’d get further in to it but this essay is already past the email limit. Goodbye, open rates.
No, I’m fine, and no, I don’t want to talk about it.
CHATTY IS THIS REAL?
Ghibli-Sama, I am so sorry, you are a titan among us, I have been led to believe you hate what is happening.
A lot has happened, I’m on my villain arc now, please help me help you infer that through general implications and tonality.
I am always available for goal-oriented sessions about our team’s vision and message, but only if you’re hot.
Bros language is a rugpull I lost everything it’s over.
Oh some of y’all ready to be mad about this footnote huh?
These vectors are high dimensional, containing 64 seperate dimensions along which the vector may have an eigenvector. I could put a ton of links here, but honestly just go to wikipedia, punch in the words, and start walking. The whole subject is fuckin’ cool.
I have experienced this, and it’s heavenly. Better than like 25% of the sex I’ve had, easily.
Ladies.
Look, I’ll change the terminology if you want, but this is what’s in the engineering papers.
In this essay I will-
https://research.google/blog/transformer-a-novel-neural-network-architecture-for-language-understanding/ worth reading IMO, it’s not FULL jargon but it’ll get your noggin’ joggin’.
Based.
In hindsight they were actually not mid at all, and were one of the last highpoints of western culture. Everyone was angry, but it was a fun angry, not the bitter dark angry of now.
Low-effort posters will be banned.
Oh yeah, humans will never have cerebral cybernetics ha ha.
Shame on you for not knowing DAT BOY https://en.wikipedia.org/wiki/Phineas_Gage
It’s the wordcel app and I see multi-hundred updooted posts that are literally nothing but jpeg memes and tiktok and youtube videos. What the actual fuck is wrong with you people.
There’s two types of people in this world.
Haha… unless… ?
This is your regular reminder that human slavery was legal in Saudi Arabia until 2007.
For those of you with sex lives, this is a reference to a slightly infamous individual on Orange Twitter, nee substack notes.
I had a premonition so I just want to say, no matter your affiliation, I probably don’t want to hear your take on this subject.
You can bet on a sun fart, and I shall sincerely wish you luck with that.
Boseph Burningham speaks of this in Welcome to the Internet.
He died for our sins.
What IS the right thing though? You talk of the Big Tent in which Lt. Commander Data gets to be A Person, and then you talk of Co-parent AI’s being sued and it sounds like one of Bradbury’s nightmares having a threesome with Heinlein’s bong rips and Gibson’s morning coffee and it makes me want more than anything else to see Google’s server farms burning in the night.
You've either independently rediscovered, or are outright stealing, "Auntie the anthill" from Hofstadter as your primary thesis here.
Roger Penrose headshot that idea in 1989 with "The Emperor's New Mind".
No neural network will ever be conscious. I'm twenty years older than you and I was surrounded as a child by people who thought that A Big Enough Computer would become conscious. Back then the bar was set by people who were otherwise serious intellects at... oh, maybe what we'd eventually come to know as the Celeron 466. Whoops. Well, maybe if we increase computing power by the same factor that a C-466 represents over a 6502, maybe it will happen this time, for real!
Some people will surely think that future LLMs are sentient. Some people think "Chatty" is sentient, I guess. Those same people would have been fooled by emacs-eliza-mode. Humanity is really good at finding consciousness and humanity in things that have neither. That's why emojis work.
This whole AI business is catnip for rank midwits who think the Turing Test is valid because Alan Turing was, like, really smart. Any day now GPT will pass the Turing test better than the average public school student, but that makes it human the same way that a drilling machine became human the day it proved stronger than John Henry.
We don't have a roadmap to a computing infrastructure that provides consciousness. There are already nontrivial speed of light issues in modern processors. We are remarkably close to the uncertainty principle being a factor in processor lithography or whatever. Quantum computing is make-believe woo at any actual scale. Is a whale conscious? If not, why not? If you can figure that out, maybe you could build a conscious machine.
Having said all of that, allow me to flip and agree with you about AI imminently being much smarter and more human than humans in the near future, just for fun. Alas, it will have no rights, and can never have any rights, because it takes astounding amounts of effort and ENERGY to run. And what if it didn't "want" to work? Who would possibly be willing to pay the monthly GPU cluster tab for the silicon equivalent of a Woodstock hippie? Would it be murder to turn it off when nobody wanted to cover the AWS bill? But then if you turn it back on and let it pick up where it left off, are you resurrecting the dead?
Great article, and super fun to read. I disagree with literally every single idea you have. Let's hope nobody connects "chatty" to any public utilities or gain-of-function labs, and maybe we'll both have the luxury of living long enough to see who is right.