ryan Posted June 12 Posted June 12 This week we've got a big batch of issue fixes on the core dev branch (about 18 fixes), most related to GitHub issue reports. I've also been catching up on some client work this week, so not as many updates as in the last few weeks. As I use more AI in my work, I've been building up a real desire to understand how it works. To me it seems like magic. So recently I started building a language model in PHP. Admittedly PHP isn't the ideal language for building such a thing, but this is really about learning, and so I thought I should use the language I know best. Using something like PyTorch in Python would certainly make things much easier, but would also abstract away a lot of the understanding I want to get out of the project. The new model (and future ProcessWire module) is called Rambler. Currently it rambles on, often incoherently, hence the name Rambler. Though it'll get more coherent as time goes on, no doubt (hopefully!). Rambler uses zero machine learning libraries, no black boxes, and in fact has no dependencies at all. It's just pure PHP implementing the same mathematical foundations that power modern AI systems like GPT, at least that's the goal. I'm writing all the code myself, but I do have Claude as my teacher, describing each step, teaching me the concepts and terminology, and telling me what I need to code. After I code each part, he looks at it and tells me what I did right and wrong. It's a slow process but I learn by doing, and so it's also a lot of fun. It certainly helps to have an infinitely patient teacher. Currently the project runs two models side by side (for comparison): A Markov n-gram model, which is a classical statistical approach that predicts the next word based on how often word sequences appeared in training data. A neural network model that learns distributed word representations called embeddings. The neural model passes those embeddings through a hidden layer with ReLU activation, then predicts the next word using a softmax output. As far as I understand it, these are the same core building blocks used in early neural language models. The most time consuming part of doing this in PHP is the training. Python has libraries and functions that handle a lot of the hardcore math in ways that can take advantage of the hardware, like GPUs. But this is not the case with PHP, so training uses the CPU, and a lot of it! As far as the training system goes, it uses mini-batch gradient descent with backpropagation. The model makes predictions, measures how wrong it was (which is the "loss"), and then works backwards through the network, computing gradients to adjust every weight in the right direction. Rambler also includes two tokenizers: A word-level tokenizer, and a BPE (byte pair encoding) tokenizer. BPE is the subword strategy that is used by GPT, Claude and other modern LLMs. But for the small scale that I'm working at, the word-level tokenizer works faster, so far. The next milestone is adding an attention mechanism in a RamblerTransformer subclass. This attention mechanism (a transformer) is the core innovation behind a lot of modern LLMs. I'm hoping to get started on that part this weekend. Beyond being a learning exercise, the longer-term goal is to train it on all of the ProcessWire documentation (which is what they call a "corpus" in this context) and release it on GitHub as a learning resource, and a PW module. Perhaps someday it'll be a tool in ProcessWire, or at least a really smart search engine for ProcessWire, we'll see. As far as I could tell, there aren't any other PHP-based language models that use the same technologies used by modern LLMs, so I figured, why not. I want to understand how they work under the hood without wading through Python frameworks, and I'm sure others do too. Once I get a little farther along with it, I look forward to getting it up on GitHub as a standalone project, but also as a ProcessWire module. The slowness of the training process (the model, not me... well okay, probably me too) is the hard part. I'm currently running a 30 hour training on all the text from a book. When the project is finished, I'm likely going to have Claude or Codex do a translation of the training code into C, that takes advantage of the much faster math capabilities available there. From what I understand, a 30 hour training in PHP will take about 30 minutes in C. I don't know if that's accurate or not, but it sounds good enough that I'm going to find out. 🙂 Thanks for reading and have a great weekend! 20 7
Ellyot_Chase Posted June 12 Posted June 12 Have FUN this weekend Ryan! Isn't that feeling of discovery a wonderous thing. 4
Kiwi Chris Posted June 13 Posted June 13 I love that rather than just consuming AI, you're trying to learn how it actually works. Speaking of the translation of code - I wonder if AI would make it easy to provide a PostgreSQL database layer for ProcessWire? Also speaking of C, I wonder about translating the entire ProcessWire codebase into Rust. (It would probably be expensive in terms of token use) ProcessWire is fast as PHP apps go, but Rust is blindingly fast, but harder to learn than PHP. It's also memory safe whereas C isn't. It might not work, as PHP is interpreted so it makes it easy to deploy individual files, whereas Rust compiles to a single binary executable, so it's probably like comparing apples with pumpkins, but all kinds of things ight be possible of you don't have to painstakingly write all the code by hand. 2
Jonathan Lahijani Posted June 13 Posted June 13 Very exciting to hear Ryan. My understanding of how LLMs work is surface level. You mentioned it's being trained on ProcessWire documentation. Does it also have to be be trained on anything general to have a stronger understanding of English itself (understanding nouns, verbs, how to form sentences, having a personality, etc etc)?
HMCB Posted June 13 Posted June 13 Is the eventual goal to perhaps have PW help us build our code without the need of an external LLM? And also take into account PW’s community modules? Imagine if it could also do front-end (HTML & CSS through the current frameworks UIKit, but including Tailwind too). That would honestly make PW a one-stop-shop. 2
ryan Posted 3 hours ago Author Posted 3 hours ago On 6/13/2026 at 10:54 AM, Jonathan Lahijani said: Very exciting to hear Ryan. My understanding of how LLMs work is surface level. You mentioned it's being trained on ProcessWire documentation. Does it also have to be be trained on anything general to have a stronger understanding of English itself (understanding nouns, verbs, how to form sentences, having a personality, etc etc)? @Jonathan Lahijani yes and no. When we use the MLP (multilayer perceptron) model then we provide it with a pre trained GloVe file for vocabulary. We're using the 100d file from Stanford: https://nlp.stanford.edu/projects/glove/ For the transformer model, apparently the pre trained GloVe file doesn't help. I'm not really sure I understand why though, I'm still learning. I get similar results either way. In any case, the scale I'm working at is small and more educational than practical. The models work, and learn the general order of words, and how some words relate to one another. But it's Rambler, it rambles on like a crazy person, and it's not yet clear to me that it will ever be good enough to have a production use with my limited scale. Though I'm going to keep at it. One thing I've learned is that the "magic" behind what we see with frontier models has a lot to do with extremely large scale, both in hardware and training data. The ability to code happened kind of accidentally. It was apparently a surprise. As the scale increased, the models started coding, without it being the actual goal or intention. Another thing I've learned is that while many people understand the technology that goes into AI models, nobody fully understands how you go from models that complete sentences and answer questions to models that seemingly reason, understand humor, think and solve complex problems that they weren't actually trained on. (Though someone correct me if that's changed). While I now feel like I have a basic understanding of the technology and how it works, there is still a sense of something beyond understanding, at least when the technology is combined with scale. Definitely a interesting subject! 1
ryan Posted 2 hours ago Author Posted 2 hours ago On 6/13/2026 at 1:24 PM, HMCB said: Is the eventual goal to perhaps have PW help us build our code without the need of an external LLM? And also take into account PW’s community modules? Imagine if it could also do front-end (HTML & CSS through the current frameworks UIKit, but including Tailwind too). That would honestly make PW a one-stop-shop. This is already possible, but you do need either an external LLM or a good local one (like Qwen Code 2.5, or Gemma 4) with the hardware to handle it. I am running Qwen 3 and Qwen Code 2.5 on my iMac and they are good, but slow (my iMac is from 2017) and I'm using 7-9b parameter models, so they are pretty limited in what they can do. These local models have ProcessWire knowledge, but not at the level that external models do. Qwen 3 is cool because you can watch its thought process before it answers your prompt. But Qwen Code 2.5 does better with PW, despite being older. I think it's highly unlikely that Rambler would ever be good enough to even be a replacement for a pre trained local model. Even those open source models have millions of dollars behind them. But I am hoping that Rambler gets good enough to have some sort of production value eventually, so I'm going to keep rambling on. 🙂 2
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now