Jump to content

Leaderboard

Popular Content

Showing content with the highest reputation on 06/14/2026 in all areas

  1. This week we've got a big batch of issue fixes on the core dev branch (about 18 fixes), most related to GitHub issue reports. I've also been catching up on some client work this week, so not as many updates as in the last few weeks. As I use more AI in my work, I've been building up a real desire to understand how it works. To me it seems like magic. So recently I started building a language model in PHP. Admittedly PHP isn't the ideal language for building such a thing, but this is really about learning, and so I thought I should use the language I know best. Using something like PyTorch in Python would certainly make things much easier, but would also abstract away a lot of the understanding I want to get out of the project. The new model (and future ProcessWire module) is called Rambler. Currently it rambles on, often incoherently, hence the name Rambler. Though it'll get more coherent as time goes on, no doubt (hopefully!). Rambler uses zero machine learning libraries, no black boxes, and in fact has no dependencies at all. It's just pure PHP implementing the same mathematical foundations that power modern AI systems like GPT, at least that's the goal. I'm writing all the code myself, but I do have Claude as my teacher, describing each step, teaching me the concepts and terminology, and telling me what I need to code. After I code each part, he looks at it and tells me what I did right and wrong. It's a slow process but I learn by doing, and so it's also a lot of fun. It certainly helps to have an infinitely patient teacher. Currently the project runs two models side by side (for comparison): A Markov n-gram model, which is a classical statistical approach that predicts the next word based on how often word sequences appeared in training data. A neural network model that learns distributed word representations called embeddings. The neural model passes those embeddings through a hidden layer with ReLU activation, then predicts the next word using a softmax output. As far as I understand it, these are the same core building blocks used in early neural language models. The most time consuming part of doing this in PHP is the training. Python has libraries and functions that handle a lot of the hardcore math in ways that can take advantage of the hardware, like GPUs. But this is not the case with PHP, so training uses the CPU, and a lot of it! As far as the training system goes, it uses mini-batch gradient descent with backpropagation. The model makes predictions, measures how wrong it was (which is the "loss"), and then works backwards through the network, computing gradients to adjust every weight in the right direction. Rambler also includes two tokenizers: A word-level tokenizer, and a BPE (byte pair encoding) tokenizer. BPE is the subword strategy that is used by GPT, Claude and other modern LLMs. But for the small scale that I'm working at, the word-level tokenizer works faster, so far. The next milestone is adding an attention mechanism in a RamblerTransformer subclass. This attention mechanism (a transformer) is the core innovation behind a lot of modern LLMs. I'm hoping to get started on that part this weekend. Beyond being a learning exercise, the longer-term goal is to train it on all of the ProcessWire documentation (which is what they call a "corpus" in this context) and release it on GitHub as a learning resource, and a PW module. Perhaps someday it'll be a tool in ProcessWire, or at least a really smart search engine for ProcessWire, we'll see. As far as I could tell, there aren't any other PHP-based language models that use the same technologies used by modern LLMs, so I figured, why not. I want to understand how they work under the hood without wading through Python frameworks, and I'm sure others do too. Once I get a little farther along with it, I look forward to getting it up on GitHub as a standalone project, but also as a ProcessWire module. The slowness of the training process (the model, not me... well okay, probably me too) is the hard part. I'm currently running a 30 hour training on all the text from a book. When the project is finished, I'm likely going to have Claude or Codex do a translation of the training code into C, that takes advantage of the much faster math capabilities available there. From what I understand, a 30 hour training in PHP will take about 30 minutes in C. I don't know if that's accurate or not, but it sounds good enough that I'm going to find out. 🙂 Thanks for reading and have a great weekend!
    4 points
  2. No Num-Pad you say? Then you aren't ready for this... Vortex Core, Cherry MX Brown switches...
    2 points
  3. Vox 1.3.0: Textformatter, forum layout and inline forms inside content I have updated Vox with a new set of content-friendly integration options. The main addition is TextformatterVox. It allows editors to place Vox widgets directly inside textarea or rich-text fields, without touching template files. Examples: [[vox:forum]] [[vox:reviews]] [[vox:questions]] [[vox:discussions]] [[vox:all]] There is also a new inline form mode for editorial pages. You can now insert a compact discussion, question or review form between paragraphs, similar to how media sites place participation blocks inside articles: [[vox:form]] [[vox:discussion-form]] [[vox:question-form]] [[vox:review-form]] Custom copy is supported: [[vox:form type="question" title="Ask the editors" intro="We will answer useful questions here." button="Send question"]] This makes it easier to add community interaction exactly where it makes sense in the content flow, instead of only placing a widget at the end of the page. Other recent additions include a forum-style overview template: [[vox:forum]] It shows categories, recommended threads, newest discussions, search and a start-discussion form. The current version is 1.3.0. Changelog highlights Added TextformatterVox. Added forum-style landing view. Added compact inline forms for discussion, question and review posting. Added Textformatter documentation and install-screen examples. Updated public styles for inline editorial forms. Repository: https://github.com/mxmsmnv/Vox
    1 point
  4. Hi @Stefanowitsch, Yes, this is now possible. I added a global Title Format setting in Ichiban settings under: Settings → Rendering → Title Format You can use it like this: {meta_title} | {site_name} or, if you prefer a fixed suffix: {meta_title} | my-company.com Supported placeholders are: {meta_title} {site_name} {entity_name} {host} So your examples would render as: Home | my-company.com Our Services | my-company.com The title format is applied to the rendered <title> tag, and Ichiban’s Audit / Bulk Editor title length checks now include the formatted title length as well. After changing the format, rebuild the audit index so the stored title length checks are refreshed.
    1 point
  5. Update (v 0.58.0): tag-vocabulary management & a "Used in" column Hi, everyone! We´ve been enjoying developing the module a bit further this week. Since the last update the module has grown from 0.55 to 0.58. Two bigger additions: 🏷️ Tag-vocabulary management You can now curate a field's predefined tag vocabulary right inside the tag picker, with no trip to the field settings: Rename or delete a predefined tag library-wide, inline (no modal-on-modal). Table rows live-refresh immediately after a library-wide rename/delete. Newly entered tags can be promoted into the field's predefined list. This is only available when the image fields file tags setting is set to "User selects from list of predefined tags + can input their own". Alphabetical ordering, and a mobile-friendly single-column tag chooser with touch controls. 🔎 "Used in" – a content-based where-used column A new column answers the question you actually have at a glance: which pages embed this image? It scans rich-text content (not just the field relations), so it also catches images placed via "Insert from library": Cross-page "Insert from library" embeds are resolved to their true source image, and embeds inside (Matrix)Repeaters are attributed to the owning page. The count is a plain link that opens a dialog listing every page and the fields the image is embedded in. The "Used in" and "Variations" columns are now sortable, and integer columns are centre-aligned. Next up (in development): nested collections – group collections into subgroups with a drag-and-drop manager, cascading fly-out menus in the bar, and touch-friendly curation. Have a nice weekend! Cheers, Mike
    1 point
  6. Hi everyone, Accessibility overlays have a bad reputation — mostly because they're sold as SaaS, phone home to third-party servers, and charge monthly fees for something that should be built-in. Ally is different: self-hosted, MIT, no external requests at runtime. GitHub: https://github.com/mxmsmnv/Ally What it does Adds an accessibility panel to your site's frontend, powered by Sienna (MIT). The JS bundle and OpenDyslexic font ship with the module and are served from your own server — nothing loads from external CDNs at runtime. Font size adjustment Dark, light, and high contrast modes High/low saturation, monochrome Dyslexia-friendly font (OpenDyslexic, bundled locally) Highlight links and headings Letter spacing, line height, bold text Reading guide, stop animations, big cursor 53 languages with auto-detection from html[lang] or browser settings Full ProcessWire multi-language support — maps $user->language to the correct locale automatically Configurable position, offset, button size, and accent color Skips admin pages and Chrome Lighthouse by default No build step — prebuilt JS bundle included. One caveat: the widget is injected via Page::render hook. If you serve pages through ProCache static HTML, the hook doesn't run on cached pages — exclude those pages from ProCache if you need the widget there. Overlay widgets supplement, but do not replace, accessible markup. Use Ally alongside good semantic HTML, not instead of it. Requirements: ProcessWire 3.0.200+, PHP 8.1+ MIT License.
    1 point
×
×
  • Create New...