• Content Count

  • Joined

  • Last visited

Community Reputation

44 Excellent

About mtwebit

  • Rank

Recent Profile Visitors

308 profile views
  1. mtwebit

    Thanks for the screencast, @flydev. I really appreciate it. Regarding the pcntl on Windows... Unfortunately, I don't have experience in this field. Windows has no signals (as far as I know) so I'd use a virtualization tool to overcome this limitation. You can try an API/kernel virtualization (e.g. Linux subsys for Windows) or use Docker. Also thanks for the pull req. I'll check your suggestions.
  2. mtwebit

    I was thinking about this too... There was a dev branch that dropped the [file + rules in description] scheme and introduced a fieldset of [rule + (optional) file]. It turned out to be too complicated and it did not work well so I dropped it. An easy solution is to allow source location override. So... see this commit and use the input:location configuration option. Not the best solution as it still requires a (dummy) file to be uploaded (to create the import rules in its description), but it works. You can even use this solution to refer to files uploaded to other pages using this URL scheme: wire://pageid/filename Hope it helps. That's different. It downloads data for a single field (e.g. a file to be stored in a filefield) not for an entire DataSet.
  3. mtwebit

    I've created a set of modules for importing (manipulating and displaying) data from external resources. A key requirement was to handle large (100k+) number of pages easily. Main features import data from CSV and XML sources in the background (using Tasker) purge, update or overwrite existing pages using selectors user configurable input <-> field mappings on-the-fly data conversion and composition (e.g. joining CSV columns into a single field) download external resources (files, images) during import handle page references by any (even numeric) fields How it works You can upload CSV or XML files to DataSet pages and specify import rules in their description. The module imports the content of the file and creates/updates child pages automatically. How to use it Create a DataSet page that stores the source file. The file's description field specifies how the import should be done: After saving the DataSet page an import button should appear below the file description. When you start the import the DataSet module creates a task (executed by Tasker) that will import the data in the background. You can monitor its execution and check its logs for errors. See the module's wiki for more details. The module was already used in three projects to import and handle large XML and CSV datasets. It has some rough edges and I'm sure it needs improvement so comments are welcome.
  4. mtwebit

    Thanks for the feedback. I'll definitely have a look at InputfieldMarkup and wirequeue. It is possible to integrate the UI elements into other pages. There's a renderTaskList() method to perform this: if(wire('modules')->isInstalled("TaskerAdmin")) { $out .= '<h3>Tasks</h3>'; $out .= wire('modules')->get('TaskerAdmin')->renderTaskList($selector); } It can render a task list or a detailed task monitoring div including JS code to show the progressbar or even to execute the task when displaying the page. It needs improvement, however, as its links point to the TaskerAdmin page atm and the list UI is not really customizable.
  5. mtwebit

    Tasker is a module to handle and execute long-running jobs in Processwire. It provides a simple API to create tasks (stored as PW pages), to set and query their state (Active, Waiting, Suspended etc.), and to execute them via Cron, LazyCron or HTTP calls. Creating a task $task = wire('modules')->Tasker->createTask($class, $method, $page, 'Task title', $arguments); where $class and $method specify the function that performs the job, $page is the task's parent page and $arguments provide optional configuration for the task. Executing a task You need to activate a task first wire('modules')->Tasker->activateTask($task); then Tasker will automatically execute it using one of its schedulers: Unix cron, LazyCron or TaskerAdmin's REST API + JS client. Getting the job done Your method that performs the task looks like public function longTask($page, &$taskData, $params) { ... } where $taskData is a persistent storage and $params are run-time options for the task. Monitoring progress, management The TaskerAdmin module provides a Javascript-based front-end to list tasks, to change their state and to monitor their progress (using a JQuery progressbar and a debug log area). It also allows the on-line execution of tasks using periodic HTTP calls performed by Javascript. Monitoring task progress (and log messages if debug mode is active) Task data and log Detailed info (setup, task dependencies, time limits, REST API etc.) and examples can be found on GitHub. This is my first public PW module. I'm sure it needs improvement
  6. I was looking for a module that allows the execution of long-running tasks (working with tens of thousands of pages) and could not find a suitable solution. It started with the import problem. I have lots of data in XML form (20k+ complex entries) that I want to import into ProcessWire. XMLReader() works fine but it takes a very long time to import all data so a simple "upload + process data on page save" would not work. So I've created a new module for this. Meet my first (well, third) PW module: Tasker. It's a simple module that executes long tasks in the background (using Cron or LazyCron) and reports their status to the user using a JQuery progressbar. Any suggestions are welcome. How do you solve similar problems? E.g. which is the best way to delete large number of pages? (max_exec_time will expire so I check it before the delete() call.) $children = $page->children('template='.$this->template.',include=all'); // creates lot of page objects, may not have time to delete all + ...->delete() OR $childIDs = $this->pages->findIDs('parent='.$page->id.',template='.$this->template.',include=all'); // will create page objects later + $pages->get(>delete() Be nice (I know you're always), I'm a PW-newbie, just started working with ProcessWire. I was developing sites with Drupal for a very long time but their non-existent module upgrade path finally has driven me away from it.
  7. mtwebit

    I face a similar problem, 20k+ entries from external sources. XMLReader() is quite fast and has low memory usage. It's also pretty simple if you understand its logic. $xml = new \XMLReader(); $xml->open($file->filename); while($xml->next('tagName')) { if ($xml->nodeType != \XMLReader::ELEMENT) continue; // skip the end element ... process attributes ... $xml->getAttribute('attr') ... inner or outer XML ... $xml->readOuterXML() ... you can even convert it to SimpleXML or DOM }