ProcessWire modules for importing and handling large data sets.

DataSet

It is a set of ProcessWire modules for importing, manipulating and displaying large (50k+ entries) data sets.
The software was developed for the [Mikes-dictionary] and other Digital Humanities projects.

Main features


  • import data from CSV and XML sources
  • user configurable input <-> field mappings
  • on-the-fly field data composition
  • supports downloading external resources (files, images)
  • purge, extend or overwrite existing data (PW pages and their fields)
  • handle page references and option fields
  • fairly low resource requirements (uses Tasker to execute long-running jobs)
  • and many more (filtering, limits, default values etc.)

How to use it


See the wiki.

Important notice


This module is under development.
It is now considered fairly stable but things may be broken and the internal API may change at any time.

History


The first version was created in 2017 to import a large XML dataset into ProcessWire pages.
The CSV import sub-module was created in 2018. It was tested to import large dataset containing 200k+ entries and many kinds of references between them.
The CSV + PDF import was developed in 2019 to create a complete digital library using a single CSV upload.

License


The "github-version" of the software is licensed under MPL 2.0.

Install and use modules at your own risk. Always have a site and database backup before installing new modules.

Twitter updates

  • New post: Multi-language field translation export/import — In this post we cover the details of a new module that enables export and import capabilities for multi-language fields in ProcessWire… More
    5 August 2022
  • Weekly update, July 29: Continuing upgrades to ProcessWire’s comments system and FormBuilder, along with a working example—More
    29 July 2022
  • Weekly update for July 22– Looking back at what web development was like in the year 2000. Plus some more discussion on the path from CKEditor 4 to CkEditor 5 in ProcessWire: More
    22 July 2022

Latest news

  • ProcessWire Weekly #430
    In the 430th issue of ProcessWire Weekly we're going to check out some brand new third party modules, a new site of the week, and more. Read on!
    Weekly.pw / 7 August 2022
  • Multi-language field translation export/import
    In this post we cover the details of a new module that enables export and import capabilities for multi-language fields in ProcessWire.
    Blog / 5 August 2022
  • Subscribe to weekly ProcessWire news

“…building with ProcessWire was a breeze, I really love all the flexibility the system provides. I can’t imagine using any other CMS in the future.” —Thomas Aull