Jump to content
ryan

Module: Import Pages from CSV file

Recommended Posts

Here is a new module for ProcessWire 2.1 that imports pages from a CSV file. By default it will create new pages from data in the CSV file, but you can also configure it to modify existing pages too (existing pages that have the same title).

Please give it a try and let me know how it works for you and if you run into any issues with it. This module is something I've had in the works for awhile, and regularly use on various projects, so figured I should clean it up a bit and release it. Also attached are a couple screenshots from it.

How to Install:

1. Download from: https://github.com/r.../ImportPagesCSV

2. Place the file ImportPagesCSV.module in your /site/modules/ directory.

3. In ProcessWire admin, click to 'Modules' and 'Check for new modules'.

4. Click 'install' next to the 'Import Pages CSV' module (under heading 'Import').

Following that, you'll see a new menu option for this module on your Admin > Setup menu.

Supported field types for importing:*

  • PageTitle
  • Text
  • Textarea (including normal or TinyMCE)
  • Integer
  • Float
  • Email
  • URL
  • Checkbox (single)

*I'll be adding support for multi-value, page-reference and file-based Fieldtypes in a future version.

post-1-132614278648_thumb.png

post-1-13261427868_thumb.png

  • Like 9

Share this post


Link to post
Share on other sites

Very cool module! Thanks a lot Ryan! I'll have a look and try out soon.

Share this post


Link to post
Share on other sites

Looks great and will definitely come in use! Will test and learn from the code later on.

Share this post


Link to post
Share on other sites

Looking great! Seems like a very useful module. Maybe even something to include by default, because most if not all projects i can think of require you to do some importing.

I have made a reference to this thread/module over here: http://processwire.com/talk/index.php/topic,24.0.html

Because my code reading skills suck i do have a question: In 'Step 2', how do you come up with the list of fields found in the csv? I'm guessing you need to have a header line in your csv file?

Share this post


Link to post
Share on other sites

Yep, it requires header row: "The list of field names must be provided as the first row in the CSV file. "

Share this post


Link to post
Share on other sites

Yep, it requires header row: "The list of field names must be provided as the first row in the CSV file. "

stupid, i totally missed that 

Share this post


Link to post
Share on other sites

I could modify it to not require a header row. I guess I just figured every CSV file I've ever come across had one.

Share this post


Link to post
Share on other sites

Usually they have. And since that works by uploading a file then anyone can add those titles to the file.

Share this post


Link to post
Share on other sites

I could modify it to not require a header row. I guess I just figured every CSV file I've ever come across had one.

I think it's fine to require the header row, without it there will be no way to do Stap 2 in it's current form.

If a header row is missing you can easily add it, because you have control over the file you choose to use.

Share this post


Link to post
Share on other sites

Ryan,

How did you know I would need something like this today?

I came to this forum to ask a question about creating pages from a feed and here it is... (ok, I was focused on an xml feed, but CSV will work as well I guess.)

Thanks!

/Jasper

Share this post


Link to post
Share on other sites

Hi!

I just started testing this module and it works great to import a product feed. I love it!

Only one minor thing I noticed is images: my productfeed contains URLs to an image, but I can't select the imagefields in the import module.

Is there a reason for this? I tried to manually add a URL into the field_image and it worked great, it saved a copy (or multiple copies in different sizes) of the image on my server.

/Jasper

Share this post


Link to post
Share on other sites

Hi!

I just started testing this module and it works great to import a product feed. I love it!

Only one minor thing I noticed is images: my productfeed contains URLs to an image, but I can't select the imagefields in the import module.

Is there a reason for this? I tried to manually add a URL into the field_image and it worked great, it saved a copy (or multiple copies in different sizes) of the image on my server.

/Jasper

The following is written in Ryan's topicstart:

Supported field types for importing:*

    * PageTitle

    * Text

    * Textarea (including normal or TinyMCE)

    * Integer

    * Float

    * Email

    * URL

    * Checkbox (single)

*I'll be adding support for multi-value, page-reference and file-based Fieldtypes in a future version.

So i guess you'll have to wait till a future version (dunno if Ryan has this planned anytime soon), or you could have a look at the code and possibly add something yourself.

Share this post


Link to post
Share on other sites

Oops, I missed that the imagefield wasn't listed.  Sorry about that. :-\

I am not sure if my coding skills are sufficient to add the image field import myself. But it's of course worth a try.  :)

A workaround could of course be using the URL field type until Ryan releases a future version.

/Jasper

Share this post


Link to post
Share on other sites

I don't think a multi-image field will work, but a single image field may very well work if you want to try it.

To try it, backup your PW database and site first. If it doesn't work, you want to be able to restore to where you were. Though, chances are you won't have to do any kind of restoring files, but you can never be too safe.

Next, make sure you are dealing with a single image field. Edit your field (in Setup > Fields) and make sure it's "max number of files" is set to "1".

Next edit the ImportPagesCSV.module file and locate this (near the top, and add the lines indicated at the bottom).

<?php
        protected $fieldtypes = array(
                'FieldtypePageTitle',
                'FieldtypeText',
                'FieldtypeTextarea',
                'FieldtypeInteger',
                'FieldtypeFloat',
                'FieldtypeEmail',
                'FieldtypeURL',
                'FieldtypeCheckbox',
                'FieldtypeFile',  // add this line
                'FieldtypeImage', // add this line
                );

Save, and try it out. Let us know if it worked?

Share this post


Link to post
Share on other sites

Save, and try it out. Let us know if it worked?

It didn't really work. It looks like the page need to be created before an image can be added.

I received the following error:

TemplateFile: New page '/shs/verblijf-in-stockholm/0/' must be saved before files can be accessed from it
#0 C:\xampp\htdocs\shs\wire\core\PagefilesManager.php(133): PagefilesManager->path()
#1 C:\xampp\htdocs\shs\wire\core\PagefilesManager.php(43): PagefilesManager->createPath()
#2 C:\xampp\htdocs\shs\wire\core\PagefilesManager.php(32): PagefilesManager->init(Object(Page))
#3 C:\xampp\htdocs\shs\wire\core\Page.php(1132): PagefilesManager->__construct(Object(Page))
#4 C:\xampp\htdocs\shs\wire\core\Pagefiles.php(63): Page->filesManager()
#5 C:\xampp\htdocs\shs\wire\core\Pagefiles.php(47): Pagefiles->setPage(Object(Page))
#6 C:\xampp\htdocs\shs\wire\modules\Fieldtype\FieldtypeImage.module(33): Pagefiles->__construct(Object(Page))
#7 C:\xampp\htdocs\shs\wire\core\Fieldtype.php(289): FieldtypeImage->getBlankValue(Object(Page), Object(Field))
#8 C:\xampp\htdocs\shs\wire\core\Page.php(523): Fieldtype->getDefaultValue(Object(Page), Object(Field))
#9 C:\xampp\htdocs\shs\wire\core\Page.php(467): Page->getFieldValue('hotel_image')
#10 C:\xampp\htdocs\shs\wire\core\Page.php(364): Page->get('hotel_image')
#11 C:\xampp\htdocs\shs\wire\core\Page.php(308): Page->setFieldValue('hotel_image', 'http://images.t...', true)
#12 C:\xampp\htdocs\shs\site\modules\ImportPagesCSV.module(385): Page->set('hotel_image', 'http://images.t...')
#13 C:\xampp\htdocs\shs\site\modules\ImportPagesCSV.module(346): ImportPagesCSV->importPage(Array, Object(InputfieldForm))
#14 C:\xampp\htdocs\shs\site\modules\ImportPagesCSV.module(126): ImportPagesCSV->processForm2(Object(InputfieldForm))
#15 [internal function]: ImportPagesCSV->___executeFields()
#16 C:\xampp\htdocs\shs\wire\core\Wire.php(267): call_user_func_array(Array, Array)
#17 C:\xampp\htdocs\shs\wire\core\Wire.php(229): Wire->runHooks('executeFields', Array)
#18 C:\xampp\htdocs\shs\wire\core\ProcessController.php(194): Wire->__call('executeFields', Array)
#19 C:\xampp\htdocs\shs\wire\core\ProcessController.php(194): ImportPagesCSV->executeFields()
#20 [internal function]: ProcessController->___execute()
#21 C:\xampp\htdocs\shs\wire\core\Wire.php(267): call_user_func_array(Array, Array)
#22 C:\xampp\htdocs\shs\wire\core\Wire.php(229): Wire->runHooks('execute', Array)
#23 C:\xampp\htdocs\shs\wire\core\admin.php(42): Wire->__call('execute', Array)
#24 C:\xampp\htdocs\shs\wire\core\admin.php(42): ProcessController->execute()
#25 C:\xampp\htdocs\shs\wire\templates-admin\controller.php(13): require('C:\xampp\htdocs...')
#26 C:\xampp\htdocs\shs\site\templates\admin.php(13): require('C:\xampp\htdocs...')
#27 C:\xampp\htdocs\shs\wire\core\TemplateFile.php(88): require('C:\xampp\htdocs...')
#28 [internal function]: TemplateFile->___render()
#29 C:\xampp\htdocs\shs\wire\core\Wire.php(267): call_user_func_array(Array, Array)
#30 C:\xampp\htdocs\shs\wire\core\Wire.php(229): Wire->runHooks('render', Array)
#31 C:\xampp\htdocs\shs\wire\modules\PageRender.module(194): Wire->__call('render', Array)
#32 C:\xampp\htdocs\shs\wire\modules\PageRender.module(194): TemplateFile->render()
#33 [internal function]: PageRender->___renderPage(Object(HookEvent))
#34 C:\xampp\htdocs\shs\wire\core\Wire.php(267): call_user_func_array(Array, Array)
#35 C:\xampp\htdocs\shs\wire\core\Wire.php(229): Wire->runHooks('renderPage', Array)
#36 C:\xampp\htdocs\shs\wire\core\Wire.php(289): Wire->__call('renderPage', Array)
#37 C:\xampp\htdocs\shs\wire\core\Wire.php(289): PageRender->renderPage(Object(HookEvent))
#38 C:\xampp\htdocs\shs\wire\core\Wire.php(229): Wire->runHooks('render', Array)
#39 C:\xampp\htdocs\shs\wire\modules\Process\ProcessPageView.module(73): Wire->__call('render', Array)
#40 C:\xampp\htdocs\shs\wire\modules\Process\ProcessPageView.module(73): Page->render()
#41 [internal function]: ProcessPageView->___execute()
#42 C:\xampp\htdocs\shs\wire\core\Wire.php(267): call_user_func_array(Array, Array)
#43 C:\xampp\htdocs\shs\wire\core\Wire.php(229): Wire->runHooks('execute', Array)
#44 C:\xampp\htdocs\shs\index.php(170): Wire->__call('execute', Array)
#45 C:\xampp\htdocs\shs\index.php(170): ProcessPageView->execute()
#46 {main}

/Jasper

Share this post


Link to post
Share on other sites

Sorry, I neglected to think about that before (how pages have to be created before you can add images to them). One possible way around it is to try to first import without the image field, then import the same spreadsheet again but with the image field (updating the existing pages that were created). You'd need to choose the option to "modify existing page." But it's possible that might work.

Share this post


Link to post
Share on other sites

I tried that as well, but it didn't work.  :(

Even when the pages exist, the import module will not add the images.

The error/warning is exactly the same some in my previous reply.

/Jasper

Share this post


Link to post
Share on other sites

Jasper, I didn't want to leave you empty handed, especially after you've tried this a few times and my suggestions didn't work. Here's an updated version of the ImportPagesCSV module that supports file and image filed importing. It supports both single and multi-file fields, so there aren't any limitations in that area.

https://github.com/ryancramerdesign/ImportPagesCSV

To import a multi-file field, place each filename or URL on it's own line in your spreadsheet, OR separate each by a pipe "|", OR separate each by a tab (you decide) – basically, you are delimiting the filenames/URLs within the field. In my own tests, I used the pipe "|" to separate the URLs and all seemed to work well. Of course, if there is only one image/file, you don't need anything other than the filename or URL (no delimiter necessary).

I ended up changing quite a bit of code, so please let me know if you run into any error messages or anything – it may not be perfect yet, but hopefully close.

  • Like 4

Share this post


Link to post
Share on other sites

Wow, that's great Ryan. Thank you!

I am going to try it later today and will tell you how it worked.

/Jasper

Share this post


Link to post
Share on other sites

It works great. I hoped to be able to add one image, but suddenly I can add even more.

Thanks Ryan

(I owe you a beer  :D)

Share this post


Link to post
Share on other sites

I have found the import CSV module to be really useful, and has become a big part of my site building workflow with processwire. It's much easier to collect and manage lots of data in excel / numbers and then directly import into processwire. This is also great for updating content with a re-import.

I think data journalists and data hackers will find this sort of functionality very useful for doing quick data visualisation mashups (especially when you add on the friendly pw api).

I think content strategists will like this workflow as most of them use excel for collecting and organising site content at the start of projects.

And of course the clients will like it - this will be a nice way to start a new project - once basic site structure is agreed, we can give the client an excel template (or google doc) and tell them to start adding content. Wouldn't it be great to start a project with actual real content!

I hope you continue adding support for other pw fields like page-references and dates (I use pages extensively for category management).

Could this module also work in the opposite direction and export the data to CSV?

Then we just need to work out a way to import / export fields and templates :)  as much as I admire the processwire interface, I still think managing data / settings / fields is much quicker and easier in text files, at least at the beginning  of a project.

Processwire - a CMS that gets out of the way.

  • Like 1

Share this post


Link to post
Share on other sites

This module looks like exactly what I need for migrating a couple of larger sites to PW. Would it be possible to hack this for adding users as well?

Thanks,

Stephen

Share this post


Link to post
Share on other sites

Thanks for the message mjmurphy–glad that you like this module. I use it quite a bit myself too. I will definitely be adding support for more fieldtypes to it. Actually, I think dates and page references are the only two that we don't support yet. Adding dates will be easy, but the page references a little more complex. However, I need the page references functional in the near future so will likely be adding both of those types soon.

An ExportCSV module is also planned for sure.

Then we just need to work out a way to import / export fields and templates

This won't be a module, it is already planned for the core (likely in 2.3/2.4). This feature was in PW1, but just hasn't made it into PW2 yet. But it's a very useful thing to have I agree.

This module looks like exactly what I need for migrating a couple of larger sites to PW. Would it be possible to hack this for adding users as well?

I think it should work now, though I haven't tried. Although it looks like I need to add support for FieldtypePassword (another one I missed) if you want to import passwords. Should be easy to add though–added to my list.

  • Like 1

Share this post


Link to post
Share on other sites

I want to import some users. So I chose "user" as "Template to use for imported pages" and "Users" as "parent page". But if I'm uploading the .csv and have to choose the right fields I can't choose "title". The only option is "email". And if I submit without a title there will be an error saying "Unable to import page because it has no required 'title' field or it is blank."

So how can I import users?

Greets,

Nico

Share this post


Link to post
Share on other sites

I haven't actually tried to import users yet. But it should be possible. Go ahead and add the 'title' field to your user template and populate it with something (email address?) to see if that makes the import happy. I'll plain to make the import a little smarter in this regard on the next update.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Gadgetto
      SnipWire - Snipcart integration for ProcessWire
      Snipcart is a powerful 3rd party, developer-first HTML/JavaScript shopping cart platform. SnipWire is the missing link between Snipcart and the content management framework ProcessWire.
      With SnipWire, you can quickly turn any ProcessWire site into a Snipcart online shop. The SnipWire plugin helps you to get your store up and running in no time. Detailed knowledge of the Snipcart system is not required.
      SnipWire is free and open source licensed under Mozilla Public License 2.0! A lot of work and effort has gone into development. It would be nice if you could donate an amount to support further development:

      Status update links (inside this thread) for SnipWire development
      2020-04-06 -- SnipWire 0.8.6 (beta) released! Adds support for Snipcart subscriptions and also fixes some problems 2020-03-21 -- SnipWire 0.8.5 (beta) released! Improves SnipWires webhooks interface and provides some other fixes and additions 2020-03-03 -- SnipWire 0.8.4 (beta) released! Improves compatibility for Windows based Systems. 2020-03-01 -- SnipWire 0.8.3 (beta) released! The installation and uninstallation process has been heavily revised. 2020-02-08 -- SnipWire 0.8.2 (beta) released! Added a feature to change the cart and catalogue currency by GET, POST or SESSION param 2020-02-03 -- SnipWire 0.8.1 (beta) released! All custom classes moved into their own namespaces. 2020-02-01 -- SnipWire is now available via ProcessWire's module directory! 2020-01-30 -- SnipWire 0.8.0 (beta) first public release! (module just submitted to the PW modules directory) 2020-01-28 -- added Custom Order Fields feature (first SnipWire release version is near!) 2020-01-21 -- Snipcart v3 - when will the new cart system be implemented? 2020-01-19 -- integrated taxes provider finished (+ very flexible shipping taxes handling) 2020-01-14 -- new date range picker, discount editor, order notifiactions, order statuses, and more ... 2019-11-15 -- orders filter, order details, download + resend invoices, refunds 2019-10-18 -- list filters, REST API improvements, new docs platform, and more ... 2019-08-08 -- dashboard interface, currency selector, managing Orders, Customers and Products, Added a WireTabs, refinded caching behavior 2019-06-15 -- taxes provider, shop templates update, multiCURL implementation, and more ... 2019-06-02 -- FieldtypeSnipWireTaxSelector 2019-05-25 -- SnipWire will be free and open source Plugin Key Features
      Fast and simple store setup Full integration of the Snipcart dashboard into the ProcessWire backend (no need to leave the ProcessWire admin area) Browse and manage orders, customers, discounts, abandoned carts, and more Multi currency support Custom order and cart fields Process refunds and send customer notifications from within the ProcessWire backend Process Abandoned Carts + sending messages to customers from within the ProcessWire backend Complete Snipcart webhooks integration (all events are hookable via ProcessWire hooks) Integrated taxes provider (which is more flexible then Snipcart own provider) Useful Links
      SnipWire in PW modules directory SnipWire Docs (please note that the documentation is a work in progress) SnipWire @GitHub (feature requests and suggestions for improvement are welcome - I also accept pull requests) Snipcart Website  
      ---- INITIAL POST FROM 2019-05-25 ----
       
    • By bernhard
      #######################
      Please use the new RockFinder2
      #######################
      WHY?
      This module was built to fill the gap between simple $pages->find() operations and complex SQL queries.
      The problem with $pages->find() is that it loads all pages into memory and that can be a problem when querying multiple thousands of pages. Even $pages->findMany() loads all pages into memory and therefore is a lot slower than regular SQL.
      The problem with SQL on the other hand is, that the queries are quite complex to build. All fields are separate tables, some repeatable fields use multiple rows for their content that belong to only one single page, you always need to check for the page status (which is not necessary on regular find() operations and therefore nobody is used to that).
      In short: It is far too much work to efficiently and easily get an array of data based on PW pages and fields and I need that a lot for my RockGrid module to build all kinds of tabular data.

      Basic Usage

       
      Docs & Download
      https://modules.processwire.com/modules/rock-finder/
      https://github.com/BernhardBaumrock/RockFinder
       
      Changelog
      180817, v1.0.6, support for joining multiple finders 180810, v1.0.5, basic support for options fields 180528, v1.0.4, add custom select statement option 180516, change sql query method, bump version to 1.0.0 180515, multilang bugfix 180513, beta release <180513, preview/discussion took place here: https://processwire.com/talk/topic/18983-rocksqlfinder-highly-efficient-and-flexible-sql-finder-module/
    • By MoritzLost
      TrelloWire
      This is a module that allows you to automatically create Trello cards for ProcessWire pages and update them when the pages are updated. This allows you to setup connected workflows. Card properties and change handling behaviour can be customized through the extensive module configuration. Every action the module performs is hookable, so you can modify when and how cards are created as much as you need to. The module also contains an API-component that makes it easy to make requests to the Trello API and build your own connected ProcessWire-Trello workflows.
      Features
      All the things the module can do for you without any custom code: Create a new card on Trello whenever a page is added or published (you can select applicable templates). Configure the target board, target list, name and description for new cards. Add default labels and checklists to the card. Update the card whenever the page is updated (optional). When the status of the card changes (published / unpublished, hidden / unhidden, trashed / restored or deleted), move the card to a different list or archive or delete it (configurable). You can extend this through hooks in many ways: Modifiy when and how cards are created. Modify the card properties (Target board & list, title, description, et c.) before they are sent to Trello. Create your own workflows by utilizing an API helper class with many convenient utility methods to access the Trello API directly. Feedback & Future Plans
      Let me know what you think! In particular:
      If you find any bugs report them here or on Github, I'll try to fix them. This module was born out of a use-case for a client project where we manage new form submissions through Trello. I'm not sure how many use-cases there are for this module. If you do use it, tell me about it! The Trello API is pretty extensive, I'll try to add some more helper methods to the TrelloWireApi class (let me know if you need anything in particular). I'll think about how the module can support different workflows that include Twig – talk to me if you have a use-case! Next steps could be a dashboard to manage pages that are connected to a Trello card, or a new section in the settings tab to manage the Trello connection. But it depends on whether there is any interest in this 🙂 Links
      Repository on Github Complete module documentation (getting started, configuration & API documentation) [Module directory pending approval] Module configuration

    • By MoritzLost
      Process Cache Control
      This module provides a simple solution to clearing all your cache layers at once, and an extensible interface to perform various cache-related actions.
      The simple motivation behind this module was that I was tired of manually clearing caches in several places after deploying a change on a live site. The basic purpose of this module is a simple Clear all caches link in the Setup menu which clears out all caches, no matter where they hide. You can customize what exactly the module does through it's configuration menu:
      Expire or delete all cache entries in the database, or selectively clear caches by namespace ($cache API) Clear the the template render cache. Clear out specific folders inside your site's cache directory (/site/assets/cache) Clear the ProCache page render cache (if your site is using ProCache) Refresh version strings for static assets to bust client-side browser caches (this requires some setup, see the full documentation for details). This is the basic function of the module. However, you can also add different cache management action through the API and execute them through the module's interface. For this advanced usage, the module provides:
      An interface to see all available cache actions and execute them. A system log and logging output on the module page to see verify what the module is doing. A CacheControlTools class with utility functions to clear out different caches. An API to add cache actions, execute them programmatically and even modify the default action. Permission management, allowing you granular control over which user roles can execute which actions. The complete documentation can be found in the module's README.
      Plans for improvements
      If there is some interest in this, I plan to expand this to a more general cache management solution. I particular, I would like to add additional cache actions. Some ideas that came to mind:
      Warming up the template render cache for publicly accessible pages. Removing all active user sessions. Let me know if you have more suggestions!
      Links
      https://github.com/MoritzLost/ProcessCacheControl ProcessCacheControl in the Module directory CHANGELOG in the repository Screenshots


    • By Macrura
      PrevNextTabs Module
      Github: https://github.com/outflux3/PrevNextTabs
      Processwire helper modules for adding page navigation within the editor.
      Overview
      This is a very simple module that adds Previous and Next links inline with the tabs on the page editor. Hovering over the tab shows the title of the previous or next page (using the admin's built in jqueryUI tooltips.)
      Usage
      This module is typically used during development where you or your editors need to traverse through pages for the purpose of proofing, flagging and/or commenting. Rather than returning to the page tree or lister, they can navigate with these links.
      Warnings
      If you are using PW version 2.6.1 or later, the system will prevent you from leaving the page if you have unsaved edits.
      For earlier versions, to avoid accidentally losing changes made to a page that might occur if a user accidentally clicks on one of these, make sure to have the Form Save Reminder module installed.
      http://modules.processwire.com/modules/prev-next-tabs/
×
×
  • Create New...