Jump to content
johnstephens

Importing content from Textpattern

Recommended Posts

I'd like to read up on importing content from Textpattern into a fresh ProcessWire installation. I know there are more than a few Textpatrons here: Is there a tutorial or thread or blog post covering this?

I've used Textpattern for years. I'm very familiar with it's database schema and pretty comfortable exporting and manipulating it with MySQL. ProcessWire's schema uses a totally different paradigm, and I'm not confident that I could simply dump the data and import it into ProcessWire with the same facility.

I'm happy with Textpattern for most of the sites in which I use it, but there are a few that I think using ProcessWire would be a significant boon.

Thanks in advance!

Share this post


Link to post
Share on other sites

You could always use an bootstrap script, which does connect to the Textpattern database, reads the data by mysql queries and then use the processwire api to fill those data into the pages you need.

  • Like 4

Share this post


Link to post
Share on other sites

Or you could build a module like MigratorWordpress: https://github.com/NicoKnoll/MigratorWordpress that gets the Textpattern content and creates JSON that can be imported into PW.

While it might be a little more work to build, it would be useful for other future users coming from TextPattern.

  • Like 4

Share this post


Link to post
Share on other sites

Hey johnstephens,

How large/complex is the Textpattern site? I moved a bunch of sites a few years back, and they were all pretty simple transfers. I didn't try to write any kind of importer, I was done with the transfers manually faster than I could write the importer.

  • Like 1

Share this post


Link to post
Share on other sites

Concepts like "bootstrap script" and writing my own importer were a bit much for me when I initially posted this question, but now that I'm facing a real need I might have to wrap my mind around them.

 

On 11/5/2015 at 8:53 AM, renobird said:

Hey johnstephens,

How large/complex is the Textpattern site? I moved a bunch of sites a few years back, and they were all pretty simple transfers. I didn't try to write any kind of importer, I was done with the transfers manually faster than I could write the importer.

At the time I wrote, I didn't have a specific site in mind to convert, but now I do. It's one of my more complex Textpattern sites, but it's not insane. Here's what I have:

- 7 major content "sections", some of which would be nested in a ProcessWire site. I also have some utility sections used for things like search and site map.
- 660 "articles", most of which belong to a knowledge base.
- 39 categories in two major divisions. Each article in the knowledge base belongs to two categories.
- The site uses Textpattern's MLP plugin/hack to offer content in two languages. The 660 article IDs include the renditions in both languages. On the front end, that means you can click on a language link for any article and view it's rendition in the other language. The URL changes from domain.tld/en/{section}/{url-title} domain.tld/es/{section}/{url-title}.
- The site uses 15 custom fields, with the glz_custom_fields plugin offering support for different field types. Articles in different sections use different combinations of the custom fields, but all the custom field data lives in the same table with the articles.

I think that covers all the complicating factors.

In a vanilla Textpattern installation, all the article content lives in one table, and I suppose importing it would involve some way of mapping the columns of that table to pages and fields in ProcessWire, with the "section" field designating the rootParent page. But this site has the added convolution of a localization table that maps articles in both languages so they can be linked appropriately.

Tom, when you say you transferred the content manually, do you mean you opened the article in Textpattern's editor (the Write tab) and simply copied and pasted each field to ProcessWire? Is that the method you'd recommend for this site, or is there some way of automating it that you'd suggest?

Thanks in advance for any guidance you can offer!

Share this post


Link to post
Share on other sites

Bootstrapping means 'require'-ing ProcessWire website from elsewhere, like this:

<?php
  require_once '/absolute/path/to/your/PW/site/index.php';

  foreach( \ProcessWire\wire('pages')->get('/')->children() as $ch ) {
    echo "{$ch->title}\n";
  }

I've just tried this so I won't lie to you: I literally just created this in in my home directory, ran it in terminal with PHP and got a list of page titles in the console.

So by writing your own importer, people mean something like this:

<?php require_once '/path/to/processwire/index.php'; 

$wire = \ProcessWire::wire();

// get stuff you want to import
$articlesToImport = \OldCMS\DB::query($oldSiteDB, 'select * from articles join whatever');

// iterate over it and create pages for it
foreach($articlesToImport->next() as $oldArticle){
	$newArticle = new \ProcessWire\Page();
	$newArticle->template = 'article';
	$newArticle->parent = $wire->get('/articles/');
	$newArticle->title = $oldArticle->get('title');
	$newArticle->body = $oldArticle->get('html');
	$newArticle->save();
}

And that's it. You've just imported some articles from your old CMS to your brand new ProcessWire installation.

Of course, on "real" pages you'll want to think about this for a bit; Maybe create pages for authors first, some categories, some tags, and then maybe add the pages in parts, so you don't timeout mid-import, or turn off timeout.

  • Like 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By jds43
      Hello,
      Does anyone have experience with migrating content from Django to Processwire? Or are there any suggestions for achieving this?
    • By maba
      Hello,
      I need to import regularly - every 15 or 30 days - a big .xslx file into my PW installation.
      This file now has 14 columns, 5.000 rows and grows every month.
      I'll need to group, order and work with these data to:
      analyse User monthly costs analyse User costs per Asset ... User (real AD account) has to match with a PW user - I can't join to the domain - but as you can see I have some services users (start with sca_*) or no user at all. Those rows have to be assigned to a specific user, e.g. account100.
      And:
      I would like to be able to have a kind of diff function to compare User assets between this and last month (and so on) other request is to have a notification when something change for a User between actual and latest import First request: which is the best solution to store those data in your opinion? Page, Table, Repeater Matrix, ...?
      Those are very repetitive data and I think a page reference is better than to import all the data every time but I have to understand how to manage those "dynamic" groups of software (AccType Det), hardware (Asset), ... For example Price will be imported and not stored with the description because it could be change in the future and I'll not have any control on it.
      Thanks!
      User,OE,productNmr,AccType1,AccType Det,Count,Price (€),Sum,ASNA,CC,AccType Info,Asset,AccGroup,,,,,,,,,,,,,
    • By dragan
      Is it by design that a site/ready.php is not included when creating a new site profile? Is it possible to include it with a hook? Or are there any security thoughts? (I don't want to redistribute it in public, it's just so I have my own boilerplate)
    • By Brawlz
      Hi,
      I hope this is the correct section for my problem.
      All I need is a connection to an external Database and a query gettings some data. I do this in a processwire Page-Template. I am honestly not sure if it is a problem with processwire or my code:
      $host = ‚XXXXX’; $user = ‚XXXXX‘; $pass = ‚XXXXX‘; $db = ‚XXXXX‘; $port = ‚3306‘; $mydb = new Database($host, $user, $pass, $db , $port);  $result = $mydb->query("SELECT * FROM char“);  while($row = $result->fetch_assoc()) {  print_r($row);  }  
      Produces the following error:
      Error: Exception: DB connect error 2002 - Connection timed out (in /customers/9/4/e/XXXX.de/httpd.www/wire/core/Database.php line 79)
       
      I also tried connecting without the $port variable but got the same error.
    • By Mobiletrooper
      Hey Ryan, hey friends,
      we, Mobile Trooper a digital agency based in Germany, use ProcessWire for an Enterprise-grade Intranet publishing portal which is under heavy development for over 3 years now. Over the years not only the user base grew but also the platform in general. We introduced lots and lots of features thanks to ProcessWire's absurd flexibility. We came along many CMS (or CMFs for that matter) that don't even come close to ProcessWire. Closest we came across was Locomotive (Rails-based) and Pimcore (PHP based).
      So this is not your typical ProcessWire installation in terms of size.
      Currently we count:
      140 Templates (Some have 1 page, some have >6000 pages)
      313 Fields
      ~ 15k Users (For an intranet portal? That's heavy.)
      ~ 195 431 Pages (At least that's the current AUTOINCREMENT)
       
      I think we came to a point where ProcessWire isn't as scalable anymore as it used to be. Our latest research measured over 20 seconds of load time (the time PHP spent scambling the HTML together). That's unacceptable unfortunately. We've implemented common performance strategies like:
      We're running on fat machines (DB server has 32 gigs RAM, Prod Web server has 32gigs as well. Both are running on quadcores (xeons) hosted by Azure.
      We have load balancing in place, but still, a single server needs up to 20 sec to respond to a single request averaging at around about 12 sec.
      In our research we came across pages that sent over 1000 SQL queries with lots of JOINs. This is obviously needed because of PWs architecture (a field a table) but does this slow mySQL down much? For the start page we need to get somewhere around 60-80 pages, each page needs to be queried for ~12 fields to be displayed correctly, is this too much? There are many different fields involved like multiple Page-fields which hold tags, categories etc.
      We installed Profiler Pro but it does not seem to show us the real bottleneck, it just says that everything is kinda slow and sums up to the grand total we mentioned above.
      ProCache does not help us because every user is seeing something different, so we can cache some fragments but they usually measure at around 10ms. We can't spend time optimising if we can't expect an affordable benefit. Therefore we opted against ProCache and used our own module which generates these cache fragments lazily. 
      That speeds up the whole page rendering to ~7 sec, this is acceptable compared to 20sec but still ridiculously long.
      Our page consists of mainly dynamic parts changing every 2-5 minutes. It's different across multiple users based on their location, language and other preferences.
      We also have about 120 people working on the processwire backend the whole day concurrently.
       
      What do you guys think?
      Here are my questions, hopefully we can collect these in a wiki or something because I'm sure more and more people will hit that break sooner than they hoped they would:
       
      - Should we opt for optimising the database? Since >2k per request is a lot even for a mysql server, webserver cpu is basically idling at that time.
      - Do you think at this point it makes sense to use ProcessWire as a simple REST API?
      - In your experience, what fieldtypes are expensive? Page? RepeaterMatrix?
      - Ryan, what do you consider as the primary bottleneck of processwire?
      - Is the amount of fields too much? Would it be better if we would try to reuse fields as much as possible?
      - Is there an option to hook onto ProcessWires SQL builder? So we can write custom SQL for some selectors?
       
      Thanks and lots of wishes,
      Pascal from Mobile Trooper
       
       
×
×
  • Create New...