Jump to content

Recommended Posts

Hi there,

I've got some code that bulk uploads the contents of Tab Delimited files into our Processwire CMS.

It looks like the $page->save functionality stops working after a certain number has been processed.

Haven't worked out where it starts failing yet but all the rows towards the bottom have certainly failed.

Unfortunately, there isn't anything in the error logs (at least the php error log) to see if there is an issue.

I have tested the rows that were not working by putting them individually in seperate tsv files to prove that the code works, the data is clean and it actually updates the CMS.

I was wondering if there's an internal limit on how many page saves the CMS can handle?
 
Thanks in advance.
 
Below is the copy of the code I'm using to do a bulk upload.
I'm using a 3rd party module called SimpleExcel to help with the reading and processing of the tab delimited files.
 
        $_rowStart = 13;

        $_colTitle  = 1; //A
        $_colMeta   = 10;//J    Meta Title = SEO Title
        $_colDesc   = 11;//K
        $_colURL    = 12;//L
        $_colURLnew = 13;//M
        $_colCanonical = 14;//N
        $_colAlt    = 15;//O
        $_colKeywords   = 16;//P

        $excel = new SimpleExcel('TSV');
        $excel->parser->loadFile(__DIR__."/{$file}");

        //$url =  $excel->parser->getCell($_rowStart, $_colURL );
        $row= $_rowStart;
        do
        {
            $title  =   $excel->parser->getCell($row, $_colTitle );
            $meta   =   $excel->parser->getCell($row, $_colMeta );
            $desc   =   $excel->parser->getCell($row, $_colDesc );
            $url    =   $excel->parser->getCell($row, $_colURL );
            $urlnew =   $excel->parser->getCell($row, $_colURLnew );
            $canonical= $excel->parser->getCell($row, $_colCanonical );
            $alt    =   $excel->parser->getCell($row, $_colAlt );
            $keywords=  $excel->parser->getCell($row, $_colKeywords );

            $internal=  str_replace("http://sprachspielspass.de",'',$url);

            $page =  wire(pages)->get("path=$internal");

            //TODO: Error handling and logging
            if (!IsNullPage($page))
            {
                $page->setOutputFormatting(false);

                $page->title = $title;
                $page->seo_title = $meta;
                $page->seo_description = $desc;
                $page->url = $urlnew;
                $page->seo_canonical = $canonical;
                $page->image_alt = $alt;
                $page->seo_keywords = $keywords;

                $page->Save();

                WriteLog("{$internal} saved");
            }
            else
            {
                WriteLog("{$internal} NOT FOUND!");
            }

            $row++;

        }  while (!IsNullOrEmptyString($url));
Edited by LostKobrakai
Please use the code blocks the forum does provide

Share this post


Link to post
Share on other sites

Are you calling this from the command line or via http? In the latter case, my guess would go towards script execution time constraints.

Are you getting a "saved" message for those failed pages?

I don't think there's a limit for the number of page saves. I know I've imported a few thousand pages in one go in my last project, with each import resulting in more than two page saves.

  • Like 3

Share this post


Link to post
Share on other sites

I've tried setting ini_set('max_execution_time', 600);

Unfortunately it hasn't fixed my issue.

I will try and add more debugging info into the code and try and get to the bottom of it.

I did a rough count of the rows and the rows in logs does not match the rows in the tsv files.

I will have to take a closer look.

Thanks for the heads up about the execution script time and the row count

Share this post


Link to post
Share on other sites

Try to set max exec time within the loop itself, inside your if block
 

$page->setOutputFormatting(false);
set_time_limit(60); 

 
If memory is a problem, you might also want to free some memory by adding the following after WriteLog.

WriteLog("{$internal} saved");
wire('pages')->uncacheAll(); // free some memory 
  • Like 2

Share this post


Link to post
Share on other sites

Unfortunately, it seems to be an encoding issue.

The file was encoded in UTF8-BOM which has been fine reading German Umlaut characters up until now.

But it seems the newest entries are returning blanks when there's entries in the tab delimited file. 

Share this post


Link to post
Share on other sites

It turned out to be a line break in some of the rows which I couldn't spot with the various text editors I was using .... Aaargh .... I'll probably have to strip them out before I process the file again.

Again, thanks for the prompt responses.

You guys are the best!

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By maba
      Hello,
      I need to import regularly - every 15 or 30 days - a big .xslx file into my PW installation.
      This file now has 14 columns, 5.000 rows and grows every month.
      I'll need to group, order and work with these data to:
      analyse User monthly costs analyse User costs per Asset ... User (real AD account) has to match with a PW user - I can't join to the domain - but as you can see I have some services users (start with sca_*) or no user at all. Those rows have to be assigned to a specific user, e.g. account100.
      And:
      I would like to be able to have a kind of diff function to compare User assets between this and last month (and so on) other request is to have a notification when something change for a User between actual and latest import First request: which is the best solution to store those data in your opinion? Page, Table, Repeater Matrix, ...?
      Those are very repetitive data and I think a page reference is better than to import all the data every time but I have to understand how to manage those "dynamic" groups of software (AccType Det), hardware (Asset), ... For example Price will be imported and not stored with the description because it could be change in the future and I'll not have any control on it.
      Thanks!
      User,OE,productNmr,AccType1,AccType Det,Count,Price (€),Sum,ASNA,CC,AccType Info,Asset,AccGroup,,,,,,,,,,,,,
    • By Tyssen
      I have a page that contains a single ProFields table field and I want to display the contents of the table on the front end and then for logged in users, they can edit certain columns in the table.
      What I have at the moment is
      $out = '<form action="'.$page->url.'" method="post" > <table class="table"> <tbody>'; $count = 1; foreach($page->fieldName as $row) : $out .= ' <tr> <td><input type="checkbox" name="fieldName_'.$count.'_columnName"></td> </tr>'; if($input->post->submit) : $page->of(false); $page->set('fieldName_'.$count.'_columnName', $sanitizer->text($input->post->{fieldName_'.$count.'_columnName})); $page->save(); endif; $count++; endforeach; $out .= ' </tbody> </table> <button class="button" type="submit">Save</button> </form>'; The two problems I have are:
      I get an error trying from $sanitizer->text($input->post->{fieldName_'.$count.'_columnName}), not sure how to make that dynamic.  If I change the above to just a static value, e.g. $page->set('fieldName_1_columnName', 'Testing') and save the form, it's not saving the values to the database. Where am I going wrong?
    • By VeiJari
      Hello forum!
      I've yet again stumbled on a head-scratching situation. We have enabled the option on our articles template and events template that it skips the title adding part and goes straight to the form. This is what our customer wants. So when you add a new article or event it automatically names it temporary to "article-0000000" and same with event. Now the problem is that obviously after saving the form we want to change to page url or "name" to the title, like it's normally. 
      Now here's the code for the hook:
      wire()->addHookBefore("Pages::saved(template=tapahtuma|artikkeli)", function($hook) {
        $page = $hook->arguments(0);
        $newUrl = wire()->sanitizer->pageName($page->title); // give it a name used in the url for the page
        wire()->log->message($page->name);
        $page->setAndSave('name', $newUrl);
      });
      I get the correct page and the name and path changes when I log them, but when I try to save it. It just loads and then I get: 
      Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 262144 bytes) This happens in sanitizer.php
      and then another error: Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 262144 bytes) in Unknown on line 0
       
      What is happening? Am I not suppose to use sanitizer in this way? When we made a temporary page object in out other hook, the sanitizer worked perfectly.
      Thanks for the help!
    • By louisstephens
      So I have been diving into hooks lately, and I am enjoying them thus far. However, I guess I am a bit stumped on how to achieve what I want too. I am trying to set up a hook that would create a new child page when the parent page is saved. However, when you save the parent page a second time, I just need to update the child page without creating multiple child pages. What would be the best way to go about this?
      So after rereading my post, I believe it is a bit vague so I will try to explain more. 
      The Goal:
      Create a page with a template "one". Once the page is created/saved => create a new child page with the template of "two" If the parent is saved anytime after, do nothing to the child page (limit the parent page to one child page) The parent page is really just being used to output content, whereas the child page is being used to pull out the some fields from the parent to be used elsewhere. I might have made this too complicated in my head.
    • By chcs
      I have a page with a good amount of fields on it, I have many other pages in the same template that have no issue, but one particular page just doesn't save changes to any of the fields. 
      I installed the clone page module, cloned the page, and everything cloned fine... but still when i try to make changes to the clone nothing saves either (just like the original page). 
      I expect some kind of notification to show up saying it saved, or not, but i get nothing as if the page reloaded fresh. The failure is silent, and there's nothing in the logs to direct me.
      This is a real head scratcher and I'm not sure how I should go about troubleshooting it further. Can someone help?
      Thank you kindly.
×
×
  • Create New...