Jason Huck

September 16, 2016

I have a project that uses a variant of the "main.inc" template strategy. All templates are set to use a single "main.php" file. That file uses output buffering to include page-specific views and insert them into a "base" template. Generally, this works great and allows me to structure my files exactly the way I want. However, I've found that if I want to manually throw a Wire404Exception, it just bubbles up uncaught rather than being handled.

The 404 handler works fine in instances where I'm not calling it manually, but fails otherwise, specifically when I'm dealing with urlSegments. I'm not that familiar with PHP's exception handling in general nor ProcessWire's exception handling in particular, so I'm not sure how to further troubleshoot.

Some abbreviated pseudo-code to illustrate:

main.php:

ob_start();
include('./views/'.$view.'/'.$view.'.inc');
$layout = ob_get_clean();

ob_start();
include('./views/base/base.inc');
$template = ob_get_clean();

echo $template;

With this setup, if I try to throw a Wire404Exception within any "view", the exception isn't handled, so logged in admins see the error trace, and regular users get an ISE. e.g.,

view.inc:

if($input->urlSegment1){
    // look for stuff...
    if(!$match){
        throw new Wire404Exception();
    }
}

Any thoughts?

September 13, 2016

3 minutes ago, Mike Rockett said:

That's the problem right there - the request should not be http404, which is the built in 404 page. That should show the original request. As it isn't doing that, it means that something is redirecting to the 404 page before Jumplinks can do its thing. Perhaps there's another module hooking to the 404 event in play? Template code, maybe?

Okay, that makes perfect sense, and gives me something to go on. I can't think of anything off the top of my head that would be interfering, but I'll have a fresh look. Thanks!

September 13, 2016

There's not much to see. If I hit a URL like

about/locations

...which should redirect to:

locations

...I instead end up on:

http404

The debug log starts off like this:

404 Page Not Found
Checked Tue, 13 Sep 2016 13:40:14 -0400
Request: http://xxx.xxxxxxx.xxx/http404/
ProcessWire Version: 2.6.1

Scanning for jumplinks...

[Checking jumplink #1]
- Original Source Path:       about/community
- Compiled Source Path:       about/community

No match there...

...then iterates through all of the jumplinks, including the one it *should* match on:

[Checking jumplink #3]
- Original Source Path:       about/locations
- Compiled Source Path:       about/locations

No match there...

...and ends like this:

No matches, sorry. We'll let your 404 error page take over when Debug Mode is turned off.

September 13, 2016

I've just discovered that Jumplinks has stopped working on an older site. Same behavior on both dev and production -- the admin UI works fine, but none of the actual redirects happen on the front end; users are just bounced to the 404 page. When I enable (Jumplinks') debug mode, I see it skip right past the relevant entry for a given URL as if it couldn't find a match. The site is running PW 2.6.1 and Jumplinks 1.3. I tried upgrading Jumplinks to the latest version in dev, but it didn't help. Any suggestions on how to further diagnose the issue?

August 8, 2016

2 hours ago, adrian said:

Have you tried fixing the cause of that warning: https://github.com/USSliberty/Processwire-site-indexer/blob/master/Indexer.module#L186

It would be good to know if that warning is gone whether everything works as expected without debug mode on.

Patched the Indexer module to get rid of the warning, but otherwise no change in behavior. Works fine with debug mode on, returns 502 with debug mode off.

August 8, 2016

Here's something. I've just discovered that if I enable debug mode in PW, the script executes without error. I do get a warning on each line from the indexer module:

Warning: trim() expects parameter 1 to be string, array given in /var/www/html/HLND/site/modules/Indexer/Indexer.module on line 304

...but otherwise everything works as expected. If I turn debug mode off, I get the 502 Bad Gateway. I just toggled it on and off a few times to verify, and it's 100% reproducible.

What does enabling debug mode do that allows the script to complete?

August 7, 2016

16 hours ago, netcarver said:

If HAProxy's server connection times out before Apache finishes, you'll get a 50x error reported (not sure if 502 or 504) in the browser yet the Apache process will probably complete without error behind the dropped connection. Have you looked at the "timeout server" setting (and the other timeout settings) in your HAProxy config file (/etc/haproxy/haproxy.cfg)? I think you'll need to up that beyond where you have the timeout set for the PHP script execution.

Edited to add: Have you thought about moving this processing load into a background task?

I've been exploring the HAProxy side of it as well, though nothing I have tried so far has made any difference. It seems very similar to what's described here:

http://serverfault.com/questions/664658/server-timeout-and-retry-in-haproxy-502

I've seen some evidence of a second request in my situation as well. I haven't looked into making it a background task yet.

August 6, 2016

55 minutes ago, netcarver said:

Is your server running php-fpm by any chance?

No. The dev server is a CentOS 7 virtual machine running stock Apache and PHP. Earlier I mistakenly said it was running behind nginx. It's actually running behind haproxy.

August 6, 2016

So that was a bit of a red herring. For some reason, putting realpath in that position caused all of the file writes to fail, so it was just skipping the image field handling completely. What I'm seeing right now is that it can handle roughly 180 rows at a time. I can skip any number of rows at the beginning, and process any consecutive batch of rows successfully right up to around 180. It sometimes varies by a few rows. It's not an issue with any particular row either -- e.g., it fails at row 181, but if I skip a row it processes row 181 just fine and fails on row 182.

August 6, 2016

@netcarver Actually it looks like that did sort the issue. I'll have to test some more, but I'm guessing that verifying the result of file_put_contents, which I should have been doing in the first place, is the key. Without that, the script was trying to add and remove nonexistent image files. Thanks for pointing that out!

August 6, 2016

Increasing the max_execution_time value has no effect. The script still returns a 502 in the same amount of time. It does run successfully if I limit the number of rows.

I'm going to play with increasing the rows and/or skipping initial rows to try and see if it's an issue with the total amount of data, or a specific row that's causing problems.

I'll also try just running the ->deleteAll() in isolation to see if the behavior changes.

Thanks!

August 6, 2016

54 minutes ago, teppo said:

Just a quick comment on this one: if you enter just one dimension, doesn't that automatically update the other one? At least in my case it does

If it's a fixed pixel dimension, yes. If it's a percentage (at least, last time I tried it in a stock CKEditor install) it leaves the other value blank.

August 6, 2016

Sure, here's the whole thing. This is called from a custom admin page. These actions do occur in a loop (iterating over the lines of a CSV file). I changed ->parent to ->id in case of extra overhead (retrieving and creating a page object vs. retrieving an integer value). I do set the name, title, and status of each page. Switching to a normal image fieldtype vs. CroppableImage makes no difference. This is PW 2.8.28. Thanks for taking a look!

<?php
	include($config->paths->templates.'init/functions.inc');

	function importProductData($filepath){
		$result = new StdClass();
		$result->lines_total = 0;
		$result->lines_added = 0;
		$result->lines_failed = 0;
	
		$sizes = [
			5 => '¾”',
			6 => '1”',
			7 => '1¼”',
			8 => '1½”',
			9 => '2”',
			11 => '3”'
		];			
	
		// open the file for reading
		if(!ini_get('auto_detect_line_endings')) ini_set('auto_detect_line_endings', true);
		
		if(($f = fopen($filepath, 'r')) !== false){		
			$header = true;
		
			// parse as CSV
			while(($line = fgetcsv($f, 0, ',', '"')) !== false){
				// skip the header
				if($header == true){
					$header = false;
					continue;
				}
	
				$result->lines_total++;
	
				// process a line
				try{	
					$title = $line[1];
					$brand = ''; // TODO: Get brand data.
					$product_type = ''; // TODO: Determine product type.
					$prod_cat = (int) str_replace('F', '', $line[4]);
					$size = $sizes[$prod_cat];
					
					$uom_raw = $line[2];
					if($uom_raw == 'EA' || $uom_raw == 'PC'){
						$uom = $uom_raw;
						$uom_value = null;
					}else{
						$uom = 'FT';
						$uom_value = (int) $uom_raw;
					}
					
					$weight = (float) $line[10];
					
					$itemno_mill_standard = (int) $line[0];
					$itemno_mill_stainless = (int) $line[11];
					$itemno_anodized_standard = (int) $line[12];
					$itemno_anodized_stainless = (int) $line[13];
					$active = ($line[5] == 'A');
				
					$image_data_raw = $line[17];
					
					if(startsWith($image_data_raw, '0x')){
						$has_image = true;
						$image_data = hex2bin(ltrim($image_data_raw, '0x'));
						$image_path = wire('config')->paths->templates.'tmp/'.$line[18];
						file_put_contents($image_path, $image_data);
					}else{
						$has_image = false;
					}
				
					// look for a page with this primary item number
					$product = wire('pages')->get('template=product-detail,itemno_mill_standard='.$itemno_mill_standard.',include=all,limit=1');
										
					// if none is found, create a new page
					if(!$product->id){			
						$product = new Page();
						$product->template = 'product-detail';
						$product->parent = wire('pages')->get(1048);
					}
					
					// disable output formatting so that the page can be saved
					$product->of(false);
					
					// set fields
					$product->name = wire('sanitizer')->pageName($title.' '.$line[3]); // slug is never seen, so append UPC code to ensure it is unique
					$product->title = $title;
					$product->brand = ''; // TODO: Get brand data.
					$product->product_type = ''; // TODO: Determine product type.
					$product->size = $size;
					$product->uom = $uom;
					$product->uom_value = $uom_value;
					$product->weight = $weight;
					$product->itemno_mill_standard = $itemno_mill_standard;
					$product->itemno_mill_stainless = $itemno_mill_stainless;
					$product->itemno_anodized_standard = $itemno_anodized_standard;
					$product->itemno_anodized_stainless = $itemno_anodized_stainless;
					
					// set published status
					($active ? $product->removeStatus(Page::statusUnpublished) : $product->addStatus(Page::statusUnpublished));
					
					// save the page
					$product->save();
				
					// add image, save page again, and remove temporary image
					if($has_image){					
						if($product->image) $product->image->deleteAll(); // get rid of anything that's already in there
						$product->save('image'); // try saving after delete to avoid 502's
						$product->image->add($image_path);
						$product->save('image'); // save just the image field instead of the whole page
						unlink(realpath($image_path));
					}

					$result->lines_added++;
					
				}catch(Exception $e){
					$result->lines_failed++;
					wire('log')->save('import-errors', $itemno_mill_standard.': '.$e->getMessage());
				}
				
				// attempt to fix 502 errors introduced by ->deleteAll() above
				wire('pages')->uncacheAll();
				
				// debugging
				// if($result->lines_total > 10) break;
			}
		
			fclose($f);
			wire('log')->prune('import-errors', 30);
		}else{
			wire('pages')->error('Could not open '.$filepath.' for reading.');
		}
		
		return $result;
	}



	// collect output
	$out = '<h2>Import Product Data</h2>';

	// build the form
	$filepath = $config->paths->templates.'tmp/';
	
	$uploadform = $modules->get('InputfieldForm');
	$uploadform->action = './';
	$uploadform->method = 'post';
	$uploadform->attr('id+name','product-import');
	
	$field = $modules->get('InputfieldFile');
	$field->label = 'CSV File';
	$field->name = 'productdata';
	$field->attr('id','productdata');
	$field->required = 1;
	$field->extensions = 'csv txt tab';
	$field->maxFiles = 1;
	$field->overwrite = true;
	$field->descriptionRows = 0;
	$field->destinationPath = $filepath;
	$field->uploadOnlyMode = true;
	$uploadform->append($field);
	
	$submit = $modules->get('InputfieldSubmit');
	$submit->attr('value','Import');
	$submit->attr('id+name','submit');
	$uploadform->append($submit);
	
	// check for a submission
	if($input->post->submit){
		// empty the tmp directory prior to starting a new import,
		// in case there were failures on a previous attempt that
		// left files there
		array_map('unlink', glob($filepath.'*.*'));
			
		// validate the form
		$uploadform->processInput($input->post);
		
		// process the form
		$uploaded_files = $uploadform->get('productdata')->value;
		
		if(count($uploaded_files)){
			$file = $uploaded_files->first();
			$this->message('File uploaded successfully.');		
			$filepath .= $file->name;
			
			$import = importProductData($filepath);
			
			if($import->lines_failed){
				$this->error('Processed '.$import->lines_total.' lines with '.$import->lines_failed.' errors. Check the import-errors log for details.');
			}else{
				$this->message('Processed '.$import->lines_total.' lines with no errors.');
			}
			
			// delete the uploaded file
			if(unlink(realpath($filepath))){
				$this->message('Removed uploaded file.');
			}else{
				$this->error('Could not delete uploaded file.');
			}
		}else{
			$this->error('Uploaded file can\'t be found.');
		}
	}
	
	// render the form, with or without errors
	$out .= $uploadform->render(); 
	
	// display the results
	echo $out;
?>

...and for the sake of completeness, here's the contents of functions.inc:

<?php
    // Excerpts a field to $limit length
    if(!function_exists('excerpt')) {
		function excerpt($str, $limit = 400, $endstr = '…'){
			$str = strip_tags($str);
			if(strlen($str) <= $limit) return $str;
			$out = substr($str, 0, $limit);
			$pos = strrpos($out, " ");
			if($pos > 0) $out = substr($out, 0, $pos);
			return $out .= $endstr;
		}
	}

	// Check if a string begins with another string.
	function startsWith($haystack, $needle){
		 $length = strlen($needle);
		 return (substr($haystack, 0, $length) === $needle);
	}
?>

August 5, 2016

->removeAll() and ->deleteAll() both produce a 502

August 5, 2016

Both worthwhile suggestions, thanks! Added them both, and will leave them in there, but still getting a 502.

August 5, 2016

Staring at that snippet, I am wondering if I should be saving the page in between the delete and add operations?

August 5, 2016

In this particular case, I'm iterating over lines in a CSV file, looking up pages one at a time, and updating (or creating new ones) as needed. So I'm never creating a PageArray (AFAIK, anyway). It's possible that ->deleteAll() is no less performant than other things I'm doing, but it just happens to be the thing that pushed this script over the edge. I mention it here because someone else seemed to have the same problem. The relevant code:

// look for a page with this primary item number
$product = wire('pages')->get('template=product-detail,itemno_mill_standard='.$itemno_mill_standard.',include=all,limit=1');
                    
// if none is found, create a new page
if(!$product->parent){                    
    $product = new Page();
    $product->template = 'product-detail';
    $product->parent = wire('pages')->get(1048); // magic number: ID of the parent for all product pages
}

// disable output formatting
$product->of(false);

// set individual fields (name, title, etc.)...

// set published status
($active ? $product->removeStatus(Page::statusUnpublished) : $product->addStatus(Page::statusUnpublished));

// save the page
$product->save();

// add image, save page again, and remove temporary image
if($has_image){                    
    if($product->image) $product->image->deleteAll(); // get rid of anything that's already in there
    $product->image->add($image_path);
    $product->save();
    unlink(realpath($image_path));
}

August 5, 2016

Thanks for the suggestion, not using Tracy on this site.

I've run plenty of imports similar to this one before, including one recently (on a different project) that imported over 6k pages, and never got a 502 until I added ->deleteAll(). Maybe I need to consider some sort of bulk operation to remove all the images for this template in one shot, rather than one page at a time. I doubt there's anything like that in the API, though. Plus, I'd prefer to only delete the images as needed, in case some other condition in the script causes that entire record to be skipped.

August 4, 2016

The custom dialogs for inserting links and images in CKEditor are missing some features available in a stock install that routinely cause me headaches. Two that come to mind right now are:

- When inserting an image, you should be able to enter just one dimension and leave the other one blank, and you should be able to use other units besides pixels (for instance, percentages).

- When inserting a mailto: link, you should be able to specify a subject line.

Would love to see these make their way into the PW dialogs.

August 4, 2016

I'm using PW 2.8.28 with a CroppableImage field. I have an import routine that updates existing records when possible. The script runs fine unless I try to delete existing images before adding new ones. If I include $page->image->deleteAll() right before adding a new image (and after disabling output formatting), I get the following error (from nginx):

502 Bad Gateway

The server returned an invalid or incomplete response.

Nothing relevant is logged in PW, or Apache or Nginx access or error logs, AFAICT. Note that it does seem to work, despite the error, but it doesn't make a great impression on clients. Any suggestions?

August 3, 2016

I have a very similar situation. I'm trying to upload a CSV file to a custom admin page, created with the Admin Custom Pages module. I've constructed a simple form using the API. When I load the page, the form appears to render correctly, but it doesn't appear to be working 100% correctly.

• Drag and drop isn't working. The browser just offers to open the dragged file. There are no JS errors in the console and it looks as if all the required JS assets are included on the page. If I instead click the "Choose File" button and select a file, it does appear to select a file (if I hover over the "Choose File" button, it displays the selected filename).

• Submitting the form does upload the file, though overwrite isn't working. I get the following error when trying to re-upload the same file:

"Refused file [my filename] because it is already on the file system and owned by a different field."

August 2, 2016

I suspect that adding width and height is probably sufficient for non-SSL sites, but in this case, the entire site is served over SSL, which apparently made secure_url required as well, even though a secure URL was already provided for the base og:image value. I can't point to any documentation in that regard, it's just what finally worked for me.

August 1, 2016

Aside from the name vs. property issue with the Open Graph tags, it seems as though FB can be very picky about parsing og:image. It's easy to miss, because FB will infer the value based on other information on the page, but on a recent launch, in order to get it to "pass" the Open Graph Debugger, I had to add the following additional (supposedly optional) properties:

og:image:url
og:image:secure_url
og:image:type
og:image:width
og:image:height

It would be nice if MarkupSEO was aware of image fields and would allow users to select the preview image from one. Then it could generate all of these additional properties automatically.

July 13, 2016

I see, thanks. I wonder if there's any way to circumvent that, especially since there's nothing preventing non-superusers from actually editing the page if they know the link (or guess the ID). I didn't realize the Trash also only appears for super users. I would think that would be useful to other types of users as well.

July 13, 2016

When logged into the admin as the superuser, the 404 page appears in the page tree per usual. When logged in as a "site administrator" (a custom role), the 404 page is missing. The 404 page uses the basic-page template and hasn't been edited or altered in any way. Under "who can access this page", it shows the expected permissions (site administrators can definitely edit basic pages), and in fact, if a site administrator navigates directly to the edit screen for the page, it works fine. So it's not a permissions issue per se, it just doesn't show up in the page tree for users in that role. It is a multilingual site, and English is not the default language, in case that's relevant. Any thoughts?

Sign In

Profiles

Forums

Store

Posts posted by Jason Huck

502 Bad Gateway