Jump to content

Creating a custom webservice for mobile app


landitus
 Share

Recommended Posts

I'm building a custom web service to service a mobile app from a PW site. I'm very exited to extend the site's functionality with PW's flexibility. I would like to check if what I'm doing is 'safe' and I'm not overexposing the site to vulnerabilities.

1. The site should serve a json file when accessing a specific URL. I like to keep my site's URLs with a /json to keep everything mirrored in my template files.

products.php

?php 

/**
 * Productos template
 *
 */

if ($page->name == 'productos' AND $input->urlSegment(1) == 'json' ) {
	include("./productos.json.php");
	die();
}
if ($page->name == 'categorias' AND $input->urlSegment(1) == 'json' ) {
	include("./categorias.json.php");
	die();
}

include("./head.inc"); ?>
// ... GOES ON

		

2. The template in charge of generating the JSON output. Loosely based in Ryan's web service plugin. Is there something dangerous in here?

products.json.php

<?php

/**
 * Productos JSON
 *
 */

// $_GET Variables from URL
$types = array(
  'limit'	=> $input->get->limit,
  'sort'	=> $input->get->sort
); 

$params = array();

foreach($types as $name => $type) {
  $value = $input->get->$name; 
  if($value === null) continue; // not specified, skip over
  if(is_string($type)) $params[$name] = $sanitizer->text($value);  // sanitize as string
    else $params[$name] = (int) $value; // sanitizer as int
}

// Set defaults
if(empty($params['sort'])) $params['sort'] = '-modified'; // set a default
if(empty($params['limit'])) $params['limit'] = 10; // set a default
$params['template'] = 'producto';

// Build query string to search
$new_query_string = http_build_query($params, '', ','); 

// Find
$q = $pages->find($new_query_string);

// Results
$result = array(
	'selector' => $new_query_string,
	'total' => $q->count(),
	'limit' => (int)$params['limit'],
	'start' => 0,
	'matches' => array()
);

$categorias = $pages->get("/categorias/");
$soluciones = $pages->get("/soluciones/");
	

foreach ($q as $i) {
	
	// ...
        // PREPARING some variables
        // ...

	
	$result[matches][] = array(
		'id'				=> $i->id,
		'template'			=> $i->template->name,
		'name' 				=> $i->name,
		'created'			=> $i->created,
		'modified'			=> $i->modified,
		'url' 				=> $i->url,
		'producto_thumb'	=> $product_thumb,
		'producto_imagen' 	=> $product_images,
		'producto_packaging'=> $product_pack,
		'producto_ficha'	=> $product_sheet,
		'title'				=> $i->title,
		'producto_bajada'	=> $i->producto_bajada,
		'body'				=> $i->body,
		'categorias'		=> $product_categories,
		'soluciones'		=> $product_solutions,
		'calc'				=> $i->calc,
		'rendimiento_min'	=> $i->rendimiento_min,
		'rendimiento_max'	=> $i->rendimiento_max
	);
}

// JSON
header('Content-Type: application/json'); 
echo(json_encode($result));

3. Finally, this seems to be working ok, but I was wondering:

a. How can I check for products modified since date X?

b. How can I notify the web service that a product has been deleted since date X? PW moves files to the trash and I would like to keep it that way but I can't imagine how to notify the webservice!

Link to comment
Share on other sites

I've been able to move forward by re-thinking the code. This time, I've set a custom parameter "modified_since" which when set will filter the query with that date. I have not be able to solve how to notify the web-service when a product is deleted. One option would be to hide the product, following this thread, but that would clutter the page tree. 

How would you solve this problem?

Link to comment
Share on other sites

It looks like you are using http_build_query to build a query for $pages->find(). That's something you'd use to insert a query string in a URL, but it's not something you want to use for a selector. Instead, you'd want to sanitize selector values with $clean = $sanitizer->selectorValue($dirty); Though if you've already sanitized something as an (int) then it's not necessary to run it through selectorValue. For your 'sort' you'd want to setup a whitelist of valid sorts and validate against that. 

Rather than doing a "die()", I suggest doing a "return;" so that you pass control back to PW rather than terminating execution. 

To find only modified products since a particular date, you can include that in your find() query. Perhaps the client can specify "modified" as a variable to the web service. If they specified it as a date string like "yyyy-mm-dd h:m:s" then you can sanitize and convert it to a timestamp with $date = strtotime($str). Then use "modified>=$date" in your selector.

To find which pages were deleted, the client usually identifies this by the non-presence of an item that used to be there. I also like to sometimes setup a separate "exists" feed that the client can use to test their IDs against. For instance, the client might query domain.com/path/to/service/?exists=123,456,789 and the response would be handled with something like this. It returns an array of TRUE for each queried page that still exists, and FALSE for pages that no longer exist. 

if($input->get->exists) {
  $dirty = explode(',', $input->get->exists); 
  if(count($dirty) > 200) throw new WireException("You may only check up to 200 at once"); 
  $ids = array();
  foreach($dirty as $id) {
    $id = (int) $id; 
    if($id) $ids[$id] = false; // set as not exists (default)
  }
  $results = $pages->find("template=product, id=" . implode("|", $ids)); 
  foreach($results as $result) {
    $ids[$result->id] = true; // set as exists
  }
  header("Content-type: application/json");
  echo json_encode($ids);   
}
 
  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...
Is it an alternative to check for products in the trash, with "include=all"? Do you see any downsides?

The trash shouldn't be a consideration in a web service feed like this, because items in trash are targeted for deletion and we have no control over when somebody will decide to empty the trash. So for all practical purposes, items in the trash should be considered the same as deleted items here. The code above won't pull pages out of the trash, so it should already work the right way, so long as you don't add "include=all" to the selector. If you do need the behavior of include=all, then I would suggest doing this instead:

$trash = Page::statusTrash; 
$results = $pages->find("template=product, status<$trash, id=" . implode("|", $ids)); 

That should give you the same behavior as "include=all" but exclude pages in the trash. 

Link to comment
Share on other sites

Hi Ryan, I've been looking into this in more depth and I have an issue in the way the mobile app is going to check the webservice to look for modified/deleted items. It might be very special and unique to this very app, so the solution might not be ideal or close to your answer. The thing is: checking all items to see if they exist was ruled out by the mobile app developer (memory reasons or something like that). My first idea was to flag each product to "is_deleted" if they reside in the trash. This way, if the product is modified or deleted, it will always show up in the query. Deleted items are in the trash, and won't be seen or count in the tree. I didn't feel right to unpublish or set a field in a visible product instead of deleting in, a they would be seen in the tree. For me and the client, the tree what is visible on the site.

This approach has the disadvantages you describe above, and it feels "wrong", but I still have not found a middle ground in which the app works and the CMS/webservice is bullet proof. I've told somethings about virtual deletion (which sounds like the trash), but it won't be ever deleted. I thinks is in Drupal.

Anyhow, I was wondering if maybe there's another creative way to approach this with the limitations I've described. For know, my webservice checks all products (including the trash) and tags deleted products if they reside in the trash.

// ...
$new_query_string = "template=".$params['template'].", sort=".$params['sort'].", ".$modified." include=all";

// Find
$q = $pages->find($new_query_string);

//...

I still have to sanitise the inputs!

// Is deleted
	if($i->parent->url == '/trash/'){
		$is_deleted = true;
	} else {
		$is_deleted  = false;
	}

Maybe I should post the whole code (it's long as it builds a custom json feed!)

Link to comment
Share on other sites

I still think this strategy would be fragile over time. It's not the job of a web service to keep track of deleted things, that's the job of the client accessing the service. I don't know of any major web service that attempts to bundle deleted items in a feed. The problems arise when something really does get deleted, and your web service has no record of it. If the client accessing your service expects you to keep track of all deleted items, then this will break sooner or later. You can't count on everything being in the trash. That is just a temporary container. Sooner or later it has to be deleted. Not to mention, showing deleted items in the feed means your feed has unnecessary overhead. The responsibility has to rest on the client side or else the system will break down at some point. Here's what I usually do on the client side:

Keep two extra fields per record in the feed: import_id (integer) and import_time (unix timestamp). The import_id reflects the unique ID of each item from the feed, and is needed to connect items from the feed with your local items. The import_time reflects the timestamp of when the item was last updated.

Before the feed is pulled, the client records the current timestamp and keeps it somewhere, like in a variable:

$startTime = time(); 

Then they pull the feed, iterating through each item in the feed and updating or adding an associated page item for each. Note that the import_time of each page item is updated regardless of whether any changes occurred. You are essentially identifying all the items that are still active by updating their import_time field. 

$parent = $pages->get("/path/to/items/"); 
$http = new WireHttp();
$feed = $http->getJSON('http://domain.com/path/to/feed');  

if(is_array($feed) && count($feed)) {
  foreach($feed as $data) {
    $item = $pages->get("import_id=$data[id]"); 
    if(!$item->id) {
      // add new item
      $item = new Page();
      $item->parent = $parent;
      $item->template = 'some-template';
      $item->import_id = $data['id'];
      $item->save();
    } else {
      $item->of(false); 
    }
    // populate item
    $item->title = $data['title'];
    // ...
 
    // update import_time
    $item->import_time = time();
    $item->save();
  }

  // after importing the feed is complete, find the pages that were NOT updated
  // these pages may be deleted or trashed
  foreach($parent->children("import_time<$startTime") as $item) {
    $item->trash(); 
  }
} 

I like if the web service also provides an "exists" function, to determine that an item really has been deleted. This returns a thumbs up or down for each item requested, as to whether it exists or not. It's not entirely necessary, but it adds a little extra peace of mind as a backup in case something goes wrong. This is described in one of my posts above. Combined with the code above, the part that does the $item->trash() would just be preceded by a call to the exists feed to reduce the list of $items to only those that are confirmed to be deleted: 

// find which pages weren't updated
$ids = array();
foreach($parent->children("import_time<$startTime") as $item) {
  $ids[] = $item->import_id; 
}

// send the IDs of those pages back to the feed, asking it to confirm deletion
$feed = $http->getJSON('http://domain.com/path/to/feed/exists?id=' . implode(',', $ids)); 
foreach($feed as $id => $exists) {
  if($exists === false) {
    // deletion confirmed, so page may be deleted
    $item = $pages->get((int) $id);
    $item->trash();
  }
}
 

As for memory issues: any feed dealing with a lot of data needs to provide pagination of feed data (example), otherwise it will not be able to scale and will break if there is too much data. Pagination of a feed is a lot simpler than you'd expect for both the service and the consumer. It not much more than wrapping everything in a foreach. 

  • Like 5
Link to comment
Share on other sites

  • 3 weeks later...
  • 2 years later...

Hi everyone! I'm facing a similar project right now, creating an API for a mobile application. This post has been very educational, though I have a question:

  • How to handle login from the app? Even though the app will be released on stores, it is meant to be used by certain members of an organizationI'm not very knowledgeable in the way PW handles login, so if anyone could point me out to any resource that could help me how I could implement this. Registration wouldn't be needed since the user base will be populated from PW itself.

Edit: I found this thread with posts from @apeisa and @ryan talking a bit about the login cookies. It seems I could just save the cookies on the mobile app, send them with the request and manually checking if the cookies correspond to a session or something in that fashion?

Edit 2: Found another thread with a similar situation.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...