Jump to content
Vineet Sawant

How to upload heavy data into Processwire?

Recommended Posts

Hi,

I'm trying to import some heavy data into Processwire, but I'm not sure what would be the best way to do it.

Usually I use CSV to Pages plugin, but this time the data is too heavy(~40k rows with 10+ columns of excel sheet), thus this plugin can't help.

I also tried Tasker plugin but I can't seem to go through the setup itself, it requires some template setup but I'm totally clueless about how to do it, so that plugin is not of any use either.

I wanted to know from you guys how you do it and in future what would be the best way to migrate thousands of rows of data in to PW.

 

Thanks.

 

 

Share this post


Link to post
Share on other sites

I also would recommend to write your own shell script and bootstrap ProcessWire.

You could use for example the PHP League CSV composer package and write your own import script where you save the CSV entries as pages via the API. 😉

It is not that hard and you can import large data this way. If you want to, I could post an example.

Regards, Andreas

Share this post


Link to post
Share on other sites

Here is a script I made for the import of thousands of customers.

You have to save this as shell script (f. e. sync-customers.php.sh), make the script executable and execute it via command line (./sync-customers.php.sh).

Spoiler

#!/usr/bin/env php
<?php
namespace {
	// Bootstrap ProcessWire
	include("./../../index.php");
}

namespace ProcessWire {

	echo "Synchronisation started...\n";

	// Source: http://csv.thephpleague.com/
	use League\Csv\Reader;

	$csv = Reader::createFromPath("./../assets/csv/customer.csv", "r");
	$csv->setDelimiter(";");
	$csv->setHeaderOffset(0);

	// $header = $csv->getHeader();
	$records = $csv->getRecords();

	// var_dump($header);
	// var_dump($records);

	/*
	 * Save records in new array
	 */
	$recordsArr = array();

	foreach ($records as $record) {

		// Save columns in variables
		$supplierID = 				$record["SupplierID"];
		$customername1 = 			$record["Customername1"];
		$street = 					$record["Street"];
		$postcode = 				$record["Postcode"];
		$city = 					$record["City"];
		$country = 					$record["Country"];
		/*
		$customername2 = 			$record["Customername2"];
		$email1 = 					$record["Email1"];
		$email2 = 					$record["Email2"];
		$additionalInformation = 	$record["AdditionalInformation"];
		$fieldworker = 				$record["Fieldworker"];
		$indoorservice = 			$record["Indoorservice"];
		$webseite = 				$record["IF:Webseite"];
		$verband = 					$record["IF:Verband"];
		$segment = 					$record["IF:Segment"];
		$unternehmenskette = 		$record["IF:Unternehmenskette"];
		*/

		$recordsArr[] = array(
			"supplierID" => 	$supplierID,
			"customername1" => 	$customername1,
			"street" => 		$street,
			"postcode" => 		$postcode,
			"city" => 			$city,
			"country" => 		$country
		);

	}

	// Remove duplicates
	$recordsArr = array_map("unserialize", array_unique(array_map("serialize", $recordsArr)));

	// var_dump($recordsArr);
	// echo count($recordsArr) . ".\n";

	// Get customers
	$customersPage = pages()->get("template=customers");
	$customers = pages()->find("parent=$customersPage, template=customer");

	/*
	 * Delete customers
	 */
	/*
	foreach ($customers as $customer) {

		$log->save("customers", "Customer " . $customer->title . " deleted.");
		echo("Customer " . $customer->title . " deleted.\n");

		pages()->delete($customer);

	}
	*/

	/*
	 * Create or update customers
	 */
	foreach ($recordsArr as $r => $record) {

		// Save columns in variables
		$supplierID = 				$record["supplierID"];
		$customername1 = 			$record["customername1"];
		$street = 					$record["street"];
		$postcode = 				$record["postcode"];
		$city = 					$record["city"];
		$country = 					$record["country"];

        $customersPage = pages()->get("template=customers");

		// Create customer
		if (!$customers->has("title=$supplierID")) {

			$customer = new Page();
			$customer->parent = $customersPage;
			$customer->template = "customer";
			$customer->title = $supplierID;

			$customer->of(false);

			$customer->save();

			$customer->set("customer_name", $customername1);
			$customer->set("customer_postal_code", $postcode);
			$customer->set("customer_city", $city);

			// Create distribution country if it doesnt exist
			$distributionCountry = pages()->get("title=$country, template=distribution-country");

			if (!$distributionCountry->id) {

				$distributionCountry = new Page();
				$distributionCountry->parent = pages()->get("template=distribution-countries");
				$distributionCountry->template = "distribution-country";
				$distributionCountry->title = $country;

				$distributionCountry->of(false);

				$distributionCountry->save();

				$log->save("distribution-countries", "Distribution country " . $distributionCountry->title . " created.");
				echo("Distribution country " . $distributionCountry->title . " created.\n");

			}

			$customer->set("customer_distribution_country", $country);

			$customer->save();

			$log->save("customers", "Customer " . $customer->title . " created.");
			echo("Customer " . $customer->title . " created.\n");

		// Update customer
		} else {

			// Get customer
			$customer = $customers->get("parent=$customersPage, title=$supplierID, template=customer");

			if (($customer->customer_name !== $customername1) ||
				($customer->customer_postal_code !== $postcode) ||
				($customer->customer_city !== $city) ||
				((string)$customer->customer_distribution_country->title !== $country)) {



				$customer->of(false);

				$customer->set("customer_name", $customername1);
				$customer->set("customer_postal_code", $postcode);
				$customer->set("customer_city", $city);
				$customer->set("customer_distribution_country", $country);

				$customer->save();

				$log->save("customers", "Customer " . $customer->title . " updated.");
				echo("Customer " . $customer->title . " updated.\n");

			}

		}

	}

	/*
	 * Delete leftover customers
	 */
	$savedCustomersArr = array();
	$customersArr = array();

	foreach ($customers as $customer) {

		$savedCustomersArr[] = $customer->title->getLanguageValue("default");

	}

	foreach ($records as $record) {

		// Save columns in variables
		$supplierID = $record["SupplierID"];

		$customersArr[] = $supplierID;

	}

	$deletedCustomersArr = array_diff($savedCustomersArr, $customersArr);
	$deletedCustomersArr = array_unique($deletedCustomersArr);

	// var_dump($savedCustomersArr);
	// var_dump($customersArr);
	// var_dump($deletedCustomersArr);

	foreach ($deletedCustomersArr as $deletedCustomer) {

		$customer = pages()->findOne("parent=$customersPage, title=$deletedCustomer, template=customer");

		$log->save("customers", "Customer " . $deletedCustomer . " deleted.");
		echo("Customer " . $deletedCustomer . " deleted.\n");

		pages()->delete($customer);

	}

	echo "Synchronisation finished...\n";

}

 

 

  • Like 4

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By hellerdruck
      Hi all
      I need to export all the texts from a website to a translation company (as json or csv or txt...). How can this be done? Of course manually, but this website is huge and it would take me years...
      Also, as a second step, importing the translation ...
      Any ideas anyone? Tutorials? Plugins?
      Thanks for your help.
    • By Rodd
      Hi everyone!
      I have a website in a production environment and I want to duplicate it in a local environment. I exported the content of the website (with the 'Site Profile Exporter' module) but I cannot use it actually. I've got an issue with the database. I imported this one in MAMP then.

      I also exported the pages (with the 'ProcessPagesExportImport' module), but I cannot import it to my local website because the fields don't exist. So I created this fields, but I have this error :
      How can I use the elements that already exist and are presents in my database? How can I duplicate correctly the templates, fields and pages?
      Thanks by advance
      PS: Sorry if my english is bad
       
    • By hellerdruck
      Hi all
      I need help with something. Situation: We have let's say 2'000 Files (Excel) that should be displayed (list with links) on a page. We'd need to filter these files by given Keywords or a tree structure or both. Now, I'm looking for a solution whereas our customer can synchronise the files from his local computer with the folder on the webserver. They will update and upload files on a daily basis. Therefore, it would need to synchronise rather than load the files manually in pages or repeaters. Maybe indexing would be an idea, too.
      Are there any modules for Processwire that would help achieving this? Could anyone point me in the right direction?
      Thanks in advance.
    • By iNoize
      Hello, need some help for an RealEstate project. It have to use the OnOffice to import the objects. 
      https://apidoc.onoffice.de/
       
×
×
  • Create New...