fedeb

July 1, 2021

Thanks, I will give it a look and do some tests to see if I gain any speed!

If I find something useful I'll update the thread.

June 30, 2021

Hi, thanks for the reply.

My problem is that I want to avoid to query the database twice. Am I doing that in the first snippet?

1 - I get all of the data from the page, this is all of its fields, just to use the url attribute

2- I do the $session->redirect(), thus I have to retrieve again all of the information of the specific page to populate the template

It seems like I am doing two queries.

On the other hand the second snippet would avoid this but I don't know how to avoid the redirect if the url() doesn't exists.

June 30, 2021

Hi!

I have a search bar that should redirect to a page if the query matches a page title. Based on suggestions I read on the forum I implemented this as follows:

$protein_page = $pages->get("template=protein, title='$search_input'");

if(!($protein_page instanceof NullPage)) {
	$session->redirect($protein_page->url);
}
else{
	echo "No results for that query!";
}

Do I really need to retrieve a $page object only to get the url? I actually know the url since it is simply mysite/proteins/$search_input.

Is Process Wire smart enough to render the page based on the info retrieved in $protein_page or should I simply use:

$session->redirect("mysite/proteins/".$search_input);

In this case I don't know how to avoid the redirect if the url doesn't exist.

Thanks in advance!

May 21, 2021

Thanks for the tip, in fact adding that line fixed the current item class. I misunderstood the use of the keyworkd {class}.

Still there is one last problem remaining which I couldn't solve: the <a href=...></a> of the current element does not have the class="page-link" even though the other elements have it.

I am trying to solve it myself but cannot manage to do it.

May 20, 2021

Hi, thanks for the response, already checking the html code outputted in dev tools was helpful:

<li class="page-item">
::marker
	<a href="/cluster/page10"><span>10</span></a>
</li>
<li class="page-item">
::marker
	<a class="page-link" href="/cluster/page11">11</a>
</li>

Page 10 (active page) vs Page 11 (non active).

The difference is that the class is only "page-item" even though I specified "page-item active" and that <a></a> doesn't have the class page-link. If I add this information manually from the dev tools the page is render as is supposed to. My guess is I am doing an error while defining the option vector but I don't know how to solve it.

May 20, 2021

Hi,

I am trying to style the pager navigation bar based on a simple example I found on bootstrap:

<nav aria-label="...">
  <ul class="pagination">
    <li class="page-item disabled">
      <a class="page-link" href="#" tabindex="-1" aria-disabled="true">Previous</a>
    </li>
    <li class="page-item"><a class="page-link" href="#">1</a></li>
    <li class="page-item active" aria-current="page">
      <a class="page-link" href="#">2</a>
    </li>
    <li class="page-item"><a class="page-link" href="#">3</a></li>
    <li class="page-item">
      <a class="page-link" href="#">Next</a>
    </li>
  </ul>
</nav>

The code I am using is the following:

<?php
    echo $entries->renderPager(array(
        'nextItemLabel' => "Next",
        'previousItemLabel' => "Prev",
        'listMarkup' => "<ul class='pagination'>{out}</ul>",
        'itemMarkup' => "<li class='page-item'>{out}</li>",
        'linkMarkup' => "<a class='page-link' href='{url}'>{out}</a>",
           'currentItemClass' => "page-item active",
    ));  
?>

which renders this result: the current page is not styled!

If I run the original bootstrap example everything works fine, but not through the renderPager options. The problem seems trivial but since I am new to ProcessWire (also to web design) I cannot find it! My main reference is this one.

Thanks in advance!

May 5, 2021

Hi, I used matplotlib library from Python. Configuration is the default one so with only few lines you get the plots!

April 20, 2021

Hi,

(almost two years later...?) I am trying out the latest update of the FieldtypeEvents announced on the latest weekly update by ryan. Since I am new to Fieldtypes (also to ProcessWire) I have some basic questions:

1) When trying to populate the Events field from a php script using $page->save()or $pages->saveField(Page $page, $field)

I get the an error raised inside ___sleepValue():

if($event->formatted) throw new WireException('Formatted events cannot be saved');

~~The error is originated because when loading the information from the database ($value) into an Event, the $event is tagged as formatted by the constructor~~ -> It is not tagged as formatted in the constructor but still the page is formatted by default.

class Event extends WireData {

	/**
	 * Construct a new Event
	 *
	 */
	public function __construct() {
		// define the fields that represent our event (and their default/blank values)
		$this->set('date', ''); 
		$this->set('title', ''); 
		$this->set('location', ''); 
		$this->set('notes', ''); 
		$this->set('formatted', false);
		parent::__construct();
	}

How can I go back to the unformatted state so that I can execute$page->save() from a php script? How are elements saved using inputFieldEvents?

2) are 'location' and 'notes' additional fields needed by default or they are just there from the previous version? When calling getDatabaseSchema() they do not appear.

3) In the latest reply @gunter mentions a set() and get() function. I do not see the get() function. Maybe it was removed in the latest update? (btw gunters reply was extremely useful, thanks!)

4) Broader question: Since I am populating my events from a php script and I don't necessarily need a GUI to input data, can I avoid including inputFieldEvents?

Thanks in advance!

April 14, 2021

@ryan for the time being the data (groupID, start, end, sequence) are not supposed to be queryable. Ideally groupID should be, because I would like to display all proteins belonging to a groupID in the group page but I think I will use a workaround for this: I have a file for each group containing this information which I plan to parse when loading the group page. Individual files have at most 1000 lines (proteins). In this way I avoid querying 20+ million entries each time you try to access a particular group page.

As you suggested I will load each entry (groupID, start, end, sequence) as a text field and then use php explode method to parse it into an array at runtime.

The only doubt is probably on groupID:

Quote

For your groupID, if the same groupID is referenced by multiple proteins, and there is more information about each "group" (other than just an ID) then I think it would make sense for it to be a Page reference field. What is the max number of groupID+start+end+sequence rows that a protein can have? If there is a natural limit and it's not large, then that would open up some new storage possibilities too.

A single groupID can be referenced by multiple proteins and it does contain additional information displayed in their respective group page (I create the group pages separately). The natural limit is around 20 groups although normally this number is 2 or 3 groups per protein. With this setup is it worth using a Page reference field? What are the other storage possibilities?

In the future I think I will end up using ProFields or building a Fieldtype module. For this last approach I think I need to read a bit more about modules since I am new to processwire. This tutorial posted by bernhard is a good start.

@Hector Nguyen if you are not constraint by memory then loading the csv into memory all at once is the way to go, right?

Thanks for all the useful suggestions.

p.s. maybe I am diverging from the original thread. If you prefer I can open a new one.

April 13, 2021

Hi,

Thanks a lot for all the feedback. I did some additional tests based on all of the suggestions you gave me and results are already amazing!!

Figure 1 shows @ryan suggestions tested independently:

1. I created the $template variable outside the loop.

2. I created the $parent variable outside the loop. The boost in performance is surprising! Defining the $parent outside the loop made a huge difference (before I didn't assigned the parent explicitly, it was already defined in the template thus the assignment was automatic)

4. I also tried this suggestion ($page->name = "protein" . $i;) and although it seems to boost a bit performance I didn't include the plot because results were not conclusive. Still I will include this in my code.

Figure 2 is based on @horst suggestion. I tested the impact of calling gc_collect_cycles() and $pages->uncacheAll() after every $database->commit(). I didn't do a test for $pages->uncache($page) because I thought $pages->uncacheAll() was basically the same. Maybe this is not true (?). Results don't show any well defined boost in performance (I guess ryan's recent reply predicted this).

I still need to try @BitPoet suggestion because I am sure this is something that will boost performance. I am now doing this tests on my personal computer. I will do this test when running on the dedicated server. I will also would like to try generators (first time a hear about them )

______________________________________________________________________________________________________________________________________________________________________________________________

One last thing regarding the fields in the protein template and the data structure in general (the pseudo code I posted initially was just as an example).

Proteins are classified into groups. Each protein can belong to more than one groups (max. 5). My original idea was to use repeaters because for each protein I have the following information repeated:

GroupID [integer], start [integer], end [integer], sequence [text]

The idea is that from GroupID you can go to the particular group page (I have around 50k groups) but I don't necessarily need a page reference for this.

The csv is structured as follow. Note that some protein entries are repeated which means that I shouldn't create a new page but add an entry to the repeater field.

Protein-name groupID start end sequence
A0A151DJ30 41 3 94 CPFES[...]]VRQVEK
A0A151DJ30 55 119 140 PWSGD[...]NWPTYKD
A0A0L0D2B9 872 74 326 MPPRV[...]TTKWSKK
V8NIV9 919 547 648 SFKYL[...]LEAKEC
A0A1D2MNM4 927 13 109 GTRVW[...]IYTYCG
A0A1D2MNM4 999 119 437 PWSGDN[...]]RQDTVT
A0A167EE16 1085 167 236 KTYLS[...]YELLTT
A0A0A0M635 1104 189 269 KADQE[...]INLVIV

Since I know repeaters also creates additional overhead I am doing all my benchmarks without them. I can always build the websites without them. In the next days I will do some benchmarks including repeaters just to see how it goes.

Once again, thanks for all the replies!

April 12, 2021

Hi,

A bit of background. I am creating a website which lets you navigate through a protein database with 20 million proteins grouped into 50 thousand categories.

The database is fixed in size, meaning no need to update/add information in the near future. Queries to the database are pretty standard.

The problem I am currently having is the time it takes to create the pages for the proteins (right now around a week). Pages are created reading the data from a csv file. Based on previous posts I found on this forum (link1, link2) I decided to use $database transactions to load the data from a php script (InnoDB engine required). This really boosts performance from 25 pages per second to 200 pages per second. Problem is performance drops as a function of pages created (see attached image).

Is this behavior expected? I tried using gc_collect_cycles() but haven't noticed any difference.

Is there a way to avoid the degradation in performance? A stable 200 pages per second would be good enough for me.

Pseudo code:


$handle = fopen($file, "r");
$trans_size = 200   // commit to database every _ pages

try {

    $database->beginTransaction();

    for ($i = 0; $row = fgetcsv($handle, 0, " "); ++$i) 
    {

        // fields from data
        $title                  = $row[0];
        $size                   = $row[1];
        $len_avg                = $row[2];
        $len_std                = $row[3];

        
        // create page
        $page = new Page();
        $page->template          = "protein";
        $page->title             = $title;
        $page->size              = $size;
        $page->len_avg           = $len_avg;
        $page->len_std           = $len_std;
        $page->save();
        
        
        if (($i+1)%$trans_size == 0)
        {
            $database->commit();
            // $pages->uncacheAll();
            // gc_collect_cycles();
            $database->beginTransaction();

        }
    }
    $database->commit();
}

I am quiet new to process wire so feel free to criticize me ?

Thanks in advance

March 26, 2021

Hi!

I just tried out your solution and it worked perfectly! Thanks a lot for the quick reply and the amazing explanation!

Quote

But the problem is that you have to keep track of the exact file/folder name of your temporary file. And another problem is, that you can end up with security problems, because the file content of a file can change over time whereas the link to that file could still be the one from an earlier version of another file.

Since I am creating temporary files on demand for download I think I don't have a big security issue nor I need to track the files, right?

Basically there is a link to a download.php which triggers the download of the file. There is no visible link to that file.

I include the download.php code just to close the thread properly. It is probably not the best code so if there something I should change please let me know.

// create temp dir
$temp_dir = $files->tempDir('downloads');
$temp_dir->setRemove(false);
$temp_dir->removeExpiredDirs(dirname($temp_dir), $config->erase_tmpfiles); // remove dirs older than $config->erase_tmpfiles seconds

// create zip
$zip_file = $temp_dir . "test.zip";
$result_zip = $files->zip($zip_file, $data);

// download pop-up
if (headers_sent()) {
    echo 'HTTP header already sent';
} else {
    if (!is_file($zip_file)) {
        header($_SERVER['SERVER_PROTOCOL'].' 404 Not Found');
        echo 'File not found';
    } else if (!is_readable($zip_file)) {
        header($_SERVER['SERVER_PROTOCOL'].' 403 Forbidden');
        echo 'File not readable';
    } else {
        header($_SERVER['SERVER_PROTOCOL'].' 200 OK');
        header("Content-Type: application/zip");
        header("Content-Transfer-Encoding: Binary");
        header("Content-Length: ".filesize($zip_file));
        header("Content-Disposition: attachment; filename=\"".basename($zip_file)."\"");
        readfile($zip_file);
        exit;
    }
}

March 21, 2021

Hi,

I am trying to use wire tempDir (/site/assets/cache/WireTempDir) to store temporary files for users to download.

The idea is that the files are created on request, they exist for a few hours till the download is over and then the folder is cleaned automatically.

Based on some useful posts I found in the forum I am able to create the files on demand using the following lines:

$wire_temp_dir = $files->tempDir('temp_downloads');
$wire_temp_dir->setRemove(false);
$temp_path = (string)$wire_temp_dir;;
$zip = $temp_path . "test.zip";
$result_zip = $files->zip($zip, $files_array);

The problem comes when I try to download the file from the website. I get the infamous error:

Forbidden You don't have permission to access this resource.

I am not able to reach the files from the website and I don't understand why. Permissions to the folder are correct (I created a test folder under assets with the same permissions as cache and I can reach the files there). I am using apache2.

I searched a lot before posting this but I am currently out of ideas. Keep in mind I am quiet new to web development and processwire.

Lastly, I have an additional error: I notice the files are not eliminated after the default 120 seconds specified in the docs.

Thanks in advance!

Fede

Sign In

fedeb

Posts

Joined

Last visited

Content Type

Profiles

Forums

Store

Posts posted by fedeb

Best way of redirecting to a page

Best way of redirecting to a page

Best way of redirecting to a page

Error while using custom options to style renderPager()

Error while using custom options to style renderPager()

Error while using custom options to style renderPager()

Creating 20 million pages

Events Fieldtype & Inputfield (How to make a table Fieldtype/Inputfield)

Creating 20 million pages

Creating 20 million pages

Creating 20 million pages

[solved] Using WireTempDir

[solved] Using WireTempDir

Browse

Activity

My Activity Streams

Store

My Details

Support