Jump to content

Find most efficient solution for storing multiple values in a field


gebeer
 Share

Recommended Posts

Hi all,

I'm not sure how to best store multiple timestamps in a field for a page. As there can be 100 or more timestamps attached to the page, I need to find the most efficient way to save those timestamps in a field and later search through that field.

My use case:

I have pages for advertisements with start date, end date and start time, end time and frequency of publishing (every 15, 30, 60 minutes). From that data I calculate timestamps at which the advertisement will be available for viewing.

One advertisement can have 100 or more timestamps attached to it. Example:

title of page: Advertisement1

field "ad_publish" contains timestamps: 1413136860,1413136920,1413136980,1413137040,1413137100...(can be a lot)

Now what would be the best way of storing these multiple timestamps. IOW: what field type to use?

I need to take into consideration that the field needs to be searchable with a *= selector like

$publishedAds = $pages->find("template=advertisement, ad_publish*={$timestamp}")

2 options that I can think of for my "ad_publish" field:

1. textarea field, store timestamps as comma delimited values.

2. page field: store each timestamp as a page

What would be more efficient and faster when it comes to searching through hundreds of timestamps?

I'm sure some of you had similar use cases in their projects and I would be happy if you could share your experience or give me pointers on how to best approach this. Thank you.

Link to comment
Share on other sites

@gebeer

I'd steer towards the page field option above or go with a custom DB table handled by a module. Anyway, I'm not sure you have to be constrained to a full-text search selector. Is there some constraint on this project that's forcing you to use fulltext fields for some reason?  I think you can arrange to do things with a normal "equals" selector (which is probably a lot faster) though it might require some more setup.

It would be good to benchmark the solution you are using with the ad_publish field as a text field and doing the fulltext-search in the selector and then comparing it to something like this (untested)...

  • Create a timestamp field called "publish_at" and a page field called "advert_id" which is setup to a single value or null in the config page.
  • Create a template called publishing_schedule and add both the "publish_at" and "advert_id" fields to it.
  • Create a parent page called "Publishing Schedule" in your page tree somewhere (it can use the basic-page template) and set up its children to use the publishing_schedule template.
  • Setup a page save hook so that when you save an advert (where you specify your start date + time, end date + time, frequency etc) you then calculate your publishing timestamps using existing code or the when library and, for each timestamp, you'd create a new page under the "Publishing Schedule" parent page using the "publishing_shedule" template. Set the publish_at and advert_id fields as required and save the page.

You should now be able to find all ads published on a particular timestamp with a simple query like...

$published_at = $pages->find("template=publishing_schedule, published_at=$timestamp");
foreach ($published_at as $published) {
    $advert = $published->advert_id;

    /**
     * Access your advert fields as needed.
     */
    echo $advert->title; // etc
}

You could then benchmark which one worked better for you.

  • Like 4
Link to comment
Share on other sites

Thanks a lot, netcarver, for your detailed answer.

I think I will go the pages route and save every timestamp as a page.

Following your example above, I'd need a pages field type in my advertisement template to store the references to the timestamp pages, right?

Or don't store the references and just extend your sample code:

$published_at = $pages->find("template=publishing_schedule, published_at=$timestamp, advert_id={$page->id}");

And thanks also for pointing me to the when library. Interesting stuff. Might save me from reinventing the wheel :-)

  • Like 1
Link to comment
Share on other sites

Thanks for posting the link to recurr - looks like another interesting library. I should have posted the link to the development branch of the when library. Most of the documentation is done by example in the tests subfolder. Minutely recurrence is handled, check out the second example here. Let me know which library works out better for you as I'll be needing one of these myself really soon.

  • Like 1
Link to comment
Share on other sites

Thanks, great.

I'll test both of them over the next days and will let you know.

EDIT:

After having a quick look at both libraries, I found that both of them require a count for the recurrence of events. So I first need to calculate the count myself from the data available.

Would have been great to just throw start date/time, end date/time and interval at some library and have it figure out the count by itself. (Being lazy again :P ).

Think now I understand count. In recurr FREQ=MONTHLY;COUNT=5 means every 5 months.

Link to comment
Share on other sites

As both, when and recurr do not work for start/end dates but rather need a count for recurring events, I don't really want to use them because I can't get my head around calculating the count from start and end date and my interval.

For now I'm using my own function derived from here to calculate my recurring events. It is not really flexible and assumes that you provide the date/time in a specific format and the interval in minutes. This function suits my job and makes use of the PHP DatePeriod Class.

function getRecurrences($start,$end,$interval) {
    $format = 'Y-m-d H:i';
    $start = DateTime::createFromFormat($format, $start);
    $end = DateTime::createFromFormat($format, $end);
    $end = $end->modify( '+1 minute' ); 

    $duration = 'PT' . $interval . 'M';
    $interval = new DateInterval($duration);
    $daterange = new DatePeriod($start, $interval ,$end);

    return $daterange;

    /* to get the timstamps use a loop like this
    foreach($daterange as $date){
        echo $date->format($format) . " : " . $date->getTimestamp() . "<br>";
    }*/
}
Link to comment
Share on other sites

Seems like I'm working too many hours and can't think clearly ATM.

Taken my function from above, I could easily calculate the count from it and use that in either when or recurr

function getRecurrenceCount($start,$end,$interval) {
    $format = 'Y-m-d H:i';
    $start = DateTime::createFromFormat($format, $start);
    $end = DateTime::createFromFormat($format, $end);
    $end = $end->modify( '+1 minute' ); 

    $duration = 'PT' . $interval . 'M';
    $interval = new DateInterval($duration);
    $daterange = new DatePeriod($start, $interval ,$end);

    $count = iterator_count($daterange);
    return $count; 
}
  • Like 1
Link to comment
Share on other sites

Update:

My function for calculating the recurrences is working fine and producing the correct timestamps.

I tried and compared both methods in terms of performance.

When I save each timestamp in an extra page together with the advert id, as suggested by netcarver, it takes quite a long time to create and save those pages if I have hundreds of timestamps for one ad. If it gets beyond 1300 or so pages, I get an error

Maximum execution time of 30 seconds exceeded

I could set a higher limit for the execution time. But then the user would have to wait quite some time until the operation is finished.

I need to delete all timestamp pages that belong to an ad when the ad gets edited and the start/end times change, before the new timestamp pages get created. This also takes quite long for 100s of pages.

So I also tried option 1 from my first post and save all timestamps comma delimited in a textarea field. This is much much faster when creating/editing an ad.

Now I have to benchmark and compare the 2 methods when it comes to searching through the timestamps either as pages or within the textarea.

Is there a PW way of showing execution times when $config->debug = true? Couldn't find anything on showing debug info in the frontend except Ryan's post here.

  • Like 1
Link to comment
Share on other sites

@gebeer,

Good to see some benchmark results for those; thanks for posting them. So, it sounds like option 1 might be the way to go for creation/deletion and I'm now curious to know what happens when searching.

There are timer helpers that you can use when in debug mode in PW's debug class.

Hope that helps!

Link to comment
Share on other sites

Results for saving 581 timestamps:

1. save as comma delimited values in a textarea field: 0.0258 s

2. save each in a page (create and sve 581 pages): 18.1477 s

So clearly, option 1 would be the way to go for saving timestamps.

Next I will benchmark search times and add them to this post.

Search results

1.Searching through 5 pages to find a timestamp inside a comma delimited list (about 600 timestamps) with this query. 5matches.

$ads = $pages->find("template=advertisement, ad_server={$serverID}, ad_publish_at*=$ts");

0.0264 s

2. Searching through 2600 pages to find the timestamp. with this query. 5 matches.

$ads = $pages->find("template=publishing_schedule, publish_at=$ts");

0.0200 s

I will have to create some more ads to see how option 1 performs when more pages need to be searched. But ATM it looks like I will go with option 1 because it is significantly faster when saving timestamps.

  • Like 2
Link to comment
Share on other sites

Just create a custom Fieldtype (module) that stores the timestamps properly in a table designed for the purpose. Saving and querying them would then be blazing fast and no need for dirty tricks like comma delimited values on a blob field (which wouldn't obviously be very flexible to query on - especially if you need comparison operators - nor would it be very flexible to modify the values).

If you need help implementing such Fieldtype, just let me know.

  • Like 5
Link to comment
Share on other sites

Now I have searched through 500 ads to find timestamp within comma delimited values in a textarea field using this query:

$ads = $pages->find("template=advertisement, ad_publish_at*=$ts");

500 matches. Time: 0.7691 s

As I will not have to search through more than 50-100 ads, I can live with the performance.

So I will choose option 1 and save my timestamps as comma delimited values in a textarea field.

Link to comment
Share on other sites

The EventField module is very interesting. But, honestly, the code is going a little bit over my head. So I'm not quite sure how I would implement something similar for my use case. Especially I don't quite understand yet how the storing part works. Only DB related method I can find in the module code is getDatabaseSchema();

I'll try to describe as exactly as possible what I need.

My form is frontend and populated from a template "advertisement". Relevant fields:

-ad_publish_up: date field -> publishing start date/time

-ad_publish_down: date field -> publishing end date/time

-ad_frequency: radio (from pages) -> interval that advert should get published (0,15,30,60 min)

-ad_publish_at: this would be my custom field type that stores all publishing dates as timestamps in an extra table.

The form is working and the processing logic is implemented:

-calculate recurrences from ad_publish_up, ad_publish_down and ad_frequency as timestamps

-store those timestamps in ad_publish_at: ATM a textarea field, timestamps are stored comma delimited

What I need now, is a custom field type that stores all timestamps for an advert in a table. There's only one column required in that table.

Later I need to be able to search for adverts with a certain timestamp within my custom field with a query like

$pages->find("template=advertisement, ad_publish_at=$timestamp");

I don't need all the input field logic because the timestamps will never get entered manually through the form. They will come from an array and each item in that array will be stored as one row in the custom field table.

If you could give me any pointers on how to modify the EventField module so it will fit my needs, I'll be more than happy.

Cheers

Gerhard

Link to comment
Share on other sites

@Gerhard,

Note I am also new to Fieldtypes :-) so, you may want to thoroughly test/validate, etc the attached. I have customised (mainly just commenting out stuff actually :ph34r:  :biggrin:  :-X  :-[  O0  :P ) the FieldtypeEvents to only save timestamps (1 column table). As was native to FieldtypeEvents, each timestamp on a page is saved in its own row in the DB.

Although I have also included the inputfield, you have several options: disable its page visibility in the template field visibility settings; disassociate it from the FieldtypeTimestamps Class and not install it or install it and use it to also have a visual view of the timestamps. You also get the nice jquery 'delete' or 'delete all' timestamps (events). If you don't want to install the Inputfield, you will also have to comment out/remove the method getInputfield() from FieldtypeTimestamps.module

Adding timestamps via API

As shown here adding items to the Fieldtype via the API is quite easy (similar code exists in the the  ___processInput() method in the InputfieldEvents (in the attached this is InputfieldTimestamps).

$times = array('1521884800', '1721884800', '1571884800');//these can also be other valid PHP date/times, e.g. 20-10-2007

$page->of(false);
foreach($times as $t) {
     
      $timestamp = new Timestamp();//this is the Class Timestamp found in Timestamp.php. Included via FieldtypeTimestamps.
      $timestamp->date = $t; // note stored in column 'data' in the db of the Field
      $page->timestamps->add($timestamp);
}

  $page->save();
  $page->of(true);

To output timestamps using API

 foreach ($page->timestamps as $t) {
           echo "<p>Date: $t->date</p>";
 }

Finding items via API

See Ryan's example here; I didn't check if it needs tweaking.

Other things to note

You can change the Inputfield to show timestamps rather than the date - InputfieldTimestamps.module lines #32-33

Currently the time format is hard coded as d-m-Y in Timestamp.php line #9

Left the commented out stuff in there...you can clean them out if all goes OK :-)

post-894-0-56059800-1413420005_thumb.png

FieldTypeTimestamps.zip

Edited by kongondo
  • Like 6
Link to comment
Share on other sites

Awesome! Thank you so much kongondo. PW forum is really outstanding thanks to people like you :)

I'll be on the road until the weekend but sure will find some time for implementing/testing this while I'm on the train. Will report back here how it works out.

Cheers

Gerhard

Link to comment
Share on other sites

I installed the module and added a field "ad_publish_at" of type Timestamp to my advertisement template.

I also added logic for storing my timestamp values

        $times = getRecurrences($startTime,$endTime,$interval);

        foreach ($times as $t) {
            $t = $t->getTimestamp();
            $timestamp = new Timestamp();//this is the Class Timestamp found in Timestamp.php. Included via FieldtypeTimestamps.
            $timestamp->date = $t; // note stored in column 'data' in the db of the Field
            $editpage->ad_publish_at->add($timestamp);
         } 

Timestamps get calculated correctly and are saved in their own DB table :)

post-1920-0-68453000-1413555470_thumb.pn

For debugging purposes I show the field in my frontend form. There I encounter a problem: same date is shown for all timestamps

post-1920-0-76495200-1413555605_thumb.pn

I get a PHP Notice

Notice: A non well formed numeric value encountered in /var/www/gocinet/site/modules/FieldTypeTimestamps/InputfieldTimestamps.module on line 32

Line 32 in InputfieldTimestamps.module reads

		$date = $timestamp->date > 0 ? date(Timestamp::dateFormat, $timestamp->date) : '';//formatted date - Y-m-d H:i [see Timestamp.php]

With var_dump($timestamp->date), I get

string '2014-10-14 20:00'

Even if I set different dateFormat like d/m/Y, Y-m-d, I always get that notice and all date inputs in the form show the same dates (Unix Epoch start).

Any ideas what could cause this behaviour?

Link to comment
Share on other sites

I think your $timestamp->date should be not the formatted one, but needs to be an integer because it gets compared > 0 what is an numeric / integer comparision.

The formatted finally goes into $date, but it needs the integer timestamp to build it from. (php.net date)

I would try to get the integer (UnixTimestamp) into $t in your foreach loop.

foreach ($times as $t) {
    ...
    $timestamp = new Timestamp();//this is the Class Timestamp found in Timestamp.php. Included via FieldtypeTimestamps.
    $timestamp->date = $t; // $t should be integer here
Edited by horst
  • Like 1
Link to comment
Share on other sites

I can't delete timestamps using the trash icons in my form. This might be related to this post (towards the end).

I tried commenting out

		if("$timestamps" != "$this->value" || $numDeleted) {

in InputfieldTimestamps.module.

But still I can't delete or save changes through manual input in my form.

Looking at the module code I can't seem to find a method for deleting entries. How would I go about deleting all entries for a specific page via API?

Link to comment
Share on other sites

@horst

thanks for pointing this out.

In my foreach, I already convert $t to an integer with $t = $t->getTimestamp();

I did some research on the PHP Notice about the non well formed numeric value and found a question on stackoverflow.

Following the proposed solution there, I altered line 32 in InputfieldTimestamps.module to read

$date = $timestamp->date > 0 ? date(Timestamp::dateFormat, strtotime($timestamp->date)) : '';

Now the PHP Notice is gone :)

And I get the correct dates shown

post-1920-0-88776500-1413627392_thumb.pn

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...