Jump to content
Hani

PW Performance with hundreds of fields

Recommended Posts

Processwire's performance has always been fantastic for the projects I've used it with - but I'm currently working on a real estate project (converting it to PW) and hitting some performance issues.

  1. I have about 800 different fields in PW (crazy, I know)
  2. I have a few templates with about 400 fields each (crazy, I know) - all of these fields, with the exception of about 10, are simple text fields.  Quite often, pages using these templates have a lot of blank fields.  One of the fields is an images field.

Several of the MySQL tables I'm pushing over into the PW template/field structure has about 400 columns in them.  (Each record is a real estate property listing.)


Important note: the site is currently on a shared host, so I'm not completely certain if that is the sole issue for the poor performance or if it is simply the fact that there are hundreds of fields to deal with.

Issues I'm facing:

  1. Whenever I go to /processwire/setup/field/ (to see the fields), it takes FOREVER to load the page (and sometimes times out).  (I know that's a lot of info to load, but what's happening in the background here?  Can something be done to make it load faster?)
  2. Creating a new page via the API and populating it with data in one shot takes quite a long time.

Questions:

  1. Can using a VPS single-handedly fix the performance issue?
  2. Is there a better way to structure the data rather than having it all in one page/template?  (Consider the fact that I have all the data at my disposal when creating a new property record and it all needs to get saved.)
  3. Is having that many fields in PW in itself not a good idea?  (Meaning, would it be better to leave the data in MySQL - outside of PW - and grab the data using wire('db') and custom SQL statements?  I would just hate to resort to creating SQL search queries for all the queries I'd be using.  I'd much rather use PW's selectors.)
  • Like 1

Share this post


Link to post
Share on other sites

hm, 400 field in one template sound really like a lot. Maybe you could optimize those templates by joining all those fields in one. If you're are not planning to use most of them for making searches and you only need to spit them when they are called by a page, crate a textarea and make a list with the info from all those fields:

field1: "content"
field2: "content"
field3: "content"
...

To present the content of this field, just explode it (but not literally, please)

  • Like 3

Share this post


Link to post
Share on other sites

800 fields!? You're crazy :D

Important note: the site is currently on a shared host,

This of course would be a important factor. I think it would all work but you'll need a strong server to handle that much fields. Even though then it may can handle it it would be still quite a load on server.

So the question would be why 800 fields (crazy) and a template with 400 fields (crazy) is a -bit- overkill. I'm not sure how it comes you need so many fields on one page.

Unfortunately I don't see much that could be done in the list fields admin that renders all fields as a list, since now it used a lot of filtering and grouping by labels it can't be easily paginated or something. Only thing I see would be to create custom fields list admin that is limit to 50 fields, or use a search (admin hotkeys can do this too) to get to a field.

post-100-0-86709100-1378547585_thumb.png

I'm also thinking if it wouldn't be possible to simplify this. As diogo mentioned you could maybe serialize the data some.

After all this sound a lot like a data heavy application (really heavy) that maybe too much to go the PW way of creating all and every field as a field and attach it to a template. This sounds more like a custom application that could use a custom database or tables.

What I would consider is, as you said, to build some custom tables for the data and use SQL queries. Or what you could try is creating some custom simple fieldtypes that store groups of data fields to a table at the end. Like a field stores prename,lastname,address,tel etc. You'd then have less fields and still could use the API and selectors. If you need help we are here to guide you.

What I also thought is (but don't see the full context) why add 400 fields to a template and not split it up to many templates and use hierarchy in the tree with child pages to store data. But then still you'd have a lot of fields. Some of the larger sites I have +-100 fields and I think it's already quite a lot (some are alternative fields _de, _fr, _en etc). So if you could let us know more details of how it comes to you creating 800 fields it may help to understand and find alternatives.

  • Like 2

Share this post


Link to post
Share on other sites

Hi,

Mysqlperformanceblog states that "Something to look out for is a high number in the rows column.

So, Hani, how about considering repeaters or Page fields?

Share this post


Link to post
Share on other sites

Hani, ProcessWire will scale pretty infinitely in terms of pages, but not in terms of fields or templates. The same can be said of databases in general–you can have millions of rows in a table, but probably don't want millions of tables. I would say you've gone beyond what may be practical for long term use and maintenance of the site. At least, I have never tested ProcessWire with any more than 200 fields (and I thought that was an insane amount). I would be curious why so many fields are necessary? It might be good for us to get a list of all the fields, which may lead to some good suggestions on alternatives. But this is a situation where I think you might really benefit from making your own Fieldtype and Inputfield combinations. It is surprisingly simple to do, and can represent any DB table, no matter how many columns it needs. There is far less overhead in having a custom Fieldtype that represents a table with 100 columns, than there is in having 100 separate fields. Have a look at Adrian's new FieldtypePhone/InputfieldPhone module for a good example of how to represent a DB table with a Fieldtype. 

Some other options I can think of relate to ProFields, a pack of modules I've been working on to release soon (probably as commercially supported like FormBuilder and ProCache). It includes FieldtypeMultipler, FieldtypeTable, FieldtypeTextArray and FieldtypeTextareas, among others. These enable you to have a single field that can to represent any quantity of inputs. Though they all do it in a little bit different ways. FieldtypeMultiplier lets you take any other Fieldtype (that isn't already multi-value) and turn it into a multi-value field. FieldtypeTable is a multi-value compound type that lets you keep a user-defined combination of fields contained in a group… not quite as powerful as a repeater, but it's only 1 field and has far less overhead. FieldtypeTextArray lets you have a defined quantity of inputs for any text field… like say you needed a field to represent 3 email addresses, for instance. Lastly, FieldtypeTextareas lets you have 1 field that can represent any number of named textareas while having them all be searchable from the same field. For example, you could have a single field called "content" that contains separate inputs for body, sidebar, summary and about_the_author or something like that, but they only consume 1 field, accessed via $page->content->body, $page->content->sidebar, etc. All of these Fieldtypes/Inputfields are aimed at really reducing the overall quantity of fields necessary to represent lots of data. I developed this set because I had a site with lots of similar fields, and this group of modules helped me to reduce the quantity of fields by more than 70%. However, these aren't ready for production use, so it may be a little while before I have them ready. But if you'd be interested in helping to beta test them, your use case might be about perfect. 

  • Like 19

Share this post


Link to post
Share on other sites

Wow Ryan - those new multi-value fields sound brilliant - can't wait to see your implementation. I'll have to use them, just to have a reason to send some money your way.

You also really got me thinking about the potential of custom fieldtype/input field combinations for an upcoming project which is extremely data heavy. I will still definitely need several custom database tables and SQL queries for the data manipulation side of things. These queries actually take quite some time to run, but only need to be run periodically to generate results. So I am thinking that I will store the manipulated results of these queries in some custom fieldtypes which will dramatically reduce the number of PW fields, yet still allow for API querying of the final results on the front-end. So many cool possibilities :)

  • Like 1

Share this post


Link to post
Share on other sites

You're crazy :D

My wife says the same thing to me all the time!   ;)

So the question would be why 800 fields (crazy) and a template with 400 fields (crazy) is a -bit- overkill. I'm not sure how it comes you need so many fields on one page.

Here's what I'm working on: I provide a service for realtors in my area that allows them to place a search engine for real estate properties on their website (with lots of functionality for their clients like saving searches, marking favorites, etc.) - the industry term for it is an "IDX System".  It's completely functional, however, it isn't using ProcessWire.  I'm attempting version 2 of the system with PW as the foundation.  (Side note - it was a side project a few years ago and I only have 1 client using it right now, but I'm giving it some focus right now.)

Now, our local MLS provides me with all the property listing data which is located in their own database.  What I, and other IDX providers, essentially do is pull all the data and store it in our own database so that we can manipulate and search the data however we want.  There are actually a few different tables, one for each different property type; residential, commercial, land, rentals, etc.  Some of the fields are common among the tables, but some are not.

Each property listing in their database has roughly 400 fields/points of information associated with it.  Such as: MLS #, Realtor ID, Street Address, Street Suffix, City, State, ZIP Code, # of Bedrooms, # of Bathrooms, Washer/Dryer, Car port, garage, Style, Description, Listing Date, Price, Modification Timestamp, Master Bedroom Area, Master Bedroom Width, Master Bedroom Length, Bedroom 1 Area, Bedroom 1 Width, etc.  Obviously TONS of individual pieces of information for each listing.

What I have now, in the non-ProcessWire version, are 5 different tables with about 400 columns each.  It's not terribly slow since when I pull data, I only pull the fields that I need

Maybe you could optimize those templates by joining all those fields in one. If you're are not planning to use most of them for making searches and you only need to spit them when they are called by a page, crate a textarea and make a list with the info from all those fields

.Now, that I think about it - you're absolutely right.  Most of those fields, while containing data that I need to be able to display, are not fields that I have to ever search.  So as diogo and Soma said, serialization and storing the data into a textarea field would be TOTALLY fine.  And it would probably cut down on 80% of the fields.

Unfortunately I don't see much that could be done in the list fields admin that renders all fields as a list, since now it used a lot of filtering and grouping by labels it can't be easily paginated or something. Only thing I see would be to create custom fields list admin that is limit to 50 fields, or use a search (admin hotkeys can do this too) to get to a field.

If I'm able to cut down on 80% of the fields, then this might not be an issue anymore.  But I have never seen your Admin Hotkeys module - that's awesome! I'm totally going to be using it!

What I also thought is (but don't see the full context) why add 400 fields to a template and not split it up to many templates and use hierarchy in the tree with child pages to store data. But then still you'd have a lot of fields.

I thought of doing that too; creating multiple pages for each property listing and splitting up the data - but like you said, I'd still have 800 fields in the system itself and whenever I do need to pull any of the pages to get the data, it would still end up using a lot of memory.

I would say you've gone beyond what may be practical for long term use and maintenance of the site. 

Based on all the feedback, it sure sounds like I have!  :)

But this is a situation where I think you might really benefit from making your own Fieldtype and Inputfield combinations. It is surprisingly simple to do, and can represent any DB table, no matter how many columns it needs. Have a look at Adrian's new FieldtypePhone/InputfieldPhone module for a good example of how to represent a DB table with a Fieldtype. 

This sounds like a great solution too.  But I suppose that since I never really need to search on so many of the fields I currently have, serialization would work for me without much effort.  BUT the benefit of this method is that I would still maintain search capability on that data (using selectors) if I really needed it.  Right?

Some other options I can think of relate to ProFields, a pack of modules I've been working on to release soon  

Those modules sound amazing and I can think of many scenarios where I'll need them.  Like Adrian, can't wait to throw my money at you! :lol:

  • Like 1

Share this post


Link to post
Share on other sites

Forgot to say...

I think it would all work but you'll need a strong server to handle that much fields. Even though then it may can handle it it would be still quite a load on server.

I am, soon enough, going to move this all over to a VPS to handle the load - especially since my goal is to have multiple realtors using the system.  It has been just fine with only one client in the current version of the system.  But the PW version will be oh-so-much better and faster.

Share this post


Link to post
Share on other sites

I wanted to make a note for anyone else that is concerned about performance and is thinking about serializing data and saving it to a textarea field:

After some research and a little testing, it seems that json_encode() is faster than serialize(). Other than being faster (at times, nearly 100% faster), it is also more easily readable without decoding and it's more portable.

This is best for simple arrays.  If you're storing objects that need to be unserialized to the correct class, you'll want to use serialize().

  • Like 5

Share this post


Link to post
Share on other sites

Thanks, Hani. PHP 5.5 dropped JSON support due to the developer's aesotheric license. But those of us who are not yet at the front line are fine :)

  • Like 1

Share this post


Link to post
Share on other sites

PHP 5.5 dropped JSON support due to the developer's aesotheric license.

I was really intrigued by that (and getting mad at lawyers!), so I Googled for more info.  Apparently, it's not quite removed. There was widespread misunderstanding of what was going on.

See here: http://iteration99.com/2013/php-json-removed-from-php-5-5/

So, developers, rejoice! Workarounds won't be necessary when upgrading to 5.5 so long as you have the JSON-C extension installed.   :lol:

  • Like 3

Share this post


Link to post
Share on other sites

It seems to me that whoever created that database in the first place is trying to be too atomic (http://en.wikipedia.org/wiki/Atomicity_(database_systems)). You could replace all the fields containing eg.

Master Bedroom Area, Master Bedroom Width, Master Bedroom Length, Bedroom 1 Area, Bedroom 1 Width, etc. 

with something like Ryan's Google Map Marker fieldtype (http://modules.processwire.com/modules/fieldtype-map-marker/), such that room description (Master Bedroom) and the 2 dimensions are contained in one fieldtype, being reused for the different rooms.

I think this might reduce the number of different fields required.

  • Like 4

Share this post


Link to post
Share on other sites

Yup - whenever I pull records from their database, it's MAINLY all or nothing.  I can pull just record IDs to see what records are still in their database - but that only serves to remove records on my end that are no longer in theirs.

GREAT call about Ryan's Google Map Marker fieldtype!  That would totally work in this case.

Share this post


Link to post
Share on other sites

This is an old topic worth resurrecting :) 

@Hani

I'm sure you've made some progress in speed and performance. Have you switched to a VPS? What did you do to reduce your number of fields?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...