briangroce Posted February 9, 2013 Share Posted February 9, 2013 Hi guys, So, I've been working with ProcessWire quite a bit now and have pretty much transferred most all of the development work that I possible can to PW and it is doing a GREAT job! It is so easy to do complicated tasks. I have a new project that I am working on. I am building some integrations from the Amazon MWS SDK. I have built a queue system to send emails through Amazon's Simple Email Service API so that way the email is coming from their servers... The purpose of the application for now is to store new orders that come in to a sellers account, When that order ships, I want to trigger an email for different time periods (24 hours later, 48 hours later, etc...) Anyways, I have set up 3 pages in ProcessWire. Scheduled Emails Email to Send Sent Emails I have 3 Cron Jobs Set Up A script that runs every 10 minutes and grabs all orders that were placed in the last 15 MINUTES from an Amazon Sellers account and stores it under a page in ProcessWire that I call "Orders" if it isn't there already. If it finds an order that has just shipped, it looks at another script that gives instructions about when to send an email to that person. After this, the script creates a page under "Scheduled Emails" with a few fields including the email that should be sent and the time that it should be sent. The second Cron Job is checking the Scheduled Emails page for children that the current time is greater than the time to send field. If the time to send has passed, then that page is added to the Emails To Send page. The Third Cron Job grabs the emails pages from Emails to Send and loops through them sending the emails from the page fields. This script regulates the rate at which this email is sent via the Amazon API, etc.. With all of that said, the issue I am having here is this: I'm having a scalability dilemma. If this system had say 3,000 users who are Amazon Sellers, I have a major timing problem. I cannot grab data for all 3,000 users from the API at the same time. If I get a portion of them every few minutes with the Cron instead, then I may miss orders because my query for orders in the last 15 minutes might not work. Plus if I looped through like that, there might be a big delay on when orders are actually updated. I would like to have the most up-to-date information possible! If you guys have any suggestions or direction to be able to download data from an API at the same time with this many users and keep everything running smooth, I would really, really appreciate it! Thank You in advance! - Brian Link to comment Share on other sites More sharing options...
apeisa Posted February 9, 2013 Share Posted February 9, 2013 One way to solve that kind of problems is to use messaging queue, like http://www.rabbitmq.com/ or http://aws.amazon.com/sqs/. There are probably simpler solutions also, but that way you don't hit scaling problems. Link to comment Share on other sites More sharing options...
ryan Posted February 9, 2013 Share Posted February 9, 2013 I'm having a scalability dilemma. If this system had say 3,000 users who are Amazon Sellers, I have a major timing problem. I cannot grab data for all 3,000 users from the API at the same time. If I get a portion of them every few minutes with the Cron instead, then I may miss orders because my query for orders in the last 15 minutes might not work. This might be a dumb question, but... couldn't you just overlap? Pull the last 30 minutes, even if you only need 15. And then skip over any you've already processed? 1 Link to comment Share on other sites More sharing options...
briangroce Posted February 9, 2013 Author Share Posted February 9, 2013 apesia and ryan, Thank you for your thoughts. One way to solve that kind of problems is to use messaging queue I took a look at those just now, but it is going to take me a while to wrap my head around what those do... They look cool, but I'm not sure how to implement that. couldn't you just overlap? Pull the last 30 minutes, even if you only need 15. And then skip over any you've already processed? Ryan, I think is what I might do, then when the app really does need additional help, I can implement some sort of system that helps manage it. The only thing that concerns me with the overlap is that I will need to overlap quite a bit to make sure that I don't miss anything. Also, I want to get the information updated as fast as possible. hopefully it at least will have everything updated 1 time per hour... Another thing that I thought about using once there are people using the application is the LazyCron module, that might help speed things along as long as I can put something in the code that regulates requests to Amazon (because they will only allow a certain amount per sec and per hour) Anyhow, I am going to run with these suggestions and see what I can put together. Thank you guys! 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now