To get things started in this forum, I'm going to post some of the Q&A from emails that people have sent me.
That's a good question. For most cases, I would not try to create a different page structure for years/months because that's more work and it invites the possibility that the links will change over time as you move pages to archive them.
After using different strategies over a few years, I think it's better to have a single container page under which all news items will be placed, permanently. Then set that container page's default sort to be by date, so that newer items always appear first. From there, the news page should focus on the newest items, and provide links to past news whether by pagination, month/year links, etc. In this respect, you can treat a page in PW2 in the same manner that you would treat a channel in EE2. There is no problem with scalability… PW2 will run just as well with 10k+ subpages as it would with 10.
When following that suggestion, these are the only scalability considerations:
1. You may occasionally have to consider the namespace with the URL "name" (aka "tag" in PW1) field. For example, if you have two news stories with substantially similar headlines, you may have very similar URL name fields. But ProcessWire will enforce that namespace, so it's not going to let you create two pages with the same name under the same parent. What that means is it'll make you modify the page's URL name to be unique, if for some reason it's not. I'm talking strictly about URL names, not headlines or anything like that.
2. When dealing with news stories (or any place that may have a large scale of pages), you want to pay attention to how many pages you load in your template API code. If you are dealing with 10k news stories, and load them all, that will slow things down for sure. So the following applies:
Avoid API calls like this:
1. $page->children
2. count($page->children)
Instead, use API calls like this:
1. $page->children("limit=10");
2. $page->numChildren
$page->numChildren is something that ProcessWire generates for every page automatically, so there is no overhead in calling that, like there is in count($page->children). But if working at a smaller scale, it doesn't really matter what method you use.
The most important thing to remember is just to use "limit" in your selector when potentially dealing with a lot of pages. That applies anywhere, especially when making things like search engines. So when you want to support unlimited scalability, these are the functions where you want to place limits:
$page->children(...)
$page->siblings(...)
$page->find(...)
$pages->find(...)
The other reason someone might like archiving to year/month is because it makes it easier to link to since you could just link to /news/2010/04/ for instance. But it would be a fairly simple matter in PW2 to make your template look for "month" and "year" get vars, or a URL segment, and produce results based on those… See the other topic in this forum for one way that you might do that.