-
Posts
70 -
Joined
-
Last visited
Everything posted by Crssp
-
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
The thing that is escaping me with say the Simple HtmlDom Parser, is how to output the results in anything more than a page? What my desired output type or collection would be CSV?? Newbie showing here. I don't see how it goes from scraping a page to being batch data? Not that many docs on that in the wild. Thanks for the pointers. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
I could do a great deal to batch edit the news articles with say Notepad++ and regular expressions, using search and replace, couldn't I? I have direct access to these files and could do that locally. Strip out everything not needed but the article, and anything else useful. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Would this tutorial be beneficial for me, references @Soma suggestion: http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/ Manual for the resource: http://simplehtmldom.sourceforge.net/manual.htm Worth a read for me I'm sure. The screenshot in the previous post of the folder structure, is accurate then for pretty much the only place for me to grab the data from. Again no images for any of these stories, whatsoever. At least it zeros me in on a plan and path forward. Thanks again everybody for all of the direction/recommendations. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
You guys are blowing me away with the awesome suggestions. I think all I've got to work with as far as trying to pull the archive stories, is the more complex .asp templated versions or sports4 in other words. Those type stories are the ones referenced, in my first post of the thread by year, then month, then day for the issues directory paths to those files. I'm going to attach a screenshot of todays directory date the 11th. There are zero images to worry about in any stories, except future stories with a new system. The old site had homepage images daily but those were not archived sadly. Unless the IT guys can surprise me and come up with the more simpler text version, I don't have copies of those for import purposes. Attached is a screenshot of today stories then. There is no set number of stories per News, sports or other story types, they rarely go over 10 stories, per type. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Hey guys sorry for my confusion. I have the raw text of a sample article. The story gets a little longer here. News2 sample is raw text that will gets written to the www. root of the site, but when it does a bunch of templated .asp junk gets added to it, which brings us to story4 sample. This sample is a templatized version, and what is possibly how all the data exists, in these .asp extension story pages (very messy, but might be the only data source for past archives). So News2 is the daily output of an article, as it comes from the newsroom... And the 2nd file Sports4-sample.txt in the directories has an .asp extension, and is the templatized page. The 2-files should be there now, looks like the attachment uploader was not working for me in Chrome. news2-sample.txt sports4-sample.txt -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Thanks all there was no attachment. I realized the articles at that path while they are live, contained some asp templatized code bits (the bad news). The good news is you all have been a great help already. The article title was wrapped in html span tags, should be accessible to a script. Maybe today I can find the non-templatized archive. The forum works great on mobile by the way -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Turns out I was a bit off on these, but I think the true text exists elsewhere in the same format but I have to locate that server and hence the text files. These are from an asp page, that has just a bit of other stuff (junk) that gets in the way for sure, but here's the article text. The headline is pulled in as below: School leaders: Casino tax money doesn't offset cuts I've learned though that a second program adds some of these templating items, so I may be able to get even more raw text files than that. That would only have line breaks and and or tags. I got brave and asked one of the IT gents, where this directory is lurking on the network. I'll get back when I know for sure what I've got to work with, the other more raw text will probably not be saved with the story numbers, and all the file names, might be more easy to work with, each article on it's own line. Stay tuned, and thanks for pushing me one and all. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
How do I make that path dynamic, will the docs show any good examples? There's quite a few variables in the path that will increment, etc. Some way I could just pull the date string, that creates the path portion such as 2013/Jan/08/ That is why I had the paths in my first post. /issues/2013/Jan/08/ar_news_010813_story1 Thanks one and all for the suggestions and schooling. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Awesome thanks Soma, the example really gives me a much better idea. This one is pure php is it not? Where how would I run such a script then, I'm as green as can be at this stuff. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Back to my original question about including text files for articles. It was probably poor practice of me to mention another product CMS, if I wanted a creative solution. But does anybody have anything regarding pulling in Text files dynamically, must be something? If the old boys could do it in .Asp must be something, not saying it's the best practice or anything. Thanks again, importing to db is best idea so far, I'm sure there is currently no DB at least not for loading the articles in a live page. @diogo It shouldn't be too difficult to come up with a script to import the text in the flat files to pw pages. You were suggesting importing to the database or to pages then diogo? -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
All of this brings me to another point, there could be a database somewhere, if there is, it isn't going to be mySQL though I don't think. One of the IT guys here might be able to answer that for me. It's a bit of a nightmare picking the website we have now apart. It's got classic .asp components, and dot net article redirector doo-dads all built in. Everybody poo-poo'ld PHP and/or open source technologies in the past MS technologies escape me totally and they reinvent the wheel all the time over there at that $$$ shop [insert more microsoft rants, lol...]. I don't seem to be able to get my head around any of the data-bits regardless. Everything magically works, on the current site, but we are getting tasked with updating the behemoth. The current web head, thinks we can pull what we need via the working RSS feed, I have my doubts on that. -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
Our archive goes back digitally to the 1998 or something, not sure we will want all of that. Manually copy pasting isn't practical at all. Hence I was liking the flat files way of access. Site search is a great point. We have an existing website product, but the whole thing there is built on .asp technologies. We've lost the keys to that car, you might say through staff attrition. Thanks again! -
Running a Daily Newspaper website with Process Wire.
Crssp replied to Crssp's topic in Getting Started
The News articles all get pushed out from another system entirely, and shoved to a server on the network. We still have a healthy print product, believe it or not. They could be imported into a database, but my designer brain blows up at that point, lol. Thanks for your reply Joss. Most Stories have no image associated in this hard-boiled news product. The stories begin their life in Adobe InCopy before being pushed into server land. Sorry for the vague answers, it's only because my understanding of parts of the process are that vague. -
Hey all, I've been asking this question around in different CMS forums. After discovering a newbie on the block aka Statamic, it sort of clarified what I need to do which is work with Flat Files, for some of the data. Is there a way with Process wire to dynamically pull in or include flat file text files for the news articles from the folder structure where they are stored Our Text file stories get archived in folders with paths such as: /issues/2013/Jan/08/ar_news_010813_story1 /issues/2013/Jan/08/ar_news_010813_story2 etc. /issues/2013/Jan/08/ar_sports_010813_story1 and so on. Any way you can see to use Process wire to pull in the articles wrapped in Text tags <!--Text--> to other templates. The story text files only include strong, bold, Italics, or EM tags and line break html tags. From there you have a daily index, also archived, with read more links to get to each individual story from a combined index/ front page. Complicated beast of a project, at least for me... Any thoughts, advice, strategies, recommendations are welcome Anything I can clarify let me know... -Crssp
-
Awesome find on the links for packagaing and developers. Maybe I will see what I can find out with our web host, to see if they selectively ad scripts, or whether once you are on the script list, the package would be added and available to be installed on your web hosting accounts. When installing wordpress for example, there are checkboxes whether you want certain themes and add-ons installed with it. So effectively for the simple scripts the commerce is in the add-on advertising and affiliations. I like that commerce model.
- 3 replies
-
- Installation
- Fresh Installation
-
(and 1 more)
Tagged with:
-
Does anyone know what it would take to create a SimpleScripts installation script for ProcessWire. The list of existing applications you can install on your server hosting with just a couple clicks is here: http://www.simplescr...com/script_list It would be awesome to see ProcessWire listed there to be able to use. My web host Bluehost has those scripts installed, WordPress fires off without a hitch, and you can install a fresh install in just seconds. Some of the others the configuration maybe needs tweaked, and are not quite as foolproof. thanks!
- 3 replies
-
- 1
-
- Installation
- Fresh Installation
-
(and 1 more)
Tagged with:
-
Have there been any thoughts on how or if this could be a publically available service? I don't think I would ever setup a URL for public use, with so many out there, but would it be doable. The idea now is it is more of a backend resource rather than frontend, does that sound right?
-
I have a quick question (sorry if it was covered I missed it). If I install Processwire in a directory on my site, let's just say a folder named 'i'. Would there be a way then that the path to the short link would not include the i in the path, but be made to just use the domain root? For example: Crssp.com/i/ujn3gi vs. Crssp.com/ujn3gi Not really a big deal, I'm thinking I will install Processwire in a folder, maybe I should even label the directory pw.
-
Sweet thanks for going to the trouble NetCarver, I need to change my forum posts to notify me by email of replies. Almost didn't see this. Awesome job, thank you very much! -ty
-
Intrigued by this module. Could anybody provide some screenshots of the interface in action? It would really help people decide to participate and install the URL Shortener, me thinks. thanks!