Jump to content

how would you extract the content of this website?


bernhard
 Share

Recommended Posts

hi,

i've a request for a iamge-website of a company that rents real estate for businesses. the properties are listed on this site: http://netmakler.at/netmakler.at/index.jsp?menuId=2&internalId=search > gewerbe > Niederösterreich > Wien-Umgebung > suchen

the problem is, that this website uses jsp and the overview-site does NOT have its own url. the overview site looks like this:

2016-07-15 14_41_39-160714_Protokoll_und_Konzept.pdf - Foxit Reader.png 

 

i would like to keep those items in sync with my client's website. is this possible somehow? maybe using phantom.js or the like?

 

the single properties have their own url - although that's also just visible when clicking on "share via email" and copying the link on the top:

netmakler.gif

http://netmakler.at/netmakler.at/index.jsp?objektId=1634410

any hints would be welcome :) thank you!

Link to comment
Share on other sites

Perhaps not particularly helpful (sorry), but to be honest this sounds like something you should always do via an API. If the company hosting those properties intends them to be available to others this way, usually they would provide an API to said content. If they don't, it's possible that they wouldn't like anyone crawling their content either :)

With some quick googling I found some hints that the company behind the actual platform (Immformer) might offer additional features etc. so I wouldn't be too surprised if they offered, either as a part of the platform or as a plugin/addition, some way to export entries.

I don't know anything about your client's relation to the site in question and the relation of that site to the platform they're using so it's very difficult to say who you would contact in this case, but anyway: in my opinion the only future-proof, stable solution you're going to find would need co-operation from the management of the NetMakler site and perhaps the provider of the platform they're using.

Anything else is, at the very least, very likely to break every time something changes at the site :)

  • Like 3
Link to comment
Share on other sites

 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...