We proudly give you our best version of Mozenda yet. You asked, and we listened. So without further ado, here's a rundown of our recently released features. Enjoy!
Capacity, capacity, capacity
Here at Mozenda, we just threw open the floodgates. Before now, some scraping jobs may have been queued on high-volume, high-traffic sites. Now you can assign and run several agents at the same time on those types of websites. Go ahead, have at it.
Load a page, any page
The new load page action lets you load any webpage from anywhere within an agent. For example, in the middle of an agent, you might need to load a web page without using a Click Item action. Now you can just insert a Load Page action and specify a URL. This can be helpful when, for instance, you need to be logged in to a website before your agent can perform subsequent actions.
To scrape, or not to scrape -- Anonymous
Although running agents using anonymous proxies isn't new to some Mozenda users, we thought we'd let you know how recent improvements in our anonymous data extraction can affect your web scraping capabilities. See details.
More value from your values
Now you can use the values that you've captured as variables in your agent. You can use these variables to set a user input, addend a URL, load a page (see previous paragraph), etc.
All together now
We're really excited about our new image/file publishing capabilities. If you build an agent that downloads files or images, then you will see a new checkbox (see image below) in the Publish dialogue that says, "Publish image and file packages". If you check this box, all images (for the view you select) will be published along with your data, on the schedule that you choose.
Takes all kinds
Mozenda will now automatically detect which version of Internet Explorer your Windows machine is running (typically 7 or 8). Once the version is determined, Mozenda will designate a server that is optimized for your individual scraping needs.
I think I hear my API calling
We've introduced a whole slew of useful new API calls. Click here to see expanded details. You can also learn more about these on the API documentation page.
Note, this post has been updated as of January 2009 to reflect changes in the Web Agent Builder version 1.8.128.
I recently added this entry as a post in our new forums (which by the way, we are very excited about!) and decided it deserved some attention here as well, given the increase in queries we receive about AJAX (no, not the cleaning powder) and how to handle it in the Web Agent Builder.
----
As web technology advances, many sites are using more advanced methods to display web content. For example, when using our web application to view a list of agents, when you click on an agent, the list disappears and is replaced with the details about the agent. The browser does not navigate to a new page, but rather, the webpage itself requests the new information from the server in the background and displays the new information by changing a part of the current webpage. This technology is know as AJAX (asynchronous JavaScript and XML, don't worry, it's not as scary as it sounds).
Because agents are primarily built using a Page structure, sites that rely heavily on AJAX to display content can be tricky. However, in most cases, sites that use AJAX do so lightly, and an agent can be designed to handle them. Here are a list of cases where AJAX is most frequently used:
1) Paging a list of results (for example, clicking [b]next >>[/b] to get to the next set of results from a search). When paging, the next list of items simply replaces the old list without causing a new page to load. [action: Page List]
2) Clicking a list item causes the item details to appear somewhere on the same page. Often, there is a designated part of the existing page that the information appears in, or a box containing the information appears on the page covering the list (with some sort of 'close' button or link that causes it to go away). [action: Click Item]
3) Selecting a value from a drop-down causes a part of the page to change, or the values of another drop-down to populate (for example, selecting the automobile manufacturer in one drop-down causes another drop-down to populate with the available manufacturer models). [action: Set Element Value]
4) After a page loads, some of the page contents take additional time to finish loading. This is often manifest when testing the agent in the builder. An Item Not Found error will occur for the first action on the page. [action: Page]
All cases can be handled by telling the agent to wait for AJAX to complete before proceeding to the next action. Most actions contain a property titled 'Wait for AJAX to alter the current web page'. This can be set by either double-clicking the specific action (or right-clicking the action and choosing 'properties') and clicking the 'Additional Settings' button in the properties panel. If this property is checked, the action will wait 2 seconds for AJAX requests by the webpage to begin. For example, if I have a Click Item action with this property checked, the action will wait up to 2 seconds for the page to begin making AJAX requests, and then any additional time it takes for any AJAX calls to complete. So, in reality it may take less than a second after performing the click action for an AJAX call to begin, but a total of 5 seconds for the AJAX call to finish. The next action will not be executed until any detected AJAX calls have completed.
On the other hand, the Wait x seconds before performing the next action property of an action waits an absolute amount of time. You can also force an agent wait an absolute number of seconds by inserting a Wait-Seconds action anywhere within your current list of actions. This can be done by right-clicking an action and choosing 'Insert a Wait-Seconds action after this action'.
Many customers have asked if it is possible to use a list of Url’s in an agent to navigate through. Absolutely! This is not much different at all from using any of the other Data List functionality that many of you are already familiar with. To do this simply direct the Mozenda Browser to http://www.mozenda.com/builderwelcome and create an input list on the url text box. From there you can create a click item on the ‘Go’ button and that will take you directly to that url. From there you can perform actions just like any other agent. For additional help setting up similar agents, please call support!