We proudly give you our best version of Mozenda yet. You asked, and we listened. So without further ado, here's a rundown of our recently released features. Enjoy!
Capacity, capacity, capacity
Here at Mozenda, we just threw open the floodgates. Before now, some scraping jobs may have been queued on high-volume, high-traffic sites. Now you can assign and run several agents at the same time on those types of websites. Go ahead, have at it.
Load a page, any page
The new load page action lets you load any webpage from anywhere within an agent. For example, in the middle of an agent, you might need to load a web page without using a Click Item action. Now you can just insert a Load Page action and specify a URL. This can be helpful when, for instance, you need to be logged in to a website before your agent can perform subsequent actions.
To scrape, or not to scrape -- Anonymous
Although running agents using anonymous proxies isn't new to some Mozenda users, we thought we'd let you know how recent improvements in our anonymous data extraction can affect your web scraping capabilities. See details.
More value from your values
Now you can use the values that you've captured as variables in your agent. You can use these variables to set a user input, addend a URL, load a page (see previous paragraph), etc.
All together now
We're really excited about our new image/file publishing capabilities. If you build an agent that downloads files or images, then you will see a new checkbox (see image below) in the Publish dialogue that says, "Publish image and file packages". If you check this box, all images (for the view you select) will be published along with your data, on the schedule that you choose.
Takes all kinds
Mozenda will now automatically detect which version of Internet Explorer your Windows machine is running (typically 7 or 8). Once the version is determined, Mozenda will designate a server that is optimized for your individual scraping needs.
I think I hear my API calling
We've introduced a whole slew of useful new API calls. Click here to see expanded details. You can also learn more about these on the API documentation page.
The following API calls have been made available for your convenience. Enjoy!
Agent.Run - You can now have an agent running multiple times in the system. This can be accomplished by calling the Agent.Run command successively using the API; you do not have to wait for a previous job from an agent to finish before initializing a new job for the agent.
In the past, you could only perform the following four actions through the web console. Now you can perform them through the API as well:
Collection.Add - With this new call, users can add entirely new collections to their account. This is really useful now that you can tell an agent which collection to use for inputs when it runs.
Collection.AddField - Allows users to add fields to existing user-created collections.
Collection.DeleteField - Allows users to delete fields from existing user-created collections.
Collection.SetUniqueFields - Allows users to programmatically set the unique fields for the Collection.
Collection.Publish - Users can now initiate FTP publishing for an agent at any time. This will be very useful for users that are getting large amounts of data back via the View.GetItems call. Downloading millions of records will take hours using the API. With the Collection.Publish call, it can publish a file in minutes. Then, you can load the file on the user side and process it.
View.SetFields - Allows users to select the included fields and their order in the specified view. Now you can include system fields in your views without having to go into the agent and manually editing the view in the Web Console.
Job.GetList - This call will allow you to find out all the jobs on your account. You can filter this list by supplying the Job.State parameter to specify Active, Archived, or All jobs. This is useful for knowing when agents are running, so you can determine whether or not you can run an agent.
Collection.GetFields - There is a new parameter called "Include" that can be supplied to determine which fields to get back—either all fields or the unique fields. It defaults to "All". In the past, you had to filter the list afterwards to determine the Unique fields for a collection.
Agent.GetJobs - This call now has a Job.State parameter that can be passed in to determine whether to only get Active, Archived, or All jobs for an agent.
You can learn more about these and other API calls on the API documentation page: http://www.mozenda.com/api.
Our many thanks to Tom at CodeSanity.net for his killer Mozenda review:
"Mozenda is a very powerful data scraping service. If you have ever found yourself writing scripts or manually copying and pasting data from one website to another then mozenda is for you. They have a very nice, full featured REST API which will be the focus of this article." Read more...
Tom wrote a nifty CodeIgniter Library (PHP) to easily interact with our Mozenda API. Download it here.
We look forward to his launch of MyGov365.