Error Handling Settings
July 06, 2016
There are several different settings available on the Web Console that help you control the way an agent job runs on the Mozenda harvesters. This article outlines those settings, when they’re useful, and what they do.
Open an Agent’s Error Handling Settings
Open an agent in the Web Console.
Click the Tools icon.
Click the Error Handling tab.
The error handling tab gives you control over how the agent responds to an error. There are two types of errors that an agent may encounter: (1) an agent error, and (2) a website error.
An agent error occurs when the agent encounters something it didn’t expect, or could not find an object on the webpage. These errors can be ignored using the settings in this window or resolved by making changes to the agent itself.
A website error happens when the website that the agent targets is unavailable, or the webpage itself has malfunctioned. These are usually one-time events, and are most often resolved by resuming the agent job.
The settings on the Error Handling tab allow you to specify how the agent should react to the error types outlined above.
When an agent error occurs
There are two options available in this drop-down menu: (1) Attempt to ignore the error and continue, and (2) Stop the job so the agent can be fixed.
Attempt to ignore the error and continue. Use this setting if the agent will regularly encounter common error types (such as “Element not found”) and this is a known and acceptable behavior.
You may also want to use this setting if you’re scraping a large number of records and would rather lose occasional records than intervene manually every time an item is not found.
Stop the job so the agent can be fixed. Use this setting if you do not expect the agent to encounter an error and accuracy is a high priority. The agent will pause when it encounters an error, allowing you to modify the agent so that the error will no longer occur, and resume the agent where it left off.
This setting can also be used if you plan to postpone or cancel the job when an error occurs by adjusting the settings in the Instead of stopping the job with an error section.
When a website error occurs
There are three options available in this drop-down menu: (1) Let the system decide what to do, (2) Ignore the error and continue, and (3) Stop the job so the agent can be fixed.
Let the system decide what to do. The system may or may not stop the job with an error, depending on the type and persistance of the error, such as navigation errors, or text that cannot be found after multiple attempts. This is the default setting.
Ignore the error and continue. Use this setting if the website frequently breaks or returns errors, but you have found that errors from that website do not necessarily make the data inaccessible.
Stop the job so the agent can be fixed. Use this setting if you anticiapte that any website errors that occur (such as HTTP errors) were caused by the way the agent runs (such as moving from page to page too quickly) and can be resolved by modifying the agent.
Instead of stopping the job with an error
This setting will apply to either of the drop-down menus above that are set to Stop the job so the agent can be fixed. If enabled, this setting provides an alternate error handling to stopping the job:
Postpone the job for 5 minutes. Use this setting if resuming this agent after encountering an error has been an effective solution in the past, as this automates that process. Increase the number of minutes if you suspect that the website might catch the agent (generally not necessary if the agent uses premium harvesting).
Cancel the job so the agent will start from the beginning the next time it runs. Use this setting if you no longer need data that would appear after an error has occurred.
You may also want to use this setting if you know that—after extensive testing—the errors the agent will encounter are one-time errors that can be resolved simply by resuming the agent job, and the target data is so time-sensitive that the time it takes to fix an agent after an error occurs would render the data obsolete.