4 Simple Ways to Future-Proof Your Agents to collect accurate data
January 31, 2017
Collecting accurate data using any tool can be challenging. And when missteps are taken when the process is just getting started, the odds of eventual headache increase dramatically.
We here at Mozenda understand that time is valuable, both right now and in the future. Based on user feedback, we’ve come up with a list of useful suggestions to keep your agent-building on the right track and avoid having to put in extra time to address problems later on.
1. Collect Everything Now, Trim Later
When starting an agent, some users are tempted to go in and get out with the bare minimum of data needed for a project. Although the minimalist mindset behind this approach is understandable, some information that is an afterthought now could become very important later. We routinely encourage our users to gather as much data as possible up front so that it can be cleaned up appropriately later.
For example, let’s say someone is collecting product names and prices. Although additional details may not seem that important, what if there are multiple products with the same name but different pricing due to technical specifications? Collecting more fields would be the only way to further break down these data points, and could be done very easily using a name-value pairs action or even collecting the source code of the entire web page in case it needs to be referenced later.
This is similar to the “measure twice, cut once mindset”. Moving too quickly on a project can lead to problems later on, and in the case of Mozenda, not gathering enough data the first time could lead to wasted time and additional page credit usage if work needs to be repeated.
2. Rename Images and Files During Collection
If your agent is downloading images or files to your account, the filenames could be based on an arbitrary naming convention. This can lead to a lot of time invested in sorting things out later when the files are needed.
Mozenda can be easily configured to automatically rename images and files as they are being collected. This simple setup can save a lot of time in the long run, especially when a large number of files are being collected for future use.
3. Use Direct URLs Wherever Possible
Websites that have multiple categories or locations will often use URL-based ways to navigate to a specific part of the website. This means you can either enter in the direct URL using preset values or collect some basic data beforehand and use it to accomplish the same goal.
As an example, the Walmart website uses the following structure for checking store inventory:
The store number is at the tail end of the URL, and performing a search for tablets yields the following:
With a little trial and error, we can simplify the URL for the exact same results:
This process can be automated for any number of product queries at multiple store locations.
In the end, the process of navigating a website can often be streamlined to produce reliable results with less effort. There are multiple benefits to this approach: it takes less time to build an agent (since no on-site navigation is required), the agent will process faster and likely with fewer errors, and it can potentially lead to reduced page credit usage.
4. Implement Error Handling
We’ve written about this topic before, but it bears repeating: proper error handling is the only way to guard against missing data, poorly-built websites, and many other issues that can come up while building an agent.
First, you should have a solid understanding of the error handling options in the Web Console. Building an agent in the Agent Builder and running it in our cloud can yield different results, and knowing how our system handles potential hiccups can be crucial in succeeding with Mozenda.
Second, while building an agent, there are a variety of ways to work around missing data. In this past article, we covered the ways this process can be approached and which options are appropriate for certain situations. To summarize, simply marking an action as “optional” when an element is missing from a page is the wrong way to go it you want comprehensive feedback. By using a dedicated error handling page, you can have a field dedicated to providing an explanation for why some information isn’t found (e.g., “out of stock”, “no products in category”, and so on).
We hope that the advice in this article can help save you time and effort in gathering important data. Stay tuned for more tips and tricks from the Mozenda team!
Have questions about the above list or comments for us? Contact Mozenda support at firstname.lastname@example.org or by calling +1.801.995.4550, option 2.