Nov 18, 2009

Parsing Errors

While parsing our xml data to display it, our script sometimes used to stop working. This inpredictable behavior was very strange and it was difficult to find the problem. The error occured at the Ebay part. First we thought that it might be a problem with the Ebay server. Monitoring the parsing gave us only a small hint.
It must be a problem with the picture URL from the articles.
Working through the Ebay-response, we realized that tags "title", "currentPrice", "url" are mandatory. Some tags, like the "galleryURL" are not.

<item>
<title>Best 5 LED Bike Bicycle Tail Light Lamp &amp; Bike Holder</title>
<galleryURL>http://thumbs4.ebaystatic.com/pict/3204463049958080_1.jpg</galleryURL>
<viewItemURL>http://cgi.ebay.com/Best-5-LED-Bike-Bicycle-Tail-Light-Lamp-Bike-Holder_W0QQitemZ320446304995QQcmdZViewItemQQptZCycling_Parts_Accessories?hash=item4a9c1692e3</viewItemURL>
...
</item>
For parsing, we use this code:
var text="<table border='1'>";
for (var j=0;j<xmlData.getElementsByTagName("title").length;j++){
//loop for iterating over the item's title, URL, pic, and price
text=text+"<tr><td><a href='"+
//starting a new table row and starting the link
xmlData.getElementsByTagName("viewItemURL")[j].childNodes[0].nodeValue+
//gets the first URL
"'target=_blank>"+xmlData.getElementsByTagName("title")[j].childNodes[0].nodeValue+"</a></td><br />"+
// gets the first Title
"<td><img src="+xmlData.getElementsByTagName("galleryURL")[j].childNodes[0].nodeValue++" /></td>"+
// gets the first pictureURL
"<td>"+xmlData.getElementsByTagName("currentPrice")[j].childNodes[0].nodeValue+"</td></tr>";}
// gets the first Price, loop starts again, or ends.
text=text+"</table>";
// closing HTML table-tag
appDiv=document.getElementById("itemContent");
appDiv.innerHTML=text;

Today we implemented yahoo shopping. While parsing the shopping articles with this code:
for (i=0;i<Json_data.length;i++)
{
textYa=textYa+"<tr><td><a href='"+
//new tablerow, starting link
Json_data[i].Offer.Url+
//getting URL
"'>"+Json_data[i].Offer.ProductName+"</a></td>"
//getting Product Name
+"<td><img src="+Json_data[i].Offer.Thumbnail.Url+" />
//getting pic URL
</td><td>"+Json_data[i].Offer.Price+"</td></tr>";
//getting price
;}
}
textYa=textYa+"</table>";
YahooDiv=document.getElementById("yelpContent");
YahooDiv.innerHTML=textYa;
We had the same problems with this yahoo like we have with Ebay. We discovered that Yahoo Shopping returns not only "Offers" but "Catalogs" with not usable data. Now, before we try to access the data, we check for an Offers-object. If true, we work on. If not, we skip it. Checking for a picture URL works the same way. If it is undefined, we return only a blank.
With yahoo-shopping, we are working on the objects, things are not that easy with Ebay. As we use the getElementsByTagName()-function, we pick out all tags with the given tag-name.
This means that there are no gaps in the galleryURL. If one item does not have an galleryURL, the system takes the next one. In the case that, at least, one article has no galleryURL, the array of Elements is shorter than the array with the title-tags, causing the exception.
The workaround at the moment is to check for access to undefined variables and to avoid this by returning an empty string. It adds entries in the end of the array, if needed.

Maybe one of you has a good idea how to fix this bug. My idea, taking the item-tags and moving through the childs does not work...

Update:
Now, I got things working. Using try-catch parts allows us to insert a galleryURL everytime an exception is catched.
try {
if (items[no].getElementsByTagName("galleryURL")[0].childNodes[0].nodeValue==undefined) {
gal_Url=" ";} else {
gal_Url=items[no].getElementsByTagName("galleryURL")[0].childNodes[0].nodeValue;
}
}
catch (e) {
gal_Url=" ";
}


Code, written in a try-block is treated specially. If an error appears, the compiler does not stop the script, but it goes to the catch-part of the code. An additional option is to use a finally-part to perform some clean-up works that should be performed in the error or in the usual case.

5 comments:

  1. To iterate on this issue further, look at the first image from Jassin's post. You will notice the <galleryURL> tag. When an item in ebay doesn't have a thumbnail photo, instead of leaving the tag as <galleryURL> </galleryURL>, Ebays XML simply doesn't display the tag at all. Therefor if we used a if statement to handle a null value, it won't work because it won't find a <galleryURL> to evaluate. . . This is also why Jassin's childnode idea didn't work, because there is simply no child node in the absence of the <galleryURL>.

    In short, this is a frustrating problem! Any help would be great!

    ReplyDelete
  2. Have you guys tried wrapping your desired if statement within another if statement that first checks if galleryURL exists? That would be how I would approach it. Does that make sense?

    ReplyDelete
  3. I don't know about your problem yet but you helped me with one of mine! I was wondering how to open a _blank document while parsing the XML document. Very usefull.

    ReplyDelete
  4. I looked a little closer because I want to try and get my code to use the target=_blank and I noticed you have: "'target=_blank>" Is this a typo? Do you have a "'" instead of a "<"?

    ReplyDelete
  5. I finally figured out what you are doing, after playing for a bit, so I was wrong about your code. You need the ' in there to make the target=_blank document. I got my stuff working and I am very happy for your post. I don't know if I would ever have figured it out. Thank You.

    ReplyDelete