YQL opens up 3rd-party web service table definitions to developers


YQL opens up 3rd-party web service table definitions to developers

The Yahoo! Query Language aspires to be the last web service API that the normal developer will ever have to learn. By default we implement 50+ tables that grab data both from Yahoo! web services, some 3rd party web services and then the web at large using our dynamic tables that allow you specify a data type and a url. However, those dynamic APIs limit the YQL user to a very flexible but ultimately hard to work with API without the benefit of the structure found in the other tables that we offer.

Today YQL introduced a new feature that allows 3rd parties to define new tables and then share those table definitions with whomever they like for them to use. For example, let’s say you are the New York Times or you are a developer that likes the New York Times APIs and would like to make them more accessible to someone using YQL. Yesterday, they released the article search API, so I will use that one among others as an example, to get an api-key to execute these examples go to their developer site. This is a pretty sophisticated API that allows you to search using a variety of parameters. If you were to use YQL without modification, you would simply use the dynamic JSON endpoint to parse out the results from their service. The big issue with this though is that you would be unable to easily construct the URLs required and would have to write that code that collected all the parameters and created the URL. If you had a YQL table, those parameters would be defined and how they are expressed in the URL codified and you would be able to individually address the keys.

So without this ability you would use something like this:

select * from json where url=’http://api.nytimes.com/svc/search/v1/article?api-key=...
&query=yahoo&begin_date=19990112&end_date=19993112' and
itemPath=’json.results’

If we instead defined the API using the YQL open data tables specification you would be able to do this (by the way, don’t read the headlines, you’ll just be depressed):

use ‘http://www.javarants.com/nyt/nyt.article.search.xml' as articles;
select * from articles where apikey=’…’ and query=’yahoo’ and begin_date=’19990112' and
end_date=’19993112'

Why is this superior? For a number of reasons. In the second case not only does it make it easier for anyone to use it, it also brings those keys from the query into columns which allows you to do joins that you cannot do with the first abstraction. Here is an example join:

select * from bs where apikey=’…’ and query in (‘yahoo’, ‘google’, ‘microsoft’) and begin_date=’19990112' and end_date=’20000101'

This will actually do 3 searches for you in parallel and then return the combined result. By creating a YQL open table we can really off-load processing that you would normally do on your client or server to the YQL engine. You’ll note with that query that the user-time spent is actually about half or less than the actual service-time thus drastically decreasing the latency through asynchronous processing.

One of the really nice APIs out there is FriendFeed’s API. It really is very well designed and easily works with straight-forward table definitions. Here is an example of how to get the public feed:

use ‘http://www.javarants.com/friendfeed/friendfeed.feeds.xml' as ff;
select * from ffconsole

By defining different endpoints with the same table definition we will automatically select the correct API based on the keys included in the query. Using that same table we can also get my public entries from twitter:

select * from ff where nickname=’spullara’ and service=’twitter’console

This won’t include my private entries though. However, if you had my remotekey you could generate and pass YQL the authorization header required and it would pass it on to authenticate the API call. Another popular example that has been on the forums is the ability to use weather.com’s API to tease out international locations. That is actually really easy and you can even use that data to join with their weather forecasts. Here is an example where we pull the weather for all the Moscow’s of the world:

use ‘http://www.javarants.com/weather/weather.search.xml' as ws;use ‘http://www.javarants.com/weather/weather.local.xml' as wl;
select * from wl where location in (select id from ws where query=’moscow’)console

I’ve actually started up a project on github called yql-tables to store useful table definitions and will be taking submissions from the community. You can try them out by ‘use’ing them directly from the git repository or by pulling them onto your own server accessible from the YQL servers.