Rants about Java and other internet technologies by Sam Pullara

Using Google App Engine to Extend Yahoo! Pipes

was published on April 13th, 2008 and is listed in Technology

Update: A commenter pointed out that you can

from django.utils import simplejson

instead of including it. Makes this even easier.

Yahoo! Pipes has always been a great tool for manipulating data but often you have to go to great contortions to get it to do what you want because of its very simple data flow programming model.  Google’s App Engine opens up the possibility of extending Yahoo! Pipes in very interesting ways through Pipes’ Web Service operator.  Currently this operator sees little use as it requires you to be running an external server somewhere on the internet that is always available for the Pipe execution which is quite a high barrier to entry for the typical Pipes developer. Here is what a Pipe that is using web service looks like and our example pipe:

Web Service Pipes Example 

With the launch of Google App Engine there is now a very simple way to get code up on the internet quickly in order to include arbitrary processing in the interior of your Pipes.

To demonstrate how this works, let’s first build a very simple web service that simply mirrors the data that it receives from Pipes.   If you don’t have a Google App Engine account you can still follow along by download the SDK and executing all the stuff locally though it will have to be accessible from the public internet if you want Pipes to send you requests.

First create a new application directory:

mkdir pipes-mirror
cd pipes-mirror 

Now create an application descriptor called app.yaml:

application: javarants
version: 1
runtime: python
api_version: 1

handlers:
- url: /.*
  script: pipes.py

This application descriptor basically tells Google how to deploy your application. Your application name should match an application name that you create within the GAE administration tool:

Application Name

Now we need to process the data coming from pipes. Pipes is going to pass this web service some data in JSON format and we need to parse it. GAE doesn’t include ‘simplejson‘ in the Python container so you are going to have to include it with your application. I downloaded simplejson-1.8.1 and symbolically linked its simplejson directory into my application directory. When the request comes in the JSON data will be in the ‘data‘ parameter so we are going to pull it out, parse it, grab the items array and write it back over the wire in pipes.py:

import simplejson
import wsgiref.handlers

from google.appengine.ext import webapp

class MirrorPipesWebService (webapp.RequestHandler):
	def post(self):
		data = self.request.get("data")
		obj = simplejson.loads(data)
		obj = obj["items"]
		self.response.content_type = "application/json"
		simplejson.dump(obj, self.response.out)

def main():
  application = webapp.WSGIApplication([('/mirror', MirrorPipesWebService)],
                                       debug=True)
  wsgiref.handlers.CGIHandler().run(application)

if __name__ == "__main__":
  main()

Now you should have a directory structure that looks a lot like this:

-rw-r--r--@ 1 sam  sam  106 Apr 13 18:55 app.yaml
-rw-r--r--  1 sam  sam  559 Apr 13 19:28 pipes.py
lrwxr-xr-x  1 sam  sam   47 Apr 13 17:40 simplejson -> /Users/sam/Software/simplejson-1.8.1/simplejson

Now that we have all the pieces we can deploy the application to GAE with a simple command from the GAE SDK:

appcfg.py update .

At this point you should be able to replace my web service URL that you find in my example Pipe with your application URL which will be

http://[application name].appspot.com/mirror

and get the same results as mine.

What kind of uses can you put this great power? I currently have a web service that I run that combines RSS entries from the same day into a single entry and have it deployed on my own server. I will likely port that to GAE as it doesn’t require a lot of CPU and it is a pain having to administer it. In fact, most of the functionality that you see in a service like FeedBurner would be easy to build on top of this framework. More exotic use cases can be found on Y! Pipes itself where at least one person uses web services to pass in photo URLs and return the coordinates of human faces in the images.

"Using Google App Engine to Extend Yahoo! Pipes" was published on April 13th, 2008 and is listed in Technology.

Follow comments via the RSS Feed | Leave a comment | Trackback URL

  • Nice article.

    By the way, can't you include simplejson using
    "from django.utils import simplejson"?
  • Very cool!
  • sam

    Nice article.


    By the way, can’t you include simplejson using

    “from django.utils import simplejson”?




    You sure can. I didn't know that was in there. Thanks!
  • Nice article. Oh for 'http://javarants.appspot.com/mirror'.
  • This article is real cool and practical.Keep goin!!
  • I agree with you! This was actually what I was looking for all over the net, and I am glad that I finally stumbled into your article! I love your blog and cool design you have
  • Nice article
  • Brian
    Thanks for the article. Great information!

    I copied your example and python code and uploaded it but I keep getting an error in my pipe saying * Web service failure:
    An Error Occurred
    405 Method Not Allowed

    Do you have any idea why?
  • How do you ensure that the pipe gets run with all the feed content?
    If you dont access the pipe, it looks like nothing gets passed to the webservice.
  • You need to either call the pipe on demand when you want the web service to be executed or you can subscribe to the pipe in an RSS reader like My Yahoo, Google Reader, Bloglines or something similar that will poll it periodically.
  • Bo
    Does this still function. Wrote it quickly and never had it work then went to your appspot and had blank screen as well. What am I doing wrong?
  • I just tested the pipe that uses it at:

    http://pipes.yahoo.com/spullara/mirror

    And that Pipe seems to mirror the RSS feed just fine. What kind of error are you seeing?
  • Bo
    so if i go to http://javarant.appspot.com/mirror - - - what should be seen? I got nada. Blank. Which is the same I get with my own. What am I missing?
  • Bo
  • That link only accepts JSON POSTs so nothing really should come from it at all unless you POST JSON to it and then it will return what you post. Maybe you expected something else? You can test it out by cloning the pipe above and changing the input feed.
  • Bo
    Ok. So, lets pretend I'm 5. So, I copied your pipes and the GAE Code placed it on my appspot.com location. so i've got app.yaml, and the xyz.py What would I need to do - to view the pipe JSON on my appspot.com site? Assumption is a wee bit of javascript on a page. Correct? Meaning, i need to write a little page that takes the post and makes it visible - yes? I'm very bright. If your in a cave with no light, and blind. I'm the sun.
  • Ah, maybe we are talking past one another. The code on this page is used to make call outs in the middle of a pipe execution in order to change the final result, not actually display those results on an html page. To do that I suggest you just use the code in the badges that Yahoo! Pipes gives you when you choose "Get Pipe as Badge" on a Pipe's homepage. Here is an example:

    <script src="http://pipes.yahoo.com/js/imagebadge.js">{"pipe_id":"ZKJobpaj3BGZOew9G8evXg","_btype":"image","pipe_params":{"ticker":"YHOO,GOOG,AMTD,ETFC,V,MA,VMW,EMC,C"}}</script>

    That HTML should be enough to show that pipe on your page. If I'm still confused about what you are trying to accomplish, please tell me the end result you are looking for.
  • nodaddy__com
    Seems like you have everything you need to share an article on using pipes and GAE to collect RSS(es) and email them daily instead of relying upon google feedburner :)
  • i read Rael Dornfest,Paul Bausch & Tara Calishain writeln's Book "Google Hacks",they are code with google API,but your code is well like they
  • Thanks very much - this really helped out!
  • jlorenzen
    Great article. Thanks. I have found Yahoo Pipes to be very powerful, but like you said a bit difficult to master. I've written about screen scrapping using Yahoo Pipes in the past: http://jlorenzen.blogspot.com/2008/10/using-yah...
blog comments powered by Disqus

YUI-Mainstream Theme by Buzzdroid.com

 Premium Wordrpess Theme