One of the really cool features of MobileMe and the new iPhone 3.0 OS is the ability for it to reach out and locate your iPhone at any time if you have that feature enabled. Right now Apple only makes this available on their MobileMe website however and does not offer a programmatic way to get a hold of the information.
Since the iPhone doesn’t have background processes to update your location for third party applications I thought that it would be great to have the ability to do this anyway by leveraging their website. The first thing you need to do whenever you are going to scrape a sophisticated service like MobileMe is to collect all the relevant packets going over the wire. Since this service is entirely behind HTTPS the easiest way to do this is within the browser client itself. To that end I found what I believe to be the best Firefox plugin for the job, Tamper Data.
As you know the original 4hoursearch was built using Yahoo! BOSS, YUI and Python running on Google App Engine. Although Google App Engine is a very productive environment I was unhappy with it for a few reasons. It doesn’t feel snappy enough, presumably because of the security enforcement aspects of the system, your code has to be written in Python which is not my favorite environment, and the last is that I was showcasing a great Yahoo! API by leveraging Google infrastructure. The last point is the least valid as Yahoo! doesn’t offer a truly equivalent environment but I think it will be more compelling as a showcase built entirely on Yahoo! technology. I could have moved it over to this new implementation when YQL launched but I also wanted to wait for BOSS to offer more features so I could significantly enrich the search experience at the same time.
The Yahoo! Query Language aspires to be the last web service API that the normal developer will ever have to learn. By default we implement 50+ tables that grab data both from Yahoo! web services, some 3rd party web services and then the web at large using our dynamic tables that allow you specify a data type and a url. However, those dynamic APIs limit the YQL user to a very flexible but ultimately hard to work with API without the benefit of the structure found in the other tables that we offer.
Today YQL introduced a new feature that allows 3rd parties to define new tables and then share those table definitions with whomever they like for them to use. For example, let’s say you are the New York Times or you are a developer that likes the New York Times APIs and would like to make them more accessible to someone using YQL. Yesterday, they released the article search API, so I will use that one among others as an example, to get an api-key to execute these examples go to their developer site. This is a pretty sophisticated API that allows you to search using a variety of parameters. If you were to use YQL without modification, you would simply use the dynamic JSON endpoint to parse out the results from their service. The big issue with this though is that you would be unable to easily construct the URLs required and would have to write that code that collected all the parameters and created the URL. If you had a YQL table, those parameters would be defined and how they are expressed in the URL codified and you would be able to individually address the keys.
One of the great things about the JAX-RS specification is that it is very extensible and adding new providers for different mime-types is very easy. One of the interesting binary protocols out there is Google Protocol Buffers. They are designed for high-performance systems and drastically reduce the amount of over-the-wire data and also the amount of CPU spent serializing and deserializing that data. There are other similar frameworks out there including Fast Infoset and Thrift. Extending JAX-RS to support those protocols is nearly identical so all of the ideas I’ll talk about are generally valid for those frameworks as well. The one limitation that we will table for now is that JAX-RS only works over HTTP and will not work for raw socket protocols and the high-performance aspect of protobufs is somewhat reduced by our dependency on the HTTP envelope. My assumption is that you have done your homework and know that message passing is your overriding bottleneck.
Posted in Java, Technology
You’ve probably read about things like Xoopit and Xobni for analyzing both online mail and your outlook mail. As it turns out, Apple has done something great in this regard that I think has been mostly overlooked. Mail.app stores all of the meta-data for you email in a file called ~/Library/Mail/Envelope Index. You might wonder what the format of this file is… well it is a SQLite3 database. The contents are pretty easy to see, go to the terminal and type:
macpro:~ sam$ sqlite3 ~/Library/Mail/Envelope\ Index
SQLite version 3.6.3
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
Building applications for deployment to the web has evolved over the last several years to be focused on dynamic behavior, separation of model/view/controller, and simplified but scalable configuration and deployment. From a performance, tools and library perspective I’m still highly biased to development in Java over more up-and-coming languages. However, much has been learned in the Java community from the better frameworks like Rails and those lessons should not be ignored.
I’ve been looking for a while though to find that perfect combination of frameworks and libraries that would give me the expressive power that I want for building web applications. There have been many contenders from JRuby on Rails, to Grails, to Seam and even just writing everything myself. Ultimately, I believe in the DRY principle (like Rails), though I don’t think many frameworks go far enough when dealing with the database. When you are building a web application it is rare that you are going to change what database you are using. In fact, the majority of your scaling architecture is likely highly dependent on how you store your data. This is why I prefer an application framework that allows me to start with the database and construct my application’s data object model from it.
Posted in Java, Technology
Tagged hibernate, Java, jax-rs, jersey, jpa, json, jsr-311, openjpa, orm, rest, toplink, xml
Update: I actually got the fslogger thing at the end of this entry working so I can do incremental backups. Not really a product yet but it isn’t hard to do. Here is the super rough version of it.
I can’t stand inefficiency. Time Machine is fundamentally a very inefficient mechanism for backing up large files that change. So bad actually that most things like Parallels and VMWare disable backups of your disk images. Here is the basic algorithm:
Since I moved into the platform group at the beginning of the year I had worked with the YAP and YQL teams to help them define their strategy and direction but without being part of the day-to-day operations. In August, the head of the Y!OS project asked me to step in to take them through their final run to launch. It has been a great couple of months working with the teams. They both had an amazing showing at Hack Day and now today we are launching the platforms worldwide.
Posted in Technology
Tagged yahoo, yap, yos, yql
There are obviously a lot of ways to measure how well a country did at the Olympics. This post takes a view that we should look at how many people the country had to draw on in order to send the athletes to China to compete. There are a lot of problems with this including: ex-pats competing for their home country, vast disparity in wealth between countries and the relative interest in the Olympic games of the cultures. One of the things that jumps out immediately is that island nations that draw on a larger related population do very well in the games. They likely have inherited not only the interest in the competition but are also wealthy enough to train and compete in the games.
Posted in Technology
As this was really just a demonstration of the power of Yahoo! BOSS, I have brought the site back as a demonstration site. Additionally, Yahoo! is making the source code to the new site available so anyone with a knack for Python, HTML and CSS can take a swipe at making a better search experience. In order to make a nice UI I teamed up with another Sam, Sam Lind. I put together the skeleton using Yahoo!’s amazing YUI tools and he created the look and feel. Please try it out and take advantage of Yahoo!’s open search API: