Google AppEngine Thoughts

Introduction


For sometimes now i've been working on Google AppEngine platform.

I've made different production back-ends and web services for professional applications. Thus i've been facing lot of problem resolution on AppEngine and after using the Google App Engine i want to share my experience on this particular platform.

First of all, I think Google make really great software and face lot of problem resolution before every one in the world. They've made an awesome map platform, mail platform, mobile platform and office software platform and others.

So it will be obvious that if they made such great applications, we, App Engine developers, would benefit such improvements in our development platform. That's why we oriented ourselves to this host media.

Data Store


Google App Engine offers an API in Java, Python and Go for developing applications. I mostly use the Python version.

The API offers lot of services from email sending to task queuing.

But like all Information System, we need a data store to save all needed data by our application. This is the datastore. Google use a proprietary datastore for storing those informations.

This database system is supposed to be optimized for cluster saving and data accessing who is particularly suited for cloud computing.

It is called Big Tables and is based on NoSQL specifications and so on it is not a relational database. So no many to many relations or on delete cascade.

Thankfully, it is possible to simulate such comportment using ReferenceProperty and/or Parent-child relationship.

Map Reduce algorithm


The BigTable database is based on the Map Reduce algorithm to order its internal data. This is certainly due to the usage of GFS for storing data.

MapReduce algorithm is very useful in distributed environment to speed up filtering and treatment of data. The drawback of this method is that it is really applicable only using background tasking.

Moreover, the MapReduce algorithm prevent performance introduced by common database system features such as B-trees and hash partitioning.

Like in Filesystems, MapReduce have proven his limitation and like it is said in this wired article that Google too has decided to migrate its infrastructures on another filesystem called Colossus.

So we should think that Google is going to apply those changes to the Google App Engine too soon or later. But for the moment, using BigTable lead sometimes to really bad performances for realtime data accessing on some applications.

That's why you should rely on the memcache API when you have such a scenario in some of your project... for the moment.

Alternatives


Other option can be to use another Cloud Computing based hoster. Like Amazon, Yahoo or a more opensource option like OpenShift.

Some of those systems use opensource technology. The one that I like are Apache Hadoop for filesystem or MongoDB for database. It offers good performances and scalability and it offers the necessary flexibility in development that is needed on a lot of projects.

Most seen