Archive for the ‘Code’ Category

Web Console and Browser Added to django-json-rpc

Saturday, November 7th, 2009

This morning I put the finishing touches on the web console and service browser for django-json-rpc. It’s a fully-bundled utility to aid in rapid development of your web service. It’s available on github now and will be a published to PyPI soon as I work out one more interface bug. A live demo is available. Check it out:

jsonrpcbrowserscreen

JSON-RPC in Objective-C

Friday, November 6th, 2009

Handling JSON in Objective-C is already pretty easy with the excellent strict parser provided by json-framework. However, interacting with a web service in a graphical application can be hard to get right. Frequently unresponsive or unpredictable behavior is a big turnoff for users. Recently I have been developing a lot for the iPhone and more often than not, ideas that beget these applications revolve around central web applications. Many client-server interactions usually boil down to the familiar CRUD apps we’re all so familiar generating with Rails and Django but require much more effort.

Writing applications for a mobile device introduces several challenges: limited memory, slow network and a language and framework designed without an eye on modern agile web services. Recently while writing an iPhone application with heavy web interaction I got lazy doing all that work and wrote DeferredKit – a port of Twisted’s Deferred with many extensions for us Objective-C coders. DeferredKit makes asynchronous programming in Objective-C really easy with very little overhead and includes an asynchronous JSON-RPC client. It allows you to write code like this:

  - (void)doRegistration {
    [[[[[DKDeferred jsonService:SERVICE_URL]
       myApp] register:array_(self.username, self.password)] 
      addCallback:callbackTS(self, doRegistrationCompleted:)]
     addErrback:callbackTS(self, doRegistrationFailed:)];
  }
 
  - (id)doRegistrationCompleted:(id)result {
    // do something with result
    return result;
  }
 
  - (id)doRegistrationFailed:(NSError *)err {
    // do something with error
    return err;
  }

Jumping In

Getting started with DeferredKit is relatively easy. Download the or pull DeferredKit from github and drop everything from DeferredKit/CocoaDeferred/Source into your project (you can also compile DeferredKit as a direct dependency in Xcode, a how-to is in the README).

First, a bit of explanation. Despite the recent addition of blocks to the Objective-C language, the blocks runtime is not available to us iPhone developers (or anybody who wants to deploy to targets older than 10.6, sadly). Also, since there is no native function object in Objective-C, functions can not be passed around as first-class citizens, stored in collection types, persisted, etc. DeferredKit provides one called DKCallback. You will almost never have to touch the interface of DKCallback since handy macros are provided to generate function objects.

A deferred is defined as an object that encapsulates a return value that is not yet available (meaning, it will be returned from the network when done downloading, or return from a thread when done computing). The deferred then dispatches the value to a series of callbacks as soon as it becomes available.

JSON-RPC is exposed in DeferredKit through a class called DKJSONServiceProxy which employs a bit of magic to enable the dynamic method calling pattern you saw above. For convenience, autorelease constructors are also provided in DKDeferred:

  id service = [DKDeferred jsonService:@"http://myservice.net/json/" name:@"myapp.sayHello"];

(note: don’t forget to #import DKDeferred+JSON.h)

Calling a method returns a deferred:

  DKDeferred *d = [service :args_array];

To which you must add at least one callback for the call to be performed. In this case, callbackTS builds a callback with a target (can be any object), and a selector to be performed. A callback could also be a function pointer, NSInvocation or any object conforming to the DKCallback protocol.

  [d addCallback:callbackTS(self, gotSayHelloResults:)];

In case your JSON-RPC server returns an error (via the error key) or encounters a network error or parse error, deferred will automatically call any errbacks, or “error callbacks,” added with an NSError object.

  [d addErrback:callbackTS(self, handleSayHelloError:)];

Callbacks and errbacks are written as a function or method which takes a single argument, the result and returns id. JSON-RPC results will always be an NSDictionary with a result key, among others.

  - (id)gotSayHelloResults:(id)results {
    self.helloLabel.text = [results objectForKey:@"result"];
    return nil;
  }

Errbacks are the same as callbacks except they will always be called with an NSError argument. The userDict dictionary will be populated with the error dictionary produced by your JSON-RPC server. userDict will be different for different error types however, so be sure to check which kind of error is being returned.

  - (id)handleSayHelloError:(NSError *)err {
    [self alertError:[[[error userDict] objectForKey:@"error"] objectForKey:@"message"]];
    return nil;
  }

Conclusion

I’ve only had the opportunity to touch on the basics of DeferredKit, but a post is coming soon with more details. It is a tested, lightweight framework for writing asynchronous code and can greatly improve your productivity when working with web services.

Django and JSON-RPC

Tuesday, November 3rd, 2009

In application development, I’ve yet to be the bearer of a product who’s primary functionality has not been aided or entirely supported by a central web application. It is uncommon for developers in this situation to have the luxury of using a common programming language between server and client implementations. A lot of developers will end up rolling their own RPC which evolves over the lifetime of the product to meet its needs. This puts unneeded strain on the developer to fulfill the role of RPC expert as well as interface designer, and expert-everything-else.

I have done many primarily-RPC web applications using tomcat or PHP or twisted, but by far the best way is using Python and Django. However, I have always viewed the state of RPC in Django broken. While using Django for all sorts of other web application purposes is incredibly easy, getting started with RPC presents a lot more friction.

Introducing django-json-rpc, a very easy way to expose your web application to the world of integrated applications.

Installation is as simple as:

  easy_install django-json-rpc

Add a mount point to your urls.py file and import django-json-rpc.

  from jsonrpc import jsonrpc_site
 
  urlpatterns += patterns('', 
    (r'^json/', jsonrpc_site.dispatch)
  )

The interface to django-json-rpc is exposed through only one function – the jsonrpc_method decorator. Functions that use this decorator may be placed anywhere in the source tree and need only import jsonrpc.jsonrpc_method.

  from jsonrpc import jsonrpc_method
 
  @jsonrpc_method('app.sayHello')
  def say_hello(request, name):
    return "Hello, " + name

To hook your views into the default jsonrpc site, they must simply be imported somewhere by Django. The best place for this is in your urls.py file, which should now look something like this:

  from jsonrpc import jsonrpc_site
  import myapp.views
 
  urlpatterns += patterns('', 
    (r'^json/', jsonrpc_site.dispatch)
  )

Test your service with the provided Proxy:

  ./manage.py runserver 8080
 
  ./manage.py shell
  >>> from jsonrpc.proxy import ServiceProxy
  >>> s = ServiceProxy('http://localhost:8080/json/')
  >>> s.app.sayHello('Sam')
  {u'error': None, u'id': u'jsonrpc', u'result': u'Hello Sam'}

HTTP GET

Django-json-rpc supports JSON-RPC version 1.1 which includes support for HTTP GET, meaning, all your REST are belong to us as well. Add the HTTP GET mount point to your urls.py file:

  from jsonrpc import jsonrpc_site
  import myapp.views
 
  urlpatterns += patterns('', 
    (r'^json/', jsonrpc_site.dispatch),
    (r'^json/(?P<method>[a-zA-Z0-9.]+)$', jsonrpc_site.dispatch)
  )
</method>

It is required by django-json-rpc to mark each method safe for dispatch through HTTP GET. jsonrpc_method('sayHello') becomes:

  jsonrpc_method('app.sayHello', safe=True)

Now your method will be available at http://localhost:8080/json/app.sayHello?name=Sam. HTTP GET requests only support string and number typed arguments.

Authentication

The django-json-rpc package also supports authentication, by default using django.contrib.auth’s model backend – the User object we’re all so familiar with, however you may provide any method, including using any authentication middleware. If you aren’t using middleware, username and password arguments will automatically be added to the beginning of the argument list for your method.

  @jsonrpc_method('app.sayHello', authenticated=True)
  def say_hello(request):
    return "Hello, " + request.user.first_name
  >>> s.app.sayHello('samuraiblog', 'password')
  {u'error': None, u'id': u'jsonrpc', u'result': u'Hello Sam'}

Version Agnostic

django-json-rpc will continue to work regardless of which version JSON-RPC client you happen to be using. It supports all argument types supported in JSON-RPC versions 1.0, 1.1 and 2.0, and will respond if a client specifies the version in either the jsonrpc or version key.

Conclusion

I have used this package for the development of several commercial iPhone applications and it has greatly improved my productivity in developing a web service backend. Up next? A django-xmlrpc? Haha! Fuck that!

Compressing NSData objects using zlib

Monday, November 2nd, 2009

This entirely iPhone friendly tutorial highlights how easy it is to compress our familiar NSData objects using zlib – the most commonly used compression library on the planet. This is the same library that compresses the PNGs and PDFs, ZIPs and GZIPed files that make up your daily life.

The zlib c library, which is bundled with all Mac OS X and iPhone 2.0+ devices offers an extensive but non obtrusive API for compressing data. The most basic functions are the stream API which serve nicely when, for instance, dealing with a large files, or matching a specific compression requirement. The method presented here is one which is utility provided by the atop the streaming API. Readers familiar with python will recognize this if ever you’ve had to handle data with the zlib module.

First, Allocate a data structure of type const ByteF containing the bytes in our uncompressed NSData structure.

  const Bytef *inBytes = (const Bytef *)[uncompressedData bytes];

Allocate enough data to start compressing data. The zlib manual recommends allocating 1% more than the length of your uncompressed data, plus twelve bytes. This gives zlib enough room to start compressing data, resizing the output buffer as needed.

  uLongf outLength = ([uncompressedData length] * 1.1) + 12;
  Bytef *outBytes = (Bytef *)malloc(outLength);

Now simply call compress:

  int z_result = compress(outBytes, &outLength, inBytes, [uncompressedData length]);

compress will return one of three constants. To handle errors, check for the return value. Depending on your situation, you could return an NSError. In this case, I return an empty NSData and check for return value length.

  NSData *ret;
  if (z_result == Z_OK) {
    ret = [NSData dataWithBytesNoCopy:outBytes length:outLength freeWhenDone:YES];
  } else {
    ret = [NSData data];
    switch (z_result) {
      case Z_MEM_ERROR:
        printf("compressData got Z_MEM_ERROR out of memory. :(");
      case Z_BUF_ERROR:
        printf("compressData got Z_BUF_ERROR output buffer wasn't larege enough :(");
      default:
        break;
    }
  }
}

compressData function

NSData* compressData(NSData *uncompressedData) {
  if ([uncompressedData length] == 0) 
    return uncompressedData;
  const Bytef *inBytes = (const Bytef *)[uncompressedData bytes];
  uLongf outLength = ([uncompressedData length] * 1.1) + 12;
  Bytef *outBytes = (Bytef *)malloc(outLength);
  int z_result = compress(outBytes, &outLength, inBytes, [uncompressedData length]);
  NSData *ret;
  if (z_result == Z_OK) {
    ret = [NSData dataWithBytesNoCopy:outBytes length:outLength freeWhenDone:YES];
  } else {
    ret = [NSData data];
    switch (z_result) {
      case Z_MEM_ERROR:
        printf("compressData got Z_MEM_ERROR out of memory. :(");
      case Z_BUF_ERROR:
        printf("compressData got Z_BUF_ERROR output buffer wasn't larege enough :(");
      default:
        break;
    }
  }
  return ret;
}

compressData that returns an NSError.

#define ZLIB_COMPRESS_DOMAIN @"zlib_compress_domain"
 
NSError* compressData2(NSData *inData, NSData **outData) {
  if ([inData length] == 0) {
    *outData = inData;
    return nil;
  }
  const Bytef *inBytes = (const Bytef *)[inData bytes];
  uLongf outLength = ([inData length] * 1.1) + 12;
  Bytef *outBytes = (Bytef *)malloc(outLength);
  int z_result = compress(outBytes, &outLength, inBytes, [inData length]); 
  if (z_result == Z_OK) {
    *outData = [NSData dataWithBytesNoCopy:outBytes length:outLength freeWhenDone:YES];
    free(outBytes);
    return nil;
  }
  return [NSError errorWithDomain:ZLIB_COMPRESS_DOMAIN 
    code:z_result userInfo:[NSDictionary dictionary]];
}

NSData category

@interface NSData (CompressionAdditions)
 
- (NSData *)compress;
- (NSData *)compress:(NSError **)errorOut;
 
@end
 
 
@implementation NSData (CompressionAdditions)
 
- (NSData *)compress { 
  return compressData(self);
}
 
- (NSData *)compress:(NSError **)errorOut {
  NSData *ret;
  *errorOut = compressData2(self, &ret);
  return ret;
}
 
@end

File Uploads in twisted.web

Friday, October 30th, 2009

Twisted is a python package for building asynchronous applications. It’s wide range of uses and sub-packages are daunting; I’ve developed with Twisted for several years now and still find it’s vast expanse of code intimidating. Twisted in 60 seconds is an excellent resource for anyone beginning with twisted and twisted.web. Here, we’re filling yet another gap in the twisted documentation.

Not surprisingly, handling file uploads in twisted.web is straight forward. Files are accessed through the args attribute of the request parameter to your render method. Since there can be multiple files for that given field, whether there are one or more, request.args[FIELD_NAME] will always be an array.

from twisted.internet import reactor
from twisted.web import resource, server
 
class FileUploadService(resource.Resource):  
  def render_GET(self, request):
    return '''<form enctype="multipart/form-data" method="post" action=".">
                 <input type="file" name="da_file"/>
                 <input type="submit"/>
              </form>'''
 
  def render_POST(self, request):
    return 'You submitted a file that was %i bytes' % len(request.args['da_file'][0])
 
root = resource.Resource()
root.putChild("upload", FileUploadService())
factory = server.Site(root)
reactor.listenTCP(8088, factory)
reactor.run()

According to the twisted.web documentation. This method of handling file uploads is not preferred. In fact, the preferred way is to use the more complex twisted.web2 package which supports streaming uploads. However, the twisted devs are working on porting most of twisted.web2’s features to twisted.web in an effort to consolidate the two. In production code, I have chosen to stick with twisted.web since it seems to be the package which will remain the most compatible with future versions of Twisted.

Create a Long-Running-Process Server for Django

Friday, March 6th, 2009

Have you ever wanted to put up a server made exclusively to handle the expensive actions in your web app? In this article I will be showing you how to create a server to that will be set up to receive commands from your web server and process them in the context of your django web app. This technique could be used many places such as a desktop GUI application, or a web service architecture.

Say you have work unit WU which takes pickleable arguments (a, b, c). If your work unit is thread-safe (it modifies no global variables, and causes no side effects in your application) it can be preformed on in another process, or even on another machine, allowing you to return control to the user immediately.

For example, I want to add the ability for a user to import their contacts using X contacts service:

Fig 1: project/views.py

1
2
3
4
def import_contacts(request):
    from my.specialsauce.contacts import run_contacts_import
    contacts = run_contacts_import(request)
    return render_to_response('t/contacts-import.html', {'contacts': contacts})

Currently, the view retrieves and imports the users contacts in the view, making the user wait until the operation has completed to see the returned page. What if we could implement this in a way which would return control to the user immediately, firing off run_contacts_import in another thread, another process, or even another server!

To do this we will need Django, multiprocessing and Python 2.5+. Now, I will assume you are already as far as Fig 1. Say a user with 10,000 contacts (or even 100) ambles along and imports their contacts. Both users are going to be waiting much longer than the average gmail-generation-web-user can be expected to wait. They will hit refresh, restarting this process and crashing your server.

Since the Django request object is pickleable, we can pass it as an argument to another process using multiprocessing.

Fig 2: project/views.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from multiprocessing import Pool
 
m = Pool(10)
 
def import_contacts(request):
    # run_contacts_import will have to update request.session['CONTACT_IMPORT_STATUS']
    has_status = 'CONTACT_IMPORT_STATUS' in request.session
    status = has_status and request.session['CONTACT_IMPORT_STATUS'] == True
    if status is True:
        return HttpResponseRedirect('/import-finished/')
    elif not status and has_status:
        return render_to_response('t/contacts-import-not-finished.html', {})
    elif not has_status:
        from my.specialsauce.contacts import run_contacts_import
        m.apply_async(run_contacts_import, args=(request,))
        return render_to_response('t/contacts-import.html', {'contacts': contacts})

This seems simple enough, and upon first inspection, seems to work rather decently. However, this adds a global variable to our Django process, which is a bad idea. Using this technique, it would be best to create a lock, spawn off a new thread, and acquire/release the lock while running apply_async (blegh). Unfortunately, this can eventually make your Django server unresponsive as many users come to run the expensive process (simply via system load). We need to move this process, and possibly a few others off our new server.

To do this, the Python multiprocessing package provides the high-level Manager class. The Manager class is used to pipe commands between managers in different instances of the python interpreter.

Create the Manager Class

First, we need to create the Manager class that will eventually be implemented as our server. Create a file called managers.py in project/managers.py.

Fig 3: project/managers.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from multiprocessing.managers import BaseManager
 
from my.specialsauce.contacts import run_contacts_import
 
# This is the shared object. It simply implements a function which applies to the instance variable
# pool, which will be installed when the server is actually started (in a snippet below)
class ContactImporter(object):
    def __init__(self, pool=None):
        self.pool = pool
 
    def run_import(self, *args):
        self.pool.apply_async(run_contacts_import, args=args)
 
# instantiate the shared object
importer = ContactImporter()
 
# our manager
class CManager(BaseManager):
    pass
 
# you need to register a function which will return the shared object
# the shared object could be anything pickleable in python, in this case
# our new-style class descending from object
CManager.register('get_importer', callable=lambda:importer)

Create the Server

We are going to be making a daemon server that we can start and stop using your projects manage.py file. To do this create a file called lrpserver.py in project/management/commands/lrpserver.py.

Fig 4: project/management/commands/lrpserver.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
import os
import sys
import time
from optparse import make_option
from signal import SIGTERM
 
from multiprocessing import Pool
 
from django.core.management.base import BaseCommand
from django.conf import settings
 
from project.managers import CManager, importer
 
 
class Command(BaseCommand):
    option_list = BaseCommand.option_list + (
        make_option('--pidfile', dest='pidfile', default='', help="Specifies the PID file to use."),
        make_option('--start', dest='start', default='no', help='Starts the daemon'),
        make_option('--stop', dest='stop', default='no', help='Stops the daemon'),
        make_option('--restart', dest='restart', default='no', help='Restarts the daemon')
    )
    def handle(self, *args, **options):
        pidfile = options.get('pidfile')
        if not pidfile:
            print "--pidfile arg required"
            sys.exit(1)
        start = options.get('start')
        stop = options.get('stop')
        restart = options.get('restart')
        server = getattr(settings, 'LRP_SERVER_HOST', None)
        port = getattr(settings, 'LRP_SERVER_PORT', None)
        authkey = getattr(settings, 'LRP_SERVER_AUTHKEY', None)
 
        if server is None or port is None or authkey is None:
            print 'LRP_SERVER_HOST, LRP_SERVER_PORT and LRP_SERVER_AUTHKEY values required in settings file.'
            sys.exit(1)
 
        # This daemonizes our process and starts the server
        def _start():
            print 'Starting Long Running Process Server'
            from django.utils.daemonize import become_daemon
            become_daemon()
            fp = open(pidfile, "w")
            fp.write('%d\n' % os.getpid())
            fp.close()
 
            # set the pool that importer.run_import requires
            importer.pool = Pool(processes=10)
            # create an instance of CManager, get a server instance and start the server loop
            m = CManager(address=(server, port), authkey=authkey)
            s = m.get_server()
            s.serve_forever()
 
        def _stop():
            try:
                fp = open(pidfile, 'r')
                pid = int(fp.read().strip())
                fp.close()
            except IOError:
                pid = None
            if not pid:
                print 'Long Running Process Server Not Currently Running'
                return
            try:
                print 'Stopping Long Running Process Server'
                while 1:
                    os.kill(pid, SIGTERM)
                    time.sleep(0.1)
            except OSError, err:
                err = str(err)
                if err.lower().find('no such process') > 0:
                    if os.path.exists(pidfile):
                        os.remove(pidfile)
                else:
                    print err
                    sys.exit(1)
 
        def _restart():
            _stop()
            _start()
 
        if str(start) == 'yes':
            return _start()
        elif str(stop) == 'yes':
            return _stop()
        elif str(restart) == 'yes':
            return _restart()
        else:
            print 'Options: pidfile=%s start=%s stop=%s restart=%s' % (pidfile, start, stop, restart)
            print 'One of --(start|stop|restart)=yes is required.'

Add the Required Settings to Your settings.py File

The server and client require three settings, the host name of your server, the port on which the server is listening and the auth key of the required to access the server. In your settings.py file add the following with your values. For now, use the blank string ” as the setting for host name. This instructs the server to serve from localhost. (Hint: If you do not know this value use socket.getfqdn()).

Fig 5: settings.py

1
2
3
LRP_SERVER_HOST = ''
LRP_SERVER_PORT = 5667
LRP_SERVER_AUTHKEY = 'password'

Modify Your View Code

To connect to the server, you must spawn a thread which creates an instance of the manager, connects and sends the command.

Fig 6: project/views.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
from threading import Thread
from multiprocessing.managers import BaseManager
 
from django.conf import settings
 
 
class CManager(BaseManager):
    pass
 
 
CManager.register('get_importer')
 
# This function will become a thread which instantiates the Manager class,
# connects and starts the long running process
def _start_import_thread(request):
    gci_manager = CManager(address=(settings.GCI_SERVER_HOST, settings.GCI_SERVER_PORT),
                           authkey=settings.GCI_SERVER_AUTHKEY)
    gci_manager.connect()
    importer = gci_manager.get_importer()
    importer.run_import(request)
 
# This is not a view, it's a helper function which starts
# the thread above
def start_contacts_import(request):
    t = Thread(target=_start_import_thread, args=[request])
    t.setDaemon(True)
    t.start()
 
def import_contacts(request):
    # run_contacts_import will have to update request.session['CONTACT_IMPORT_STATUS']
    has_status = 'CONTACT_IMPORT_STATUS' in request.session
    status = has_status and request.session['CONTACT_IMPORT_STATUS'] == True
    if status is True:
        return HttpResponseRedirect('/import-finished/')
    elif not status and has_status:
        return render_to_response('t/contacts-import-not-finished.html', {})
    elif not has_status:
        start_contacts_import(request)
        return render_to_response('t/contacts-import.html', {'contacts': contacts})

Start the Server and Let it Fly!

And finally we may start the server.

1
python manage.py lrpserver.py --pidfile=/home/django/run/lrp.pid --start=yes

Now you can run your view.

Follow-Up

This process is applicable to more than just crawlers and contact importers. It can be used to minimize the load on your server for any number of tasks such as image and other file processing or running large database updates. The server can be put on another machine simply by duplicating your django installation on another machine and using that machine’s socket.getfqdn() (or ip address) value as the host name in your configs. You can also add a javascript status updater to notify the user of progress information related to their request.